Research

Applied ML research in NLP, development economics, and recommender systems.

In-Context Learning Robustness via Curriculum Learning

In-Context Learning Robustness via Curriculum Learning

2025

UC Berkeley (EECS 282)

Trained GPT-style Transformer models to study robustness of in-context learning under noisy demonstrations across synthetic regression and NLP benchmarks. Evaluated curriculum-based noise schedules on GPT-Neo 2.7B and Llama-2 7B.

Paper: Robustness of In-context Learning via Curriculum LearningCode

Ask Before You Summarize: NLI-Guided Clarification-Aware Abstractive Summarization

2026

UC Berkeley (CS 288)

Clarification-aware abstractive summarization system that uses multi-sample disagreement and NLI-based uncertainty estimation to decide when to ask a targeted clarification question. The pipeline samples diverse candidates from BART, builds a semantic entailment graph, gates on global uncertainty, localizes the riskiest sentence, and regenerates conditioned on the resolved answer. Full pipeline improves over greedy baseline by +0.079 mean entailment (p = 0.0001) on a 40-example test set, with ablations confirming that both the selection policy and targeted localization contribute.

Paper: Ask Before You Summarize: NLI-Guided Uncertainty and Clarification-Aware Abstractive Summarization

Trading Engagement for Sustainability: Carbon-Aware Re-ranking for E-commerce Recommendations

2026 (ongoing)

UC Berkeley (EECS 294)

Carbon-aware re-ranking framework for e-commerce recommendations. Estimates product carbon footprints by combining retrieval from the Carbon Catalogue with few-shot LLM prompting (Spearman 0.853), then applies a tunable post-hoc re-ranking policy over BPR, LightGCN, and NeuMF candidates (RecBole). On Electronics, BPR sustains full NDCG@10 through λ = 0.90 while cutting average footprint by over 35%, revealing a plateau-then-cliff trade-off structure across all models.

Paper: Trading Engagement for Sustainability: Carbon-Aware Re-ranking for E-commerce RecommendationsCode

Seasonal Healthcare Accessibility Modeling (Zambia)

2026 (ongoing)

UC Berkeley (INFO 288)

Modeling barriers to healthcare access using DHS survey data (16k+ responses) combined with geospatial travel-time, rainfall, and infrastructure data. Achieved Spearman ~0.53 and identified key drivers of access inequality in rural regions.

Connectivity and Data Management for Environmental Monitoring

2025

NTNU x NINA

Bachelor thesis project developing a scalable system to manage and visualize passive acoustic environmental data, improving automated biodiversity monitoring workflows.

Paper: Connectivity and Data Management for Environmental Monitoring