My Work
Selected research and projects.
-
We apply machine learning and natural language processing techniques to support large-scale meta-analysis, literature review, and data collection — focusing on the validity and interchangeability of rainfall metrics used in instrumental variables regression. We test state-of-the-art NLP methods to assess their utility in academic research contexts. This paper contributes to both the literature critiquing instrumental variables and the growing interest in the labor market implications of large language models. It is currently a working paper, and results are forthcoming.
-
This paper explores the relationships between the use of antiparasitic drug use during pregnancy on birth outcomes using data from India. Using data from the Demographic and Health Surveys (DHS), I analyze surveys from India (2019-2021). I employ OLS, using normalized individual sampling weights for women aged 15–49 to account for heterogeneity in selection probabilities and non-response. Using birth weight as a proxy for general health at birth, I find no statistically significant association with antiparasitic use during pregnancy. Poor model performance suggests that the findings should be interpreted with caution, and uncovering true relationships requires a more controlled environment. This was a capstone project for an upper division biostatistics course I took as an undergraduate.
-
We explore common rainfall metrics used in instrumental variables (IV) regression and demonstrate the extent of their ad hoc selection in predicting a range of outcomes. By testing the predictive power of the 14 most common rainfall metrics used in the economics literature on agricultural productivity, we find substantial heterogeneity in metric performance. Moreover, we find evidence of plausible exclusion restriction violations and weak instrument problems in the use of rainfall as an IV, underscoring the need for more systematic, transparent, and well-justified metric selection.
-
I identify common pitfalls in instrumental variables regression and make a case for greater scrutiny in IV selection criteria. This paper includes a thorough literature review and proposes a new technique for synthesizing data from large bodies of literature using LLMs. It was written as part of the UROC Program’s Summer Research Internship, where I was advised by Dr. Anna Josephson.
-
We replicate and expand upon Building Nations through Shared Experiences: Evidence from African Football. This paper examines whether collective experiences foster the construction of national identity by analyzing the impact of national football victories in sub-Saharan Africa on trust, identity, and violence. We validate the original findings and extend the analysis by suggesting additional robustness checks and opportunities to apply the framework to other contexts.
-
This paper quantifies the significance and magnitude of measurement error in Earth Observation (EO) data in the context of smallholder agricultural productivity. Contributing to the literature critically assessing weather variables and data quality in economics, we find that results across EO sources are not robust to dataset choice and that outcomes are not simple transformations of one another. These findings highlight the need for researchers using EO data to more carefully ensure the robustness of their variable selection and data sources.