Projects

I use a variety of statistical and data science approaches to address critical societal challenges and drive social impact. From experimental design and hypothesis testing to classification modeling and sentiment analysis, my projects focus on uncovering insights that contribute to meaningful change. Whether it's analyzing the effects of perceived emotional damage on forgiveness, exploring the relationship between economic systems and public health priorities during the COVID-19 pandemic, or developing tools like the IPEDS CRAN package for improving accessibility to educational data, my work is grounded in the practical application of data science to foster a more informed and equitable society. Through machine learning models aimed at addressing global financial equity or combating political misinformation, my projects consistently prioritize fairness, accessibility, and data-driven solutions for the betterment of communities.

Statistics: Experimental Design

A Chance of Forgiveness

This study explores the impact of Perceived Emotional Damage (PED) on forgiveness, examining how varying levels of emotional hurt influence the likelihood of granting forgiveness in different scenarios. Four hypotheses were tested, predicting that forgiveness would differ based on the severity of PED, with participants experiencing higher damage (Maximal scenario) being less likely to forgive compared to those in Mediocre or Minimal scenarios.

The results, confirmed through a one-way ANOVA and pairwise comparisons, revealed significant differences in forgiveness likelihood across the PED levels, indicating that greater perceived emotional damage correlates with decreased forgiveness. However, the study's limitations, including a convenience sample primarily of Smith College students, suggest the need for more diverse research to enhance generalizability and to consider additional factors influencing forgiveness in future studies.

Statistics: Hypothesis Testing

Economic Systems and Postmaterialism

This project explores the relationship between economic systems, materialism, and post-materialism through the lens of the COVID-19 pandemic. Drawing from Pew Research data and global economic freedom indices, the study investigates why more Americans prioritized the economy over public health in the early stages of the pandemic. Hypotheses center on the association between economic freedom and materialist values, examining the psychological costs of economic systems like American Corporate Capitalism (ACC). Statistical methods, including comparisons of national economic systems and their impact on perceived COVID-19 danger, aim to uncover how these values influenced public attitudes.

Data Science: Data Accessibility

IPEDS

The IPEDS package provides comprehensive data on postsecondary institution statistics for 2021, designed for ease of use with basic R knowledge. It offers detailed insights into areas such as admissions, enrollment, staff, financial aid, and institutional offerings, making it a valuable tool for prospective students, college counselors, and researchers interested in higher education. The package delivers a clear and accurate picture of colleges and universities, enabling users to better evaluate institutions based on a variety of key metrics.

Data Science: Classification Modelling

Analysis of Access to Emergency Funds in Sub-Saharan Countries: A Human Rights-Based Approach

Most people require access to emergency funds at least once in their life. These funds act as an important safety net in emergency cases. The purpose of our project is to predict access to emergency funds for adults in Sub-Saharan countries. Our analysis is based on the 2017 Global Findex Database which includes demographic as well as financial information for a sample of individuals within each country.

We used a Decision Tree Classifier machine learning model implemented using Python to predict access to emergency funds with 68% accuracy. We assessed the fairness of our model with respect to gender using a variety of group and individual fairness metrics. We evaluated the implications of each fairness metric with respect to our data and the goals of the analysis. We then implemented a variety of pre-processing, in-processing, and post-processing techniques to minimize bias and maximize fairness. We have documented our analysis in a Jupyter notebook where this information can be made accessible to a broader undergraduate audience.

Data Science: NLP Sentiment Analysis

Proof Over Promises

The Proof Over Promises (POP) project aims to combat political misinformation and enhance civic engagement by developing a machine learning tool that connects users with politicians who align with their beliefs. In a politically polarized environment where access to accurate information is crucial, the project will integrate congressional bill texts and roll call voting data from Congress.gov. By performing data wrangling, sentiment analysis, and creating embeddings based on politicians' voting records, the tool will provide users with personalized recommendations of representatives who genuinely support their views. Ultimately, POP seeks to empower citizens to make informed democratic decisions in an increasingly complex political landscape.