This is a collection of projects I have worked on during the course of my master’s program.

Transcriptomics of Thyroid Cancer

To demonstrate proficiency in statistical genetics and omics analyses, I performed a transcriptome-wide association study (TWAS) coupled with a gene set enrichment analysis (GSEA) on papillary thyroid cancer. I utilized data from TCGA and identified differentially expressed genes and enriched gene sets. The following figures were created in the course of the project.

Volcano plot showing TWAS results. Each dot is a gene product. The grey horizontal line is at a q-value of 0.05, and grey vertical lines are at ±1. Red genes are upregulated compared to normal tissue, while blue genes are downregulated. The 30 genes with highest differential expression are labelled.
GSEA plots for the four most enriched gene sets in the TCGA thyroid cancer project. The location of gene set members with log2 fold change (top panel) and the running enrichment score (bottom panel) is shown for A) GO:0035074 (collagen catabolism), B) GO:0031424 (keratinization), C) GO:0035082 (thyroid hormone metabolism), & D) GO:0042403 (axoneme assembly).

UFO Visualization

I worked with other M.S. Biostatistics students to create a website that explores and analyzes reports of UFO sightings. We used data collected by the NUFORC and adapted by Timothy Renner for this project. If you are interested, our report summarizes these findings.

A heatmap showing the number of UFO sightings between 1970-2022, normalized by population.

Breast Cancer Survival

For my final project in Biostatistical Methods 1, I collaborated with two other students to predict risk of death in patients with breast cancer. We built and optimized a logistic regression model and evaluated its performance on different racial groups. Our report is available here.

Separation plots by race. Values are stripes, arranged by increasing predicted probability of death in (A) White, (B) Black, and (C) Other race patients. The stripes are colored yellow if the patient survived, and red if they died. The black line indicates the predicted probability of death.

Interactive NOAA Dashboard

I made this dashboard as part of my data science course at Mailman. It pulls data from weather stations across the state of New York and displays various pieces of information using the plotly and leaflet libraries.