This is a collection of projects I have worked on during the course
of my master’s program.
Transcriptomics of Thyroid Cancer
To demonstrate proficiency in statistical genetics and omics
analyses, I performed a transcriptome-wide association study (TWAS)
coupled with a gene set enrichment analysis (GSEA) on papillary thyroid
cancer. I utilized data from TCGA
and identified differentially expressed genes and enriched gene sets.
The following figures were created in the course of the project.
Volcano plot showing TWAS results. Each dot is a gene product. The grey
horizontal line is at a q-value of 0.05, and grey vertical lines are at
±1. Red genes are upregulated compared to normal tissue, while blue
genes are downregulated. The 30 genes with highest differential
expression are labelled.
GSEA plots for the four most enriched gene sets in the TCGA thyroid
cancer project. The location of gene set members with log2 fold change
(top panel) and the running enrichment score (bottom panel) is shown for
A) GO:0035074 (collagen catabolism), B) GO:0031424
(keratinization), C) GO:0035082 (thyroid hormone metabolism),
& D) GO:0042403 (axoneme assembly).
UFO Visualization
I worked with other M.S. Biostatistics students to create a
website that explores and analyzes reports of UFO sightings. We used
data collected by the NUFORC and
adapted by Timothy Renner
for this project. If you are interested, our report
summarizes these findings.
A heatmap showing the number of UFO sightings between 1970-2022,
normalized by population.
Breast Cancer Survival
For my final project in Biostatistical Methods 1, I collaborated with
two other students to predict risk of death in patients with breast
cancer. We built and optimized a logistic regression model and evaluated
its performance on different racial groups. Our report is available here.
Separation plots by race. Values are stripes, arranged by increasing
predicted probability of death in (A) White, (B) Black, and (C) Other
race patients. The stripes are colored yellow if the patient survived,
and red if they died. The black line indicates the predicted probability
of death.