Show simple item record

dc.creatorStein-O'Brien, Genevieve Lauren
dc.date.accessioned2018-05-22T03:38:28Z
dc.date.available2018-05-22T03:38:28Z
dc.date.created2017-12
dc.date.issued2017-06-21
dc.date.submittedDecember 2017
dc.identifier.urihttp://jhir.library.jhu.edu/handle/1774.2/58615
dc.description.abstractStarting from a single fertilized egg, the compendium of human cells is generated via stochastic perturbations of earlier generations. Concurrently, canalization of developmental pathways limits the type and degree of variation to ensure viability; thus, it is unsurprising that deviations early in life have been linked to late manifesting diseases. Human pluripotent stem cells (hPSCs) are a highly robust and uniquely human experimental system in which to model the sources and consequences of this variability. Further, variation in hPSCs’ transcriptomes has been directly linked to both genomic background and biases in differentiation efficiency. Taking advantage of this link between genomic background and developmental phenotypes, we developed Genome-Wide CoGAPS Analysis in Parallel Sets (GWCoGAPS), the first robust whole genome Bayesian non-negative matrix factorization (NMF), to find conserved transcriptional signatures representative of the functional effect of human genetic variation. Time course RNA-seq data obtained from three human embryonic stem cells (ESC) and three human induced pluripotent stem cells (IPSC) in three different experimental conditions was analyzed. GWCoGAPS distinguished shared developmental trajectories from unique transcriptional signatures of each of the cell lines. Further analysis of these “identity” signatures found they were predictive of lineage biases during neuronal differentiation. Additionally, lineage biases were consistent with early differences in morphogenetic phenotypes within monolayer culture, thus, linking transcriptional genomic signatures to stable quantifiable cellular features. To test whether the cell line signatures were genome specific, we next developed the projectoR algorithm to assess a given signatures robustness in independent data sets. By using the identity signatures as inputs to projectoR, we were able to identify samples from the same donor genome in datasets from multiple tissues and across technical platforms, including RNA-seq results from post-mortem brain, micro arrayed embryoid bodies, and publicly available datasets. The identification of signatures that define the functional rather than physical background of an individual’s genome has the potential to profoundly influence our view of human variation and disease.
dc.format.mimetypeapplication/pdf
dc.language.isoen_US
dc.publisherJohns Hopkins University
dc.subjectNMFen_US
dc.subjectTranscriptomicsen_US
dc.subjectdevelopmenten_US
dc.subjecthuman Embryonic Stem Cellsen_US
dc.subjectprojectoRen_US
dc.subjectCoGAPSen_US
dc.subjectGWCoGAPSen_US
dc.titleFinding human genetic variation in whole genome expression data with applications for “missing” heritability: The GWCoGAPS algorithm, the PatternMarkers statistic, and the ProjectoR package
dc.typeThesis
thesis.degree.disciplineHuman Genetics and Molecular Biology
thesis.degree.grantorJohns Hopkins University
thesis.degree.grantorSchool of Medicine
thesis.degree.levelDoctoral
thesis.degree.namePh.D.
dc.date.updated2018-05-22T03:38:29Z
dc.type.materialtext
thesis.degree.departmentMcKusick-Nathans Institute of Genetic Medicine
dc.contributor.committeeMemberSmith, Kirby D.
dc.contributor.committeeMemberColantuoni, Carlo
dc.contributor.committeeMemberFertig, Elana
dc.contributor.committeeMemberMcKay, Ronald
dc.contributor.committeeMemberHoeppner, Daniel
dc.contributor.committeeMemberChenoweth, Josh
dc.publisher.countryUSA
dc.creator.orcid0000-0001-8681-9110


Files in this item

Thumbnail
Thumbnail
Thumbnail
Thumbnail

This item appears in the following Collection(s)

Show simple item record