Methods for genome interpretation: causal gene discovery and personal phenotype prediction
MetadataShow full item record
Genome interpretation – illustrating how genomic variation affects phenotypic variation – is one of the central questions of the early 21st century. Deciphering the mapping between genotypes and phenotypes requires the collection of a large amount of data, both genetic and phenotypic. Phenotypic profiles, for example, have been systematically recorded and archived in hospitals and national health-related organizations for years. Human genome sequences, however, had not been sequenced in a high throughput manner until next-generation sequencing technologies became available in 2005. Since then, vast amounts of genotype-phenotype data have been collected, allowing for the unprecedented opportunity for genome interpretation. Genome interpretation is an ambitious, poorly understood goal that may require collaboration between many disciplines. In this dissertation, I focus on the development of computational methods for genome interpretation. Based on recent interest in relating genotypes and phenotypes, the task is divided into two stages: discovery (Chapters 2-6) and prediction (Chapters 7-10). In the discovery stage, the location of genomic loci associated with a phenotype of interest is identified based on sequence-based case-control studies. In the prediction stage, I propose a probabilistic model to predict personal phenotypes given an individual’s genome by integrating many sources of information, including the phenotype-associated loci found in the discovery stage.