• Login
    View Item 
    •   JScholarship Home
    • Theses and Dissertations, Electronic (ETDs)
    • ETD -- Doctoral Dissertations
    • View Item
    •   JScholarship Home
    • Theses and Dissertations, Electronic (ETDs)
    • ETD -- Doctoral Dissertations
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Sample-based Measures of Dysregulation and Heterogeneity in Cancer Molecular Profiles

    Thumbnail
    View/Open
    KE-DISSERTATION-2021.pdf (4.268Mb)
    Date
    2021-06-29
    Author
    Ke, Qian
    0000-0002-6897-7315
    Metadata
    Show full item record
    Abstract
    Computational models are essential to understand the molecular mechanisms underlying cell function and tissue organization in both health and disease. Clinically relevant molecular signatures derived from high-dimensional omics data represent unprecedented tools to personalize treatment. However, the absence of mechanistic underpinnings for the signatures generated by machine learning algorithms, and the absence of robust, quantitative measures of dysregulation, represent important barriers to successful clinical implementation. This thesis is focused on developing general procedures for representing and distinguishing among disease phenotypes, which embed biological mechanisms into the statistical learning process, and quantify levels of dysregulation and heterogeneity. We introduce a unified theory called “divergence” to convert an omics profile to a digital representation by comparing the profile of an individual to the range of landscapes in a baseline population. The reduction in complexity facilitates a more personalized and biologically interpretable analysis of variation. We introduce several new representations of multi-omics profiles which are highly simplified and yet sufficiently rich to account for observed heterogeneity. Starting from the network of gene-gene interactions existing in Reactome, we build a library of “Source-Target Pairs” (STPs); each consists of a “source” gene and a “target” gene whose expression is plausibly controlled by the source gene. An STP is “aberrant” if source gene is DNA-aberrant and the target gene is RNA-aberrant. To further reduce complexity, we use integer programming to “cover” the disease samples with a set of aberrant STPs, that is to find the smallest family of STPs such that every sample displays at least one aberrant STP within that family. Given such a covering, inter-sample heterogeneity is quantified by the entropy of distribution of covering states over the population. Finally, we develop a prediction model, “weighted voting”, which incorporate gene regulatory network information into the model parameters. The features are aberration states of individual genes and STPs, selected to display sharp differences in distribution among phenotypes of interest. We apply the entire framework to TCGA data from six distinct tumor types, demonstrating that our approach is well-suited to accommodate the expanding complexity of cancer genomes emerging from large consortia projects.
    URI
    http://jhir.library.jhu.edu/handle/1774.2/64370
    Collections
    • ETD -- Doctoral Dissertations

    DSpace software copyright © 2002-2016  DuraSpace
    Policies | Contact Us | Send Feedback
    Theme by 
    Atmire NV
     

     

    Browse

    All of JScholarshipCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

    My Account

    LoginRegister

    DSpace software copyright © 2002-2016  DuraSpace
    Policies | Contact Us | Send Feedback
    Theme by 
    Atmire NV