Gene/Protein
Disease
Symptom
Drug
Enzyme
Compound
Pivot Concepts:
Gene/Protein
Disease
Symptom
Drug
Enzyme
Compound
Target Concepts:
Gene/Protein
Disease
Symptom
Drug
Enzyme
Compound
Query: UMLS:C0017636 (
glioblastoma
)
18,345
document(s) hit in 31,850,051 MEDLINE articles (0.00 seconds)
Increased availability of multi-platform genomics data on matched samples has sparked research efforts to discover how diverse molecular features interact both within and between platforms. In addition, simultaneous measurements of genetic and epigenetic characteristics illuminate the roles their complex relationships play in disease progression and outcomes. However, integrative methods for diverse genomics data are faced with the challenges of ultra-high dimensionality and the existence of complex interactions both within and between platforms. We propose a novel modeling framework for integrative analysis based on decompositions of the large number of platform-specific features into a smaller number of latent features. Subsequently we build a predictive model for clinical outcomes accounting for both within- and between-platform interactions based on Bayesian model averaging procedures. Principal components, partial least squares and non-negative matrix factorization as well as sparse counterparts of each are used to define the latent features, and the performance of these decompositions is compared both on real and simulated data. The latent feature interactions are shown to preserve interactions between the original features and not only aid prediction but also allow explicit selection of outcome-related features. The methods are motivated by and applied to a glioblastoma multiforme data set from The Cancer Genome Atlas to predict patient survival times integrating gene expression, microRNA, copy number and methylation data. For the
glioblastoma
data, we find a high concordance between our selected prognostic genes and genes with known associations with
glioblastoma
. In addition, our model discovers several relevant cross-platform interactions such as copy number variation associated gene dosing and epigenetic regulation through promoter methylation. On simulated data, we show that our proposed method successfully incorporates interactions within and between genomic platforms to aid accurate prediction and variable selection. Our methods perform best when principal components are used to define the latent features.
IEEE/
ACM
Trans Comput Biol Bioinform
PMID:Latent Feature Decompositions for Integrative Analysis of Multi-Platform Genomic Data. 2614 92
The human genome contains a large number of protein polymorphisms due to individual genome variation. How many of these polymorphisms lead to altered protein-protein interaction is unknown. We have developed a method to address this question. The intersection of the SKEMPI database (of affinity constants among interacting proteins) and CAPRI 4.0 docking benchmark was docked using HADDOCK, leading to a training set of 166 mutant pairs. A random forest classifier based on the differences in resulting docking scores between the 166 mutant pairs and their wild-types was used, to distinguish between variants that have either completely or partially lost binding ability. Fifty percent of non-binders were correctly predicted with a false discovery rate of only 2 percent. The model was tested on a set of 15 HIV-1 - human, as well as seven human- human
glioblastoma
-related, mutant protein pairs: 50 percent of combined non-binders were correctly predicted with a false discovery rate of 10 percent. The model was also used to identify 10 protein-protein interactions between human proteins and their HIV-1 partners that are likely to be abolished by rare non-synonymous single-nucleotide polymorphisms (nsSNPs). These nsSNPs may represent novel and potentially therapeutically-valuable targets for anti-viral therapy by disruption of viral binding.
IEEE/
ACM
Trans Comput Biol Bioinform
PMID:Predicting nsSNPs that Disrupt Protein-Protein Interactions Using Docking. 2681 31
Identifying gene network rewiring under different biological conditions is important for understanding the mechanisms underlying complex diseases. Gaussian graphical models, which assume the data follow the multivariate normal distribution, are widely used to identify gene network rewiring. However, the normality assume often fails in reality since the data are contaminated by extreme outliers in general. In this study, we propose a new robust differential graphical model to identify gene network rewiring between two conditions based on the multivariate t-distribution. The multivariate t-distribution is more robust to outliers than the normal distribution since it has heavy tails and allows values far from the mean. A fused lasso penalty is used to borrow information across conditions to improve the results. We develop an expectation maximization algorithm to solve the optimization model. Experiment results on simulated data show that our method outperforms the state-of-the-art methods. Our method is also applied to identify gene network rewiring between luminal A and basal-like subtypes of breast cancer, and gene network rewiring between the proneural and mesenchymal subtypes of
glioblastoma
. Several key genes which drive gene network rewiring are discovered.
IEEE/
ACM
Trans Comput Biol Bioinform
PMID:Identifying Gene Network Rewiring Using Robust Differential Graphical Model with Multivariate t-Distribution. 3080 72
The development and prognosis of complex diseases usually involves changes in regulatory relationships among biomolecules. Understanding how the regulatory relationships change with genetic alterations can help to reveal the underlying biological mechanisms for complex diseases. Although several models have been proposed to estimate the differential network between two different states, they are not suitable to deal with situations where the molecules of interest are affected by other covariates. Nor can they make use of prior information that provides insights about the structures of biomolecular networks. In this study, we introduce a novel weighted fused conditional Gaussian graphical model to jointly estimate two state-specific biomolecular regulatory networks and their difference between two different states. Unlike previous differential network estimation methods, our model can take into account the related covariates and the prior network information when inferring differential networks. The effectiveness of our proposed model is first evaluated based on simulation studies. Experiment results demonstrate that our model outperforms other state-of-the-art differential networks estimation models in all cases. We then apply our model to identify the differential gene network between two subtypes of
glioblastoma
based on gene expression and miRNA expression data. Our model is able to discover known mechanisms of
glioblastoma
and provide interesting predictions.
IEEE/
ACM
Trans Comput Biol Bioinform
PMID:Differential Network Analysis via Weighted Fused Conditional Gaussian Graphical Model. 3124 59