Search CORE

30,119 research outputs found

Post-transcriptional knowledge in pathway analysis increases the accuracy of phenotypes classification

Author: Acunzo Mario
Alaimo Salvatore
Ferro Alfredo
Giugno Rosalba
Pulvirenti Alfredo
Veneziano Dario
Publication venue: 'Impact Journals, LLC'
Publication date: 01/01/2016
Field of study

Motivation: Prediction of phenotypes from high-dimensional data is a crucial task in precision biology and medicine. Many technologies employ genomic biomarkers to characterize phenotypes. However, such elements are not sufficient to explain the underlying biology. To improve this, pathway analysis techniques have been proposed. Nevertheless, such methods have shown lack of accuracy in phenotypes classification. Results: Here we propose a novel methodology called MITHrIL (Mirna enrIched paTHway Impact anaLysis) for the analysis of signaling pathways, which has built on top of the work of Tarca et al., 2009. MITHrIL extends pathways by adding missing regulatory elements, such as microRNAs, and their interactions with genes. The method takes as input the expression values of genes and/or microRNAs and returns a list of pathways sorted according to their deregulation degree, together with the corresponding statistical significance (p-values). Our analysis shows that MITHrIL outperforms its competitors even in the worst case. In addition, our method is able to correctly classify sets of tumor samples drawn from TCGA. Availability: MITHrIL is freely available at the following URL: http://alpha.dmi.unict.it/mithril

arXiv.org e-Print Archive

PubMed Central

Catalogo dei prodotti della ricerca

Machine Learning and Integrative Analysis of Biomedical Big Data.

Author: Choi Howard
Chung Neo Christopher
Mirza Bilal
Ping Peipei
Wang Jie
Wang Wei
Publication venue: eScholarship, University of California
Publication date: 01/01/2019
Field of study

Recent developments in high-throughput technologies have accelerated the accumulation of massive amounts of omics data from multiple sources: genome, epigenome, transcriptome, proteome, metabolome, etc. Traditionally, data from each source (e.g., genome) is analyzed in isolation using statistical and machine learning (ML) methods. Integrative analysis of multi-omics and clinical data is key to new biomedical discoveries and advancements in precision medicine. However, data integration poses new computational challenges as well as exacerbates the ones associated with single-omics studies. Specialized computational approaches are required to effectively and efficiently perform integrative analysis of biomedical data acquired from diverse modalities. In this review, we discuss state-of-the-art ML-based approaches for tackling five specific computational challenges associated with integrative analysis: curse of dimensionality, data heterogeneity, missing data, class imbalance and scalability issues

Multidisciplinary Digital Publishing Institute

Ezid

Directory of Open Access Journals

eScholarship - University of California

Network-based approaches to explore complex biological systems towards network medicine

Author: Conte Federica
Farina Lorenzo
Fiscon Giulia
Paci Paola
Publication venue: 'MDPI AG'
Publication date: 01/01/2018
Field of study

Network medicine relies on different types of networks: from the molecular level of protein–protein interactions to gene regulatory network and correlation studies of gene expression. Among network approaches based on the analysis of the topological properties of protein–protein interaction (PPI) networks, we discuss the widespread DIAMOnD (disease module detection) algorithm. Starting from the assumption that PPI networks can be viewed as maps where diseases can be identified with localized perturbation within a specific neighborhood (i.e., disease modules), DIAMOnD performs a systematic analysis of the human PPI network to uncover new disease-associated genes by exploiting the connectivity significance instead of connection density. The past few years have witnessed the increasing interest in understanding the molecular mechanism of post-transcriptional regulation with a special emphasis on non-coding RNAs since they are emerging as key regulators of many cellular processes in both physiological and pathological states. Recent findings show that coding genes are not the only targets that microRNAs interact with. In fact, there is a pool of different RNAs—including long non-coding RNAs (lncRNAs) —competing with each other to attract microRNAs for interactions, thus acting as competing endogenous RNAs (ceRNAs). The framework of regulatory networks provides a powerful tool to gather new insights into ceRNA regulatory mechanisms. Here, we describe a data-driven model recently developed to explore the lncRNA-associated ceRNA activity in breast invasive carcinoma. On the other hand, a very promising example of the co-expression network is the one implemented by the software SWIM (switch miner), which combines topological properties of correlation networks with gene expression data in order to identify a small pool of genes—called switch genes—critically associated with drastic changes in cell phenotype. Here, we describe SWIM tool along with its applications to cancer research and compare its predictions with DIAMOnD disease genes

Directory of Open Access Journals

Archivio della ricerca- Università di Roma La Sapienza

Prediction of non-genotoxic carcinogenicity based on genetic profiles of short term exposure assays

Author: González José Rolando
Peral Garcia Pilar
Perez Luis Orlando
Publication venue: 'The Korean Society of Toxicology'
Publication date: 01/06/2016
Field of study

Non-genotoxic carcinogens are substances that induce tumorigenesis by non-mutagenic mechanisms and long term rodent bioassays are required to identify them. Recent studies have shown that transcription profiling can be applied to develop early identifiers for long term phenotypes. In this study, we used rat liver expression profiles from the NTP (National Toxicology Program, Research Triangle Park, USA) DrugMatrix Database to construct a gene classifier that can distinguish between non-genotoxic carcinogens and other chemicals. The model was based on short term exposure assays (3 days) and the training was limited to oxidative stressors, peroxisome proliferators and hormone modulators. Validation of the predictor was performed on independent toxicogenomic data (TG-GATEs, Toxicogenomics Project-Genomics Assisted Toxicity Evaluation System, Osaka, Japan). To build our model we performed Random Forests together with a recursive elimination algorithm (VarSelRF). Gene set enrichment analysis was employed for functional interpretation. A total of 770 microarrays comprising 96 different compounds were analyzed and a predictor of 54 genes was built. Prediction accuracy was 0.85 in the training set, 0.87 in the test set and increased with increasing concentration in the validation set: 0.6 at low dose, 0.7 at medium doses and 0.81 at high doses. Pathway analysis revealed gene prominence of cellular respiration, energy production and lipoprotein metabolism. The biggest target of toxicogenomics is accurately predict the toxicity of unknown drugs. In this analysis, we presented a classifier that can predict non-genotoxic carcinogenicity by using short term exposure assays. In this approach, dose level is critical when evaluating chemicals at early time points.Fil: Perez, Luis Orlando. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Centro Nacional Patagónico. Instituto Patagónico para el Estudio de los Ecosistemas Continentales; ArgentinaFil: González José, Rolando. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Centro Nacional Patagónico. Instituto Patagónico para el Estudio de los Ecosistemas Continentales; ArgentinaFil: Peral Garcia, Pilar. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico CONICET- La Plata. Instituto de Genética Veterinaria ; Argentin

CONICET Digital

PubMed Central

Reverse Engineering Gene Networks with ANN: Variability in Network Inference Algorithms

Author: A Barabasi
A Baralla
A Braunstein
A Braunstein
A Krishnan
A Margolin
B Di Camillo
B Matthews
C Bishop
C Marr
C Steinhoff
D Marbach
D Stokic
E Dimitrova
E Keedwell
F He
F Markowetz
F Mordelet
G Altay
G Altay
G Karlebach
Giuseppe Jurman
I Nemenman
J Faith
J Peregrin-Alvarez
J Supper
L Song
M Bailly-Bechet
M Bansal
Marco Grimaldi
MB Eisen
N Friedman
P Baldi
P Erdös
P Langfelder
P Meyer
Paolo Provero
R De Smet
R Neal
R Neal
Roberto Visintainer
S Kauffman
S Lahabar
S Tuna
T Cover
VA Huynh-Thu
Y Zhang
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 24/09/2010
Field of study

Motivation :Reconstructing the topology of a gene regulatory network is one of the key tasks in systems biology. Despite of the wide variety of proposed methods, very little work has been dedicated to the assessment of their stability properties. Here we present a methodical comparison of the performance of a novel method (RegnANN) for gene network inference based on multilayer perceptrons with three reference algorithms (ARACNE, CLR, KELLER), focussing our analysis on the prediction variability induced by both the network intrinsic structure and the available data. Results: The extensive evaluation on both synthetic data and a selection of gene modules of "Escherichia coli" indicates that all the algorithms suffer of instability and variability issues with regards to the reconstruction of the topology of the network. This instability makes objectively very hard the task of establishing which method performs best. Nevertheless, RegnANN shows MCC scores that compare very favorably with all the other inference methods tested. Availability: The software for the RegnANN inference algorithm is distributed under GPL3 and it is available at the corresponding author home page (http://mpba.fbk.eu/grimaldi/regnann-supmat

arXiv.org e-Print Archive

CiteSeerX

Public Library of Science (PLOS)

Crossref

Archivio della ricerca - Fondazione Bruno Kessler

Directory of Open Access Journals

PubMed Central

Recommended from our members

Novel translational approaches to the search for precision therapies for acute respiratory distress syndrome.

Author: Calfee Carolyn S
Meyer Nuala J
Publication venue: eScholarship, University of California
Publication date: 01/06/2017
Field of study

In the 50 years since acute respiratory distress syndrome (ARDS) was first described, substantial progress has been made in identifying the risk factors for and the pathogenic contributors to the syndrome and in characterising the protein expression patterns in plasma and bronchoalveolar lavage fluid from patients with ARDS. Despite this effort, however, pharmacological options for ARDS remain scarce. Frequently cited reasons for this absence of specific drug therapies include the heterogeneity of patients with ARDS, the potential for a differential response to drugs, and the possibility that the wrong targets have been studied. Advances in applied biomolecular technology and bioinformatics have enabled breakthroughs for other complex traits, such as cardiovascular disease or asthma, particularly when a precision medicine paradigm, wherein a biomarker or gene expression pattern indicates a patient's likelihood of responding to a treatment, has been pursued. In this Review, we consider the biological and analytical techniques that could facilitate a precision medicine approach for ARDS

eScholarship - University of California

Isoform-level gene signature improves prognostic stratification and accurately classifies glioblastoma subtypes.

Author: Bi Yingtao
Davuluri Ramana V
Macyszyn Luke
O'Rourke Donald M
Pal Sharmistha
Showe Louise C
Publication venue: eScholarship, University of California
Publication date: 06/02/2014
Field of study

Molecular stratification of tumors is essential for developing personalized therapies. Although patient stratification strategies have been successful; computational methods to accurately translate the gene-signature from high-throughput platform to a clinically adaptable low-dimensional platform are currently lacking. Here, we describe PIGExClass (platform-independent isoform-level gene-expression based classification-system), a novel computational approach to derive and then transfer gene-signatures from one analytical platform to another. We applied PIGExClass to design a reverse transcriptase-quantitative polymerase chain reaction (RT-qPCR) based molecular-subtyping assay for glioblastoma multiforme (GBM), the most aggressive primary brain tumors. Unsupervised clustering of TCGA (the Cancer Genome Altas Consortium) GBM samples, based on isoform-level gene-expression profiles, recaptured the four known molecular subgroups but switched the subtype for 19% of the samples, resulting in significant (P = 0.0103) survival differences among the refined subgroups. PIGExClass derived four-class classifier, which requires only 121 transcript-variants, assigns GBM patients' molecular subtype with 92% accuracy. This classifier was translated to an RT-qPCR assay and validated in an independent cohort of 206 GBM samples. Our results demonstrate the efficacy of PIGExClass in the design of clinically adaptable molecular subtyping assay and have implications for developing robust diagnostic assays for cancer patient stratification

PubMed Central

eScholarship - University of California

Qualitative Assessment of Gene Expression in Affymetrix Genechip Arrays

Author: Ashkenazy
Balazsi
Beran
Chen
Chen
di Bernardo
Gardner
Golub
Held
Hu
Irizarry
Irizarry
Kantelhardt
Lockhart
Makse
Meenakshi Upreti
Naef
Nagarajan
Peng
Radhakrishnan Nagarajan
Schena
Shehadeh
Speed
Theiler
Ueda
Xu
Yeung
Publication venue: 'Elsevier BV'
Publication date: 19/05/2006
Field of study

Affymetrix Genechip microarrays are used widely to determine the simultaneous expression of genes in a given biological paradigm. Probes on the Genechip array are atomic entities which by definition are randomly distributed across the array and in turn govern the gene expression. In the present study, we make several interesting observations. We show that there is considerable correlation between the probe intensities across the array which defy the independence assumption. While the mechanism behind such correlations is unclear, we show that scaling behavior and the profiles of perfect match (PM) as well as mismatch (MM) probes are similar and immune to background subtraction. We believe that the observed correlations are possibly an outcome of inherent non-stationarities or patchiness in the array devoid of biological significance. This is demonstrated by inspecting their scaling behavior and profiles of the PM and MM probe intensities obtained from publicly available Genechip arrays from three eukaryotic genomes, namely: Drosophila Melanogaster, Homo Sapiens and Mus musculus across distinct biological paradigms and across laboratories, with and without background subtraction. The fluctuation functions were estimated using detrended fluctuation analysis (DFA) with fourth order polynomial detrending. The results presented in this study provide new insights into correlation signatures of PM and MM probe intensities and suggests the choice of DFA as a tool for qualitative assessment of Affymetrix Genechip microarrays prior to their analysis. A more detailed investigation is necessary in order to understand the source of these correlations.Comment: 22 Pages, 7 Figures, 1 Tabl

arXiv.org e-Print Archive

Crossref