Search CORE

7,436 research outputs found

Inferring gene regulatory networks using ensembles of feature selection techniques

Author: Demeester Piet
Dhaene Tom
Geurts Pierre
Huynh-thu Vân anh
Ruyssinck Joeri
Saeys Yvan
Publication venue
Publication date: 01/01/2012
Field of study

Nonparametric Bayesian inference for perturbed and orthologous gene regulatory networks

Author: Baumbach
Breeze
Bulyk
Christopher A. Penfold
Cooke
David L. Wild
Goda
Katherine J. Denby
Kilian
Kimbrough
Klemm
Liu
Lopato
Marbach
Marbach
Matys
Mitsuda
Ou
Park
Penfold
Prill
Rasmussen
Roth
Stegle
Tsutsui
Vicky Buchanan-Wollaston
Werhli
Werhli
Yamaguchi-Shinozaki
Äijö
Publication venue: 'Oxford University Press (OUP)'
Publication date: 09/06/2012
Field of study

Motivation: The generation of time series transcriptomic datasets collected under multiple experimental conditions has proven to be a powerful approach for disentangling complex biological processes, allowing for the reverse engineering of gene regulatory networks (GRNs). Most methods for reverse engineering GRNs from multiple datasets assume that each of the time series were generated from networks with identical topology. In this study, we outline a hierarchical, non-parametric Bayesian approach for reverse engineering GRNs using multiple time series that can be applied in a number of novel situations including: (i) where different, but overlapping sets of transcription factors are expected to bind in the different experimental conditions; that is, where switching events could potentially arise under the different treatments and (ii) for inference in evolutionary related species in which orthologous GRNs exist. More generally, the method can be used to identify context-specific regulation by leveraging time series gene expression data alongside methods that can identify putative lists of transcription factors or transcription factor targets. Results: The hierarchical inference outperforms related (but non-hierarchical) approaches when the networks used to generate the data were identical, and performs comparably even when the networks used to generate data were independent. The method was subsequently used alongside yeast one hybrid and microarray time series data to infer potential transcriptional switches in Arabidopsis thaliana response to stress. The results confirm previous biological studies and allow for additional insights into gene regulation under various abiotic stresses. Availability: The methods outlined in this article have been implemented in Matlab and are available on request

Crossref

PubMed Central

Warwick Research Archives Portal Repository

A systematic review of data quality issues in knowledge discovery tasks

Author: Corrales David Camilo
Corrales Juan Carlos
Ledezma Agapito Ismael
Publication venue: 'Universidad de Medellin'
Publication date: 07/11/2015
Field of study

Hay un gran crecimiento en el volumen de datos porque las organizaciones capturan permanentemente la cantidad colectiva de datos para lograr un mejor proceso de toma de decisiones. El desafío mas fundamental es la exploración de los grandes volúmenes de datos y la extracción de conocimiento útil para futuras acciones por medio de tareas para el descubrimiento del conocimiento; sin embargo, muchos datos presentan mala calidad. Presentamos una revisión sistemática de los asuntos de calidad de datos en las áreas del descubrimiento de conocimiento y un estudio de caso aplicado a la enfermedad agrícola conocida como la roya del café.Large volume of data is growing because the organizations are continuously capturing the collective amount of data for better decision-making process. The most fundamental challenge is to explore the large volumes of data and extract useful knowledge for future actions through knowledge discovery tasks, nevertheless many data has poor quality. We presented a systematic review of the data quality issues in knowledge discovery tasks and a case study applied to agricultural disease named coffee rust

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Crossref

Universidad de Medellín: Revistas Científicas

Repositorio Institucional Universidad de Medellín

DIALNET

A hybrid algorithm for Bayesian network structure learning with application to multi-label learning

Author: Aussem Alex
Elghazel Haytham
Gasse Maxime
Publication venue: 'Elsevier BV'
Publication date: 01/11/2014
Field of study

We present a novel hybrid algorithm for Bayesian network structure learning, called H2PC. It first reconstructs the skeleton of a Bayesian network and then performs a Bayesian-scoring greedy hill-climbing search to orient the edges. The algorithm is based on divide-and-conquer constraint-based subroutines to learn the local structure around a target variable. We conduct two series of experimental comparisons of H2PC against Max-Min Hill-Climbing (MMHC), which is currently the most powerful state-of-the-art algorithm for Bayesian network structure learning. First, we use eight well-known Bayesian network benchmarks with various data sizes to assess the quality of the learned structure returned by the algorithms. Our extensive experiments show that H2PC outperforms MMHC in terms of goodness of fit to new data and quality of the network structure with respect to the true dependence structure of the data. Second, we investigate H2PC's ability to solve the multi-label learning problem. We provide theoretical results to characterize and identify graphically the so-called minimal label powersets that appear as irreducible factors in the joint distribution under the faithfulness condition. The multi-label learning problem is then decomposed into a series of multi-class classification problems, where each multi-class variable encodes a label powerset. H2PC is shown to compare favorably to MMHC in terms of global classification accuracy over ten multi-label data sets covering different application domains. Overall, our experiments support the conclusions that local structural learning with H2PC in the form of local neighborhood induction is a theoretically well-motivated and empirically effective learning framework that is well suited to multi-label learning. The source code (in R) of H2PC as well as all data sets used for the empirical tests are publicly available.Comment: arXiv admin note: text overlap with arXiv:1101.5184 by other author

arXiv.org e-Print Archive

Crossref

HAL

Hal-Diderot

Learning the structure of Bayesian Networks: A quantitative assessment of the effect of different algorithmic schemes

Author: Beretta Stefano
Castelli Mauro
Goncalves Ivo
Henriques Roberto
Ramazzotti Daniele
Publication venue
Publication date: 01/01/2018
Field of study

One of the most challenging tasks when adopting Bayesian Networks (BNs) is the one of learning their structure from data. This task is complicated by the huge search space of possible solutions, and by the fact that the problem is NP-hard. Hence, full enumeration of all the possible solutions is not always feasible and approximations are often required. However, to the best of our knowledge, a quantitative analysis of the performance and characteristics of the different heuristics to solve this problem has never been done before. For this reason, in this work, we provide a detailed comparison of many different state-of-the-arts methods for structural learning on simulated data considering both BNs with discrete and continuous variables, and with different rates of noise in the data. In particular, we investigate the performance of different widespread scores and algorithmic approaches proposed for the inference and the statistical pitfalls within them

arXiv.org e-Print Archive

Directory of Open Access Journals

Repositório da Universidade Nova de Lisboa

Estudo Geral

Global Functional Atlas of \u3cem\u3eEscherichia coli\u3c/em\u3e Encompassing Previously Uncharacterized Proteins

Author: Ali Mehrab
Babu Mohan
Butland Gareth
Chandran Shamanta
Christopolous Constantine
Emili Andrew
Eroukova Veronika
Golshani Ashkan
Greenblatt Jack F.
Guao Xinghua
Hu Pingzhao
Janga Sarah Chandra
Moreno-Hagelsieb Gabriel
Musso Gabriela
Nazarians-Armavil Anaies
Nazemof Nazila
Paccanaro Alberto
Phanse Sadhna
Pogoutse Oxana
Wong Peter
Yang Wenhong
Publication venue: Scholars Commons @ Laurier
Publication date: 01/04/2009
Field of study

One-third of the 4,225 protein-coding genes of Escherichia coli K-12 remain functionally unannotated (orphans). Many map to distant clades such as Archaea, suggesting involvement in basic prokaryotic traits, whereas others appear restricted to E. coli, including pathogenic strains. To elucidate the orphans’ biological roles, we performed an extensive proteomic survey using affinity-tagged E. coli strains and generated comprehensive genomic context inferences to derive a high-confidence compendium for virtually the entire proteome consisting of 5,993 putative physical interactions and 74,776 putative functional associations, most of which are novel. Clustering of the respective probabilistic networks revealed putative orphan membership in discrete multiprotein complexes and functional modules together with annotated gene products, whereas a machine-learning strategy based on network integration implicated the orphans in specific biological processes. We provide additional experimental evidence supporting orphan participation in protein synthesis, amino acid metabolism, biofilm formation, motility, and assembly of the bacterial cell envelope. This resource provides a “systems-wide” functional blueprint of a model microbe, with insights into the biological and evolutionary significance of previously uncharacterized proteins

Wilfrid Laurier University

Computational strategies for dissecting the high-dimensional complexity of adaptive immune repertoires

Author: Afzal
Ahmed
Albert
Amanna
Andrews
Angermueller
Apeltsin
Arden
Atchley
Avnir
Barabási
Barak
Bashford-Rogers
Bastian
Baum
Becattini
Ben-Hamo
Berger
Betz
Boc
Bolen
Bolkhovskaya
Bolotin
Bouckaert
Boyd
Boyd
Breden
Brown
Burnet
Bürckert
Calis
Castro
Chang
Chao
Chen
Ching
Cobey
Collins
Corcoran
Covacu
Csardi
Cui
Dash
de Bourcy
DeKosky
DeKosky
DeWitt
Dziubianau
Elhanati
Elhanati
Elhanati
Ellebedy
Emerson
Felsenstein
Friedensohn
Gadala-Maria
Galson
Galson
Geering
Georgiou
Ghraichy
Giribet
Giudicelli
Glanville
Glanville
Glanville
Good
Granato
Greiff
Greiff
Greiff
Greiff
Greiff
Grigaityte
Guindon
Gupta
Gupta
Hagberg
Halliley
Hammarlund
Heather
Hershberg
Hochreiter
Hoehn
Hoehn
Hoehn
Horns
Howie
Iversen
Jackson
Janeway
Jiang
Johnston
Jost
Jurtz
Kaplinsky
Kaplinsky
Kendall
Khavrutskii
Kidd
Kidd
Kidera
Kirik
Konishi
Kumar
Landsverk
Larkin
Lavinder
Laydon
Laydon
Lee
Lee
Lewitus
Li
Lindeman
Lindner
Liu
Love
Lozupone
Madi
Madi
Malissen
Mamoshina
Mangul
Manz
Martin
Meng
Miho
Mora
Morisita
Murugan
Nazarov
Nouri
Oakes
Ostmeyer
Paradis
Parameswaran
Parola
Pinheiro
Pollok
Ralph
Ravn
Reddy
Rempala
Rempala
Revell
Rizzetto
Robinson
Ronquist
Roybal
Rubelt
Rubelt
Safonova
Schliep
Schramm
Schwab
Shannon
Sheng
Sheng
Shlemov
Shugay
Shugay
Shugay
Snir
Snir
Stamatakis
Stern
Strauli
Stubbington
Stubbington
Sun
Sun Cinelli
Swofford
Thomas
Tickotsky
Tonegawa
Torkamani
Trepel
VanDuijn
Venturi
Venturi
Vieira
Vita
Wang
Wardemann
Warren
Watson
Watson
Watson
Weinstein
Wine
Wine
Wu
Yaari
Yaari
Yang
Yeap
Yermanos
Yokota
Yu
Zhu
Publication venue: 'Frontiers Media SA'
Publication date: 29/11/2017
Field of study

The adaptive immune system recognizes antigens via an immense array of antigen-binding antibodies and T-cell receptors, the immune repertoire. The interrogation of immune repertoires is of high relevance for understanding the adaptive immune response in disease and infection (e.g., autoimmunity, cancer, HIV). Adaptive immune receptor repertoire sequencing (AIRR-seq) has driven the quantitative and molecular-level profiling of immune repertoires thereby revealing the high-dimensional complexity of the immune receptor sequence landscape. Several methods for the computational and statistical analysis of large-scale AIRR-seq data have been developed to resolve immune repertoire complexity in order to understand the dynamics of adaptive immunity. Here, we review the current research on (i) diversity, (ii) clustering and network, (iii) phylogenetic and (iv) machine learning methods applied to dissect, quantify and compare the architecture, evolution, and specificity of immune repertoires. We summarize outstanding questions in computational immunology and propose future directions for systems immunology towards coupling AIRR-seq with the computational discovery of immunotherapeutics, vaccines, and immunodiagnostics.Comment: 27 pages, 2 figure

arXiv.org e-Print Archive

Repository for Publications and Research Data

Crossref

NORA - Norwegian Open Research Archives