Search CORE

711 research outputs found

Stability of building gene regulatory networks with sparse autoregressive models

Author: A Barabasi
A Fujita
A Fujita
A Uren
B Amati
C Fogelberg
F Ferrazzi
F Geier
H de Jong
H Zou
H Zou
I Chaturvedi
J Bretz
J Lawler
Jagath C Rajapakse
K Murphy
M Grzegorczyk
M Petrucci
M Whitfield
N Friedman
P Li
P Menéndez
Piyushkumar A Mundra
R Tibshirani
S Chao
S Kim
T Hastie
T Shimamura
W Zhao
Y Benjamini
Y Huang
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Inferring species interaction networks from species abundance data: a comparative evaluation of various statistical and machine learning methods

Author: Albert
Ali Faisal
Alstroem
Araujo
Augustin
Barabási
Batjes
Beale
Beisner
Blüthgen
Chickering
Cohen
Colin M. Beale
Dale
Davis
de Silva
Dirk Husmeier
Dondelinger
Dunne
Edwards
Elle
Faisal
Frank Dondelinger
Friedman
Friedman
Garcia
Gaston
Geiger
Grafen
Grandvalet
Grzegorczyk
Grzegorczyk
Grzegorczyk
Hagemeijer
Hartemink
Heckerman
Heckerman
Henneman
Holt
Huntley
Husmeier
Ings
La Sorte
Lande
Legendre
Lennon
Losos
Luce
MacKay
MacKay
Maddison
Madigan
Memmott
Memmott
Milo
Murray
Needham
New
Opgen-Rhein
Park
Prentice
Proulx
Rogers
Rolando
Sachs
Schmitz
Schäfer
Schäfer
Schäfer
Sinclair
Smith
Snow
Thomas
Thuiller
Tibshirani
Tipping
Tipping
Tirelli
Tong
Valiente
van Someren
van Veen
Vázquez
Watts
Watts
Werhli
Werhli
Werner
Williams
Williams
Publication venue: 'Elsevier BV'
Publication date: 01/01/2010
Field of study

Crossref

Enlighten

Big data analytics in computational biology and bioinformatics

Author: Byron Kevin
Publication venue: Digital Commons @ NJIT
Publication date: 01/04/2017
Field of study

Big data analytics in computational biology and bioinformatics refers to an array of operations including biological pattern discovery, classification, prediction, inference, clustering as well as data mining in the cloud, among others. This dissertation addresses big data analytics by investigating two important operations, namely pattern discovery and network inference. The dissertation starts by focusing on biological pattern discovery at a genomic scale. Research reveals that the secondary structure in non-coding RNA (ncRNA) is more conserved during evolution than its primary nucleotide sequence. Using a covariance model approach, the stems and loops of an ncRNA secondary structure are represented as a statistical image against which an entire genome can be efficiently scanned for matching patterns. The covariance model approach is then further extended, in combination with a structural clustering algorithm and a random forests classifier, to perform genome-wide search for similarities in ncRNA tertiary structures. The dissertation then presents methods for gene network inference. Vast bodies of genomic data containing gene and protein expression patterns are now available for analysis. One challenge is to apply efficient methodologies to uncover more knowledge about the cellular functions. Very little is known concerning how genes regulate cellular activities. A gene regulatory network (GRN) can be represented by a directed graph in which each node is a gene and each edge or link is a regulatory effect that one gene has on another gene. By evaluating gene expression patterns, researchers perform in silico data analyses in systems biology, in particular GRN inference, where the “reverse engineering” is involved in predicting how a system works by looking at the system output alone. Many algorithmic and statistical approaches have been developed to computationally reverse engineer biological systems. However, there are no known bioin-formatics tools capable of performing perfect GRN inference. Here, extensive experiments are conducted to evaluate and compare recent bioinformatics tools for inferring GRNs from time-series gene expression data. Standard performance metrics for these tools based on both simulated and real data sets are generally low, suggesting that further efforts are needed to develop more reliable GRN inference tools. It is also observed that using multiple tools together can help identify true regulatory interactions between genes, a finding consistent with those reported in the literature. Finally, the dissertation discusses and presents a framework for parallelizing GRN inference methods using Apache Hadoop in a cloud environment

Digital Commons @ New Jersey Institute of Technology (NJIT)

Demand forecasting using exogenous leading indicators

Author: Aghezzaf El-Houssaine
Desmet Bram
Kourentzes Nikolaos
Sagaert Yves
Publication venue
Publication date: 01/01/2014
Field of study

Ghent University Academic Bibliography

Grouped graphical Granger modeling for gene expression regulatory networks discovery

Author: A. C. Lozano
Aressy
Carpenter
Fujita
Li
Liu
N. Abe
Ong
Ray
S. Rosset
Salon
Segal
Xu
Y. Liu
Publication venue: Oxford University Press
Publication date
Field of study

We consider the problem of discovering gene regulatory networks from time-series microarray data. Recently, graphical Granger modeling has gained considerable attention as a promising direction for addressing this problem. These methods apply graphical modeling methods on time-series data and invoke the notion of ‘Granger causality’ to make assertions on causality through inference on time-lagged effects. Existing algorithms, however, have neglected an important aspect of the problem—the group structure among the lagged temporal variables naturally imposed by the time series they belong to. Specifically, existing methods in computational biology share this shortcoming, as well as additional computational limitations, prohibiting their effective applications to the large datasets including a large number of genes and many data points. In the present article, we propose a novel methodology which we term ‘grouped graphical Granger modeling method’, which overcomes the limitations mentioned above by applying a regression method suited for high-dimensional and large data, and by leveraging the group structure among the lagged temporal variables according to the time series they belong to. We demonstrate the effectiveness of the proposed methodology on both simulated and actual gene expression data, specifically the human cancer cell (HeLa S3) cycle data. The simulation results show that the proposed methodology generally exhibits higher accuracy in recovering the underlying causal structure. Those on the gene expression data demonstrate that it leads to improved accuracy with respect to prediction of known links, and also uncovers additional causal relationships uncaptured by earlier works

Crossref

PubMed Central

Dynamic Analysis of High Dimensional Microarray Time Series Data Using Various Dimensional Reduction Methods

Author: Aolhasani Samareh Banafsheh
Publication venue: 'Oklahoma State University Library'
Publication date: 01/12/2013
Field of study

This dissertation focuses on dynamic analysis of reduced dimension models of two microarray time series datasets. Underlying research achieves two main objectives; namely, (1) various dimension reduction techniques used on time series microarray data, and (2) estimating autoregressive coefficients using several penalized regression methods like ridge, SCAD, and lasso.The research methodology includes two research tasks. Firstly, applying several dimension reduction methods on two microarray data sets, and modeling comparisons based on accuracy and computation cost. Secondly, applying the sparse vector autoregressive (SVAR) model to estimate gene regulatory network based on gene expression profile from time series microarray experiment on two datasets and the autoregressive coefficients estimation were calculated using several penalized regression methods, and then performing comparisons among various regression methods for each dimension reduction model.Study results show that the dimension reduction methods producing orthogonal independent variables are performing better because orthogonality leads to reasonable coefficient estimation with low standard errors. On the other hand, regarding dynamic analysis, it could be seen that factor analysis (FA) outperformed the rest of dimension reduction methods with regards to goodness of fit after applying several penalized regression methods on each model. The reason behind this is due to using varimax rotation in FA, in which most of the coordinates are set closer to zero, and in turn makes the data sparser. Hence inducing additional sparsity subject to maintaining a certain goodness of fit.Industrial Engineering & Managemen

SHAREOK repository