Search CORE

873 research outputs found

Modeling and identification of gene regulatory networks: A Granger causality approach

Author: Chan SC
Hu Y
Hung YS
Xu WC
Zhang ZG
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2010
Field of study

It is of increasing interest in systems biology to discover gene regulatory networks (GRNs) from time-series genomic data, i.e., to explore the interactions among a large number of genes and gene products over time. Currently, one common approach is based on Granger causality, which models the time-series genomic data as a vector autoregressive (VAR) process and estimates the GRNs from the VAR coefficient matrix. The main challenge for identification of VAR models is the high dimensionality of genes and limited number of time points, which results in statistically inefficient solution and high computational complexity. Therefore, fast and efficient variable selection techniques are highly desirable. In this paper, an introductory review of identification methods and variable selection techniques for VAR models in learning the GRNs will be presented. Furthermore, a dynamic VAR (DVAR) model, which accounts for dynamic GRNs changing with time during the experimental cycle, and its identification methods are introduced. © 2010 IEEE.published_or_final_versionThe 9th International Conference on Machine Learning and Cybernetics (ICMLC 2010), Qingdao, China, 11-14 July 2010. In Proceedings of the 9th ICMLC, 2010, v. 6, p. 3073-307

HKU Scholars Hub

A temporal precedence based clustering method for gene expression microarray data

Author: Buchanan-Wollaston Vicky
Krishna Ritesh V.
Li Chang-Tsun
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2010
Field of study

Background: Time-course microarray experiments can produce useful data which can help in understanding the underlying dynamics of the system. Clustering is an important stage in microarray data analysis where the data is grouped together according to certain characteristics. The majority of clustering techniques are based on distance or visual similarity measures which may not be suitable for clustering of temporal microarray data where the sequential nature of time is important. We present a Granger causality based technique to cluster temporal microarray gene expression data, which measures the interdependence between two time-series by statistically testing if one time-series can be used for forecasting the other time-series or not. Results: A gene-association matrix is constructed by testing temporal relationships between pairs of genes using the Granger causality test. The association matrix is further analyzed using a graph-theoretic technique to detect highly connected components representing interesting biological modules. We test our approach on synthesized datasets and real biological datasets obtained for Arabidopsis thaliana. We show the effectiveness of our approach by analyzing the results using the existing biological literature. We also report interesting structural properties of the association network commonly desired in any biological system. Conclusions: Our experiments on synthesized and real microarray datasets show that our approach produces encouraging results. The method is simple in implementation and is statistically traceable at each step. The method can produce sets of functionally related genes which can be further used for reverse-engineering of gene circuits

Deakin Research Online

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Warwick Research Archives Portal Repository

The Local Edge Machine: inference of dynamic models of gene regulation

Author: Adam R. Leman
Anastasia Deckard
Christina M. Kelliher
John B. Hogenesch
John L. Harer
Kevin A. McGoff
Lauren J. Francey
Steven B. Haase
Xin Guo
Publication venue: Springer Nature
Publication date: 22/05/2017
Field of study

We present a novel approach, the Local Edge Machine, for the inference of regulatory interactions directly from time-series gene expression data. We demonstrate its performance, robustness, and scalability on in silico datasets with varying behaviors, sizes, and degrees of complexity. Moreover, we demonstrate its ability to incorporate biological prior information and make informative predictions on a well-characterized in vivo system using data from budding yeast that have been synchronized in the cell cycle. Finally, we use an atlas of transcription data in a mammalian circadian system to illustrate how the method can be used for discovery in the context of large complex networks.Department of Applied Mathematic

The Hong Kong Polytechnic University Pao Yue-kong Library

Springer - Publisher Connector

PolyU Institutional Repository

FigShare

Causal machine learning for single-cell genomics

Author: Aliee Hananeh
Bauer Stefan
Bengio Yoshua
Bertin Paul
Tejada-Lapuerta Alejandro
Theis Fabian J.
Publication venue
Publication date: 23/10/2023
Field of study

Advances in single-cell omics allow for unprecedented insights into the transcription profiles of individual cells. When combined with large-scale perturbation screens, through which specific biological mechanisms can be targeted, these technologies allow for measuring the effect of targeted perturbations on the whole transcriptome. These advances provide an opportunity to better understand the causative role of genes in complex biological processes such as gene regulation, disease progression or cellular development. However, the high-dimensional nature of the data, coupled with the intricate complexity of biological systems renders this task nontrivial. Within the machine learning community, there has been a recent increase of interest in causality, with a focus on adapting established causal techniques and algorithms to handle high-dimensional data. In this perspective, we delineate the application of these methodologies within the realm of single-cell genomics and their challenges. We first present the model that underlies most of current causal approaches to single-cell biology and discuss and challenge the assumptions it entails from the biological point of view. We then identify open problems in the application of causal approaches to single-cell data: generalising to unseen environments, learning interpretable models, and learning causal models of dynamics. For each problem, we discuss how various research directions - including the development of computational approaches and the adaptation of experimental protocols - may offer ways forward, or on the contrary pose some difficulties. With the advent of single cell atlases and increasing perturbation data, we expect causal models to become a crucial tool for informed experimental design.Comment: 35 pages, 7 figures, 3 tables, 1 bo

arXiv.org e-Print Archive

Big data analytics in computational biology and bioinformatics

Author: Byron Kevin
Publication venue: Digital Commons @ NJIT
Publication date: 01/04/2017
Field of study

Big data analytics in computational biology and bioinformatics refers to an array of operations including biological pattern discovery, classification, prediction, inference, clustering as well as data mining in the cloud, among others. This dissertation addresses big data analytics by investigating two important operations, namely pattern discovery and network inference. The dissertation starts by focusing on biological pattern discovery at a genomic scale. Research reveals that the secondary structure in non-coding RNA (ncRNA) is more conserved during evolution than its primary nucleotide sequence. Using a covariance model approach, the stems and loops of an ncRNA secondary structure are represented as a statistical image against which an entire genome can be efficiently scanned for matching patterns. The covariance model approach is then further extended, in combination with a structural clustering algorithm and a random forests classifier, to perform genome-wide search for similarities in ncRNA tertiary structures. The dissertation then presents methods for gene network inference. Vast bodies of genomic data containing gene and protein expression patterns are now available for analysis. One challenge is to apply efficient methodologies to uncover more knowledge about the cellular functions. Very little is known concerning how genes regulate cellular activities. A gene regulatory network (GRN) can be represented by a directed graph in which each node is a gene and each edge or link is a regulatory effect that one gene has on another gene. By evaluating gene expression patterns, researchers perform in silico data analyses in systems biology, in particular GRN inference, where the “reverse engineering” is involved in predicting how a system works by looking at the system output alone. Many algorithmic and statistical approaches have been developed to computationally reverse engineer biological systems. However, there are no known bioin-formatics tools capable of performing perfect GRN inference. Here, extensive experiments are conducted to evaluate and compare recent bioinformatics tools for inferring GRNs from time-series gene expression data. Standard performance metrics for these tools based on both simulated and real data sets are generally low, suggesting that further efforts are needed to develop more reliable GRN inference tools. It is also observed that using multiple tools together can help identify true regulatory interactions between genes, a finding consistent with those reported in the literature. Finally, the dissertation discusses and presents a framework for parallelizing GRN inference methods using Apache Hadoop in a cloud environment

Digital Commons @ New Jersey Institute of Technology (NJIT)

Data based identification and prediction of nonlinear and complex dynamical systems

Author: Grebogi Celso
Lai Ying-Cheng
Wang Wen-Xu
Publication venue: 'Elsevier BV'
Publication date: 27/04/2017
Field of study

We thank Dr. R. Yang (formerly at ASU), Dr. R.-Q. Su (formerly at ASU), and Mr. Zhesi Shen for their contributions to a number of original papers on which this Review is partly based. This work was supported by ARO under Grant No. W911NF-14-1-0504. W.-X. Wang was also supported by NSFC under Grants No. 61573064 and No. 61074116, as well as by the Fundamental Research Funds for the Central Universities, Beijing Nova Programme.Peer reviewedPostprin

arXiv.org e-Print Archive

Aberdeen University Research

The Local Edge Machine: inference of dynamic models of gene regulation

Author: A Greenfield
A Nayak
Adam R. Leman
AL Barabasi
Anastasia Deckard
CA Penfold
Christina M. Kelliher
CJ Oates
CJ Oates
CT Workman
CWJ Granger
D Marbach
DA Orlando
DT Banos
EE Zhang
EE Zhang
ENCODE Project Consortium
ER Morrissey
F Dondelinger
G Lillacci
H Kitano
I Simon
J Mazur
J Yu
JM Cherry
JM Raser
JN Bazil
John B. Hogenesch
John L. Harer
Kevin A. McGoff
LA Simmons Kovacs
Lauren J. Francey
M Bansal
M Hecker
MB Elowitz
MC Teixeira
MS Yeung
N Reynolds
NE Buchler
P Zoppoli
PG Bissiri
PT Spellman
R Bonneau
R Wong
R Zhang
RA Fisher
RC Anafi
RS McIsaac
SB Haase
SM Hill
Steven B. Haase
T Pramila
T Äijö
TI Lee
TS Gardner
TS Price
VA Huynh-Thu
W Jiang
Xin Guo
Y Benjamini
Y Peng
Y Setty
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref