Search CORE

37,770 research outputs found

Multi-test Decision Tree and its Application to Microarray Data Classification

Author: Armstrong
Berzal
Breiman
Breiman
Breiman
Brodley
Brown
Brown
Che
Chen
Cohen
Cordell
Cowell
Czajkowski
Demsar
Dettling
Diaz-Uriarte
Dramiński
Fayyad
Freund
Freund
Ge
Golub
Grześ
Hall
Hastie
Hu
Kuo
Li
Marcin Czajkowski
Marek Grześ
Marek Kretowski
Murthy
Murthy
Pagallo
Qu
Quinlan
Robnik-Siikonja
Rokach
Rokach
Sebastiani
Shalev-Shwartz
Shi
Tan
Tan
Wold
Yeoh
Publication venue: 'Elsevier BV'
Publication date: 01/05/2014
Field of study

Objective: The desirable property of tools used to investigate biological data is easy to understand models and predictive decisions. Decision trees are particularly promising in this regard due to their comprehensible nature that resembles the hierarchical process of human decision making. However, existing algorithms for learning decision trees have tendency to underfit gene expression data. The main aim of this work is to improve the performance and stability of decision trees with only a small increase in their complexity. Methods: We propose a multi-test decision tree (MTDT); our main contribution is the application of several univariate tests in each non-terminal node of the decision tree. We also search for alternative, lower-ranked features in order to obtain more stable and reliable predictions. Results: Experimental validation was performed on several real-life gene expression datasets. Comparison results with eight classifiers show that MTDT has a statistically significantly higher accuracy than popular decision tree classifiers, and it was highly competitive with ensemble learning algorithms. The proposed solution managed to outperform its baseline algorithm on

14

datasets by an average

6

percent. A study performed on one of the datasets showed that the discovered genes used in the MTDT classification model are supported by biological evidence in the literature. Conclusion: This paper introduces a new type of decision tree which is more suitable for solving biological problems. MTDTs are relatively easy to analyze and much more powerful in modeling high dimensional microarray data than their popular counterparts

Crossref

Kent Academic Repository

Stochastic Modeling of Expression Kinetics Identifies Messenger Half-Lives and Reveals Sequential Waves of Co-ordinated Transcription and Decay

Author: Cacace Filippo
Cusimano Valerio
FARINA Lorenzo
Germani Alfredo
PACI PAOLA
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2012
Field of study

The transcriptome in a cell is finely regulated by a large number of molecular mechanisms able to control the balance between mRNA production and degradation. Recent experimental findings have evidenced that fine and specific regulation of degradation is needed for proper orchestration of a global cell response to environmental conditions. We developed a computational technique based on stochastic modeling, to infer condition-specific individual mRNA half-lives directly from gene expression time-courses. Predictions from our method were validated by experimentally measured mRNA decay rates during the intraerythrocytic developmental cycle of Plasmodium falciparum. We then applied our methodology to publicly available data on the reproductive and metabolic cycle of budding yeast. Strikingly, our analysis revealed, in all cases, the presence of periodic changes in decay rates of sequentially induced genes and co-ordination strategies between transcription and degradation, thus suggesting a general principle for the proper coordination of transcription and degradation machinery in response to internal and/or external stimuli. Citation: Cacace F, Paci P, Cusimano V, Germani A, Farina L (2012) Stochastic Modeling of Expression Kinetics Identifies Messenger Half-Lives and Reveals Sequential Waves of Co-ordinated Transcription and Decay. PLoS Comput Biol 8(11): e1002772. doi:10.1371/journal.pcbi.100277

CiteSeerX

Directory of Open Access Journals

PubMed Central

Archivio della ricerca- Università di Roma La Sapienza

FigShare

CAMUR: Knowledge extraction from RNA-seq cancer data through equivalent classification rules

Author: Bertolazzi Paola
Cestarelli Valerio
FELICI GIOVANNI
FISCON GIULIA
Weitschek Emanuel
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2015
Field of study

Nowadays, knowledge extraction methods from Next Generation Sequencing data are highly requested. In this work, we focus on RNA-seq gene expression analysis and specifically on case-control studies with rule-based supervised classification algorithms that build a model able to discriminate cases from controls. State of the art algorithms compute a single classification model that contains few features (genes). On the contrary, our goal is to elicit a higher amount of knowledge by computing many classification models, and therefore to identify most of the genes related to the predicted class

PubMed Central

Archivio della ricerca- Università di Roma La Sapienza

Interpretable Categorization of Heterogeneous Time Series Data

Author: Kochenderfer Mykel J.
Lee Ritchie
Mengshoel Ole J.
Silbermann Joshua
Publication venue
Publication date: 26/01/2018
Field of study

Understanding heterogeneous multivariate time series data is important in many applications ranging from smart homes to aviation. Learning models of heterogeneous multivariate time series that are also human-interpretable is challenging and not adequately addressed by the existing literature. We propose grammar-based decision trees (GBDTs) and an algorithm for learning them. GBDTs extend decision trees with a grammar framework. Logical expressions derived from a context-free grammar are used for branching in place of simple thresholds on attributes. The added expressivity enables support for a wide range of data types while retaining the interpretability of decision trees. In particular, when a grammar based on temporal logic is used, we show that GBDTs can be used for the interpretable classi cation of high-dimensional and heterogeneous time series data. Furthermore, we show how GBDTs can also be used for categorization, which is a combination of clustering and generating interpretable explanations for each cluster. We apply GBDTs to analyze the classic Australian Sign Language dataset as well as data on near mid-air collisions (NMACs). The NMAC data comes from aircraft simulations used in the development of the next-generation Airborne Collision Avoidance System (ACAS X).Comment: 9 pages, 5 figures, 2 tables, SIAM International Conference on Data Mining (SDM) 201

arXiv.org e-Print Archive

Crossref

NASA Technical Reports Server

Iron deficiency-mediated stress regulation of four subgroup Ib BHLH genes in Arabidopsis thaliana

Author: Bauer P.
Baumlein H.
Jakoby M.
Klatte M.
Wang H.
Weisshaar B.
Publication venue
Publication date: 01/09/2007
Field of study

MPG.PuRe

Pulsed Feedback Defers Cellular Differentiation

Author: A Chastanet
A Eldar
A. D Grossman
A. H Singh
A. L Sonenshein
B. A Lazazzera
C. M Waters
D Burbulys
D Lopez
D Lopez
D. B Kearns
E Kussell
F Giudicelli
G. M Suel
G. M Suel
J Newport
J Newport
J. E Donnellan Jr
J. E Gonzalez-Pastor
J. E Segall
J. M Sterlini
J. R LeDeaux
J. W Veening
J. W Veening
Joe H. Levine
Jonathan Dworkin
K Asai
K Ireton
K. L Ohlsen
L Cai
M Acar
M Acar
M Fujita
M Fujita
M Jiang
M Perego
M Perego
M Raff
M Ratnayake-Lecamwasam
M Strauch
M. A Strauch
M. A Strauch
M. B Elowitz
Michael B. Elowitz
Michael Laub
Michelle E. Fontes
N Rosenfeld
N Rosenfeld
N. Q Balaban
P Eswaramoorthy
P Eswaramoorthy
T Long
T. Y Tsai
U Alon
V Molle
W. F Burkholder
Y. T Maeda
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2012
Field of study

Environmental signals induce diverse cellular differentiation programs. In certain systems, cells defer differentiation for extended time periods after the signal appears, proliferating through multiple rounds of cell division before committing to a new fate. How can cells set a deferral time much longer than the cell cycle? Here we study Bacillus subtilis cells that respond to sudden nutrient limitation with multiple rounds of growth and division before differentiating into spores. A well-characterized genetic circuit controls the concentration and phosphorylation of the master regulator Spo0A, which rises to a critical concentration to initiate sporulation. However, it remains unclear how this circuit enables cells to defer sporulation for multiple cell cycles. Using quantitative time-lapse fluorescence microscopy of Spo0A dynamics in individual cells, we observed pulses of Spo0A phosphorylation at a characteristic cell cycle phase. Pulse amplitudes grew systematically and cell-autonomously over multiple cell cycles leading up to sporulation. This pulse growth required a key positive feedback loop involving the sporulation kinases, without which the deferral of sporulation became ultrasensitive to kinase expression. Thus, deferral is controlled by a pulsed positive feedback loop in which kinase expression is activated by pulses of Spo0A phosphorylation. This pulsed positive feedback architecture provides a more robust mechanism for setting deferral times than constitutive kinase expression. Finally, using mathematical modeling, we show how pulsing and time delays together enable “polyphasic” positive feedback, in which different parts of a feedback loop are active at different times. Polyphasic feedback can enable more accurate tuning of long deferral times. Together, these results suggest that Bacillus subtilis uses a pulsed positive feedback loop to implement a “timer” that operates over timescales much longer than a cell cycle

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Caltech Authors

Recommended from our members

The tumor-promoting functions of Ataxia-telangiectasia mutated (ATM) in cancer cells

Author: Chen Wei-Ta, active 21st century
Publication venue
Publication date: 16/10/2015
Field of study

textAtaxia-telangiectasia mutated (ATM) protein kinase regulates the DNA damage response (DDR) and is associated with cancer suppression by protecting cells from DNA double-strand breaks (DSBs). However, how ATM functions outside of DSB signaling is less clearly understood. Here, we report a new cancer-promoting role for ATM in stimulating cell migration and invasion independently of DSB signaling or induction. We used two highly metastatic human breast cancer cell lines to corroborate that ATM is required for cell migration and invasion. Microarray analysis of cells depleted for ATM identified interleukin-8 (IL-8) as a target since the exogenous addition of IL-8 rescued migration and invasion defects in ATM-deficient cells. Finally, ATM depletion in human cancer cells reduced lung metastasis in a mouse xenograft model. These findings shed light on tumor-promoting functions of ATM. Therefore, in addition to its canonical roles in tumor suppression, ATM promotes tumor progression as well.Cellular and Molecular Biolog

Texas ScholarWorks

Quantitative single-cell splicing analysis reveals an ‘economy of scale’ filter for gene expression

Author: Ding Fangyuan
Elowitz Michael B.
Publication venue
Publication date: 30/10/2018
Field of study

In eukaryotic cells, splicing affects the fate of each pre-mRNA transcript, helping to determine whether it is ultimately processed into an mRNA, or degraded. The efficiency of splicing plays a key role in gene expression. However, because it depends on the levels of multiple isoforms at the same transcriptional active site (TAS) in the same cell, splicing efficiency has been challenging to measure. Here, we introduce a quantitative single-molecule FISH-based method that enables determination of the absolute abundances of distinct RNA isoforms at individual TASs. Using this method, we discovered that splicing efficiency behaves in an unexpected ‘economy of scale’ manner, increasing, rather than decreasing, with gene expression levels, opposite to a standard enzymatic process. This behavior could result from an observed correlation between splicing efficiency and spatial proximity to nuclear speckles. Economy of scale splicing represents a non-linear filter that amplifies the expression of genes when they are more strongly transcribed. This method will help to reveal the roles of splicing in the quantitative control of gene expression

Caltech Authors