Search CORE

1,715 research outputs found

Casual Compressive Sensing for Gene Network Inference

Author: A Butte
A Fujita
A Margolin
A Margolin
A Rao
A Shojaie
A Werhli
AC Lozano
Amin Emad
BE Perrin
BS Chen
C Olsen
C Sima
CA Penfold
CWJ Granger
D Husmeier
D Marbach
D Ruklisa
Daniele Marinazzo
DL Donoho
E Van Den Berg
EJ Candès
F Emmert-Streib
G Altay
G Della Gatta
G Stolovitzky
H de Jong
HE Samad
I Cantone
J Dingel
J Dougherty
J Watkinson
J Wright
J Yu
JF Geweke
JJ Faith
K Liang
M Bansal
M Deng
M Xu
M Zou
ML Whitfield
N Friedman
N Mukhopadhyay
Olgica Milenkovic
PE Meyer
PM Long
R Laubenbacher
R Penrose
R Tibshirani
RR Vallabhajosyula
S Becker
S Kauffman
T Chen
TS Gardner
W Dai
W Liu
W Zhao
X Cai
Y Prat
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2012
Field of study

We propose a novel framework for studying causal inference of gene interactions using a combination of compressive sensing and Granger causality techniques. The gist of the approach is to discover sparse linear dependencies between time series of gene expressions via a Granger-type elimination method. The method is tested on the Gardner dataset for the SOS network in E. coli, for which both known and unknown causal relationships are discovered

arXiv.org e-Print Archive

CiteSeerX

Directory of Open Access Journals

FigShare

Graph Kernels

Author: Borgwardt Karsten M.
Kondor Risi
Schraudolph Nicol N.
Vishwanathan S. V. N.
Publication venue: 'MIT Press - Journals'
Publication date: 01/01/2010
Field of study

We present a unified framework to study graph kernels, special cases of which include the random walk (Gärtner et al., 2003; Borgwardt et al., 2005) and marginalized (Kashima et al., 2003, 2004; Mahé et al., 2004) graph kernels. Through reduction to a Sylvester equation we improve the time complexity of kernel computation between unlabeled graphs with n vertices from O(n^6) to O(n^3). We find a spectral decomposition approach even more efficient when computing entire kernel matrices. For labeled graphs we develop conjugate gradient and fixed-point methods that take O(dn^3) time per iteration, where d is the size of the label set. By extending the necessary linear algebra to Reproducing Kernel Hilbert Spaces (RKHS) we obtain the same result for d-dimensional edge kernels, and O(n^4) in the infinite-dimensional case; on sparse graphs these algorithms only take O(n^2) time per iteration in all cases. Experiments on graphs from bioinformatics and other application domains show that these techniques can speed up computation of the kernel by an order of magnitude or more. We also show that certain rational kernels (Cortes et al., 2002, 2003, 2004) when specialized to graphs reduce to our random walk graph kernel. Finally, we relate our framework to R-convolution kernels (Haussler, 1999) and provide a kernel that is close to the optimal assignment kernel of Fröhlich et al. (2006) yet provably positive semi-definite

Caltech Authors

Computational biology in the 21st century

Author: Berger Leighton Bonnie
Daniels Noah
Yu Yun William
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 16/05/2018
Field of study

Computational biologists answer biological and biomedical questions by using computation in support of—or in place of—laboratory procedures, hoping to obtain more accurate answers at a greatly reduced cost. The past two decades have seen unprecedented technological progress with regard to generating biological data; next-generation sequencing, mass spectrometry, microarrays, cryo-electron microscopy, and other highthroughput approaches have led to an explosion of data. However, this explosion is a mixed blessing. On the one hand, the scale and scope of data should allow new insights into genetic and infectious diseases, cancer, basic biology, and even human migration patterns. On the other hand, researchers are generating datasets so massive that it has become difficult to analyze them to discover patterns that give clues to the underlying biological processes.National Institutes of Health. (U.S.) ( grant GM108348)Hertz Foundatio

Entropy-scaling search of massive biological data

Author: Berger Bonnie
Daniels Noah M.
Danko David Christian
Yu Y. William
Publication venue: 'Elsevier BV'
Publication date: 01/06/2015
Field of study

Many datasets exhibit a well-defined structure that can be exploited to design faster search tools, but it is not always clear when such acceleration is possible. Here, we introduce a framework for similarity search based on characterizing a dataset's entropy and fractal dimension. We prove that searching scales in time with metric entropy (number of covering hyperspheres), if the fractal dimension of the dataset is low, and scales in space with the sum of metric entropy and information-theoretic entropy (randomness of the data). Using these ideas, we present accelerated versions of standard tools, with no loss in specificity and little loss in sensitivity, for use in three domains---high-throughput drug screening (Ammolite, 150x speedup), metagenomics (MICA, 3.5x speedup of DIAMOND [3,700x BLASTX]), and protein structure search (esFragBag, 10x speedup of FragBag). Our framework can be used to achieve "compressive omics," and the general theory can be readily applied to data science problems outside of biology.Comment: Including supplement: 41 pages, 6 figures, 4 tables, 1 bo

arXiv.org e-Print Archive

Computational solutions for omics data

Author: A Butte
A Chatr-aryamontri
A Franceschini
A Joshi
A Lan
A Mortazavi
A Subramanian
A Tanay
AC Jungkamp
AJ Pinho
AK Wong
AR Whitney
B Langmead
B Langmead
B Paten
Bonnie Berger
BP Kelley
C Huttenhower
C Kingsford
C Trapnell
C Trapnell
C Trapnell
C Wang
CH Yeang
CJ Vaske
CS Liao
D Croft
D Earl
D Kim
D Kim
D Park
DB Allison
DB Jaffe
DR Zerbino
E Banks
E Banks
E Cerami
E Nabieva
E Segal
E Yeger-Lotem
EJ Rossin
ER Mardis
ES Lander
ET Wang
F Hach
F Hach
F Markowetz
F Ozsolak
F Vandin
F Vandin
F Vezzi
GE Zinman
H Li
H Li
I Ulitsky
I Ulitsky
IA Adzhubei
J Butler
J Clarke
J Flannick
J Goecks
J Lamb
J Pandey
JC Marioni
JC Venter
Jian Peng
JT Dudley
JT Leek
JT Simpson
JT Simpson
K Rhrissorrakrai
KI Goh
KY Yeung
L Parts
LD Stein
LH Hartwell
LM Heiser
LR Meyer
M Ascano
M Burrows
M Garber
M Gross
M Gstaiger
M Hafner
M Hsi-Yang Fritz
M Kircher
M Koyuturk
M Narayanan
M Reich
M Schatz
M Schmid
M Sirota
M Steffen
M Yandell
MB Gerstein
MB Gerstein
MC Brandon
MC Schatz
MG Grabherr
MH Maathuis
ML Metzker
Mona Singh
N Atias
N de Souza
N Tuncbag
NP Palmer
NT Ingolia
O Hirose
O Litvin
O Ogasawara
O Stegle
O Vanunu
P Ferragina
P Flicek
P Jiang
P Kumar
P Lu
P Shannon
PA Pevzner
PE Compeau
PG Doyle
PO Brown
PR Loh
PR Schmid
R Colak
R Gaujoux
R Li
R Li
R Li
R Singh
RC Gentleman
S Anders
S Batzoglou
S Christley
S Deorowicz
S Erten
S Kohler
S Levy
S Navlakha
S Ng
S Suthram
SA Chowdhury
SD Kahn
SF Altschul
SG Tringe
SL Salzberg
SS Huang
SS Shen-Orr
T Barrett
T Ideker
T Michoel
TS Furey
U Manber
UD Akavia
W Ali
W Li
W Tembe
WJ Kent
X Liu
X Wang
X Zhou
Y Prat
Y Wang
Y Zhang
YA Kim
Z Tu
Z Wang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/04/2013
Field of study

High-throughput experimental technologies are generating increasingly massive and complex genomic data sets. The sheer enormity and heterogeneity of these data threaten to make the arising problems computationally infeasible. Fortunately, powerful algorithmic techniques lead to software that can answer important biomedical questions in practice. In this Review, we sample the algorithmic landscape, focusing on state-of-the-art techniques, the understanding of which will aid the bench biologist in analysing omics data. We spotlight specific examples that have facilitated and enriched analyses of sequence, transcriptomic and network data sets.National Institutes of Health (U.S.) (Grant GM081871

Dietary grape pomace supplementation in dairy cows: Effect on nutritional quality of milk and its derived dairy products

Author: Ianni A.
Martino G.
Publication venue: 'MDPI AG'
Publication date: 01/01/2020
Field of study

Grape pomace (GP) is the main solid by-product of winemaking and represents a rich source of potent bioactive compounds which could display a wide range of beneficial effects in human health for their association with reduced risk of several chronic diseases. Several studies have proposed the use of GP as a macro-ingredient to obtain economically worthwhile animal feedstuffs naturally enriched by polyphenols and dietary fibers. Moreover, the research carried out in this field in the last two decades evidences the ability of GP to induce beneficial effects in cow milk and its derived dairy products. First of all, a general increase in concentration of polyunsaturated fatty acids (PUFA) was observed, and this could be considered the reflection of the high content of these compounds in the by-product. Furthermore, an improvement in the oxidative stability of dairy products was observed, presumably as a direct consequence of the high content of bioactive compounds in GP that are credited with high and well-characterized antioxidant functions. Last but not least, particularly in ripened cheeses, volatile compounds (VOCs) were identified, arising both from lipolytic and proteolytic processes and commonly associated with pleasant aromatic notes. In conclusion, the GP introduction in the diet of lactating cows made it possible to obtain dairy products characterized by improved nutritional properties and high health functionality. Furthermore, the presumable improvement of organoleptic properties seems to be effective in contributing to an increase in the consumer acceptability of the novel products. This review aims to evaluate the effect of the dietary GP supplementation on the quality of milk and dairy products deriving from lactating dairy cows

Archivio della Ricerca - Università degli Studi di Teramo

Algorithms for Inferring Multiple Microbial Networks

Author: Tavakoli Sahar
Publication venue: 'Information Bulletin on Variable Stars (IBVS)'
Publication date: 01/01/2020
Field of study

The interactions among the constituent members of a microbial community play a major role in determining the overall behavior of the community and the abundance levels of its members. These interactions can be modeled using a network whose nodes represent microbial taxa and edges represent pairwise interactions. A microbial network is a weighted graph that is constructed from a sample-taxa count matrix and can be used to model co-occurrences and/or interactions of the constituent members of a microbial community. The nodes in this graph represent microbial taxa and the edges represent pairwise associations amongst these taxa. A microbial network is typically constructed from a sample-taxa count matrix that is obtained by sequencing multiple biological samples and identifying taxa counts. From large-scale microbiome studies, it is evident that microbial community compositions and interactions are impacted by environmental and/or host factors. Thus, it is not unreasonable to expect that a sample-taxa matrix generated as part of a large study involving multiple environmental or clinical parameters can be associated with more than one microbial network. However, to our knowledge, microbial network inference methods proposed thus far assume that the sample-taxa matrix is associated with a single network. This dissertation addresses the scenario when the sample-taxa matrix is associated with K microbial networks and considers the computational problem of inferring K microbial networks from a given sample-taxa matrix. The contributions of this dissertation include 1) new frameworks to generate synthetic sample-taxa count data; 2)novel methods to combine mixture modeling with probabilistic graphical models to infer multiple interaction/association networks from microbial count data; 3) dealing with the compositionality aspect of microbial count data;4) extensive experiments on real and synthetic data; 5)new methods for model selection to infer the correct value of K