Search CORE

297 research outputs found

Efficient Training of Graph-Regularized Multitask SVMs

Author: A. Torralba
C. Cortes
D. Bertsekas
K.R. Müller
M. Kloft
R. Fan
R.M. Rifkin
S. Sonnenburg
S. Sonnenburg
T. Evgeniou
T. Joachims
T.W.T.C.C. Consortium
W. Samek
Y. Xue
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2012
Field of study

We present an optimization framework for graph-regularized multi-task SVMs based on the primal formulation of the problem. Previous approaches employ a so-called multi-task kernel (MTK) and thus are inapplicable when the numbers of training examples n is large (typically n < 20,000, even for just a few tasks). In this paper, we present a primal optimization criterion, allowing for general loss functions, and derive its dual representation. Building on the work of Hsieh et al. [1,2], we derive an algorithm for optimizing the large-margin objective and prove its convergence. Our computational experiments show a speedup of up to three orders of magnitude over LibSVM and SVMLight for several standard benchmarks as well as challenging data sets from the application domain of computational biology. Combining our optimization methodology with the COFFIN large-scale learning framework [3], we are able to train a multi-task SVM using over 1,000,000 training points stemming from 4 different tasks. An efficient C++ implementation of our algorithm is being made publicly available as a part of the SHOGUN machine learning toolbox [4]

Crossref

MPG.PuRe

Recognition and Degradation of Plant Cell Wall Polysaccharides by Two Human Gut Symbionts

Competition for nutrients contained in diverse types of plant cell wall-associated polysaccharides may explain the evolution of substrate-specific catabolic gene modules in common bacterial members of the human gut microbiota

Public Library of Science (PLOS)

Crossref

PubMed Central

Chikungunya risk assessment for europe: recommendations for action

Author: Boutin J.P.
Brooker S.
Coulombier D.
Depoortere E.
Dieckmann S.
Fontenille D.
Gould E.
Nathan M.
Nilsson M.
Schaffner F.
Sonnenburg F., von
Takken W.
Valk H., de
Publication venue
Publication date: 01/01/2006
Field of study

Since March 2005, 255 000 cases of chikungunya fever are estimated to have occurred on the island of Réunion, a French overseas department in the Indian Ocean [1]. An huge increase in estimated cases occurred at the end of December 2005, culminating in an estimated peak incidence of more than 40 000 cases in week 5 of 2006 [2]. Since then, the estimated weekly incidence trend is downwards, although there have been an estimated 3000 new cases per week since week 13 of 2006. In total, 213 deaths have been linked to the disease [1]. In Mayotte, the nearby French territorial collectivity, 5834 cases have been notified [3]. Chikungunya cases have also been reported on other islands in the Indian Ocean, and imported cases have been confirmed in several European countrie

Wageningen University & Research Publications

jFuzzyLogic: a Java Library to Design Fuzzy Logic Controllers According to the Standard for Fuzzy Control Programming

Author: Acampora G.
Alcalá R.
Alcalá R.
Bonissone P.P.
Cho E.
Chávez F.
Cordó n O.
Demir O.
Eskridge B.E.
Hellendoorn H.
Juang Ch.-F.
Lee C.C.
Mamdani E.H.
Mamdani E.H.
Mucientes M.
Otero J.
Pezzulo G.
Reisig W.
Sonnenburg S.
Wang L.X.
Zadeh L.A.
Zhao Y.
Publication venue: 'Informa UK Limited'
Publication date: 01/01/2013
Field of study

Fuzzy Logic Controllers are a specific model of Fuzzy Rule Based Systems suitable for engineering applications for which classic control strategies do not achieve good results or for when it is too difficult to obtain a mathematical model. Recently, the International Electrotechnical Commission has published a standard for fuzzy control programming in part 7 of the IEC 61131 norm in order to offer a well defined common understanding of the basic means with which to integrate fuzzy control applications in control systems. In this paper, we introduce an open source Java library called jFuzzyLogic which offers a fully functional and complete implementation of a fuzzy inference system according to this standard, providing a programming interface and Eclipse plugin to easily write and test code for fuzzy control applications. A case study is given to illustrate the use of jFuzzyLogic.McGill Uninversity, Genome QuebecSpanish Government TIN2011-28488Andalusian Government P10-TIC-685

Crossref

Directory of Open Access Journals

Repositorio Institucional Universidad de Granada

Methods to study splicing from high-throughput RNA Sequencing data

Author: A Ameur
A Bhasi
A Dobin
A Mortazavi
A Oshlack
A Roberts
A Roberts
AM Mezlini
AN Brooks
B Jackson
B Kakaradov
B Langmead
B Li
B Li
BJ Haas
BJ Haas
C Trapnell
C Trapnell
C Trapnell
D Hiller
D Singh
DL Wood
DW Bryant
E Eyras
E Lee
E Turro
ET Wang
F Birzele
F Bona De
F Denoeud
F Tang
G Robertson
G Xu
GA Sacomoto
GR Grant
GS Slater
H Bao
H Jiang
H Jiang
H Kim
H Richard
J Behr
J Du
J Feng
J Hu
J Lovén
J Martin
J Salzman
J Seok
J Seok
J Wu
J Wu
JE Allen
JJ Li
JP Venables
K Schneeberger
K Wang
KD Hansen
KF Au
KL Howe
KM Borgwardt
L Chen
L Chen
L Wang
L Wang
LY Chen
M Aschoff
M Fiume
M Garber
M Griffith
M Guttman
M Stanke
M Stanke
M Sultan
MC Ryan
MF Rogers
MG Grabherr
MH Schulz
MT Dimon
N Cloonan
N Cloonan
N Deng
N Leng
N Nicolae
N Philippe
N Vijay
NA Fonseca
O Stegle
P Drewe
P Glaus
PL Martelli
PP Labaj
Q Liu
Q Liu
Q Pan
QY Zhao
R Bohnert
R Guigó
R Li
S Anders
S Djebali
S Filichkin
S Heber
S Huang
S Lee
S Mangul
S Marco-Sola
S Shen
S Sonnenburg
S Srivastava
S Tang
S Zheng
SB Montgomery
SH Nagaraj
SK Lou
T Bonfert
TA Clark
TD Wu
TD Wu
W Li
W Li
W Wang
WJ Kent
Y Hu
Y Katz
Y Li
Y Liao
Y Surget-Groba
Y Xing
Y Xing
Y Zhang
Z Xia
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 30/07/2015
Field of study

The development of novel high-throughput sequencing (HTS) methods for RNA (RNA-Seq) has provided a very powerful mean to study splicing under multiple conditions at unprecedented depth. However, the complexity of the information to be analyzed has turned this into a challenging task. In the last few years, a plethora of tools have been developed, allowing researchers to process RNA-Seq data to study the expression of isoforms and splicing events, and their relative changes under different conditions. We provide an overview of the methods available to study splicing from short RNA-Seq data. We group the methods according to the different questions they address: 1) Assignment of the sequencing reads to their likely gene of origin. This is addressed by methods that map reads to the genome and/or to the available gene annotations. 2) Recovering the sequence of splicing events and isoforms. This is addressed by transcript reconstruction and de novo assembly methods. 3) Quantification of events and isoforms. Either after reconstructing transcripts or using an annotation, many methods estimate the expression level or the relative usage of isoforms and/or events. 4) Providing an isoform or event view of differential splicing or expression. These include methods that compare relative event/isoform abundance or isoform expression across two or more conditions. 5) Visualizing splicing regulation. Various tools facilitate the visualization of the RNA-Seq data in the context of alternative splicing. In this review, we do not describe the specific mathematical models behind each method. Our aim is rather to provide an overview that could serve as an entry point for users who need to decide on a suitable tool for a specific analysis. We also attempt to propose a classification of the tools according to the operations they do, to facilitate the comparison and choice of methods.Comment: 31 pages, 1 figure, 9 tables. Small corrections adde

arXiv.org e-Print Archive

Crossref

De-Novo Discovery of Differentially Abundant Transcription Factor Binding Sites Including Their Positional Preference

Author: AD Smith
AM Benotmane
C Linhart
CE Lawrence
CT Harbison
DJ Galas
DJ Lockhart
DJC MacKay
DS Johnson
E Redhead
E Wingender
G Mönke
G Pavesi
GA Wray
GK Sandve
H Wettig
Harmen J. Bussemaker
HM Wallach
IA Paponov
Ivan A. Paponov
Ivo Grosse
J Cerquides
J Davis
J Wu
Jan Grau
JC Bryne
JD Hughes
Jens Keilwagen
LM Hellman
LV Sun
M Tompa
Marc Strickert
NK Kim
O Elemento
S Sonnenburg
S Sonnenburg
Stefan Posch
T Ulmasov
T Ulmasov
TD Schneider
TJ Guilfoyle
TL Bailey
V Matys
VV Raghavan
W Ao
W Thompson
WA Thompson
WD Teale
Publication venue: Public Library of Science
Publication date: 10/02/2011
Field of study

Transcription factors are a main component of gene regulation as they activate or repress gene expression by binding to specific binding sites in promoters. The de-novo discovery of transcription factor binding sites in target regions obtained by wet-lab experiments is a challenging problem in computational biology, which has not been fully solved yet. Here, we present a de-novo motif discovery tool called Dispom for finding differentially abundant transcription factor binding sites that models existing positional preferences of binding sites and adjusts the length of the motif in the learning process. Evaluating Dispom, we find that its prediction performance is superior to existing tools for de-novo motif discovery for 18 benchmark data sets with planted binding sites, and for a metazoan compendium based on experimental data from micro-array, ChIP-chip, ChIP-DSL, and DamID as well as Gene Ontology data. Finally, we apply Dispom to find binding sites differentially abundant in promoters of auxin-responsive genes extracted from Arabidopsis thaliana microarray data, and we find a motif that can be interpreted as a refined auxin responsive element predominately positioned in the 250-bp region upstream of the transcription start site. Using an independent data set of auxin-responsive genes, we find in genome-wide predictions that the refined motif is more specific for auxin-responsive genes than the canonical auxin-responsive element. In general, Dispom can be used to find differentially abundant motifs in sequences of any origin. However, the positional distribution learned by Dispom is especially beneficial if all sequences are aligned to some anchor point like the transcription start site in case of promoter sequences. We demonstrate that the combination of searching for differentially abundant motifs and inferring a position distribution from the data is beneficial for de-novo motif discovery. Hence, we make the tool freely available as a component of the open-source Java framework Jstacs and as a stand-alone application at http://www.jstacs.de/index.php/Dispom

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Considerations for best practices in studies of fiber or other dietary components and the intestinal microbiome

Author: Allen-Vercoe Emma
Chang Eugene B
Chassaing Benoit
Davis Cindy D
Fahey George C, Jr
Hamaker Bruce R
Holscher Hannah D
Karp Robert W
Klurfeld David M
Lampe Johanna W
Lynch Christopher J
Marette Andre
Martens Eric
O\u27Keefe Stephen J
Rose Devin J
Saarela Maria
Schneeman Barbara O
Slavin Joanne L
Sonnenburg Justin L
Swanson Kelly S
Wu Gary D
Publication venue: DigitalCommons@University of Nebraska - Lincoln
Publication date: 21/08/2018
Field of study

Considerations for best practices in studies of fiber or other dietary components and the intestinal microbiome. Am J Physiol Endocrinol Metab 315: E1087–E1097, 2018. First published August 21, 2018; doi:10.1152/ajpendo.00058.2018.—A 2-day workshop organized by the National Institutes of Health and U.S. Department of Agriculture included 16 presentations focused on the role of diet in alterations of the gastrointestinal microbiome, primarily that of the colon. Although thousands of research projects have been funded by U.S. federal agencies to study the intestinal microbiome of humans and a variety of animal models, only a minority addresses dietary effects, and a small subset is described in sufficient detail to allow reproduction of a study. Whereas there are standards being developed for many aspects of microbiome studies, such as sample collection, nucleic acid extraction, data handling, etc., none has been proposed for the dietary component; thus this workshop focused on the latter specific point. It is important to foster rigor in design and reproducibility of published studies to maintain high quality and enable designs that can be compared in systematic reviews. Speakers addressed the influence of the structure of the fermentable carbohydrate on the microbiota and the variables to consider in design of studies using animals, in vitro models, and human subjects. For all types of studies, strengths and weaknesses of various designs were highlighted, and for human studies, comparisons between controlled feeding and observational designs were discussed. Because of the lack of published, best-diet formulations for specific research questions, the main recommendation is to describe dietary ingredients and treatments in as much detail as possible to allow reproduction by other scientists

DigitalCommons@University of Nebraska

Generating Explainable and Effective Data Descriptors Using Relational Learning: Application to Cancer Biology

Author: A Cherkasov
A Clare
A Gaulton
A Koleti
A Srinivasan
AE Hoerl
DS Wishart
EP Barracchia
I Olier
J Verma
JW Lloyd
L Breiman
L Dehaspe
M Ceci
M Zitnik
MP Menden
NP Tatonetti
R Tibshirani
RD King
RD King
S Fröhler
S Muggleton
S Sonnenburg
SJ Russell
T Dash
T Takeda
W Jeon
Y Chen
Y LeCun
Y Park
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2020
Field of study

The key to success in machine learning is the use of effective data representations. The success of deep neural networks (DNNs) is based on their ability to utilize multiple neural network layers, and big data, to learn how to convert simple input representations into richer internal representations that are effective for learning. However, these internal representations are sub-symbolic and difficult to explain. In many scientific problems explainable models are required, and the input data is semantically complex and unsuitable for DNNs. This is true in the fundamental problem of understanding the mechanism of cancer drugs, which requires complex background knowledge about the functions of genes/proteins, their cells, and the molecular structure of the drugs. This background knowledge cannot be compactly expressed propositionally, and requires at least the expressive power of Datalog. Here we demonstrate the use of relational learning to generate new data descriptors in such semantically complex background knowledge. These new descriptors are effective: adding them to standard propositional learning methods significantly improves prediction accuracy. They are also explainable, and add to our understanding of cancer. Our approach can readily be expanded to include other complex forms of background knowledge, and combines the generality of relational learning with the efficiency of standard propositional learning

Crossref

Chalmers Research

Expression of Colonization Factor CS5 of Enterotoxigenic Escherichia coli (ETEC) Is Enhanced In Vivo and by the Bile Component Na Glycocholate Hydrate

Author: A Svennerholm
AF Hofmann
AG Torres
AM Svennerholm
AM Svennerholm
Ann-Mari Svennerholm
Astrid von Mentzer
BH Abuaita
D Favre
DG Evans
DT Hung
F Qadri
F Qadri
F Qadri
F Qadri
F Qadri
F Qadri
F von Sonnenburg
FE Willson
Firdausi Qadri
H Abe
HI Shaheen
HM Grewal
HM Grewal
I Bölin
J Nataro
J Sanchez
K Barrett
KF Stensrud
LA de Haan
M Begley
M Lunelli
M Nicklasson
M Pichel
M Wolf
Matilda Nicklasson
ME Merritt
Michael Hensel
MK Wolf
MK Wolf
MT Gallegos
S Gupta
TG Duthy
V Rivera-Amill
W Gaastra
WHO
WHO
Å Lothigius
Å Sjöling
Å Sjöling
Åsa Sjöling
Publication venue: Public Library of Science
Publication date: 30/04/2012
Field of study

Enterotoxigenic Escherichia coli (ETEC) is an important cause of acute watery diarrhoea in developing countries. Colonization factors (CFs) on the bacterial surface mediate adhesion to the small intestinal epithelium. Two of the most common CFs worldwide are coli surface antigens 5 and 6 (CS5, CS6). In this study we investigated the expression of CS5 and CS6 in vivo, and the effects of bile and sodium bicarbonate, present in the human gut, on the expression of CS5. Five CS5+CS6 ETEC isolates from adult Bangladeshi patients with acute diarrhoea were studied. The level of transcription from the CS5 operon was approximately 100-fold higher than from the CS6 operon in ETEC bacteria recovered directly from diarrhoeal stool without sub-culturing (in vivo). The glyco-conjugated primary bile salt sodium glycocholate hydrate (NaGCH) induced phenotypic expression of CS5 in a dose-dependent manner and caused a 100-fold up-regulation of CS5 mRNA levels; this is the first description of NaGCH as an enteropathogenic virulence inducer. The relative transcription levels from the CS5 and CS6 operons in the presence of bile or NaGCH in vitro were similar to those in vivo. Another bile salt, sodium deoxycholate (NaDC), previously reported to induce enteropathogenic virulence, also induced expression of CS5, whereas sodium bicarbonate did not

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Sequence-specific antimicrobials using efficiently delivered RNA-guided nucleases

Author: AA Gomaa
AP Desbois
B Mnif
C Westwater
CJ Paddon
CT Chung
D Bikard
D Dong
D Pérez-Mendoza
DG Gibson
DJ Dwyer
E Deltcheva
F Duan
F Hayes
GA Jacoby
JB Kaper
JE Garneau
JJ Williams
JK Rasheed
JK Rasheed
JL Sonnenburg
JM Pennington
KD Seed
L Chasteen
M Jinek
Mark Mimee
P Mali
P Nordmann
R Barrangou
R Edgar
R Lutz
RB Vercoe
Robert J Citorik
RS Sikorski
S Datta
Timothy K Lu
TK Lu
TK Lu
W Jiang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/02/2014
Field of study

Current antibiotics tend to be broad spectrum, leading to indiscriminate killing of commensal bacteria and accelerated evolution of drug resistance. Here, we use CRISPR-Cas technology to create antimicrobials whose spectrum of activity is chosen by design. RNA-guided nucleases (RGNs) targeting specific DNA sequences are delivered efficiently to microbial populations using bacteriophage or bacteria carrying plasmids transmissible by conjugation. The DNA targets of RGNs can be undesirable genes or polymorphisms, including antibiotic resistance and virulence determinants in carbapenem-resistant Enterobacteriaceae and enterohemorrhagic Escherichia coli. Delivery of RGNs significantly improves survival in a Galleria mellonella infection model. We also show that RGNs enable modulation of complex bacterial populations by selective knockdown of targeted strains based on genetic signatures. RGNs constitute a class of highly discriminatory, customizable antimicrobials that enact selective pressure at the DNA level to reduce the prevalence of undesired genes, minimize off-target effects and enable programmable remodeling of microbiota.National Institutes of Health (U.S.) (New Innovator Award 1DP2OD008435)National Centers for Systems Biology (U.S.) (Grant 1P50GM098792)United States. Defense Threat Reduction Agency (HDTRA1-14-1-0007)Massachusetts Institute of Technology. Institute for Soldier Nanotechnologies (W911NF13D0001)National Institute of General Medical Sciences (U.S.) (Interdepartmental Biotechnology Training Program 5T32 GM008334)Fonds de la recherche en sante du Quebec (Master's Training Award

DSpace@MIT

Crossref

PubMed Central