Search CORE

666 research outputs found

Overview of Brazilian remote sensing activities

Author: Dejesusparada N.
Sonnenburg C. R.
Publication venue
Publication date
Field of study

There are no author-identified significant results in this report

NASA Technical Reports Server

INPE remote sensing program

Author: Dejesusparada N.
Sonnenburg C. R.
Publication venue
Publication date
Field of study

There are no author-identified significant results in this report

NASA Technical Reports Server

Improving the Caenorhabditis elegans Genome Annotation Using Machine Learning

Author: Bernhard Schölkopf
Gunnar Rätsch
Hanh Witte
Jagan Srinivasan
Klaus-R Müller
Ralf-J Sommer
Sören Sonnenburg
The Caenorhabditis elegans sequencing consortium
Uwe Ohler
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2007
Field of study

For modern biology, precise genome annotations are of prime importance, as they allow the accurate definition of genic regions. We employ state-of-the-art machine learning methods to assay and improve the accuracy of the genome annotation of the nematode Caenorhabditis elegans. The proposed machine learning system is trained to recognize exons and introns on the unspliced mRNA, utilizing recent advances in support vector machines and label sequence learning. In 87% (coding and untranslated regions) and 95% (coding regions only) of all genes tested in several out-of-sample evaluations, our method correctly identified all exons and introns. Notably, only 37% and 50%, respectively, of the presently unconfirmed genes in the C. elegans genome annotation agree with our predictions, thus we hypothesize that a sizable fraction of those genes are not correctly annotated. A retrospective evaluation of the Wormbase WS120 annotation [1] of C. elegans reveals that splice form predictions on unconfirmed genes in WS120 are inaccurate in about 18% of the considered cases, while our predictions deviate from the truth only in 10%–13%. We experimentally analyzed 20 controversial genes on which our system and the annotation disagree, confirming the superiority of our predictions. While our method correctly predicted 75% of those cases, the standard annotation was never completely correct. The accuracy of our system is further corroborated by a comparison with two other recently proposed systems that can be used for splice form prediction: SNAP and ExonHunter. We conclude that the genome annotation of C. elegans and other organisms can be greatly enhanced using modern machine learning technology

CiteSeerX

Public Library of Science (PLOS)

Crossref

Fraunhofer-ePrints

Directory of Open Access Journals

PubMed Central

Caltech Authors

MPG.PuRe

A Unifying View of Multiple Kernel Learning

Author: A. Rakotomamonjy
B. Schölkopf
B. Schölkopf
C. Zhu
F.R. Bach
G.R.G. Lanckriet
H. Zou
K.-R. Müller
M. Kloft
P.L. Bartlett
R.M. Rifkin
R.T. Rockafellar
S. Sonnenburg
V.N. Vapnik
Publication venue
Publication date: 01/01/2010
Field of study

Recent research on multiple kernel learning has lead to a number of approaches for combining kernels in regularized risk minimization. The proposed approaches include different formulations of objectives and varying regularization strategies. In this paper we present a unifying general optimization criterion for multiple kernel learning and show how existing formulations are subsumed as special cases. We also derive the criterion's dual representation, which is suitable for general smooth optimization algorithms. Finally, we evaluate multiple kernel learning in this framework analytically using a Rademacher complexity bound on the generalization error and empirically in a set of experiments

arXiv.org e-Print Archive

CiteSeerX

Crossref

Queensland University of Technology ePrints Archive

Assessment of the damage caused by the frost of 1975 to coffee and wheat crops in the northwest of the state of Parana using LANDSAT images with automatic classification

Author: Dejesusparada N.
Palestino C. V. B.
Sonnenburg C. R.
Tardin A. T.
Publication venue
Publication date
Field of study

There are no author-identified significant results in this report

NASA Technical Reports Server

The Feature Importance Ranking Measure

Author: A. Graf
B. Schölkopf
B. Üstün
C. Strobl
C. Strobl
G. Rätsch
G.R.G. Lanckriet
J. Friedman
J. Schäfer
K. Bennett
M. Laan van der
R. Tibshirani
S. Sonnenburg
S. Sonnenburg
Publication venue
Publication date: 01/01/2009
Field of study

Most accurate predictions are typically obtained by learning machines with complex feature spaces (as e.g. induced by kernels). Unfortunately, such decision rules are hardly accessible to humans and cannot easily be used to gain insights about the application domain. Therefore, one often resorts to linear models in combination with variable selection, thereby sacrificing some predictive power for presumptive interpretability. Here, we introduce the Feature Importance Ranking Measure (FIRM), which by retrospective analysis of arbitrary learning machines allows to achieve both excellent predictive performance and superior interpretation. In contrast to standard raw feature weighting, FIRM takes the underlying correlation structure of the features into account. Thereby, it is able to discover the most relevant features, even if their appearance in the training data is entirely prevented by noise. The desirable properties of FIRM are investigated analytically and illustrated in simulations.Comment: 15 pages, 3 figures. to appear in the Proceedings of the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML/PKDD), 200

arXiv.org e-Print Archive

Evaluation of antigens for the serodiagnosis of kala-azar and oriental sores by means of the indirect immunofluorescence antibody test (IFAT)

Author: A. Zuckermann
E. C. Hedge
E. Mannweiler
F. Falkner v. Sonnenburg
G. Piekarski
G. Weiland
Gh. H. Endrissian
H. E. Krampitz
H. E. Krampitz
H. E. Krampitz
J. J. Shaw
J. Ranque
L. Prüfer
M. Lopez-Brea
N. Beforouz
R. S. Bray
R. S. Bray
R. S. Bray
T. I. Aljeboori
Th. Löscher
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/1981
Field of study

Antigens and corresponding sera were collected from travellers with leishmaniasis returning to Germany from different endemic areas of the old world. The antigenicity of these Leishmania strains, which were maintained in Syrian hamsters, was compared by indirect immunofluorescence (IFAT). Antigenicity was demonstrated by antibody titres in 18 sera from 11 patients. The amastigotic stages of nine strains of Leishmania donovani and four strains of Leishmania tropica were compared with each other and with the culture forms of insect flagellates (Strigomonas oncopelti and Leptomonas ctenocephali). Eighteen sera from 11 patients were available for antibody determination with these antigens. The maximal antibody titres in a single serum varied considerably depending on which antigen was used for the test. High antibody levels could only be maintained when Leishmania donovani was employed as the antigen, but considerable differences also occurred between the different strains of this species. The other antigens were weaker. No differences in antigenicity between amastigotes and promastigotes of the same strain were observed. It is important to select suitable antigens. Low titres may be of doubtful specificity and are a poor baseline for the fall in titre which is an essential index of effective treatment.Wir sammelten Parasiten und Seren von Reisenden, die aus verschiedenen endemischen Gebieten der Alten Welt mit einer Leishmaniasis nach Deutschland zurückkehrten. Die Antigenaktivitäten der isolierten und fortlaufend in Goldhamstern gehaltenenLeishmania-Stämme wurden im indirekten Immunofluoreszenztest (IFAT) verglichen. Die Antigenität wurde an Hand von Antikörpertitern in 18 Serumproben von 11 Patienten bewiesen. Neun Stämme desLeishmania donovani-Komplexes und vierLeishmania tropica-Isolate wurden in ihrem amastigoten Stadium miteinander verglichen. Hinzu kamen zwei Insekten-Flagellaten als Kulturformen:Strigomonas oncopelti undLeptomonas ctenocephali. 18 Serumproben von 11 Patienten standen für die Antikörperbestimmung mit diesen Antigenen zur Verfügung. Die maximalen Titerhöhen variierten in ein- und derselben antiserumprobe zum Teil erheblich, je nachdem, welches Antigen für den Test benutzt wurde. Hohe Antikörpertiter konnten nur erhalten werden, wennLeishmania donovani als Antigen vorlag, es ergaben sich aber auch zwischen den einzelnen Stämmen dieser Leishmaniaart erhebliche Unterschiede in der Antigenaktivität. Antigene anderer Art erwiesen sich als wenig wirksam. Zwischen amastigoten und promastigoten Entwicklungsformen einesLeishmania donovani-Stammes konnten keine Unterschiede in der Antigenaktivität erkannt werden. Für den Nachweis möglichst hoher Antikörpertiter im IFAT ist die Auswahl geeigneter Antigene von ausschlaggebender Bedeutung. Niedrige Titer erschweren deren Beurteilung als spezifisch und sind eine schlechte Ausgangsposition für die Beobachtung des obligatorischen Titerabfalles nach erfolgreicher Therapie

Crossref

Open Access LMU

Federated Ensemble Regression Using Classification

Author: A Ahmad
A Ali
A Koleti
CN Silla
E Dolgin
J Mendes-Moreira
L Breiman
L Breiman
N Japkowicz
N Rooney
NV Chawla
OI Orhobor
PA Futreal
R Dash
R Ihaka
S Sonnenburg
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2020
Field of study

Ensemble learning has been shown to significantly improve predictive accuracy in a variety of machine learning problems. For a given predictive task, the goal of ensemble learning is to improve predictive accuracy by combining the predictive power of multiple models. In this paper, we present an ensemble learning algorithm for regression problems which leverages the distribution of the samples in a learning set to achieve improved performance. We apply the proposed algorithm to a problem in precision medicine where the goal is to predict drug perturbation effects on genes in cancer cell lines. The proposed approach significantly outperforms the base case

Crossref

Chalmers Research

Kernel learning for ligand-based virtual screening: discovery of a new PPARγ agonist

Author: B Henke
B Schölkopf
C Rücker
Ewgenij Proschak
Gisbert Schneider
H Zettl
K Hansen
K-R Müller
M Rupp
M Schubert-Zsilavecz
Matthias Rupp
O Rau
O Rau
R Steri
S Sonnenburg
T Schroeter
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Poster presentation at 5th German Conference on Cheminformatics: 23. CIC-Workshop Goslar, Germany. 8-10 November 2009 We demonstrate the theoretical and practical application of modern kernel-based machine learning methods to ligand-based virtual screening by successful prospective screening for novel agonists of the peroxisome proliferator-activated receptor gamma (PPARgamma) [1]. PPARgamma is a nuclear receptor involved in lipid and glucose metabolism, and related to type-2 diabetes and dyslipidemia. Applied methods included a graph kernel designed for molecular similarity analysis [2], kernel principle component analysis [3], multiple kernel learning [4], and, Gaussian process regression [5]. In the machine learning approach to ligand-based virtual screening, one uses the similarity principle [6] to identify potentially active compounds based on their similarity to known reference ligands. Kernel-based machine learning [7] uses the "kernel trick", a systematic approach to the derivation of non-linear versions of linear algorithms like separating hyperplanes and regression. Prerequisites for kernel learning are similarity measures with the mathematical property of positive semidefiniteness (kernels). The iterative similarity optimal assignment graph kernel (ISOAK) [2] is defined directly on the annotated structure graph, and was designed specifically for the comparison of small molecules. In our virtual screening study, its use improved results, e.g., in principle component analysis-based visualization and Gaussian process regression. Following a thorough retrospective validation using a data set of 176 published PPARgamma agonists [8], we screened a vendor library for novel agonists. Subsequent testing of 15 compounds in a cell-based transactivation assay [9] yielded four active compounds. The most interesting hit, a natural product derivative with cyclobutane scaffold, is a full selective PPARgamma agonist (EC50 = 10 ± 0.2 microM, inactive on PPARalpha and PPARbeta/delta at 10 microM). We demonstrate how the interplay of several modern kernel-based machine learning approaches can successfully improve ligand-based virtual screening results

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Hochschulschriftenserver - Universität Frankfurt am Main

Efficient Training of Graph-Regularized Multitask SVMs

Author: A. Torralba
C. Cortes
D. Bertsekas
K.R. Müller
M. Kloft
R. Fan
R.M. Rifkin
S. Sonnenburg
S. Sonnenburg
T. Evgeniou
T. Joachims
T.W.T.C.C. Consortium
W. Samek
Y. Xue
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2012
Field of study

We present an optimization framework for graph-regularized multi-task SVMs based on the primal formulation of the problem. Previous approaches employ a so-called multi-task kernel (MTK) and thus are inapplicable when the numbers of training examples n is large (typically n < 20,000, even for just a few tasks). In this paper, we present a primal optimization criterion, allowing for general loss functions, and derive its dual representation. Building on the work of Hsieh et al. [1,2], we derive an algorithm for optimizing the large-margin objective and prove its convergence. Our computational experiments show a speedup of up to three orders of magnitude over LibSVM and SVMLight for several standard benchmarks as well as challenging data sets from the application domain of computational biology. Combining our optimization methodology with the COFFIN large-scale learning framework [3], we are able to train a multi-task SVM using over 1,000,000 training points stemming from 4 different tasks. An efficient C++ implementation of our algorithm is being made publicly available as a part of the SHOGUN machine learning toolbox [4]

Crossref

MPG.PuRe