Search CORE

34 research outputs found

Online Transitivity Clustering of Biological Data with Missing Values

Author: Baumbach Jan
Kreutzer Christoph
Vu Thuy Duong
Wittkop Tobias
Publication venue: OASIcs - OpenAccess Series in Informatics. German Conference on Bioinformatics 2012
Publication date: 01/01/2012
Field of study

Dagstuhl Research Online Publication Server

MPG.PuRe

Large scale clustering of protein sequences with FORCE -A layout based heuristic for weighted cluster editing

Author: Baumbach Jan
Lobo Francisco P
Rahmann Sven
Wittkop Tobias
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

Wittkop T, Baumbach J, Lobo FP, Rahmann S. Large scale clustering of protein sequences with FORCE - a layout based heuristic for weighted cluster editing. BMC Bioinformatics. 2007;8(1): 396.Background: Detecting groups of functionally related proteins from their amino acid sequence alone has been a long-standing challenge in computational genome research. Several clustering approaches, following different strategies, have been published to attack this problem. Today, new sequencing technologies provide huge amounts of sequence data that has to be efficiently clustered with constant or increased accuracy, at increased speed. Results: We advocate that the model of weighted cluster editing, also known as transitive graph projection is well-suited to protein clustering. We present the FORCE heuristic that is based on transitive graph projection and clusters arbitrary sets of objects, given pairwise similarity measures. In particular, we apply FORCE to the problem of protein clustering and show that it outperforms the most popular existing clustering tools ( Spectral clustering, TribeMCL, GeneRAGE, Hierarchical clustering, and Affinity Propagation). Furthermore, we show that FORCE is able to handle huge datasets by calculating clusters for all 192 187 prokaryotic protein sequences ( 66 organisms) obtained from the COG database. Finally, FORCE is integrated into the corynebacterial reference database CoryneRegNet. Conclusion: FORCE is an applicable alternative to existing clustering algorithms. Its theoretical foundation, weighted cluster editing, can outperform other clustering paradigms on protein homology clustering. FORCE is open source and implemented in Java. The software, including the source code, the clustering results for COG and CoryneRegNet, and all evaluation datasets are available at http://gi.cebitec.uni-bielefeld.de/comet/force/

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Publications at Bielefeld University

Genetic Correction of Huntington's Disease Phenotypes in Induced Pluripotent Stem Cells

Author: An Mahru C.
Ellerby Lisa M.
Melov Simon
Montoro Daniel
Mooney Sean
Scott Gary
Wittkop Tobias
Zhang Ningzhe
Publication venue: Elsevier Inc.
Publication date: 03/08/2012
Field of study

SummaryHuntington's disease (HD) is caused by a CAG expansion in the huntingtin gene. Expansion of the polyglutamine tract in the huntingtin protein results in massive cell death in the striatum of HD patients. We report that human induced pluripotent stem cells (iPSCs) derived from HD patient fibroblasts can be corrected by the replacement of the expanded CAG repeat with a normal repeat using homologous recombination, and that the correction persists in iPSC differentiation into DARPP-32-positive neurons in vitro and in vivo. Further, correction of the HD-iPSCs normalized pathogenic HD signaling pathways (cadherin, TGF-β, BDNF, and caspase activation) and reversed disease phenotypes such as susceptibility to cell death and altered mitochondrial bioenergetics in neural stem cells. The ability to make patient-specific, genetically corrected iPSCs from HD patients will provide relevant disease models in identical genetic backgrounds and is a critical step for the eventual use of these cells in cell replacement therapy

Elsevier - Publisher Connector

PubMed Central

STOP using just GO: a multi-ontology hypothesis generation tool for high throughput experimentation

Author: Ari E Berman
Corey Powell
Emily TerAvest
K Fleisch
Nigam H Shah
Sean D Mooney
Tobias Wittkop
Uday S Evani
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

Crossref

Springer - Publisher Connector

clusterMaker: a multi-algorithm clustering plugin for Cytoscape

Abstract Background In the post-genomic era, the rapid increase in high-throughput data calls for computational tools capable of integrating data of diverse types and facilitating recognition of biologically meaningful patterns within them. For example, protein-protein interaction data sets have been clustered to identify stable complexes, but scientists lack easily accessible tools to facilitate combined analyses of multiple data sets from different types of experiments. Here we present <it>clusterMaker</it>, a Cytoscape plugin that implements several clustering algorithms and provides network, dendrogram, and heat map views of the results. The Cytoscape network is linked to all of the other views, so that a selection in one is immediately reflected in the others. <it>clusterMaker </it>is the first Cytoscape plugin to implement such a wide variety of clustering algorithms and visualizations, including the only implementations of hierarchical clustering, dendrogram plus heat map visualization (tree view), k-means, k-medoid, SCPS, AutoSOME, and native (Java) MCL. Results Results are presented in the form of three scenarios of use: analysis of protein expression data using a recently published mouse interactome and a mouse microarray data set of nearly one hundred diverse cell/tissue types; the identification of protein complexes in the yeast <it>Saccharomyces cerevisiae</it>; and the cluster analysis of the vicinal oxygen chelate (VOC) enzyme superfamily. For scenario one, we explore functionally enriched mouse interactomes specific to particular cellular phenotypes and apply fuzzy clustering. For scenario two, we explore the prefoldin complex in detail using both physical and genetic interaction clusters. For scenario three, we explore the possible annotation of a protein as a methylmalonyl-CoA epimerase within the VOC superfamily. Cytoscape session files for all three scenarios are provided in the Additional Files section. Conclusions The Cytoscape plugin <it>clusterMaker </it>provides a number of clustering algorithms and visualizations that can be used independently or in combination for analysis and visualization of biological data sets, and for confirming or generating hypotheses about biological function. Several of these visualizations and algorithms are only available to Cytoscape users through the <it>clusterMaker </it>plugin. <it>clusterMaker </it>is available via the Cytoscape plugin manager.</p

University of Toronto Research Repository

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

eScholarship - University of California

MPG.PuRe

Deep Blue Documents at the University of Michigan

Geographic and temporal trends in the molecular epidemiology and genetic mechanisms of transmitted HIV-1 drug resistance:an individual-patient- and sequence-level meta-analysis

Author: A Lindstrom
AC Pineda-Pena
AF Aghokeng
Africa Holguin
AJ Drummond
Alaka K. Deshpande
Amilcar Tanuri
AN Phillips
Andrew Carr
Anne Derache
Anne-Mieke Vandamme
Avelin F. Aghokeng
BH Chi
Bum Sik Chin
CB Holmes
Charles A. B. Boucher
Chunfu Yang
Cillian F. De Gascun
Cyrille F. Djoko
D Frentz
Davey M. Smith
David A. M. C. van de Vijver
David Katzenstein
DE Bennett
DE Bennett
Deogratius Ssemwanga
Diane Descamps
DR Kuritzkes
E Andersson
E Paradis
F Tanser
G Zhang
Gillian M. Hunt
GM Ellis
Gonzalo Yebra
GQ Lee
Gustavo Reyes-Terán
H Castro
Hermann Bussmann
Herve Fleury
Hiroshi Ichimura
Hong-Ha M. Truong
Irja Lutsar
J Bor
James I. Brooks
Jan Albert
Jerome H. Kim
JG Garcia-Lerma
JH McMahon
John P. A. Ioannidis
Jonathan Taylor
Jose Luis Blanco
Junko Hattori
JW Eaton
JZ Li
K Borroto-Esoda
K Mollan
Kee Peng NG
KM Stadeli
Kok Keng Tee
L Wittkop
Lynn Morris
M Ragonnet-Cronin
Maja Stanojevic
Manon L. Ragonnet-Cronin
Marcelo A. Soares
Mariane A. Stefani
Marie-Laure Chaix
Mario Poljak
Martine Peeters
Matt A. Price
MH Chung
Michael P. Busch
Michael R. Jordan
Morgane Rolland
MW Tang
N Sluis-Cremer
Nicaise Ndembi
Nicole Vidal
P. Richard Harrigan
Pascal O. Bessong
PC Lambert
Philippe Lemey
Pierre Frange
Pontiano Kaleebu
Radko Avi
Ramesh S. Paranjape
Raph L. Hamers
RD Kouyos
RE Barth
RJ Gifford
RK Gupta
RL Hamers
Robert W. Shafer
Rongge Yang
S Hue
S Yerly
Santiago Avila-Rios
Sasisopin Kiertiburanakul
Seiichiro Fujisaki
Silvia Bertagnolio
Somnuek Sungkanuparph
Soo-Yon Rhee
Sunee Sirivichayakul
Sung Soon Kim
Susan H. Eshleman
SY Rhee
Tobias F. Rinke de Wit
Toni T. D’Aquin
V Cambiano
V Jain
Vici Varghese
Vincent V. Soriano
Wataru Sugiura
Yanpeng Li
ZA Antoniadou
Zabrina L. Brumme
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2015
Field of study

Regional and subtype-specific mutational patterns of HIV-1 transmitted drug resistance (TDR) are essential for informing first-line antiretroviral (ARV) therapy guidelines and designing diagnostic assays for use in regions where standard genotypic resistance testing is not affordable. We sought to understand the molecular epidemiology of TDR and to identify the HIV-1 drug-resistance mutations responsible for TDR in different regions and virus subtypes.status: publishe

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Directory of Open Access Journals

Repositório da Universidade Nova de Lisboa

Edinburgh Research Explorer

Spiral - Imperial College Digital Repository

Erasmus University Digital Repository

Lirias

Public Library of Science (PLOS)

Crossref

LSHTM Research Online

PubMed Central

EUR Research Repository

eScholarship - University of California

Repository of the University of Ljubljana

Clusterung von biologischen Daten durch Aufdecken verborgener transitiver Strukturen

Author: Wittkop Tobias
Publication venue: Bielefeld University
Publication date: 01/01/2010
Field of study

Wittkop T. Clustering biological data by unraveling hidden transitive substructures. Bielefeld (Germany): Bielefeld University; 2010.Clustering is a computational technique for the assignment of objects into groups of similar elements. Generally, it is widely used for business data interpretation, natural language analyses, and image processing, just to name a few. Typical bioinformatic applications are: (1) detection of homologous proteins; single and multi domain, (2) prediction of protein complexes in protein-protein interaction networks, (3) identification of overrepresented DNA sequence patterns, and (4) gene co-expression studies. Traditionally, we distinguish between partitional, overlapping, and hierarchical approaches. Partitional and overlapping approaches follow two different strategies: (1) center-based approaches for the detection of appropriate cluster representatives, such as k-means and (2) methods for the identification of homogeneous clusters, such as Markov Clustering. Hierarchical approaches allow for the construction of a tree structure; single linkage agglomerative clustering may serve as an example here. Solving the following problems is crucial for a successful cluster analysis: (1) Probably most challenging is the identification of a problem-specific similarity function. (2) Every clustering approach incorporates at least one parameter that influences the size and number of the clusters. Determining such a density parameter strongly depends on the problem and the chosen similarity function. Preferably, one can even prove certain attributes of a clustering result, given a similarity function and the density parameter. (3) Currently, high throughput experiments produce huge amounts of data. Hence, a clustering environment has to be capable of processing hundreds of thousands of data objects. (4) The integration of existing knowledge into a cluster analysis is highly valuable for improving the clustering output. The integration of known assignments may serve as an example here. (5) It is clear that the method needs to be robust against noise and outliers. (6) From an end-user's point of view, integration with standard software, appropriate visualization capabilities, and easy-to-use evaluation methods are highly beneficial. This thesis introduces Transitivity Clustering (TC) and its accompanying software framework TransClust, a method which addresses all of the aforementioned problems. It is a homogeneous partitioning method based on Weighted Transitive Graph Projection (WTGP), which aims for unraveling hidden transitive substructures in a given similarity graph deduced from a pairwise similarity measure. TC solves the aforementioned problems (2-5). The software implementation TransClust is an easy-to-use standalone and online application that solves the problems mentioned in (1,6). Furthermore, in TC, the density parameter can be chosen intuitively and the underlying weighted transitive graph projection model allows certain criteria of the clustering results to be proven. In addition, the model has been extended in order to allow for the following advanced features: (1) The integration of existing knowledge, for instance, by means of upper and lower bounds, (2) the computation of an hierarchical clustering, and (3) the calculation of overlapping clusterings. These extensions widen the applicability of TC and provide features that distinguish TC from other bioinformatics alternatives. The flexibility of TC makes it suitable for various real-world applications. In this work, we concentrate on protein sequence clustering and the detection of protein complexes in protein-protein interaction networks, showing that TC outperforms the most-commonly used bioinformatics clustering techniques. The software implementation of TC, TransClust, is available online at http://transclust.cebitec.uni-bielefeld.de as web application, as standalone tool, and as plugin for the standard network analysis tool Cytoscape. It provides results of similar or superior accuracy to those of alternative approaches. It is unique in that it features an easy-to-use clustering environment that contributes to all the important steps in a cluster analysis: (1) the choice and evaluation of a meaningful similarity function, (2) the detection of an appropriate density parameter, (3) the efficient computation of a clustering, and (4) the interpretation and evaluation of the clustering results

Publications at Bielefeld University

More than shots in the dark: driving vaccine efficacy in cirrhosis.

Author: Boettler Tobias
Wittkop Linda
Publication venue: BMJ Publishing Group
Publication date: 07/06/2023
Field of study

International audienc

INRIA a CCSD electronic archive server

Oskar Bordeaux

INTRODUCTION: ADVANCES IN COMPUTATIONAL SYSTEMS BIOINFORMATICS

Author: SEAN D. MOONEY
TOBIAS WITTKOP
Publication venue: 'World Scientific Pub Co Pte Lt'
Publication date
Field of study

Crossref