Search CORE

675 research outputs found

Active transitivity clustering of large-scale biomedical datasets

Author: Röttger Richard
Publication venue: Fakultät 6 - Naturwissenschaftlich-Technische Fakultät I. Fachrichtung 6.2 - Informatik
Publication date: 01/01/2014
Field of study

Clustering is a popular computational approach for partitioning data sets into groups of objects that share common traits. Due to recent advances in wet-lab technology, the amount of available biological data grows exponentially and increasingly poses problems in terms of computational complexity for current clustering approaches. In this thesis, we introduce two novel approaches, TransClustMV and ActiveTransClust, that enable the handling of large scale datasets by reducing the amount of required information drastically by means of exploiting missing values. Furthermore, there exists a plethora of different clustering tools and standards making it very difficult for researchers to choose the correct methods for a given problem. In order to clarify this multifarious field, we developed ClustEval which streamlines the clustering process and enables practitioners conducting large-scale cluster analyses in a standardized and bias-free manner. We conclude the thesis by demonstrating the power of clustering tools and the need for the previously developed methods by conducting real-world analyses. We transferred the regulatory network of E. coli K-12 to pathogenic EHEC organisms based on evolutionary conservation therefore avoiding tedious and potentially dangerous wet-lab experiments. In another example, we identify pathogenicity specific core genomes of actinobacteria in order to identify potential drug targets.Clustering ist ein populärer Ansatz um Datensätze in Gruppen ähnlicher Objekte zu partitionieren. Nicht zuletzt aufgrund der jüngsten Fortschritte in der Labortechnik wächst die Menge der biologischen Daten exponentiell und stellt zunehmend ein Problem für heutige Clusteralgorithmen dar. Im Rahmen dieser Arbeit stellen wir zwei neue Ansätze, TransClustMV und ActiveTransClust, vor die auch das Bearbeiten sehr großer Datensätze ermöglichen, indem sie den Umfang der benötigten Informationen drastisch reduzieren da fehlende Werte kompensiert werden können. Allein die schiere Vielfalt der vorhanden Cluster-Methoden und Standards stellt den Anwender darüber hinaus vor das Problem, den am besten geeigneten Algorithmus für das vorliegende Problem zu wählen. ClustEval wurde mit dem Ziel entwickelt, diese Unübersichtlichkeit zu beseitigen und gleichzeitig die Clusteranalyse zu vereinheitlichen und zu automatisieren um auch aufwendige Clusteranalysen zu realisieren. Abschließend demonstrieren wir die Nützlichkeit von Clustering anhand von realen Anwendungsfällen die darüber hinaus auch den Bedarf der zuvor entwickelten Methoden aufzeigen. Wir haben das genregulatorische Netzwerk von E. coli K-12 ohne langwierige und potentiell gefährliche Laborarbeit auf pathogene EHEC Stämme übertragen. In einem weiteren Beispiel bestimmen wir das pathogenitätsspeziefische „Kerngenom“ von Actinobakterien um potenzielle Angriffspunkte für Medikamente zu identifizieren

Universaar

MPG.PuRe

Acronym

Methodological challenges and analytic opportunities for modeling and interpreting Big Healthcare Data

Author
Publication venue: BioMed Central
Publication date: 25/02/2016
Field of study

Springer - Publisher Connector

Methodological challenges and analytic opportunities for modeling and interpreting Big Healthcare Data

Author: Dinov Ivo D
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 25/02/2016
Field of study

Abstract Managing, processing and understanding big healthcare data is challenging, costly and demanding. Without a robust fundamental theory for representation, analysis and inference, a roadmap for uniform handling and analyzing of such complex data remains elusive. In this article, we outline various big data challenges, opportunities, modeling methods and software techniques for blending complex healthcare data, advanced analytic tools, and distributed scientific computing. Using imaging, genetic and healthcare data we provide examples of processing heterogeneous datasets using distributed cloud services, automated and semi-automated classification techniques, and open-science protocols. Despite substantial advances, new innovative technologies need to be developed that enhance, scale and optimize the management and processing of large, complex and heterogeneous data. Stakeholder investments in data acquisition, research and development, computational infrastructure and education will be critical to realize the huge potential of big data, to reap the expected information benefits and to build lasting knowledge assets. Multi-faceted proprietary, open-source, and community developments will be essential to enable broad, reliable, sustainable and efficient data-driven discovery and analytics. Big data will affect every sector of the economy and their hallmark will be ‘team science’.http://deepblue.lib.umich.edu/bitstream/2027.42/134522/1/13742_2016_Article_117.pd

Springer - Publisher Connector

PubMed Central

Deep Blue Documents at the University of Michigan

Reliable transfer of transcriptional gene regulatory networks between taxonomically related organisms

Author: A Paccanaro
AD Gonzalez Perez
AE Kel
AG Perez
AJ Enright
Andreas Tauch
CO Pabo
DJ Galas
I Brune
I Matic
J Baumbach
J Baumbach
J Baumbach
J Baumbach
J Baumbach
J Baumbach
Jan Baumbach
K Brinkrolf
LM Hellman
LV Sun
M Beckstette
M Madan Babu
M Tompa
RL Tatusov
S Balaji
S Balaji
S Rahmann
S Rahmann
SA Teichmann
SF Altschul
Sven Rahmann
T Wittkop
V Espinosa
WB Alkema
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Baumbach J, Rahmann S, Tauch A. Reliable transfer of transcriptional gene regulatory networks between taxonomically related organisms. BMC Systems Biology. 2009;3(1):8.Background: Transcriptional regulation of gene activity is essential for any living organism. Transcription factors therefore recognize specific binding sites within the DNA to regulate the expression of particular target genes. The genome-scale reconstruction of the emerging regulatory networks is important for biotechnology and human medicine but cost-intensive, time-consuming, and impossible to perform for any species separately. By using bioinformatics methods one can partially transfer networks from well-studied model organisms to closely related species. However, the prediction quality is limited by the low level of evolutionary conservation of the transcription factor binding sites, even within organisms of the same genus. Results: Here we present an integrated bioinformatics workflow that assures the reliability of transferred gene regulatory networks. Our approach combines three methods that can be applied on a large-scale: re-assessment of annotated binding sites, subsequent binding site prediction, and homology detection. A gene regulatory interaction is considered to be conserved if (1) the transcription factor, (2) the adjusted binding site, and (3) the target gene are conserved. The power of the approach is demonstrated by transferring gene regulations from the model organism Corynebacterium glutamicum to the human pathogens C. diphtheriae, C. jeikeium, and the biotechnologically relevant C. efficiens. For these three organisms we identified reliable transcriptional regulations for similar to 40% of the common transcription factors, compared to similar to 5% for which knowledge was available before. Conclusion: Our results suggest that trustworthy genome-scale transfer of gene regulatory networks between organisms is feasible in general but still limited by the level of evolutionary conservation

Crossref

Springer - Publisher Connector

PubMed Central

Publications at Bielefeld University

MULTIVARIATE MODELING OF COGNITIVE PERFORMANCE AND CATEGORICAL PERCEPTION FROM NEUROIMAGING DATA

Author: Al-Fahad Rakib
Publication venue: University of Memphis Digital Commons
Publication date: 01/01/2020
Field of study

State-of-the-art cognitive-neuroscience mainly uses hypothesis-driven statistical testing to characterize and model neural disorders and diseases. While such techniques have proven to be powerful in understanding diseases and disorders, they are inadequate in explaining causal relationships as well as individuality and variations. In this study, we proposed multivariate data-driven approaches for predictive modeling of cognitive events and disorders. We developed network descriptions of both structural and functional connectivities that are critical in multivariate modeling of cognitive performance (i.e., fluency, attention, and working memory) and categorical perceptions (i.e., emotion, speech perception). We also performed dynamic network analysis on brain connectivity measures to determine the role of different functional areas in relation to categorical perceptions and cognitive events. Our empirical studies of structural connectivity were performed using Diffusion Tensor Imaging (DTI). The main objective was to discover the role of structural connectivity in selecting clinically interpretable features that are consistent over a large range of model parameters in classifying cognitive performances in relation to Acute Lymphoblastic Leukemia (ALL). The proposed approach substantially improved accuracy (13% - 26%) over existing models and also selected a relevant, small subset of features that were verified by domain experts. In summary, the proposed approach produced interpretable models with better generalization.Functional connectivity is related to similar patterns of activation in different brain regions regardless of the apparent physical connectedness of the regions. The proposed data-driven approach to the source localized electroencephalogram (EEG) data includes an array of tools such as graph mining, feature selection, and multivariate analysis to determine the functional connectivity in categorical perceptions. We used the network description to correctly classify listeners behavioral responses with an accuracy over 92% on 35 participants. State-of-the-art network description of human brain assumes static connectivities. However, brain networks in relation to perception and cognition are complex and dynamic. Analysis of transient functional networks with spatiotemporal variations to understand cognitive functions remains challenging. One of the critical missing links is the lack of sophisticated methodologies in understanding dynamics neural activity patterns. We proposed a clustering-based complex dynamic network analysis on source localized EEG data to understand the commonality and differences in gender-specific emotion processing. Besides, we also adopted Bayesian nonparametric framework for segmentation neural activity with a finite number of microstates. This approach enabled us to find the default network and transient pattern of the underlying neural mechanism in relation to categorical perception. In summary, multivariate and dynamic network analysis methods developed in this dissertation to analyze structural and functional connectivities will have a far-reaching impact on computational neuroscience to identify meaningful changes in spatiotemporal brain activities

University of Memphis Digital Commons

Brain connectivity analysis from EEG signals using stable phase-synchronized states during face perception tasks

Author: Das S
Jamal W
Kuyucu D
Maharatna K
Pan I
Publication venue: 'Elsevier BV'
Publication date: 14/02/2018
Field of study

This is the author accepted manuscript. The final version is available from Elsevier via the DOI in this recordDegree of phase synchronization between different Electroencephalogram (EEG) channels is known to be the manifestation of the underlying mechanism of information coupling between different brain regions. In this paper, we apply a continuous wavelet transform (CWT) based analysis technique on EEG data, captured during face perception tasks, to explore the temporal evolution of phase synchronization, from the onset of a stimulus. Our explorations show that there exists a small set (typically 3-5) of unique synchronized patterns or synchrostates, each of which are stable of the order of milliseconds. Particularly, in the beta (β) band, which has been reported to be associated with visual processing task, the number of such stable states has been found to be three consistently. During processing of the stimulus, the switching between these states occurs abruptly but the switching characteristic follows a well-behaved and repeatable sequence. This is observed in a single subject analysis as well as a multiple-subject group-analysis in adults during face perception. We also show that although these patterns remain topographically similar for the general category of face perception task, the sequence of their occurrence and their temporal stability varies markedly between different face perception scenarios (stimuli) indicating toward different dynamical characteristics for information processing, which is stimulus-specific in nature. Subsequently, we translated these stable states into brain complex networks and derived informative network measures for characterizing the degree of segregated processing and information integration in those synchrostates, leading to a new methodology for characterizing information processing in human brain. The proposed methodology of modeling the functional brain connectivity through the synchrostates may be viewed as a new way of quantitative characterization of the cognitive ability of the subject, stimuli and information integration/segregation capability.The work presented in this paper was supported by FP7 EU funded MICHELANGELO project, Grant Agreement #288241. Website: www.michelangelo-project.eu/

Open Research Exeter