Search CORE

8 research outputs found

Free Energy Estimates of All-Atom Protein Structures Using Generalized Belief Propagation

Author: Aji S.M.
Christopher J. Langmead
Eric P. Xing
Hetunandan Kamisetty
Liu Y.
Yanover C.
Yedidia J.S.
Publication venue: 'Mary Ann Liebert Inc'
Publication date
Field of study

Segmentation Conditional Random Fields (SCRFs): A New Approach for Protein Fold Recognition

Author: Jaime G. Carbonell (5361842)
Peter Weigele (5430035)
Vanathi Gopalakrishnan (5415902)
Yan Liu (5411249)
Publication venue
Publication date: 30/06/2018
Field of study

Protein fold recognition is an important step towards understanding protein three-dimensional structures and their functions. A conditional graphical model, i.e., segmentation conditional random fields (SCRFs), is proposed as an effective solution to this problem. In contrast to traditional graphical models, such as the hidden Markov model (HMM), SCRFs follow a discriminative approach. Therefore, it is flexible to include any features in the model, such as overlapping or long-range interaction features over the whole sequence. The model also employs a convex optimization function, which results in globally optimal solutions to the model parameters. On the other hand, the segmentation setting in SCRFs makes their graphical structures intuitively similar to the protein 3-D structures and more importantly provides a framework to model the long-range interactions between secondary structures directly. Our model is applied to predict the parallel β -helix fold, an important fold in bacterial pathogenesis and carbohydrate binding/cleavage. The cross-family validation shows that SCRFs not only can score all known β -helices higher than non-β -helices in the Protein Data Bank (PDB), but also accurately locates rungs in known beta-helix proteins. Our method outperforms BetaWrap, a state-of-the-art algorithm for predicting beta-helix folds, and HMMER, a general motif detection algorithm based on HMM, and has the additional advantage of general application to other protein folds. Applying our prediction model to the Uniprot Database, we identify previously unknown potential β -helices. </p

Segmentation Conditional Random Fields (SCRFs): A New Approach for Protein Fold Recognition

Author: A. Murzin
C. Bystroff
C. Orengo
D. Jones
H. Berman
J. Kreisberg
J. Lafferty
J. Thompson
K. Karplus
M. Yoder
M. Yoder
M.I. Jordan
R. Durbin
R. Leinonen
R. Steward
S. Altschul
Y. Liu
Publication venue: ACM Press
Publication date: 01/01/2005
Field of study

Protein fold recognition is an important step towards understanding protein three-dimensional structures and their functions. A conditional graphical model, i.e. segmentation conditional random fields (SCRFs), is proposed to solve the problem. In contrast to traditional graphical models such as hidden markov model (HMM), SCRFs follow a discriminative approach. It has the flexibility to include overlapping or long-range interaction features over the whole sequence, as well as global optimally solutions for the parameters. On the other hand, the segmentation setting in SCRFs makes its graphical structures intuitively similar to the protein 3-D structures and more importantly, provides a framework to model the long-range interactions directly

CiteSeerX

Crossref

Learning, Probability and Logic: Toward a Unified Approach for Content-Based Music Information Retrieval

Author: Al Farabi
Anglade
Anglade
Anglade
Anglade
Arabi
Aucouturier
Bartsch
Bello
Bello
Bengio
Bergmann
Besold
Blockeel
Boulanger-Lewandowski
Burgoyne
Burgoyne
Böck
Casey
Cella
Cho
Crane
d'Avila Garcez
Dannenberg
Davis
De Raedt
De Raedt
De Raedt
De Raedt
De Raedt
Deng
Deng
Dobrian
Domingos
Donadello
Donadello
Dovey
Downie
Ellis
Ellis
Ellis
Flach
Foote
Friedman
Fujishima
Gaudefroy
Getoor
Getoor
Grosche
Gurevych
Haack
Hamel
Harte
Herremans
Humphrey
Humphrey
Humphrey
Jain
Jain
Jernite
Kameoka
Kempf
Kernfeld
Kersting
Kersting
Kim
Kimmig
Kindermann
Kok
Kok
Koller
Koops
Korzeniowski
Korzeniowski
Korzeniowski
Korzeniowski
Krumhansl
Kuzelka
Lafferty
Lee
Leivant
Lew
Lewin
Liu
Lostanlen
Malkin
Mallat
Mallory
Maresz
Marsík
Mauch
Mauch
McFee
McVicar
Mihalkova
Minsky
Mishkin
Morales
Morales
Muggleton
Muller
Murphy
Müller
Müller
Ni
Nilsson
Ojima
Orio
Oudre
Pachet
Paiement
Pan
Papadopoulos
Papadopoulos
Papadopoulos
Papadopoulos
Papadopoulos
Papai
Paulus
Pauwels
Pawar
Pearl
Pereira
Poole
Poon
Poon
Poon
Prince
Pápai
Raedt
Rameau
Ramirez
Ramirez
Repetto
Richardson
Richardson
Riedel
Riemann
Russell
Salamon
Sarkhel
Schedl
Schedl
Schoenberg
Schuller
Serrà
Serrá
Sheh
Shenoy
Sigtia
Singla
Smith
Snidaro
Socher
Srinivasamurthy
Sutton
Sztyler
Thimm
Tsushima
Van Baelen
Van Haaren
Venugopal
Wang
Widmer
Wu
Zalkow
Zhou
Šourek
Publication venue: 'Frontiers Media SA'
Publication date: 01/01/2019
Field of study

Within the last 15 years, the field of Music Information Retrieval (MIR) has made tremendous progress in the development of algorithms for organizing and analyzing the ever-increasing large and varied amount of music and music-related data available digitally. However, the development of content-based methods to enable or ameliorate multimedia retrieval still remains a central challenge. In this perspective paper, we critically look at the problem of automatic chord estimation from audio recordings as a case study of content-based algorithms, and point out several bottlenecks in current approaches: expressiveness and flexibility are obtained to the expense of robustness and vice versa; available multimodal sources of information are little exploited; modeling multi-faceted and strongly interrelated musical information is limited with current architectures; models are typically restricted to short-term analysis that does not account for the hierarchical temporal structure of musical signals. Dealing with music data requires the ability to tackle both uncertainty and complex relational structure at multiple levels of representation. Traditional approaches have generally treated these two aspects separately, probability and learning being the usual way to represent uncertainty in knowledge, while logical representation being the usual way to represent knowledge and complex relational information. We advocate that the identified hurdles of current approaches could be overcome by recent developments in the area of Statistical Relational Artificial Intelligence (StarAI) that unifies probability, logic and (deep) learning. We show that existing approaches used in MIR find powerful extensions and unifications in StarAI, and we explain why we think it is time to consider the new perspectives offered by this promising research field

HAL-CentraleSupelec

Crossref

Directory of Open Access Journals

HAL-Rennes 1

Computational approaches to modeling the conserved structural core among distantly homologous proteins

Author: Menke Matthew Ewald, 1978-
Publication venue: Massachusetts Institute of Technology
Publication date: 01/01/2009
Field of study

Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2009.Cataloged from PDF version of thesis.Includes bibliographical references (p. 95-103).Modem techniques in biology have produced sequence data for huge quantities of proteins, and 3-D structural information for a much smaller number of proteins. We introduce several algorithms that make use of the limited available structural information to classify and annotate proteins with structures that are unknown, but similar to solved structures. The first algorithm is actually a tool for better understanding solved structures themselves. Namely, we introduce the multiple alignment algorithm Matt (Multiple Alignment with Translations and Twists), an aligned fragment pair chaining algorithm that, in intermediate steps, allows local flexibility between fragments. Matt temporarily allows small translations and rotations to bring sets of fragments into closer alignment than physically possible under rigid body transformation. The second algorithm, BetaWrapPro, is designed to recognize sequences of unknown structure that belong to specific all-beta fold classes. BetaWrapPro employs a "wrapping" algorithm that uses long-distance pairwise residue preferences to recognize sequences belonging to the beta-helix and the beta-trefoil classes. It uses hand-curated beta-strand templates based on solved structures. Finally, SMURF (Structural Motifs Using Random Fields) combines ideas from both these algorithms into a general method to recognize beta-structural motifs using both sequence information and long-distance pairwise correlations involved in beta-sheet formation. For any beta-structural fold, SMURF uses Matt to automatically construct a template from an alignment of solved 3-D structures.(cont.) From this template, SMURF constructs a Markov random field that combines a profile hidden Markov model together with pairwise residue preferences of the type introduced by BetaWrapPro. The efficacy of SMURF is demonstrated on three beta-propeller fold classes.by Matthew Ewald Menke.Ph.D

CiteSeerX

DSpace@MIT

Probabilistic Graphical Models and Algorithms for

Author: Jiao Feng
Publication venue: 'University of Waterloo'
Publication date: 01/01/2008
Field of study

In this thesis I present research in two fields: machine learning and computational biology. First, I develop new machine learning methods for graphical models that can be applied to protein problems. Then I apply graphical model algorithms to protein problems, obtaining improvements in protein structure prediction and protein structure alignment. First,in the machine learning work, I focus on a special kind of graphical model---conditional random fields (CRFs). Here, I present a new semi-supervised training procedure for CRFs that can be used to train sequence segmentors and labellers from a combination of labeled and unlabeled training data. Such learning algorithms can be applied to protein and gene name entity recognition problems. This work provides one of the first semi-supervised discriminative training methods for structured classification. Second, in my computational biology work, I focus mainly on protein problems. In particular, I first propose a tree decomposition method for solving the protein structure prediction and protein structure alignment problems. In so doing, I reveal why tree decomposition is a good method for many protein problems. Then, I propose a computational framework for detection of similar structures of a target protein with sparse NMR data, which can help to predict protein structure using experimental data. Finally, I propose a new machine learning approach---LS_Boost---to solve the protein fold recognition problem, which is one of the key steps in protein structure prediction. After a thorough comparison, the algorithm is proved to be both more accurate and more efficient than traditional z-Score method and other machine learning methods

University of Waterloo's Institutional Repository

Probabilistic Graphical Models for Human Interaction Analysis

Author: Zhang Dong
Publication venue: IDIAP
Publication date: 11/02/2010
Field of study

The objective of this thesis is to develop probabilistic graphical models for analyzing human interaction in meetings based on multimodel cues. We use meeting as a study case of human interactions since research shows that high complexity information is mostly exchanged through face-to-face interactions. Modeling human interaction provides several challenging research issues for the machine learning community. In meetings, each participant is a multimodal data stream. Modeling human interaction involves simultaneous recording and analysis of multiple multimodal streams. These streams may be asynchronous, have different frame rates, exhibit different stationarity properties, and carry complementary (or correlated) information. In this thesis, we developed three probabilistic graphical models for human interaction analysis. The proposed models use the ``probabilistic graphical model'' formalism, a formalism that exploits the conjoined capabilities of graph theory and probability theory to build complex models out of simpler pieces. We first introduce the multi-layer framework, in which the first layer models typical individual activity from low-level audio-visual features, and the second layer models the interactions. The two layers are linked by a set of posterior probability-based features. Next, we describe the team-player influence model, which learns the influence of interacting Markov chains within a team. The team-player influence model has a two-level structure: individual-level and group-level. Individual level models actions of each player, and the group-level models actions of the team as a whole. The influence of each player on the team is jointly learned with the rest of the model parameters in a principled manner using the Expectation-Maximization (EM) algorithm. Finally, we describe the semi-supervised adapted HMMs for unusual event detection. Unusual events are characterized by a number of features (rarity, unexpectedness, and relevance) that limit the application of traditional supervised model-based approaches. We propose a semi-supervised adapted Hidden Markov Model (HMM) framework, in which usual event models are first learned from a large amount of (commonly available) training data, while unusual event models are learned by Bayesian adaptation in an unsupervised manner

Infoscience - École polytechnique fédérale de Lausanne

Adaptive Methods for Robust Document Image Understanding

Author: Konya Iuliu
Publication venue: Universitäts- und Landesbibliothek Bonn
Publication date
Field of study

A vast amount of digital document material is continuously being produced as part of major digitization efforts around the world. In this context, generic and efficient automatic solutions for document image understanding represent a stringent necessity. We propose a generic framework for document image understanding systems, usable for practically any document types available in digital form. Following the introduced workflow, we shift our attention to each of the following processing stages in turn: quality assurance, image enhancement, color reduction and binarization, skew and orientation detection, page segmentation and logical layout analysis. We review the state of the art in each area, identify current defficiencies, point out promising directions and give specific guidelines for future investigation. We address some of the identified issues by means of novel algorithmic solutions putting special focus on generality, computational efficiency and the exploitation of all available sources of information. More specifically, we introduce the following original methods: a fully automatic detection of color reference targets in digitized material, accurate foreground extraction from color historical documents, font enhancement for hot metal typesetted prints, a theoretically optimal solution for the document binarization problem from both computational complexity- and threshold selection point of view, a layout-independent skew and orientation detection, a robust and versatile page segmentation method, a semi-automatic front page detection algorithm and a complete framework for article segmentation in periodical publications. The proposed methods are experimentally evaluated on large datasets consisting of real-life heterogeneous document scans. The obtained results show that a document understanding system combining these modules is able to robustly process a wide variety of documents with good overall accuracy

bonndoc – Der Publikationsserver der Universität Bonn