Search CORE

20 research outputs found

Hidden Markov Model Variants and their Application

Author: A Katchalsky
C Kittel
E Jaynes
G Jumarie
I Prigogine
I Prigogine
I Progigine
L Sklar
P Bak
R Durbin
S Winters-Hilt
SM Bezrukov
Stephen Winters-Hilt
Y Sinai
Publication venue: BioMed Central
Publication date: 01/01/2006
Field of study

Markov statistical methods may make it possible to develop an unsupervised learning process that can automatically identify genomic structure in prokaryotes in a comprehensive way. This approach is based on mutual information, probabilistic measures, hidden Markov models, and other purely statistical inputs. This approach also provides a uniquely common ground for comparative prokaryotic genomics. The approach is an on-going effort by its nature, as a multi-pass learning process, where each round is more informed than the last, and thereby allows a shift to the more powerful methods available for supervised learning at each iteration. It is envisaged that this "bootstrap" learning process will also be useful as a knowledge discovery tool. For such an ab initio prokaryotic gene-finder to work, however, it needs a mechanism to identify critical motif structure, such as those around the start of coding or start of transcription (and then, hopefully more). For eukaryotes, even with better start-of-coding identification, parsing of eukaryotic coding regions by the HMM is still limited by the HMM's single gene assumption, as evidenced by the poor performance in alternatively spliced regions. To address these complications an approach is described to expand the states in a eukaryotic gene-predictor HMM, to operate with two layers of DNA parsing. This extension from the single layer gene prediction parse is indicated after preliminary analysis of the C. elegans alt-splice statistics. State profiles have made use of a novel hash-interpolating MM (hIMM) method. A new implementation for an HMM-with-Duration is also described, with far-reaching application to gene-structure identification and analysis of channel current blockade data

Crossref

Springer - Publisher Connector

PubMed Central

Proceedings of the Fourth Annual Conference of the MidSouth Computational Biology and Bioinformatics Society

Author: Bridges Susan
Gusev Yuriy
Loganantharaj Raja
Wilkins Dawn
Winters-Hilt Stephen
Wren Jonathan D
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

<p/

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Proceedings of the Third Annual Conference of the MidSouth Computational Biology and Bioinformatics Society

Author: A Kel
A Ptitsyn
A Ptitsyn
Andrey Ptitsyn
H Fang
H Hong
H Sun
JD Wren
JD Wren
JD Wren
Jonathan D Wren
L Guo
L Shi
L Shi
LA Nahum
N Mei
NR Garge
Q Xie
R Delongchamp
R Loganantharaj
RL Frank
RL Frank
RR Delongchamp
RT Iqbal
S Winters-Hilt
S Winters-Hilt
S Winters-Hilt
SF Jennings
Stephen Winters-Hilt
T Chen
T Han
T Han
TG Smolinski
V Nagarajan
V Thodima
Y Ding
Yuriy Gusev
Z Xu
Publication venue: BioMed Central
Publication date: 01/01/2006
Field of study

Crossref

Springer - Publisher Connector

PubMed Central

Analysis of Nanopore Detector Measurements using Machine Learning Methods, with Application to Single-Molecule Kinetics

Author: Landry Matthew
Publication venue: ScholarWorks@UNO
Publication date: 18/05/2007
Field of study

At its core, a nanopore detector has a nanometer-scale biological membrane across which a voltage is applied. The voltage draws a DNA molecule into an á-hemolysin channel in the membrane. Consequently, a distinctive channel current blockade signal is created as the molecule flexes and interacts with the channel. This flexing of the molecule is characterized by different blockade levels in the channel current signal. Previous experiments have shown that a nanopore detector is sufficiently sensitive such that nearly identical DNA molecules were classified successfully using machine learning techniques such as Hidden Markov Models and Support Vector Machines in a channel current based signal analysis platform [4-9]. In this paper, methods for improving feature extraction are presented to improve both classification and to provide biologists and chemists with a better understanding of the physical properties of a given molecule

University of New Orleans

Duration learning for analysis of nanopore ionic current blockades

Author: Alexander Churbanov
B Walker
C Mitchell
C Mitchell
Carl Baribault
I Miklós
J Bilmes
J Bilmes
J Gouaux
J Kasianowicz
J Kasianowicz
J Mathé
L Rabiner
L Rabiner
L Song
M Akeson
M Akeson
R Durbin
R Durbin
S Bhakdi
S Winters-Hilt
S Winters-Hilt
S Winters-Hilt
Stephen Winters-Hilt
V DeGuzman
W Vercoutere
W Vercoutere
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

Abstract Background Ionic current blockade signal processing, for use in nanopore detection, offers a promising new way to analyze single molecule properties, with potential implications for DNA sequencing. The alpha-Hemolysin transmembrane channel interacts with a translocating molecule in a nontrivial way, frequently evidenced by a complex ionic flow blockade pattern. Typically, recorded current blockade signals have several levels of blockade, with various durations, all obeying a fixed statistical profile for a given molecule. Hidden Markov Model (HMM) based duration learning experiments on artificial two-level Gaussian blockade signals helped us to identify proper modeling framework. We then apply our framework to the real multi-level DNA hairpin blockade signal. Results The identified upper level blockade state is observed with durations that are geometrically distributed (consistent with an a physical decay process for remaining in any given state). We show that mixture of convolution chains of geometrically distributed states is better for presenting multimodal long-tailed duration phenomena. Based on learned HMM profiles we are able to classify 9 base-pair DNA hairpins with accuracy up to 99.5% on signals from same-day experiments. Conclusion We have demonstrated several implementations for <it>de novo </it>estimation of duration distribution probability density function with HMM framework and applied our model topology to the real data. The proposed design could be handy in molecular analysis based on nanopore current blockade signal.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Analysis of nanopore detector measurements using Machine-Learning methods, with application to single-molecule kinetic analysis

Author: A Churbanov
A Prabhakaran
C Baribault
CJC Burges
E Osuna
K Thomson
Matthew Landry
R Durbin
R Iqbal
S Winters-Hilt
S Winters-Hilt
S Winters-Hilt
S Winters-Hilt
S Winters-Hilt
S Winters-Hilt
Stephen Winters-Hilt
TH Cormen
VN Vapnik
W Vercoutere
W Vercoutere
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

Abstract Background A nanopore detector has a nanometer-scale trans-membrane channel across which a potential difference is established, resulting in an ionic current through the channel in the pA-nA range. A distinctive channel current blockade signal is created as individually "captured" DNA molecules interact with the channel and modulate the channel's ionic current. The nanopore detector is sensitive enough that nearly identical DNA molecules can be classified with very high accuracy using machine learning techniques such as Hidden Markov Models (HMMs) and Support Vector Machines (SVMs). Results A non-standard implementation of an HMM, emission inversion, is used for improved classification. Additional features are considered for the feature vector employed by the SVM for classification as well: The addition of a single feature representing spike density is shown to notably improve classification results. Another, much larger, feature set expansion was studied (2500 additional features instead of 1), deriving from including all the HMM's transition probabilities. The expanded features can introduce redundant, noisy information (as well as diagnostic information) into the current feature set, and thus degrade classification performance. A hybrid Adaptive Boosting approach was used for feature selection to alleviate this problem. Conclusion The methods shown here, for more informed feature extraction, improve both classification and provide biologists and chemists with tools for obtaining a better understanding of the kinetic properties of molecules of interest.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Nanopore current transduction analysis of protein binding to non-terminal and terminal DNA regions: analysis of transcription factor binding, retroviral DNA terminus dynamics, and retroviral integrase-DNA binding

Author: A Mara
A Meller
A Meller
A Meller
AJ Storm
Amanda Davis
D Branton
D Mitsui
D Stein
DM Stein
DS Latchman
DS Latchman
DS Latchman
DW Deamer
E Heins
Eric Morales
Iftekhar Amin
J Li
JJ Kasianowicz
K Ken Healy
K Thomson
M Akeson
M Landry
P Chen
P Hindmarsh
PY Apel
R Durbin
S Howorka
S Winters-Hilt
S Winters-Hilt
S Winters-Hilt
S Winters-Hilt
SE Henrickson
SM Bezrukov
SM Bezrukov
Stephen Winters-Hilt
TH Cormen
W Vercoutere
W Vercoutere
WH Coulter
Z Siwy
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

Crossref

Springer - Publisher Connector

PubMed Central

A novel, fast, HMM-with-Duration implementation – for application with a new, pattern recognition informed, nanopore detector

Author: A Churbanov
A Mara
A Meller
A Meller
A Meller
AJ Storm
B Sin
C Mitchell
Carl Baribault
CJC Burges
D Branton
D Stein
DM Stein
DW Deamer
E Heins
E Osuna
J Ferguson
J Li
JJ Kasianowicz
K Ken Healy
M Akeson
M Johnson
N Yoma
P Chen
P Ramesh
Park YK
PY Apel
R Durbin
S Howorka
S Levinson
S Vaseghi
S Winters-Hilt
S Winters-Hilt
S Winters-Hilt
S Winters-Hilt
S Winters-Hilt
SE Henrickson
SM Bezrukov
SM Bezrukov
Stephen Winters-Hilt
T Koski
T Mitsui
TH Cormen
VN Vapnik
W Vercoutere
W Vercoutere
Z Siwy
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

Abstract Background Hidden Markov Models (HMMs) provide an excellent means for structure identification and feature extraction on stochastic sequential data. An HMM-with-Duration (HMMwD) is an HMM that can also exactly model the hidden-label length (recurrence) distributions – while the regular HMM will impose a best-fit geometric distribution in its modeling/representation. Results A Novel, Fast, HMM-with-Duration (HMMwD) Implementation is presented, and experimental results are shown that demonstrate its performance on two-state synthetic data designed to model Nanopore Detector Data. The HMMwD experimental results are compared to (i) the ideal model and to (ii) the conventional HMM. Its accuracy is clearly an improvement over the standard HMM, and matches that of the ideal solution in many cases where the standard HMM does not. Computationally, the new HMMwD has all the speed advantages of the conventional (simpler) HMM implementation. In preliminary work shown here, HMM feature extraction is then used to establish the first pattern recognition-informed (PRI) sampling control of a Nanopore Detector Device (on a "live" data-stream). Conclusion The improved accuracy of the new HMMwD implementation, at the same order of computational cost as the standard HMM, is an important augmentation for applications in gene structure identification and channel current analysis, especially PRI sampling control, for example, where speed is essential. The PRI experiment was designed to inherit the high accuracy of the well characterized and distinctive blockades of the DNA hairpin molecules used as controls (or blockade "test-probes"). For this test set, the accuracy inherited is 99.9%.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Cheminformatics methods for novel nanopore analysis of HIV DNA termini

Author
Publication venue: BioMed Central
Publication date: 26/09/2006
Field of study

Springer - Publisher Connector

Minería de datos sobre comunidades biológicas

Author: Santa María Cristóbal
Soria Marcelo A.
Publication venue
Publication date: 01/05/2010
Field of study

La práctica científica y tecnológica suele reunir conceptos originados en diversas disciplinas para desarrollar perfiles y potenciales usos que adquieren cierta unidad e independencia conceptual. Tal es el caso de data mining que a partir de la tecnología de las bases de datos incorporó paulatinamente ideas provenientes de la inteligencia artificial y de la estadística para clasificar y/o predecir resultados sobre un muy variado conjunto de sistemas. El proyecto de investigación aquí presentado estudia técnicas bioinformáticas con las que se trabaja sobre comunidades microbiológicas de suelos. Tales métodos tienen el propósito de clasificar los organismos que forman parte del medio y predecir su diversidad. El análisis parte de la representación computacional del ADN que codifica la información genética y establece, con datos obtenidos a partir de muestras, las propiedades del conjunto de microorganismos que conforman esa comunidad. Este tipo de estudio, denominado metagenómica, permite agrupar los distintos tipos de organismos en clusters que representan alguna categoría taxonómica como especie, género, familia etc. También es posible a partir de estos agrupamientos realizar estimaciones de biodiversidad que proporcionen información sobre la potencialidad y riqueza del suelo. El proyecto de investigación tiene dos objetivos. Por un lado establecer un modelo bioinformático markoviano para la comparación de secuencias de ADN a efecto de clasificación, y por otro presentar un análisis crítico de los procedimientos de data mining aplicados a la evaluación de la riqueza en distintos ecosistemas.Eje: Bases de datos y minería de datosRed de Universidades con Carreras en Informática (RedUNCI