Search CORE

7,388 research outputs found

A standalone version of IsoFinder for the computational prediction of isochores in genome sequences

Author: Bernaola-Galván Pedro
Carpena Pedro
Oliver José L.
Publication venue
Publication date: 01/01/2008
Field of study

Isochores are long genome segments relatively homogeneous in G+C. A heuristic algorithm based on entropic segmentation has been developed by our group, and a web server implementing all the required components is available. However, a researcher may want to perform batch processing of many sequences simultaneously in its local machine, instead of analyzing them on one by one basis through the web. To this end, standalone versions are required. We report here the implementation of two standalone programs, able to predict isochores at the sequence level: 1) a command-line version (IsoFinder) for Windows and Linux systems; and 2) a user-friendly version (IsoFinderWin) running under Windows.Comment: 7 pages, 3 figure

arXiv.org e-Print Archive

Repositorio Institucional Universidad de Granada

Recommended from our members

EpiAlign: an alignment-based bioinformatic tool for comparing chromatin state sequences.

Author: Ge Xinzhou
Kwon Soo Bin
Li Jingyi Jessica
Li Wei Vivian
Xie Lingjue
Zhang Haowen
Publication venue: eScholarship, University of California
Publication date: 01/01/2019
Field of study

The availability of genome-wide epigenomic datasets enables in-depth studies of epigenetic modifications and their relationships with chromatin structures and gene expression. Various alignment tools have been developed to align nucleotide or protein sequences in order to identify structurally similar regions. However, there are currently no alignment methods specifically designed for comparing multi-track epigenomic signals and detecting common patterns that may explain functional or evolutionary similarities. We propose a new local alignment algorithm, EpiAlign, designed to compare chromatin state sequences learned from multi-track epigenomic signals and to identify locally aligned chromatin regions. EpiAlign is a dynamic programming algorithm that novelly incorporates varying lengths and frequencies of chromatin states. We demonstrate the efficacy of EpiAlign through extensive simulations and studies on the real data from the NIH Roadmap Epigenomics project. EpiAlign is able to extract recurrent chromatin state patterns along a single epigenome, and many of these patterns carry cell-type-specific characteristics. EpiAlign can also detect common chromatin state patterns across multiple epigenomes, and it will serve as a useful tool to group and distinguish epigenomic samples based on genome-wide or local chromatin state patterns

eScholarship - University of California

The Unsupervised Acquisition of a Lexicon from Continuous Speech

Author: de Marcken Carl
Publication venue
Publication date: 01/01/1995
Field of study

We present an unsupervised learning algorithm that acquires a natural-language lexicon from raw speech. The algorithm is based on the optimal encoding of symbol sequences in an MDL framework, and uses a hierarchical representation of language that overcomes many of the problems that have stymied previous grammar-induction procedures. The forward mapping from symbol sequences to the speech stream is modeled using features based on articulatory gestures. We present results on the acquisition of lexicons and language models from raw speech, text, and phonetic transcripts, and demonstrate that our algorithm compares very favorably to other reported results with respect to segmentation performance and statistical efficiency.Comment: 27 page technical repor

arXiv.org e-Print Archive

CiteSeerX

DSpace@MIT

Bayesian modeling of recombination events in bacterial populations

Author: A Baldwin
A Baldwin
A Baldwin
A Rambaut
A Skalka
Adam Baldwin
C Fraser
Chris Dowson
CP Robert
CX Chan
D Falush
D Husmeier
D Posada
DJ Hand
E Mahenthiralingam
E Mahenthiralingam
EHL Aarts
Eshwar Mahenthiralingam
FM Cohan
J Corander
J Corander
J Corander
J Corander
J Felsenstein
J Hein
J Maynard Smith
JG Lawrence
JS Sinsheimer
Jukka Corander
JV Braun
M Arenas
M Hasegawa
MA Suchard
MJ Schervish
NC Grassly
P Marttinen
Pekka Marttinen
R Jain
RA Elton
S Sawyer
SA Sisson
VN Minin
VN Minin
William P Hanage
WJ Wiersinga
WP Hanage
X Didelot
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2008
Field of study

Background: We consider the discovery of recombinant segments jointly with their origins within multilocus DNA sequences from bacteria representing heterogeneous populations of fairly closely related species. The currently available methods for recombination detection capable of probabilistic characterization of uncertainty have a limited applicability in practice as the number of strains in a data set increases. Results: We introduce a Bayesian spatial structural model representing the continuum of origins over sites within the observed sequences, including a probabilistic characterization of uncertainty related to the origin of any particular site. To enable a statistically accurate and practically feasible approach to the analysis of large-scale data sets representing a single genus, we have developed a novel software tool (BRAT, Bayesian Recombination Tracker) implementing the model and the corresponding learning algorithm, which is capable of identifying the posterior optimal structure and to estimate the marginal posterior probabilities of putative origins over the sites. Conclusion: A multitude of challenging simulation scenarios and an analysis of real data from seven housekeeping genes of 120 strains of genus Burkholderia are used to illustrate the possibilities offered by our approach. The software is freely available for download at URL http://web.abo.fi/fak/ mnf//mate/jc/software/brat.html

Crossref

Online Research @ Cardiff

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Warwick Research Archives Portal Repository

An Overview of the Use of Neural Networks for Data Mining Tasks

Author: Alberts B
Alpaydin E
Ando T
Blake CL
Bramer MA
Castanheira LG
Han J
Lu H
Mitchell M
Ni X
Quinlan RJ
Rumelhart DE
Shafer JC
Shendure J
Simić D
Stahl F
Steinwart I
Surjandari I
Wei JS
Widrow B
Witten IH
Zaslavsky B
Zhang D
Publication venue: 'Wiley'
Publication date: 01/01/2012
Field of study

In the recent years the area of data mining has experienced a considerable demand for technologies that extract knowledge from large and complex data sources. There is a substantial commercial interest as well as research investigations in the area that aim to develop new and improved approaches for extracting information, relationships, and patterns from datasets. Artificial Neural Networks (NN) are popular biologically inspired intelligent methodologies, whose classification, prediction and pattern recognition capabilities have been utilised successfully in many areas, including science, engineering, medicine, business, banking, telecommunication, and many other fields. This paper highlights from a data mining perspective the implementation of NN, using supervised and unsupervised learning, for pattern recognition, classification, prediction and cluster analysis, and focuses the discussion on their usage in bioinformatics and financial data analysis tasks

Central Archive at the University of Reading

Crossref

Portsmouth University Research Portal (Pure)

Bournemouth University Research Online