Search CORE

470 research outputs found

Pseudo-Boolean Programming for Partially Ordered Genomes

Author: Angibaud Sébastien
Fertin Guillaume
Thevenin Annelyse
Vialette Stéphane
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 27/09/2009
Field of study

International audienceComparing genomes of different species is a crucial problem in comparative genomics. Different measures have been proposed to compare two genomes: number of common intervals, number of adjacencies, number of reversals, etc. These measures are classically used between two totally ordered genomes. However, genetic mapping techniques often give rise to different maps with some unordered genes. Starting from a partial order between genes of a genome, one method to find a total order consists in optimizing a given measure between a linear extension of this partial order and a given total order of a close and well-known genome. However, for most common measures, the problem turns out to be NP-hard. In this paper, we propose a (0, 1)-linear programming approach to compute a linear extension of one genome that maximizes the number of common intervals (resp. the number of adjacencies) between this linear extension and a given total order. Next, we propose an algorithm to find linear extensions of two partial orders that maximize the number of adjacencies

HAL-CentraleSupelec

INRIA a CCSD electronic archive server

Hal-Diderot

HAL-Ecole des Ponts ParisTech

HAL - UPEC / UPEM

HAL-Rennes 1

Large Genomes Assembly Using MAPREDUCE Framework

Author: Zhang Yuehua
Publication venue: Clemson University Libraries
Publication date: 01/12/2022
Field of study

Knowing the genome sequence of an organism is the essential step toward understanding its genomic and genetic characteristics. Currently, whole genome shotgun (WGS) sequencing is the most widely used genome sequencing technique to determine the entire DNA sequence of an organism. Recent advances in next-generation sequencing (NGS) techniques have enabled biologists to generate large DNA sequences in a high-throughput and low-cost way. However, the assembly of NGS reads faces significant challenges due to short reads and an enormously high volume of data. Despite recent progress in genome assembly, current NGS assemblers cannot generate high-quality results or efficiently handle large genomes with billions of reads. In this research, we proposed a new Genome Assembler based on MapReduce (GAMR), which tackles both limitations. GAMR is based on a bi-directed de Bruijn graph and implemented using the MapReduce framework. We designed a distributed algorithm for each step in GAMR, making it scalable in assembling large-scale genomes. We also proposed novel gap-filling algorithms to improve assembly results to achieve higher accuracy and more extended continuity. We evaluated the assembly performance of GAMR using benchmark data and compared it against other NGS assemblers. We also demonstrated the scalability of GAMR by using it to assemble loblolly pine (~22Gbp). The results showed that GAMR finished the assembly much faster and with a much lower requirement of computing resources

Clemson University: TigerPrints

An integer linear programming approach for genome scaffolding

Author: Briot Nicolas
Chateau Annie
Coletta Remi
De Givry Simon
Leleux Philippe
Schiex Thomas
Publication venue: HAL CCSD
Publication date: 08/09/2014
Field of study

This paper presents a simple and fast approach for genome scaffolding, combining constraint modeling and simple graph manipulation. We model the scaffolding problem as an optimization problem on a graph built from a paired-end reads alignment on contigs, then describe an heuristic to solve this problem with the iterative combination of local constraints solving and cycle breaking phases. We tested our approach on a benchmark of various genomes, and compared it with several usual scaffolders. The proposed method is quick, flexible, and provides results comparable to other scaffolders in terms of quality. In addition, contrarily to state of the art approaches that require dedicated servers, it can be run on a basic notebook computer even for large genomes

INRIA a CCSD electronic archive server

HAL Descartes

ProdInra

Hal-Diderot

Arapan-S: a fast and highly accurate whole-genome assembly software for viruses and small genomes

Author: B Chevreux
DD Sommer
DR Zerbino
DR Zerbino
DW Bryant
EW Myers
GG Sutton
I Maccallum
J Butler
JT Simpson
MJ Chaisson
Mohammed Sahli
P Medvedev
PA Pevzner
R Li
RL Warren
Tetsuo Shibuya
X Huang
X Huang
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Killing Two Birds with One Stone: The Concurrent Development of the Novel Alignment Free Tree Building Method, Scrawkov-Phy, and the Extensible Phyloinformatics Utility, EMU-Phy.

Author: Fisk J. Nick
Publication venue: RIT Scholar Works
Publication date: 27/03/2016
Field of study

Many components of phylogenetic inference belong to the most computationally challenging and complex domain of problems. To further escalate the challenge, the genomics revolution has exponentially increased the amount of data available for analysis. This, combined with the foundational nature of phylogenetic analysis, has prompted the development of novel methods for managing and analyzing phylogenomic data, as well as improving or intelligently utilizing current ones. In this study, a novel alignment tree building algorithm using Quasi-Hidden Markov Models (QHMMs), Scrawkov-Phy, is introduced. Additionally, exploratory work in the design and implementation of an extensible phyloinformatics tool, EMU-Phy, is described. Lastly, features of the best-practice tools are inspected and provisionally incorporated into Scrawkov-Phy to evaluate the algorithm’s suitability for said features. This study shows that Scrawkov-Phy, as utilized through EMU-Phy, captures phylogenetic signal and reconstructs reasonable phylogenies without the need for multiple-sequence alignment or high-order statistical models. There are numerous additions to both Scrawkov-Phy and EMU-Phy which would improve their efficacy and the results of the provisional study shows that such additions are compatible

RIT Scholar Works

AI Methods in Algorithmic Composition: A Comprehensive Survey

Author: Fernández Rodríguez Jose David
Vico-Vela Francisco Jose
Publication venue: 'AI Access Foundation'
Publication date: 01/01/2013
Field of study

Algorithmic composition is the partial or total automation of the process of music composition by using computers. Since the 1950s, different computational techniques related to Artificial Intelligence have been used for algorithmic composition, including grammatical representations, probabilistic methods, neural networks, symbolic rule-based systems, constraint programming and evolutionary algorithms. This survey aims to be a comprehensive account of research on algorithmic composition, presenting a thorough view of the field for researchers in Artificial Intelligence.This study was partially supported by a grant for the MELOMICS project (IPT-300000-2010-010) from the Spanish Ministerio de Ciencia e Innovación, and a grant for the CAUCE project (TSI-090302-2011-8) from the Spanish Ministerio de Industria, Turismo y Comercio. The first author was supported by a grant for the GENEX project (P09-TIC- 5123) from the Consejería de Innovación y Ciencia de Andalucía

arXiv.org e-Print Archive

Crossref

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Repositorio Institucional Universidad de Málaga