470 research outputs found
Pseudo-Boolean Programming for Partially Ordered Genomes
International audienceComparing genomes of different species is a crucial problem in comparative genomics. Different measures have been proposed to compare two genomes: number of common intervals, number of adjacencies, number of reversals, etc. These measures are classically used between two totally ordered genomes. However, genetic mapping techniques often give rise to different maps with some unordered genes. Starting from a partial order between genes of a genome, one method to find a total order consists in optimizing a given measure between a linear extension of this partial order and a given total order of a close and well-known genome. However, for most common measures, the problem turns out to be NP-hard. In this paper, we propose a (0, 1)-linear programming approach to compute a linear extension of one genome that maximizes the number of common intervals (resp. the number of adjacencies) between this linear extension and a given total order. Next, we propose an algorithm to find linear extensions of two partial orders that maximize the number of adjacencies
Large Genomes Assembly Using MAPREDUCE Framework
Knowing the genome sequence of an organism is the essential step toward understanding its genomic and genetic characteristics. Currently, whole genome shotgun (WGS) sequencing is the most widely used genome sequencing technique to determine the entire DNA sequence of an organism. Recent advances in next-generation sequencing (NGS) techniques have enabled biologists to generate large DNA sequences in a high-throughput and low-cost way. However, the assembly of NGS reads faces significant challenges due to short reads and an enormously high volume of data. Despite recent progress in genome assembly, current NGS assemblers cannot generate high-quality results or efficiently handle large genomes with billions of reads. In this research, we proposed a new Genome Assembler based on MapReduce (GAMR), which tackles both limitations. GAMR is based on a bi-directed de Bruijn graph and implemented using the MapReduce framework. We designed a distributed algorithm for each step in GAMR, making it scalable in assembling large-scale genomes. We also proposed novel gap-filling algorithms to improve assembly results to achieve higher accuracy and more extended continuity. We evaluated the assembly performance of GAMR using benchmark data and compared it against other NGS assemblers. We also demonstrated the scalability of GAMR by using it to assemble loblolly pine (~22Gbp). The results showed that GAMR finished the assembly much faster and with a much lower requirement of computing resources
An integer linear programming approach for genome scaffolding
This paper presents a simple and fast approach for genome scaffolding, combining constraint modeling and simple graph manipulation. We model the scaffolding problem as an optimization problem on a graph built from a paired-end reads alignment on contigs, then describe an heuristic to solve this problem with the iterative combination of local constraints solving and cycle breaking phases. We tested our approach on a benchmark of various genomes, and compared it with several usual scaffolders. The proposed method is quick, flexible, and provides results comparable to other scaffolders in terms of quality. In addition, contrarily to state of the art approaches that require dedicated servers, it can be run on a basic notebook computer even for large genomes
Killing Two Birds with One Stone: The Concurrent Development of the Novel Alignment Free Tree Building Method, Scrawkov-Phy, and the Extensible Phyloinformatics Utility, EMU-Phy.
Many components of phylogenetic inference belong to the most computationally challenging and complex domain of problems. To further escalate the challenge, the genomics revolution has exponentially increased the amount of data available for analysis. This, combined with the foundational nature of phylogenetic analysis, has prompted the development of novel methods for managing and analyzing phylogenomic data, as well as improving or intelligently utilizing current ones. In this study, a novel alignment tree building algorithm using Quasi-Hidden Markov Models (QHMMs), Scrawkov-Phy, is introduced. Additionally, exploratory work in the design and implementation of an extensible phyloinformatics tool, EMU-Phy, is described. Lastly, features of the best-practice tools are inspected and provisionally incorporated into Scrawkov-Phy to evaluate the algorithm’s suitability for said features.
This study shows that Scrawkov-Phy, as utilized through EMU-Phy, captures phylogenetic signal and reconstructs reasonable phylogenies without the need for multiple-sequence alignment or high-order statistical models. There are numerous additions to both Scrawkov-Phy and EMU-Phy which would improve their efficacy and the results of the provisional study shows that such additions are compatible
AI Methods in Algorithmic Composition: A Comprehensive Survey
Algorithmic composition is the partial or total automation of the process of music composition
by using computers. Since the 1950s, different computational techniques related to
Artificial Intelligence have been used for algorithmic composition, including grammatical
representations, probabilistic methods, neural networks, symbolic rule-based systems, constraint
programming and evolutionary algorithms. This survey aims to be a comprehensive
account of research on algorithmic composition, presenting a thorough view of the field for
researchers in Artificial Intelligence.This study was partially supported by a grant for the MELOMICS project
(IPT-300000-2010-010) from the Spanish Ministerio de Ciencia e Innovación, and a grant for
the CAUCE project (TSI-090302-2011-8) from the Spanish Ministerio de Industria, Turismo
y Comercio. The first author was supported by a grant for the GENEX project (P09-TIC-
5123) from the ConsejerÃa de Innovación y Ciencia de AndalucÃa
- …