Search CORE

780 research outputs found

Fine-Grained Parallel Genomic Sequence Comparison

Author: Dominique Lavenier
Publication venue: 'IntechOpen'
Publication date: 01/01/2010
Field of study

IntechOpen

HAL-CentraleSupelec

INRIA a CCSD electronic archive server

HAL-Rennes 1

Extreme Scale De Novo Metagenome Assembly

Author: Arndt Bill
Buluc Aydin
Egan Rob
Georganas Evangelos
Goltsman Eugene
Hofmeyr Steven
Oliker Leonid
Tritt Andrew
Yelick Katherine
Publication venue
Publication date: 01/01/2018
Field of study

Metagenome assembly is the process of transforming a set of short, overlapping, and potentially erroneous DNA segments from environmental samples into the accurate representation of the underlying microbiomes's genomes. State-of-the-art tools require big shared memory machines and cannot handle contemporary metagenome datasets that exceed Terabytes in size. In this paper, we introduce the MetaHipMer pipeline, a high-quality and high-performance metagenome assembler that employs an iterative de Bruijn graph approach. MetaHipMer leverages a specialized scaffolding algorithm that produces long scaffolds and accommodates the idiosyncrasies of metagenomes. MetaHipMer is end-to-end parallelized using the Unified Parallel C language and therefore can run seamlessly on shared and distributed-memory systems. Experimental results show that MetaHipMer matches or outperforms the state-of-the-art tools in terms of accuracy. Moreover, MetaHipMer scales efficiently to large concurrencies and is able to assemble previously intractable grand challenge metagenomes. We demonstrate the unprecedented capability of MetaHipMer by computing the first full assembly of the Twitchell Wetlands dataset, consisting of 7.5 billion reads - size 2.6 TBytes.Comment: Accepted to SC1

arXiv.org e-Print Archive

Crossref

eScholarship - University of California

HapPart: partitioning algorithm for multiple haplotyping from haplotype conflict graph

Author: Abdullah Abu-Bakar Muhammad
Hossain Md. Monowar
Shill Pintu Chandra
Publication venue: Institute of Advanced Engineering and Science
Publication date: 01/06/2022
Field of study

Each chromosome in the human genome has two copies. The haplotype assembly challenge entails reconstructing two haplotypes (chromosomes) using aligned fragments genomic sequence. Plants viz. wheat, paddy and banana have more than two chromosomes. Multiple haplotype reconstruction has been a major research topic. For reconstructing multiple haplotypes for a polyploid organism, several approaches have been designed. The researchers are still fascinated to the computational challenge. This article introduces a partitioning algorithm, HapPart for dividing the fragments into k-groups focusing on reducing the computational time. HapPart uses minimum error correction curve to determine the value of k at which the growth of gain measures for two consecutive values of k-multiplied by its diversity is maximum. Haplotype conflict graph is used for constructing all possible number of groups. The dissimilarity between two haplotypes represents the distance between two nodes in graph. For merging two nodes with the minimum distance between them this algorithm ensures minimum error among fragments in same group. Experimental results on real and simulated data show that HapPart can partition fragments efficiently and with less computational time

ZENODO

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

Institute of Advanced Engineering and Science

DNA Fragment Assembly Algorithms: Toward a Solution for Long Repeats

Author: Li Ching
Publication venue: SJSU ScholarWorks
Publication date: 01/01/2008
Field of study

In this work, we describe our efforts to seek optimal solutions for the DNA Fragment Assembly Problem in terms of assembly accuracy and runtime efficiency. The main obstacles for the DNA Fragment Assembly are analyzed. After reviewing various advanced algorithms adopted by some assemblers in the bioinformatics industry, this work explores the feasibility of assembling fragments for a target sequence containing perfect long repeats, which is deemed theoretically impossible without tedious finishing reaction experiments. Innovative algorithms incorporating statistical analysis proposed in this work make the restoration of DNA sequences containing long perfect repeats an attainable goal

SJSU ScholarWorks

JANE: efficient mapping of prokaryotic ESTs and variable length sequence reads on related template genomes

Author: A Raghunathan
AD Smith
Alexander Schmid
Andres Moya
AR Subramanian
B Langmead
B Morgenstern
BP Howden
C Liang
CA Hutchison III
Chunguang Liang
CJ Sigrist
D Gilbert
DW Mount
E Birney
E Gaidos
ER Xavier
F Meyer
GS Slater
H Jiang
H Li
JE Stajich
JP McCutcheon
Jörg Bernhardt
M Krzywinski
María José López-Sánchez
N Sanapareddy
R Li
R Mott
R Seshadri
R Wernersson
RK Aziz
RL Tatusov
Roy Gross
S Stoll
S Yang
SF Altschul
SF Altschul
T Smith
Thomas Dandekar
X Huang
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Crossref

Springer - Publisher Connector

PubMed Central

The application of artificial intelligence techniques to a sequencing problem in the biological domain

Author: Walker Joan
Publication venue
Publication date: 01/01/1995
Field of study

SIGLEAvailable from British Library Document Supply Centre- DSC:DXN002816 / BLDSC - British Library Document Supply CentreGBUnited Kingdo

Abertay Research Portal

OpenGrey Repository

On the role of metaheuristic optimization in bioinformatics

Author: Benito Sergio
Calvet Laura
Juan Angel A
Prados Ferran
Publication venue: 'Royal College of Obstetricians & Gynaecologists (RCOG)'
Publication date: 01/01/2022
Field of study

Metaheuristic algorithms are employed to solve complex and large-scale optimization problems in many different fields, from transportation and smart cities to finance. This paper discusses how metaheuristic algorithms are being applied to solve different optimization problems in the area of bioinformatics. While the text provides references to many optimization problems in the area, it focuses on those that have attracted more interest from the optimization community. Among the problems analyzed, the paper discusses in more detail the molecular docking problem, the protein structure prediction, phylogenetic inference, and different string problems. In addition, references to other relevant optimization problems are also given, including those related to medical imaging or gene selection for classification. From the previous analysis, the paper generates insights on research opportunities for the Operations Research and Computer Science communities in the field of bioinformatics

UCL Discovery