Not Available

Anu Sharma; Dwijesh Chandra Mishra; Neeraj Budhlakoti; Sanjeev Kumar; Shashi Bhushan Lal

Not Available

Authors: Anu Sharma
Dwijesh Chandra Mishra
Neeraj Budhlakoti
Sanjeev Kumar
Shashi Bhushan Lal
Publication date: 10 November 2019
Publisher: Not Available

Abstract

Not AvailableCurrent de-novo assemblers are unable to effectively use the long-read sequencing data generated by present single-molecule sequencing technologies primarily because of the considerable error rate. In this project, both long and short reads have been required for efficient assembly results. The error correction on long reads were performed by aligning short reads over long reads to get reduced errors on the long reads and use them for assembly. Our approach exploits this technology by complementing it with shorter, high-identity sequences resulting in long, accurate transcripts and improved assemblies. The result of our hybrid approach is higher quality assemblies with fewer errors and gaps, which will drive down the expensive cost of genome finishing and enable more accurate downstream analyses. High-quality assemblies are critical for all aspects of genomics, especially genome annotation and comparative genomics. It is clear that higher-quality assemblies, with long unbroken contigs, will have a positive impact on a wide range of disciplines.This way, it is noticed that high error rates do not become a barrier to genome assembly. Higherror, long reads can be efficiently assembled in combination with complementary short-reads to produce assemblies not possible with any prior technology, bringing us one step closer to the goal of “one chromosome, one contig.” The rapid turnaround time possible with PacBio and other technologies, such as Ion Torrent, can make it possible to produce high-quality genome assemblies at a fraction of the time once required.Many tools in bioinformatics run on parallelized computational infrastructure for getting results in a comparatively less time because of heavy computational algorithms or job sizes involved. In this work, the parallelized tools installed on supercomputing infrastructure were utilized for faster results. The genome assembly is carried out in a pipeline form and running tools on HPC environment. This study was undertaken with the objectives to create a web-based software for various components of assembly namely – pre-processing, alignment for error correction, long read assembly and scaffolding. The software has been developed using JSP, Java, HTML and CSS. This software does a series of computations for all the steps involved. These computations are done on ASHOKA supercomputing system to get the faster results. The results are shown to the user on the browser which can also be downloaded to the client’s local hard disk.Not Availabl

Similar works

Full text

Available Versions

KRISHI Publications and Data Repository

oai:krishi.icar.gov.in:1234567...

Last time updated on 16/11/2021