Not AvailableCurrent de-novo assemblers are unable to effectively use the long-read sequencing data generated
by present single-molecule sequencing technologies primarily because of the considerable error
rate. In this project, both long and short reads have been required for efficient assembly results.
The error correction on long reads were performed by aligning short reads over long reads to get
reduced errors on the long reads and use them for assembly. Our approach exploits this technology
by complementing it with shorter, high-identity sequences resulting in long, accurate transcripts
and improved assemblies. The result of our hybrid approach is higher quality assemblies with
fewer errors and gaps, which will drive down the expensive cost of genome finishing and enable
more accurate downstream analyses. High-quality assemblies are critical for all aspects of
genomics, especially genome annotation and comparative genomics. It is clear that higher-quality
assemblies, with long unbroken contigs, will have a positive impact on a wide range of disciplines.This way, it is noticed that high error rates do not become a barrier to genome assembly. Higherror,
long reads can be efficiently assembled in combination with complementary short-reads to
produce assemblies not possible with any prior technology, bringing us one step closer to the goal
of “one chromosome, one contig.” The rapid turnaround time possible with PacBio and other
technologies, such as Ion Torrent, can make it possible to produce high-quality genome assemblies
at a fraction of the time once required.Many tools in bioinformatics run on parallelized computational infrastructure for getting results in
a comparatively less time because of heavy computational algorithms or job sizes involved. In this
work, the parallelized tools installed on supercomputing infrastructure were utilized for faster
results. The genome assembly is carried out in a pipeline form and running tools on HPC
environment. This study was undertaken with the objectives to create a web-based software for
various components of assembly namely – pre-processing, alignment for error correction, long
read assembly and scaffolding. The software has been developed using JSP, Java, HTML and CSS.
This software does a series of computations for all the steps involved. These computations are
done on ASHOKA supercomputing system to get the faster results. The results are shown to the
user on the browser which can also be downloaded to the client’s local hard disk.Not Availabl