232,262 research outputs found

    Functional Comparison of Current Software Tools for Genomic Assembly from High Throughput Sequencing Data

    Get PDF
    De novo genomic sequencing, which is the process of discovering the sequence of a genome which has not previously been elucidated, provides unique challenges, especially for larger genomes. Modern high-throughput sequencing technologies have addressed the issue of covering the entire genome in a reasonable time by fragmenting the genome into portions that can be examined in a massively-parallel approach. While these have saved considerable time and cost for the chemical process of determining the sequence of a genome, they result in sets of many tens of millions of sequence fragments called reads, each of which is typically on the order of just 100 to 300 bases long. Assembling these reads into a genomic sequence is highly computationally complex. A variety of assembly software packages are readily available for this purpose. In this project, a set of genomic assemblers was selected for examination. These programs were then tested with an Illumina data set for the grape species Vitis romanetii. Experimental runs with this dataset were performed to evaluate the run time required as well as the contiguity, completeness, and accuracy of the resulting assemblies. Different approaches to quality control preprocessing of the sequence data were also explored and evaluated. The results strongly recommend the use of the program MaSuRCA, run with data which has not been preprocessed for quality control. The second highest recommendation would be the use of ABySS with data preprocessed via QuorUM error-correction. In the process of these tests, it was also hoped that at least the beginnings of a draft genome for V. romanetii would be produced. The assemblies which came closest to publication quality were produced by MaSuRCA. Examination of these using the assessment software BUSCO suggest that the best of these assemblies may well be approaching publishable quality

    Ecological indicators of water quality in small rivers

    Get PDF
    At the present time hydrobiological indicators are widely used for the control of surface water quality. Results of the applying of methods suggested at the 1st Soviet-American seminar (1975), development of improved methods and estimation of their usefulness for various conditions are presented in this report. Among the criteria permitting an estimation of the degree and character of changes in water quality and their connection with the functioning of river ecosystems in general, the biological tests of natural waters appears to be the most universal one and is being carried out in two main directions — ecological and physiological. This study summarises approaches in both directions

    Reference-Free Validation of Short Read Data

    Get PDF
    High-throughput DNA sequencing techniques offer the ability to rapidly and cheaply sequence material such as whole genomes. However, the short-read data produced by these techniques can be biased or compromised at several stages in the sequencing process; the sources and properties of some of these biases are not always known. Accurate assessment of bias is required for experimental quality control, genome assembly, and interpretation of coverage results. An additional challenge is that, for new genomes or material from an unidentified source, there may be no reference available against which the reads can be checked.-mers. We apply our methodology to wide range of short read data and show that, surprisingly, strong biases appear to be present. These include gross overrepresentation of some poly-base sequences, per-position biases towards some bases, and apparent preferences for some starting positions over others.The existence of biases in short read data is known, but they appear to be greater and more diverse than identified in previous literature. Statistical analysis of a set of short reads can help identify issues prior to assembly or resequencing, and should help guide chemical or statistical methods for bias rectification

    The computer revolution in science: steps towards the realization of computer-supported discovery environments

    Get PDF
    The tools that scientists use in their search processes together form so-called discovery environments. The promise of artificial intelligence and other branches of computer science is to radically transform conventional discovery environments by equipping scientists with a range of powerful computer tools including large-scale, shared knowledge bases and discovery programs. We will describe the future computer-supported discovery environments that may result, and illustrate by means of a realistic scenario how scientists come to new discoveries in these environments. In order to make the step from the current generation of discovery tools to computer-supported discovery environments like the one presented in the scenario, developers should realize that such environments are large-scale sociotechnical systems. They should not just focus on isolated computer programs, but also pay attention to the question how these programs will be used and maintained by scientists in research practices. In order to help developers of discovery programs in achieving the integration of their tools in discovery environments, we will formulate a set of guidelines that developers could follow

    Special Libraries, July 1978

    Get PDF
    Volume 69, Issue 7https://scholarworks.sjsu.edu/sla_sl_1978/1005/thumbnail.jp

    Physico-chemical foundations underpinning microarray and next-generation sequencing experiments

    Get PDF
    Hybridization of nucleic acids on solid surfaces is a key process involved in high-throughput technologies such as microarrays and, in some cases, next-generation sequencing (NGS). A physical understanding of the hybridization process helps to determine the accuracy of these technologies. The goal of a widespread research program is to develop reliable transformations between the raw signals reported by the technologies and individual molecular concentrations from an ensemble of nucleic acids. This research has inputs from many areas, from bioinformatics and biostatistics, to theoretical and experimental biochemistry and biophysics, to computer simulations. A group of leading researchers met in Ploen Germany in 2011 to discuss present knowledge and limitations of our physico-chemical understanding of high-throughput nucleic acid technologies. This meeting inspired us to write this summary, which provides an overview of the state-of-the-art approaches based on physico-chemical foundation to modeling of the nucleic acids hybridization process on solid surfaces. In addition, practical application of current knowledge is emphasized
    corecore