32 research outputs found

    ReRep: Computational detection of repetitive sequences in genome survey sequences (GSS)

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Genome survey sequences (GSS) offer a preliminary global view of a genome since, unlike ESTs, they cover coding as well as non-coding DNA and include repetitive regions of the genome. A more precise estimation of the nature, quantity and variability of repetitive sequences very early in a genome sequencing project is of considerable importance, as such data strongly influence the estimation of genome coverage, library quality and progress in scaffold construction. Also, the elimination of repetitive sequences from the initial assembly process is important to avoid errors and unnecessary complexity. Repetitive sequences are also of interest in a variety of other studies, for instance as molecular markers.</p> <p>Results</p> <p>We designed and implemented a straightforward pipeline called ReRep, which combines bioinformatics tools for identifying repetitive structures in a GSS dataset. In a case study, we first applied the pipeline to a set of 970 GSSs, sequenced in our laboratory from the human pathogen <it>Leishmania braziliensis</it>, the causative agent of leishmaniosis, an important public health problem in Brazil. We also verified the applicability of ReRep to new sequencing technologies using a set of 454-reads of an <it>Escheria coli</it>. The behaviour of several parameters in the algorithm is evaluated and suggestions are made for tuning of the analysis.</p> <p>Conclusion</p> <p>The ReRep approach for identification of repetitive elements in GSS datasets proved to be straightforward and efficient. Several potential repetitive sequences were found in a <it>L. braziliensis </it>GSS dataset generated in our laboratory, and further validated by the analysis of a more complete genomic dataset from the EMBL and Sanger Centre databases. ReRep also identified most of the <it>E. coli </it>K12 repeats prior to assembly in an example dataset obtained by automated sequencing using 454 technology. The parameters controlling the algorithm behaved consistently and may be tuned to the properties of the dataset, in particular to the length of sequencing reads and the genome coverage. ReRep is freely available for academic use at <url>http://bioinfo.pdtis.fiocruz.br/ReRep/</url>.</p

    Development and analysis of the Soil Water Infiltration Global database

    Get PDF
    In this paper, we present and analyze a novel global database of soil infiltration measurements, the Soil Water Infiltration Global (SWIG) database. In total, 5023 infiltration curves were collected across all continents in the SWIG database. These data were either provided and quality checked by the scientists who performed the experiments or they were digitized from published articles. Data from 54 different countries were included in the database with major contributions from Iran, China, and the USA. In addition to its extensive geographical coverage, the collected infiltration curves cover research from 1976 to late 2017. Basic information on measurement location and method, soil properties, and land use was gathered along with the infiltration data, making the database valuable for the development of pedotransfer functions (PTFs) for estimating soil hydraulic properties, for the evaluation of infiltration measurement methods, and for developing and validating infiltration models. Soil textural information (clay, silt, and sand content) is available for 3842 out of 5023 infiltration measurements ( ∼ 76%) covering nearly all soil USDA textural classes except for the sandy clay and silt classes. Information on land use is available for 76% of the experimental sites with agricultural land use as the dominant type ( ∼ 40%). We are convinced that the SWIG database will allow for a better parameterization of the infiltration process in land surface models and for testing infiltration models. All collected data and related soil characteristics are provided online in *.xlsx and *.csv formats for reference, and we add a disclaimer that the database is for public domain use only and can be copied freely by referencing it. Supplementary data are available at https://doi.org/10.1594/PANGAEA.885492 (Rahmati et al., 2018). Data quality assessment is strongly advised prior to any use of this database. Finally, we would like to encourage scientists to extend and update the SWIG database by uploading new data to it

    Modern drama

    No full text
    xxvi, 491 p. ; 21 cm

    Classic through modern drama An introductory anthology

    No full text
    corecore