52 research outputs found

    Simultaneous identification of long similar substrings in large sets of sequences

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Sequence comparison faces new challenges today, with many complete genomes and large libraries of transcripts known. Gene annotation pipelines match these sequences in order to identify genes and their alternative splice forms. However, the software currently available cannot simultaneously compare sets of sequences as large as necessary especially if errors must be considered.</p> <p>Results</p> <p>We therefore present a new algorithm for the identification of almost perfectly matching substrings in very large sets of sequences. Its implementation, called ClustDB, is considerably faster and can handle 16 times more data than VMATCH, the most memory efficient exact program known today. ClustDB simultaneously generates large sets of exactly matching substrings of a given minimum length as seeds for a novel method of match extension with errors. It generates alignments of maximum length with a considered maximum number of errors within each overlapping window of a given size. Such alignments are not optimal in the usual sense but faster to calculate and often more appropriate than traditional alignments for genomic sequence comparisons, EST and full-length cDNA matching, and genomic sequence assembly. The method is used to check the overlaps and to reveal possible assembly errors for 1377 <it>Medicago truncatula </it>BAC-size sequences published at <url>http://www.medicago.org/genome/assembly_table.php?chr=1</url>.</p> <p>Conclusion</p> <p>The program ClustDB proves that window alignment is an efficient way to find long sequence sections of homogenous alignment quality, as expected in case of random errors, and to detect systematic errors resulting from sequence contaminations. Such inserts are systematically overlooked in long alignments controlled by only tuning penalties for mismatches and gaps.</p> <p>ClustDB is freely available for academic use.</p

    The Influence of Behavioral, Social, and Environmental Factors on Reproducibility and Replicability in Aquatic Animal Models

    Full text link
    The publication of reproducible, replicable, and translatable data in studies utilizing animal models is a scientific, practical, and ethical necessity. This requires careful planning and execution of experiments and accurate reporting of results. Recognition that numerous developmental, environmental, and test-related factors can affect experimental outcomes is essential for a quality study design. Factors commonly considered when designing studies utilizing aquatic animal species include strain, sex, or age of the animal; water quality; temperature; and acoustic and light conditions. However, in the aquatic environment, it is equally important to consider normal species behavior, group dynamics, stocking density, and environmental complexity, including tank design and structural enrichment. Here, we will outline normal species and social behavior of 2 commonly used aquatic species: zebrafish (Danio rerio) and Xenopus (X. laevis and X. tropicalis). We also provide examples as to how these behaviors and the complexity of the tank environment can influence research results and provide general recommendations to assist with improvement of reproducibility and replicability, particularly as it pertains to behavior and environmental complexity, when utilizing these popular aquatic models. © The Author(s) 2020. Published by Oxford University Press on behalf of the National Academies of Sciences, Engineering, and Medicine. All rights reserved.A.V.K. research was supported by the Russian Science Foundation grant 19-15-00053. He is the Chair of the International Zebrafish Neuroscience Research Consortium (ZNRC). This collaboration was supported, in part, through the NIH/NCI Cancer Center Support Grant P30 CA008748. The authors would like to thank Gregory Paull for sharing his photographs and insight into the natural habitat of zebrafish in Bangladesh

    Simultaneous identification of long similar substrings in large sets of sequences-0

    No full text
    <p><b>Copyright information:</b></p><p>Taken from "Simultaneous identification of long similar substrings in large sets of sequences"</p><p>http://www.biomedcentral.com/1471-2105/8/S5/S7</p><p>BMC Bioinformatics 2007;8(Suppl 5):S7-S7.</p><p>Published online 24 May 2007</p><p>PMCID:PMC1892095.</p><p></p> (len2 = 298559 bp). The alignment has length aln = 105638 including 20 gaps in AC148340 and 9 gaps in AC148483. Its displayed part shows five clusters of mismatches surrounded by long perfect matches so that the number of errors does not exceed 10 in each window of size 40. Therefore, ClustDB does not look for improvements by introducing gaps and reports 69 errors, i.e. 16 errors more than the exact alignment computes. Note that each cluster of mismatches can be realigned with two gaps

    Simultaneous identification of long similar substrings in large sets of sequences-3

    No full text
    <p><b>Copyright information:</b></p><p>Taken from "Simultaneous identification of long similar substrings in large sets of sequences"</p><p>http://www.biomedcentral.com/1471-2105/8/S5/S7</p><p>BMC Bioinformatics 2007;8(Suppl 5):S7-S7.</p><p>Published online 24 May 2007</p><p>PMCID:PMC1892095.</p><p></p> and Sequence 2 are matching parts of the concatenated long sequences S that is stored in memory. The sliding window of length 10 is marked by grey colour

    Simultaneous identification of long similar substrings in large sets of sequences-1

    No full text
    <p><b>Copyright information:</b></p><p>Taken from "Simultaneous identification of long similar substrings in large sets of sequences"</p><p>http://www.biomedcentral.com/1471-2105/8/S5/S7</p><p>BMC Bioinformatics 2007;8(Suppl 5):S7-S7.</p><p>Published online 24 May 2007</p><p>PMCID:PMC1892095.</p><p></p>w complexity subsequence section B. Removing A from AC146774 yields a perfect alignment with one mismatch

    Ein Beitrag zur Lehre vom Myxödem

    No full text

    Visibility of flat line and structured road markings for machine vision

    No full text
    Road markings are equally needed for human drivers and for machine vision equipment; their visibility demands high contrast ratio. Particularly difficult is achieving visibility under the conditions of wetness at night and in the presence of glare from an oncoming vehicle. A field evaluation of visibility by camera and by LiDAR was done on two types of road markings that were applied as pedestrian crossing: flat lines with typical retroreflectivity and structured lines with high retroreflectivity – thus, extreme cases were assessed. The measured camera contrast ratio dropped meaningfully in the presence of moisture in case of flat line markings, but remained high in case of structured markings that facilitated moisture drainage. Glare was detrimental to visibility, bringing it to almost naught regardless of the type of road markings. Simultaneous evaluation with LiDAR showed profound differences under the conditions of moisture: while the response from flat line markings dropped to nil (recovery time >20 s after wetting), the structured markings continuously provided meaningful response. This outcome proves that for reliable guiding by machine vision, as well as for human drivers, structured road markings that facilitate water drainage should be used. For dependable steering by machine vision equipment under adverse conditions, a combination of LiDAR and camera is seen as necessary
    corecore