104 research outputs found

    PatMaN: rapid alignment of short sequences to large databases

    Get PDF
    Summary: We present a tool suited for searching for many short nucleotide sequences in large databases, allowing for a predefined number of gaps and mismatches. The commandline-driven program implements a non-deterministic automata matching algorithm on a keyword tree of the search strings. Both queries with and without ambiguity codes can be searched. Search time is short for perfect matches, and retrieval time rises exponentially with the number of edits allowed

    FUNC: a package for detecting significant associations between gene sets and ontological annotations

    Get PDF
    BACKGROUND: Genome-wide expression, sequence and association studies typically yield large sets of gene candidates, which must then be further analysed and interpreted. Information about these genes is increasingly being captured and organized in ontologies, such as the Gene Ontology. Relationships between the gene sets identified by experimental methods and biological knowledge can be made explicit and used in the interpretation of results. However, it is often difficult to assess the statistical significance of such analyses since many inter-dependent categories are tested simultaneously. RESULTS: We developed the program package FUNC that includes and expands on currently available methods to identify significant associations between gene sets and ontological annotations. Implemented are several tests in particular well suited for genome wide sequence comparisons, estimates of the family-wise error rate, the false discovery rate, a sensitive estimator of the global significance of the results and an algorithm to reduce the complexity of the results. CONCLUSION: FUNC is a versatile and useful tool for the analysis of genome-wide data. It is freely available under the GPL license and also accessible via a web service

    BOWiki: an ontology-based wiki for annotation of data and integration of knowledge in biology.

    Get PDF
    MOTIVATION: Ontology development and the annotation of biological data using ontologies are time-consuming exercises that currently require input from expert curators. Open, collaborative platforms for biological data annotation enable the wider scientific community to become involved in developing and maintaining such resources. However, this openness raises concerns regarding the quality and correctness of the information added to these knowledge bases. The combination of a collaborative web-based platform with logic-based approaches and Semantic Web technology can be used to address some of these challenges and concerns. RESULTS: We have developed the BOWiki, a web-based system that includes a biological core ontology. The core ontology provides background knowledge about biological types and relations. Against this background, an automated reasoner assesses the consistency of new information added to the knowledge base. The system provides a platform for research communities to integrate information and annotate data collaboratively. AVAILABILITY: The BOWiki and supplementary material is available at . The source code is available under the GNU GPL from

    High-coverage genome of the Tyrolean Iceman reveals unusually high Anatolian farmer ancestry

    Get PDF
    The Tyrolean Iceman is known as one of the oldest human glacier mummies, directly dated to 3350-3120 calibrated BCE. A previously published low-coverage genome provided novel insights into European prehistory, despite high present-day DNA contamination. Here, we generate a high-coverage genome with low contamination (15.3×) to gain further insights into the genetic history and phenotype of this individual. Contrary to previous studies, we found no detectable Steppe-related ancestry in the Iceman. Instead, he retained the highest Anatolian-farmer-related ancestry among contemporaneous European populations, indicating a rather isolated Alpine population with limited gene flow from hunter-gatherer-ancestry-related populations. Phenotypic analysis revealed that the Iceman likely had darker skin than present-day Europeans and carried risk alleles associated with male-pattern baldness, type 2 diabetes, and obesity-related metabolic syndrome. These results corroborate phenotypic observations of the preserved mummified body, such as high pigmentation of his skin and the absence of hair on his head

    The Neandertal genome and ancient DNA authenticity

    Get PDF
    Recent advances in high-thoughput DNA sequencing have made genome-scale analyses of genomes of extinct organisms possible. With these new opportunities come new difficulties in assessing the authenticity of the DNA sequences retrieved. We discuss how these difficulties can be addressed, particularly with regard to analyses of the Neandertal genome. We argue that only direct assays of DNA sequence positions in which Neandertals differ from all contemporary humans can serve as a reliable means to estimate human contamination. Indirect measures, such as the extent of DNA fragmentation, nucleotide misincorporations, or comparison of derived allele frequencies in different fragment size classes, are unreliable. Fortunately, interim approaches based on mtDNA differences between Neandertals and current humans, detection of male contamination through Y chromosomal sequences, and repeated sequencing from the same fossil to detect autosomal contamination allow initial large-scale sequencing of Neandertal genomes. This will result in the discovery of fixed differences in the nuclear genome between Neandertals and current humans that can serve as future direct assays for contamination. For analyses of other fossil hominins, which may become possible in the future, we suggest a similar ‘boot-strap' approach in which interim approaches are applied until sufficient data for more definitive direct assays are acquired

    Targeted high-throughput sequencing of tagged nucleic acid samples

    Get PDF
    High-throughput 454 DNA sequencing technology allows much faster and more cost-effective sequencing than traditional Sanger sequencing. However, the technology imposes inherent limitations on the number of samples that can be processed in parallel. Here we introduce parallel tagged sequencing (PTS), a simple, inexpensive and flexible barcoding technique that can be used for parallel sequencing any number and type of double-stranded nucleic acid samples. We demonstrate that PTS is particularly powerful for sequencing contiguous DNA fragments such as mtDNA genomes: in theory as many as 250 mammalian mtDNA genomes can be sequenced in a single GS FLX run. PTS dramatically increases the sequencing throughput of samples in parallel and thus fully mobilizes the resources of the 454 technology for targeted sequencing

    Transcription Factors Are Targeted by Differentially Expressed miRNAs in Primates

    Get PDF
    MicroRNAs (miRNAs) are small RNA molecules involved in the regulation of mammalian gene expression. Together with other transcription regulators, miRNAs modulate the expression of genes and thereby potentially contribute to tissue and species diversity. To identify miRNAs that are differentially expressed between tissues and/or species, and the genes regulated by these, we have quantified expression of miRNAs and messenger RNAs in five tissues from multiple human, chimpanzee, and rhesus macaque individuals using high-throughput sequencing. The breadth of this tissue and species data allows us to show that downregulation of target genes by miRNAs is more pronounced between tissues than between species and that downregulation is more pronounced for genes with fewer binding sites for expressed miRNAs. Intriguingly, we find that tissue- and species-specific miRNAs target transcription factor genes (TFs) significantly more often than expected. Through their regulatory effect on transcription factors, miRNAs may therefore exert an indirect influence on a larger proportion of genes than previously thought

    Cases of trisomy 21 and trisomy 18 among historic and prehistoric individuals discovered from ancient DNA

    Get PDF
    Aneuploidies, and in particular, trisomies represent the most common genetic aberrations observed in human genetics today. To explore the presence of trisomies in historic and prehistoric populations we screen nearly 10,000 ancient human individuals for the presence of three copies of any of the target autosomes. We find clear genetic evidence for six cases of trisomy 21 (Down syndrome) and one case of trisomy 18 (Edwards syndrome), and all cases are present in infant or perinatal burials. We perform comparative osteological examinations of the skeletal remains and find overlapping skeletal markers, many of which are consistent with these syndromes. Interestingly, three cases of trisomy 21, and the case of trisomy 18 were detected in two contemporaneous sites in early Iron Age Spain (800-400 BCE), potentially suggesting a higher frequency of burials of trisomy carriers in those societies. Notably, the care with which the burials were conducted, and the items found with these individuals indicate that ancient societies likely acknowledged these individuals with trisomy 18 and 21 as members of their communities, from the perspective of burial practice
    corecore