11 research outputs found

    Characterization of Nucleotide Misincorporation Patterns in the Iceman's Mitochondrial DNA

    Get PDF
    BACKGROUND: The degradation of DNA represents one of the main issues in the genetic analysis of archeological specimens. In the recent years, a particular kind of post-mortem DNA modification giving rise to nucleotide misincorporation ("miscoding lesions") has been the object of extensive investigations. METHODOLOGY/PRINCIPAL FINDINGS: To improve our knowledge regarding the nature and incidence of ancient DNA nucleotide misincorporations, we have utilized 6,859 (629,975 bp) mitochondrial (mt) DNA sequences obtained from the 5,350-5,100-years-old, freeze-desiccated human mummy popularly known as the Tyrolean Iceman or Otzi. To generate the sequences, we have applied a mixed PCR/pyrosequencing procedure allowing one to obtain a particularly high sequence coverage. As a control, we have produced further 8,982 (805,155 bp) mtDNA sequences from a contemporary specimen using the same system and starting from the same template copy number of the ancient sample. From the analysis of the nucleotide misincorporation rate in ancient, modern, and putative contaminant sequences, we observed that the rate of misincorporation is significantly lower in modern and putative contaminant sequence datasets than in ancient sequences. In contrast, type 2 transitions represent the vast majority (85%) of the observed nucleotide misincorporations in ancient sequences. CONCLUSIONS/SIGNIFICANCE: This study provides a further contribution to the knowledge of nucleotide misincorporation patterns in DNA sequences obtained from freeze-preserved archeological specimens. In the Iceman system, ancient sequences can be clearly distinguished from contaminants on the basis of nucleotide misincorporation rates. This observation confirms a previous identification of the ancient mummy sequences made on a purely phylogenetical basis. The present investigation provides further indication that the majority of ancient DNA damage is reflected by type 2 (cytosine-->thymine/guanine-->adenine) transitions and that type 1 transitions are essentially PCR artifacts

    Computational pan-genomics: Status, promises and challenges

    Get PDF
    Many disciplines, from human genetics and oncology to plant breeding, microbiology and virology, commonly face the challenge of analyzing rapidly increasing numbers of genomes. In case of Homo sapiens, the number of sequenced genomes will approach hundreds of thousands in the next few years. Simply scaling up established bioinformatics pipelines will not be sufficient for leveraging the full potential of such rich genomic data sets. Instead, novel, qualitatively different Computational methods and paradigms are needed.We will witness the rapid extension of Computational pan-genomics, a new sub-area of research in Computational biology. In this article, we generalize existing definitions and understand a pangenome as any collection of genomic sequences to be analyzed jointly or to be used as a reference. We examine already available approaches to construct and use pan-genomes, discuss the potential benefits of future technologies and methodologies and review open challenges from the vantage point of the above-mentioned biological disciplines. As a prominent example for a Computational paradigm shift, we particularly highlight the transition from the representation of reference genomes as strings to representations

    FALDO: a semantic standard for describing the location of nucleotide and protein feature annotation

    Get PDF
    BACKGROUND: Nucleotide and protein sequence feature annotations are essential to understand biology on the genomic, transcriptomic, and proteomic level. Using Semantic Web technologies to query biological annotations, there was no standard that described this potentially complex location information as subject-predicate-object triples. DESCRIPTION: We have developed an ontology, the Feature Annotation Location Description Ontology (FALDO), to describe the positions of annotated features on linear and circular sequences. FALDO can be used to describe nucleotide features in sequence records, protein annotations, and glycan binding sites, among other features in coordinate systems of the aforementioned “omics” areas. Using the same data format to represent sequence positions that are independent of file formats allows us to integrate sequence data from multiple sources and data types. The genome browser JBrowse is used to demonstrate accessing multiple SPARQL endpoints to display genomic feature annotations, as well as protein annotations from UniProt mapped to genomic locations. CONCLUSIONS: Our ontology allows users to uniformly describe – and potentially merge – sequence annotations from multiple sources. Data sources using FALDO can prospectively be retrieved using federalised SPARQL queries against public SPARQL endpoints and/or local private triple stores

    The Ruby UCSC API: accessing the UCSC genome database using Ruby

    Get PDF
    Background: The University of California, Santa Cruz (UCSC) genome database is among the most used sources of genomic annotation in human and other organisms. The database offers an excellent web-based graphical user interface (the UCSC genome browser) and several means for programmatic queries. A simple application programming interface (API) in a scripting language aimed at the biologist was however not yet available. Here, we present the Ruby UCSC API, a library to access the UCSC genome database using Ruby.Results: The API is designed as a BioRuby plug-in and built on the ActiveRecord 3 framework for the object-relational mapping, making writing SQL statements unnecessary. The current version of the API supports databases of all organisms in the UCSC genome database including human, mammals, vertebrates, deuterostomes, insects, nematodes, and yeast. The API uses the bin index—if available—when querying for genomic intervals. The API also supports genomic sequence queries using locally downloaded *.2bit files that are not stored in the official MySQL database. The API is implemented in pure Ruby and is therefore available in different environments and with different Ruby interpreters (including JRuby).Conclusions: Assisted by the straightforward object-oriented design of Ruby and ActiveRecord, the Ruby UCSC API will facilitate biologists to query the UCSC genome database programmatically. The API is available through the RubyGem system. Source code and documentation are available at https://github.com/misshie/bioruby-ucsc-api/ under the Ruby license. Feedback and help is provided via the website at http://rubyucscapi.userecho.com/

    Whole-Genome Pyrosequencing of an Epidemic Multidrug-Resistant Acinetobacter baumannii Strain Belonging to the European Clone II Group ▿ †

    No full text
    The whole-genome sequence of an epidemic, multidrug-resistant Acinetobacter baumannii strain (strain ACICU) belonging to the European clone II group and carrying the plasmid-mediated blaOXA-58 carbapenem resistance gene was determined. The A. baumannii ACICU genome was compared with the genomes of A. baumannii ATCC 17978 and Acinetobacter baylyi ADP1, with the aim of identifying novel genes related to virulence and drug resistance. A. baumannii ACICU has a single chromosome of 3,904,116 bp (which is predicted to contain 3,758 genes) and two plasmids, pACICU1 and pACICU2, of 28,279 and 64,366 bp, respectively. Genome comparison showed 86.4% synteny with A. baumannii ATCC 17978 and 14.8% synteny with A. baylyi ADP1. A conspicuous number of transporters belonging to different superfamilies was predicted for A. baumannii ACICU. The relative number of transporters was much higher in ACICU than in ATCC 17978 and ADP1 (76.2, 57.2, and 62.5 transporters per Mb of genome, respectively). An antibiotic resistance island, AbaR2, was identified in ACICU and had plausibly evolved by reductive evolution from the AbaR1 island previously described in multiresistant strain A. baumannii AYE. Moreover, 36 putative alien islands (pAs) were detected in the ACICU genome; 24 of these had previously been described in the ATCC 17978 genome, 4 are proposed here for the first time and are present in both ATCC 17978 and ACICU, and 8 are unique to the ACICU genome. Fifteen of the pAs in the ACICU genome encode genes related to drug resistance, including membrane transporters and ex novo acquired resistance genes. These findings provide novel insight into the genetic basis of A. baumannii resistance

    Single cell-derived spheroids capture the self-renewing subpopulations of metastatic ovarian cancer

    No full text
    : High Grade Serous Ovarian cancer (HGSOC) is a major unmet need in oncology, due to its precocious dissemination and the lack of meaningful human models for the investigation of disease pathogenesis in a patient-specific manner. To overcome this roadblock, we present a new method to isolate and grow single cells directly from patients' metastatic ascites, establishing the conditions for propagating them as 3D cultures that we refer to as single cell-derived metastatic ovarian cancer spheroids (sMOCS). By single cell RNA sequencing (scRNAseq) we define the cellular composition of metastatic ascites and trace its propagation in 2D and 3D culture paradigms, finding that sMOCS retain and amplify key subpopulations from the original patients' samples and recapitulate features of the original metastasis that do not emerge from classical 2D culture, including retention of individual patients' specificities. By enabling the enrichment of uniquely informative cell subpopulations from HGSOC metastasis and the clonal interrogation of their diversity at the functional and molecular level, this method provides a powerful instrument for precision oncology in ovarian cancer
    corecore