42 research outputs found

    biobambam: tools for read pair collation based algorithms on BAM files

    Get PDF
    Sequence alignment data is often ordered by coordinate (id of the reference sequence plus position on the sequence where the fragment was mapped) when stored in BAM files, as this simplifies the extraction of variants between the mapped data and the reference or of variants within the mapped data. In this order paired reads are usually separated in the file, which complicates some other applications like duplicate marking or conversion to the FastQ format which require to access the full information of the pairs. In this paper we introduce biobambam, an API for efficient BAM file reading supporting the efficient collation of alignments by read name without performing a complete resorting of the input file and some tools based on this API performing tasks like marking duplicate reads and conversion to the FastQ format. In comparison with previous approaches to problems involving the collation of alignments by read name like the BAM to FastQ or duplication marking utilities in the Picard suite the approach of biobambam can often perform an equivalent task more efficiently in terms of the required main memory and run-time.Comment: 17 pages, 3 figures, 2 table

    A new Loureedia species on overgrazed former cork oak forest in Morocco (Araneae: Eresidae)

    Get PDF
    In this paper a new velvet spider species from Morocco is described from an overgrazed former cork oak [Quercus suber (Linné 1753)] forest. It is the second known species of the hitherto monotypic genus Loureedia. Loureedia maroccana sp. n. is distinguished from L. annulipes (Lucas, 1857) by the morphology of the conductor, the anteriorly widening cephalic region of the prosoma and opisthosoma decorated with a lobed, bright red marking on the dorsal side. Furthermore, three partial gene fragment sequences (histone 3, 28S ribosomal and cytochrome c oxidase) are also given, supporting the establishment of the new species

    Challenges and opportunities in understanding microbial communities with metagenome assembly (accompanied by IPython Notebook tutorial)

    Get PDF
    Metagenomic investigations hold great promise for informing the genetics, physiology, and ecology of environmental microorganisms. Current challenges for metagenomic analysis are related to our ability to connect the dots between sequencing reads, their population of origin, and their encoding functions. Assembly-based methods reduce dataset size by extending overlapping reads into larger contiguous sequences (contigs), providing contextual information for genetic sequences that does not rely on existing references. These methods, however, tend to be computationally intensive and are again challenged by sequencing errors as well as by genomic repeats While numerous tools have been developed based on these methodological concepts, they present confounding choices and training requirements to metagenomic investigators. To help with accessibility to assembly tools, this review also includes an IPython Notebook metagenomic assembly tutorial. This tutorial has instructions for execution any operating system using Amazon Elastic Cloud Compute and guides users through downloading, assembly, and mapping reads to contigs of a mock microbiome metagenome. Despite its challenges, metagenomic analysis has already revealed novel insights into many environments on Earth. As software, training, and data continue to emerge, metagenomic data access and its discoveries will to grow

    Comparative genomics of Burkholderia multivorans, a ubiquitous pathogen with a highly conserved genomic structure

    Get PDF
    The natural environment serves as a reservoir of opportunistic pathogens. A well-established method for studying the epidemiology of such opportunists is multilocus sequence typing, which in many cases has defined strains predisposed to causing infection. Burkholderia multivorans is an important pathogen in people with cystic fibrosis (CF) and its epidemiology suggests that strains are acquired from non-human sources such as the natural environment. This raises the central question of whether the isolation source (CF or environment) or the multilocus sequence type (ST) of B. multivorans better predicts their genomic content and functionality. We identified four pairs of B. multivorans isolates, representing distinct STs and consisting of one CF and one environmental isolate each. All genomes were sequenced using the PacBio SMRT sequencing technology, which resulted in eight high-quality B. multivorans genome assemblies. The present study demonstrated that the genomic structure of the examined B. multivorans STs is highly conserved and that the B. multivorans genomic lineages are defined by their ST. Orthologous protein families were not uniformly distributed among chromosomes, with core orthologs being enriched on the primary chromosome and ST-specific orthologs being enriched on the second and third chromosome. The ST-specific orthologs were enriched in genes involved in defense mechanisms and secondary metabolism, corroborating the strain-specificity of these virulence characteristics. Finally, the same B. multivorans genomic lineages occur in both CF and environmental samples and on different continents, demonstrating their ubiquity and evolutionary persistence

    Mutational analysis of the antitoxin in the Lactococcal type III toxin-antitoxin system AbiQ

    Get PDF
    The lactococcal abortive phage infection mechanism AbiQ recently was classified as a type III toxin-antitoxin system in which the toxic protein (ABIQ) is regulated following cleavage of its repeated noncoding RNA antitoxin (antiQ). In this study, we investigated the role of the antitoxin in antiphage activity. The cleavage of antiQ by ABIQ was characterized using 5' rapid amplification of cDNA ends PCR and was located in an adenine-rich region of antiQ. We next generated a series of derivatives with point mutations within antiQ or with various numbers of antiQ repetitions. These modifications were analyzed for their effect on the antiphage activity (efficiency of plaquing) and on the endoribonuclease activity (Northern hybridization). We observed that increasing or reducing the number of antiQ repeats significantly decreased the antiphage activity of the system. Several point mutations had a similar effect on the antiphage activity and were associated with changes in the digestion profile of antiQ. Interestingly, a point mutation in the putative pseudoknot structure of antiQ mutants led to an increased AbiQ antiphage activity, thereby offering a novel way to increase the activity of an abortive infection mechanism
    corecore