32 research outputs found

    Ergatis: a web interface and scalable software system for bioinformatics workflows

    Get PDF
    Motivation: The growth of sequence data has been accompanied by an increasing need to analyze data on distributed computer clusters. The use of these systems for routine analysis requires scalable and robust software for data management of large datasets. Software is also needed to simplify data management and make large-scale bioinformatics analysis accessible and reproducible to a wide class of target users

    CloVR: A virtual machine for automated and portable sequence analysis from the desktop using cloud computing

    Get PDF
    Next-generation sequencing technologies have decentralized sequence acquisition, increasing the demand for new bioinformatics tools that are easy to use, portable across multiple platforms, and scalable for high-throughput applications. Cloud computing platforms provide on-demand access to computing infrastructure over the Internet and can be used in combination with custom built virtual machines to distribute pre-packaged with pre-configured software. We describe the Cloud Virtual Resource, CloVR, a new desktop application for push-button automated sequence analysis that can utilize cloud computing resources. CloVR is implemented as a single portable virtual machine (VM) that provides several automated analysis pipelines for microbial genomics, including 16S, whole genome and metagenome sequence analysis. The CloVR VM runs on a personal computer, utilizes local computer resources and requires minimal installation, addressing key challenges in deploying bioinformatics workflows. In addition CloVR supports use of remote cloud computing resources to improve performance for large-scale sequence processing. In a case study, we demonstrate the use of CloVR to automatically process next-generation sequencing data on multiple cloud computing platforms. The CloVR VM and associated architecture lowers the barrier of entry for utilizing complex analysis protocols on both local single- and multi-core computers and cloud systems for high throughput data processing.https://doi.org/10.1186/1471-2105-12-35

    Draft genome sequence of Lactobacillus rhamnosus CRL1505, an immunobiotic strain used in social food programs in Argentina

    Get PDF
    We report the draft genome sequence of the probiotic Lactobacillus rhamnosus strain CRL1505. This new probiotic strain has been included into official Nutritional Programs in Argentina. The draft genome sequence is composed of 3,417,633 bp with 3,327 coding sequences.Fil: Taranto, Maria Pia. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Tucumán. Centro de Referencia para Lactobacilos (i); Argentina;Fil: Villena, Julio Cesar. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Tucumán. Centro de Referencia para Lactobacilos (i); Argentina;Fil: Salva, Maria Susana. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Tucumán. Centro de Referencia para Lactobacilos (i); Argentina;Fil: Alvarez, Gladis Susana. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Tucumán. Centro de Referencia para Lactobacilos (i); Argentina; Universidad Nacional de Tucuman. Facultad de Bioquimica, Quimica y Farmacia. Instituto de Bioquimica Clinica Aplicada; Argentina;Fil: Savoy, Graciela. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Tucumán. Centro de Referencia para Lactobacilos (i); Argentina; Universidad Nacional de Tucumán. Facultad de Bioquímica, Química y Farmacia; Argentina;Fil: Font, Graciela Maria. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Tucumán. Centro de Referencia para Lactobacilos (i); Argentina; Universidad Nacional de Tucumán. Facultad de Bioquímica, Química y Farmacia; Argentina;Fil: Hebert, Elvira Maria. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Tucumán. Centro de Referencia para Lactobacilos (i); Argentina

    METHODS FOR HIGH-THROUGHPUT COMPARATIVE GENOMICS AND DISTRIBUTED SEQUENCE ANALYSIS

    Get PDF
    High-throughput sequencing has accelerated applications of genomics throughout the world. The increased production and decentralization of sequencing has also created bottlenecks in computational analysis. In this dissertation, I provide novel computational methods to improve analysis throughput in three areas: whole genome multiple alignment, pan-genome annotation, and bioinformatics workflows. To aid in the study of populations, tools are needed that can quickly compare multiple genome sequences, millions of nucleotides in length. I present a new multiple alignment tool for whole genomes, named Mugsy, that implements a novel method for identifying syntenic regions. Mugsy is computationally efficient, does not require a reference genome, and is robust in identifying a rich complement of genetic variation including duplications, rearrangements, and large-scale gain and loss of sequence in mixtures of draft and completed genome data. Mugsy is evaluated on the alignment of several dozen bacterial chromosomes on a single computer and was the fastest program evaluated for the alignment of assembled human chromosome sequences from four individuals. A distributed version of the algorithm is also described and provides increased processing throughput using multiple CPUs. Numerous individual genomes are sequenced to study diversity, evolution and classify pan-genomes. Pan-genome annotations contain inconsistencies and errors that hinder comparative analysis, even within a single species. I introduce a new tool, Mugsy-Annotator, that identifies orthologs and anomalous gene structure across a pan-genome using whole genome multiple alignments. Identified anomalies include inconsistently located translation initiation sites and disrupted genes due to draft genome sequencing or pseudogenes. An evaluation of pan-genomes indicates that such anomalies are common and alternative annotations suggested by the tool can improve annotation consistency and quality. Finally, I describe the Cloud Virtual Resource, CloVR, a desktop application for automated sequence analysis that improves usability and accessibility of bioinformatics software and cloud computing resources. CloVR is installed on a personal computer as a virtual machine and requires minimal installation, addressing challenges in deploying bioinformatics workflows. CloVR also seamlessly accesses remote cloud computing resources for improved processing throughput. In a case study, I demonstrate the portability and scalability of CloVR and evaluate the costs and resources for microbial sequence analysis

    Draft Genome Sequence of the Polyextremophilic Halorubrum sp. Strain AJ67, Isolated from Hyperarsenic Lakes in the Argentinian Puna

    Get PDF
    Halorubrum sp. AJ67, an extreme halophilic, UV resistant archae that was isolated from Laguna Antofalla in the Argentinean Puna. The draft genome sequence suggests potent enzyme candidates that are essential to survive in multiple environmental extreme conditions, as high UV radiation, elevated salinity and the presence of critical arsenic concentration.Fil: Burguener, Germán Federico. Facultad de Ciencias Exactas y Naturales. Instituto de Cálculo. Plataforma de Bioinformática Argentina; ArgentinaFil: Maldonado, Marcos Javier. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Tucumán. Planta Piloto de Procesos Industriales Microbiológicos (i); Argentina. Universidad Nacional de Jujuy. Facultad de Ciencias Agrarias; ArgentinaFil: Revale, Santiago. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Rosario. Instituto de Biología Molecular y Celular de Rosario; Argentina. Instituto de Agrobiotecnología de Rosario; ArgentinaFil: Fernández Do Porto, Darío Augusto. Facultad de Ciencias Exactas y Naturales. Instituto de Cálculo. Plataforma de Bioinformática Argentina; ArgentinaFil: Rascovan, Nicolas. Consejo Nacional de Investigaciones Científicas y Técnicas. Instituto de Investigaciones en Ingeniería Genética y Biología Molecular; Argentina. Instituto de Agrobiotecnología de Rosario; ArgentinaFil: Vazquez, Martin Pablo. Consejo Nacional de Investigaciones Científicas y Técnicas. Instituto de Investigaciones en Ingeniería Genética y Biología Molecular; Argentina. Facultad de Ciencias Exactas y Naturales. Instituto de Cálculo. Plataforma de Bioinformática Argentina; ArgentinaFil: Farias, Maria Eugenia. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Tucumán. Planta Piloto de Procesos Industriales Microbiológicos (i); ArgentinaFil: Marti, Marcelo Adrian. Facultad de Ciencias Exactas y Naturales. Instituto de Cálculo. Plataforma de Bioinformática Argentina; Argentina. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Departamento de Química Biológica; ArgentinaFil: Turjanski, Adrian. Facultad de Ciencias Exactas y Naturales. Instituto de Cálculo. Plataforma de Bioinformática Argentina; Argentina. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Departamento de Química Biológica; Argentin

    Genome sequence of the cheese-starter strain Lactobacillus delbrueckii subsp. lactis CRL 581

    Get PDF
    We report the genome sequence of Lactobacillus delbrueckii subsp. lactis CRL 581 (1,911,137 bp, GC 49.7%), a proteolytic strain isolated from a homemade Argentinian hard cheese which has a key role in bacterial nutrition and releases bioactive healthbeneficial peptides from milk proteins.Fil: Hebert, Elvira Maria. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Tucumán. Centro de Referencia para Lactobacilos (i); Argentina;Fil: Raya, Raul Ricardo. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Tucumán. Centro de Referencia para Lactobacilos (i); Argentina;Fil: Brown, Lucia. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Tucumán. Centro de Referencia para Lactobacilos (i); Argentina;Fil: Font, Graciela Maria. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Tucumán. Centro de Referencia para Lactobacilos (i); Argentina; Universidad Nacional de Tucumán. Facultad de Bioquímica, Química y Farmacia; Argentina;Fil: Savoy, Graciela. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Tucumán. Centro de Referencia para Lactobacilos (i); Argentina; Universidad Nacional de Tucumán. Facultad de Bioquímica, Química y Farmacia; Argentina;Fil: Taranto, Maria Pia. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Tucumán. Centro de Referencia para Lactobacilos (i); Argentina

    Draft Genome Sequence of the Nonstarter Bacteriocin-Producing Strain Enterococcus mundtii CRL35

    Get PDF
    Enterococcus mundtii CRL35 is a bacteriocinogenic strain isolated from an artisanal cheese of northwestern Argentina. Here we report its draft genome sequence, consisting of 82 contigs. In silico genomic analysis of biotechnological properties was performed to determine the potential of this microorganism to be used in a food model system.Fil: Bonacina, Julieta. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Tucuman. Centro de Referencia Para Lactobacilos; ArgentinaFil: Saavedra, Maria Lucila. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Tucuman. Centro de Referencia Para Lactobacilos; ArgentinaFil: Suárez, Nadia Elina. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Tucuman. Centro de Referencia Para Lactobacilos; ArgentinaFil: Sesma, Fernando Juan Manuel. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Tucuman. Centro de Referencia Para Lactobacilos; Argentin

    Draft Genome Sequence of the Moderately Halophilic Bacterium Pseudoalteromonas ruthenica Strain CP76

    Get PDF
    Pseudoalteromonas ruthenica strain CP76, isolated from a saltern in Spain, is a moderately halophilic bacterium belonging to the Gammaproteobacteria. Here we report the draft genome sequence, which consists of a 4.0-Mb chromosome, of this strain, which is able to produce the extracellular enzyme haloprotease CPI

    Draft Genome Sequence of the Moderately Halophilic Bacterium Marinobacter lipolyticus Strain SM19

    Get PDF
    Marinobacter lipolyticus strain SM19, isolated from saline soil in Spain, is a moderately halophilic bacterium belonging to the class Gammaproteobacteria. Here, we report the draft genome sequence of this strain, which consists of a 4.0-Mb chromosome and which is able to produce the halophilic enzyme lipase LipBL
    corecore