17 research outputs found

    Complete gene expression profiling of Saccharopolyspora erythraea using GeneChip DNA microarrays

    Get PDF
    The Saccharopolyspora erythraea genome sequence, recently published, presents considerable divergence from those of streptomycetes in gene organization and function, confirming the remarkable potential of S. erythraea for producing many other secondary metabolites in addition to erythromycin. In order to investigate, at whole transcriptome level, how S. erythraea genes are modulated, a DNA microarray was specifically designed and constructed on the S. erythraea strain NRRL 2338 genome sequence, and the expression profiles of 6494 ORFs were monitored during growth in complex liquid medium

    Complete genome sequence of a serotype 11A, ST62 Streptococcus pneumoniae invasive isolate

    Get PDF
    <p>Abstract</p> <p>Background</p> <p><it>Streptococcus pneumoniae </it>is an important human pathogen representing a major cause of morbidity and mortality worldwide. We sequenced the genome of a serotype 11A, ST62 <it>S. pneumoniae </it>invasive isolate (AP200), that was erythromycin-resistant due to the presence of the <it>erm</it>(TR) determinant, and carried out analysis of the genome organization and comparison with other pneumococcal genomes.</p> <p>Results</p> <p>The genome sequence of <it>S. pneumoniae </it>AP200 is 2,130,580 base pair in length. The genome carries 2216 coding sequences (CDS), 56 tRNA, and 12 rRNA genes. Of the CDSs, 72.9% have a predicted biological known function. AP200 contains the pilus islet 2 and, although its phenotype corresponds to serotype 11A, it contains an 11D capsular locus. Chromosomal rearrangements resulting from a large inversion across the replication axis, and horizontal gene transfer events were observed. The chromosomal inversion is likely implicated in the rebalance of the chromosomal architecture affected by the insertions of two large exogenous elements, the <it>erm</it>(TR)-carrying Tn<it>1806 </it>and a functional prophage designated ϕSpn_200. Tn<it>1806 </it>is 52,457 bp in size and comprises 49 ORFs. Comparative analysis of Tn<it>1806 </it>revealed the presence of a similar genetic element or part of it in related species such as <it>Streptococcus pyogenes </it>and also in the anaerobic species <it>Finegoldia magna, Anaerococcus prevotii </it>and <it>Clostridium difficile</it>. The genome of ϕSpn_200 is 35,989 bp in size and is organized in 47 ORFs grouped into five functional modules. Prophages similar to ϕSpn_200 were found in pneumococci and in other streptococcal species, showing a high degree of exchange of functional modules. ϕSpn_200 viral particles have morphologic characteristics typical of the <it>Siphoviridae </it>family and are capable of infecting a pneumococcal recipient strain.</p> <p>Conclusions</p> <p>The sequence of <it>S. pneumoniae </it>AP200 chromosome revealed a dynamic genome, characterized by chromosomal rearrangements and horizontal gene transfers. The overall diversity of AP200 is driven mainly by the presence of the exogenous elements Tn<it>1806 </it>and ϕSpn_200 that show large gene exchanges with other genetic elements of different bacterial species. These genetic elements likely provide AP200 with additional genes, such as those conferring antibiotic-resistance, promoting its adaptation to the environment.</p

    Community-driven development for computational biology at Sprints, Hackathons and Codefests

    Get PDF
    Background: Computational biology comprises a wide range of technologies and approaches. Multiple technologies can be combined to create more powerful workflows if the individuals contributing the data or providing tools for its interpretation can find mutual understanding and consensus. Much conversation and joint investigation are required in order to identify and implement the best approaches. Traditionally, scientific conferences feature talks presenting novel technologies or insights, followed up by informal discussions during coffee breaks. In multi-institution collaborations, in order to reach agreement on implementation details or to transfer deeper insights in a technology and practical skills, a representative of one group typically visits the other. However, this does not scale well when the number of technologies or research groups is large. Conferences have responded to this issue by introducing Birds-of-a-Feather (BoF) sessions, which offer an opportunity for individuals with common interests to intensify their interaction. However, parallel BoF sessions often make it hard for participants to join multiple BoFs and find common ground between the different technologies, and BoFs are generally too short to allow time for participants to program together. Results: This report summarises our experience with computational biology Codefests, Hackathons and Sprints, which are interactive developer meetings. They are structured to reduce the limitations of traditional scientific meetings described above by strengthening the interaction among peers and letting the participants determine the schedule and topics. These meetings are commonly run as loosely scheduled "unconferences" (self-organized identification of participants and topics for meetings) over at least two days, with early introductory talks to welcome and organize contributors, followed by intensive collaborative coding sessions. We summarise some prominent achievements of those meetings and describe differences in how these are organised, how their audience is addressed, and their outreach to their respective communities. Conclusions: Hackathons, Codefests and Sprints share a stimulating atmosphere that encourages participants to jointly brainstorm and tackle problems of shared interest in a self-driven proactive environment, as well as providing an opportunity for new participants to get involved in collaborative projects

    Characterization of Nucleotide Misincorporation Patterns in the Iceman's Mitochondrial DNA

    Get PDF
    BACKGROUND: The degradation of DNA represents one of the main issues in the genetic analysis of archeological specimens. In the recent years, a particular kind of post-mortem DNA modification giving rise to nucleotide misincorporation ("miscoding lesions") has been the object of extensive investigations. METHODOLOGY/PRINCIPAL FINDINGS: To improve our knowledge regarding the nature and incidence of ancient DNA nucleotide misincorporations, we have utilized 6,859 (629,975 bp) mitochondrial (mt) DNA sequences obtained from the 5,350-5,100-years-old, freeze-desiccated human mummy popularly known as the Tyrolean Iceman or Otzi. To generate the sequences, we have applied a mixed PCR/pyrosequencing procedure allowing one to obtain a particularly high sequence coverage. As a control, we have produced further 8,982 (805,155 bp) mtDNA sequences from a contemporary specimen using the same system and starting from the same template copy number of the ancient sample. From the analysis of the nucleotide misincorporation rate in ancient, modern, and putative contaminant sequences, we observed that the rate of misincorporation is significantly lower in modern and putative contaminant sequence datasets than in ancient sequences. In contrast, type 2 transitions represent the vast majority (85%) of the observed nucleotide misincorporations in ancient sequences. CONCLUSIONS/SIGNIFICANCE: This study provides a further contribution to the knowledge of nucleotide misincorporation patterns in DNA sequences obtained from freeze-preserved archeological specimens. In the Iceman system, ancient sequences can be clearly distinguished from contaminants on the basis of nucleotide misincorporation rates. This observation confirms a previous identification of the ancient mummy sequences made on a purely phylogenetical basis. The present investigation provides further indication that the majority of ancient DNA damage is reflected by type 2 (cytosine-->thymine/guanine-->adenine) transitions and that type 1 transitions are essentially PCR artifacts

    The 2nd DBCLS BioHackathon: interoperable bioinformatics Web services for integrated applications

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The interaction between biological researchers and the bioinformatics tools they use is still hampered by incomplete interoperability between such tools. To ensure interoperability initiatives are effectively deployed, end-user applications need to be aware of, and support, best practices and standards. Here, we report on an initiative in which software developers and genome biologists came together to explore and raise awareness of these issues: BioHackathon 2009.</p> <p>Results</p> <p>Developers in attendance came from diverse backgrounds, with experts in Web services, workflow tools, text mining and visualization. Genome biologists provided expertise and exemplar data from the domains of sequence and pathway analysis and glyco-informatics. One goal of the meeting was to evaluate the ability to address real world use cases in these domains using the tools that the developers represented. This resulted in i) a workflow to annotate 100,000 sequences from an invertebrate species; ii) an integrated system for analysis of the transcription factor binding sites (TFBSs) enriched based on differential gene expression data obtained from a microarray experiment; iii) a workflow to enumerate putative physical protein interactions among enzymes in a metabolic pathway using protein structure data; iv) a workflow to analyze glyco-gene-related diseases by searching for human homologs of glyco-genes in other species, such as fruit flies, and retrieving their phenotype-annotated SNPs.</p> <p>Conclusions</p> <p>Beyond deriving prototype solutions for each use-case, a second major purpose of the BioHackathon was to highlight areas of insufficiency. We discuss the issues raised by our exploration of the problem/solution space, concluding that there are still problems with the way Web services are modeled and annotated, including: i) the absence of several useful data or analysis functions in the Web service "space"; ii) the lack of documentation of methods; iii) lack of compliance with the SOAP/WSDL specification among and between various programming-language libraries; and iv) incompatibility between various bioinformatics data formats. Although it was still difficult to solve real world problems posed to the developers by the biological researchers in attendance because of these problems, we note the promise of addressing these issues within a semantic framework.</p

    The DBCLS BioHackathon: standardization and interoperability for bioinformatics web services and workflows. The DBCLS BioHackathon Consortium*

    Get PDF
    Web services have become a key technology for bioinformatics, since life science databases are globally decentralized and the exponential increase in the amount of available data demands for efficient systems without the need to transfer entire databases for every step of an analysis. However, various incompatibilities among database resources and analysis services make it difficult to connect and integrate these into interoperable workflows. To resolve this situation, we invited domain specialists from web service providers, client software developers, Open Bio* projects, the BioMoby project and researchers of emerging areas where a standard exchange data format is not well established, for an intensive collaboration entitled the BioHackathon 2008. The meeting was hosted by the Database Center for Life Science (DBCLS) and Computational Biology Research Center (CBRC) and was held in Tokyo from February 11th to 15th, 2008. In this report we highlight the work accomplished and the common issues arisen from this event, including the standardization of data exchange formats and services in the emerging fields of glycoinformatics, biological interaction networks, text mining, and phyloinformatics. In addition, common shared object development based on BioSQL, as well as technical challenges in large data management, asynchronous services, and security are discussed. Consequently, we improved interoperability of web services in several fields, however, further cooperation among major database centers and continued collaborative efforts between service providers and software developers are still necessary for an effective advance in bioinformatics web service technologies

    The 3rd DBCLS BioHackathon: improving life science data integration with Semantic Web technologies.

    Get PDF
    BACKGROUND: BioHackathon 2010 was the third in a series of meetings hosted by the Database Center for Life Sciences (DBCLS) in Tokyo, Japan. The overall goal of the BioHackathon series is to improve the quality and accessibility of life science research data on the Web by bringing together representatives from public databases, analytical tool providers, and cyber-infrastructure researchers to jointly tackle important challenges in the area of in silico biological research. RESULTS: The theme of BioHackathon 2010 was the 'Semantic Web', and all attendees gathered with the shared goal of producing Semantic Web data from their respective resources, and/or consuming or interacting those data using their tools and interfaces. We discussed on topics including guidelines for designing semantic data and interoperability of resources. We consequently developed tools and clients for analysis and visualization. CONCLUSION: We provide a meeting report from BioHackathon 2010, in which we describe the discussions, decisions, and breakthroughs made as we moved towards compliance with Semantic Web technologies - from source provider, through middleware, to the end-consumer.RIGHTS : This article is licensed under the BioMed Central licence at http://www.biomedcentral.com/about/license which is similar to the 'Creative Commons Attribution Licence'. In brief you may : copy, distribute, and display the work; make derivative works; or make commercial use of the work - under the following conditions: the original author must be given credit; for any reuse or distribution, it must be made clear to others what the license terms of this work are

    BioHackathon series in 2011 and 2012: penetration of ontology and linked data in life science domains

    Get PDF
    The application of semantic technologies to the integration of biological data and the interoperability of bioinformatics analysis and visualization tools has been the common theme of a series of annual BioHackathons hosted in Japan for the past five years. Here we provide a review of the activities and outcomes from the BioHackathons held in 2011 in Kyoto and 2012 in Toyama. In order to efficiently implement semantic technologies in the life sciences, participants formed various sub-groups and worked on the following topics: Resource Description Framework (RDF) models for specific domains, text mining of the literature, ontology development, essential metadata for biological databases, platforms to enable efficient Semantic Web technology development and interoperability, and the development of applications for Semantic Web data. In this review, we briefly introduce the themes covered by these sub-groups. The observations made, conclusions drawn, and software development projects that emerged from these activities are discussed
    corecore