34 research outputs found

    Dealing with the Data Deluge – New Strategies in Prokaryotic Genome Analysis

    Get PDF
    Recent technological innovations have ignited an explosion in microbial genome sequencing that has fundamentally changed our understanding of biology of microbes and profoundly impacted public health policy. This huge increase in DNA sequence data presents new challenges for the annotation, analysis, and visualization bioinformatics tools. New strategies have been designed to bring an order to this genome sequence shockwave and improve the usability of associated data. Genomes are organized in a hierarchical distance tree using single-copy ribosomal protein marker distances for distance calculation. Protein distance measures dissimilarity between markers of the same type and the subsequent genomic distance averages over the majority of marker-distances, ignoring the outliers. More than 30,000 genomes from public archives have been organized in a marker distance tree resulting in 6,438 species-level clades representing 7,597 taxonomic species. This computational infrastructure provides a foundation for prokaryotic gene and genome analysis, allowing easy access to pre-calculated genome groups at various distance levels. One of the most challenging problems in the current data deluge is the presentation of the relevant data at an appropriate resolution for each application, eliminating data redundancy but keeping biologically interesting variations

    Phage annotation guide: Guidelines for assembly and high-quality annotation

    Get PDF
    All sequencing projects of bacteriophages (phages) should seek to report an accurate and comprehensive annotation of their genomes. This article defines 14 questions for those new to phage genomics that should be addressed before submitting a genome sequence to the International Nucleotide Sequence Database Collaboration or writing a publication

    The ACS LCID Project: RR Lyrae stars as tracers of old population gradients in the isolated dwarf spheroidal galaxy Tucana

    Get PDF
    We present a study of the radial distribution of RR Lyrae variables, which present a range of photometric and pulsational properties, in the dwarf spheroidal galaxy Tucana. We find that the fainter RR Lyrae stars, having a shorter period, are more centrally concentrated than the more luminous, longer period RR Lyrae variables. Through comparison with the predictions of theoretical models of stellar evolution and stellar pulsation, we interpret the fainter RR Lyrae stars as a more metal-rich subsample. In addition, we show that they must be older than about 10 Gyr. Therefore, the metallicity gradient must have appeared very early on in the history of this galaxy.Comment: 5 pages, 5 figures in emulateapj style. Submitted to ApJ Letter

    The ACS LCID Project. I. Short-Period Variables in the Isolated Dwarf Spheroidal Galaxies Cetus & Tucana

    Full text link
    (abridged) We present the first study of the variable star populations in the isolated dwarf spheroidal galaxies (dSph) Cetus and Tucana. Based on Hubble Space Telescope images obtained with the Advanced Camera for Surveys in the F475W and F814W bands, we identified 180 and 371 variables in Cetus and Tucana, respectively. The vast majority are RR Lyrae stars. In Cetus we also found three anomalous Cepheids, four candidate binaries and one candidate long-period variable (LPV), while six anomalous Cepheids and seven LPV candidates were found in Tucana. Of the RR Lyrae stars, 147 were identified as fundamental mode (RRab) and only eight as first-overtone mode (RRc) in Cetus, with mean periods of 0.614 and 0.363 day, respectively. In Tucana we found 216 RRab and 82 RRc giving mean periods of 0.604 and 0.353 day. These values place both galaxies in the so-called Oosterhoff Gap, as is generally the case for dSph. We calculated the distance modulus to both galaxies using different approaches based on the properties of RRab and RRc, namely the luminosity-metallicity and period-luminosity-metallicity relations, and found values in excellent agreement with previous estimates using independent methods: (m-M)_{0,Cet}=24.46+-0.12 and (m-M)_{0,Tuc}=24.74+-0.12, corresponding to 780+-40 kpc and 890+-50 kpc. We also found numerous RR Lyrae variables pulsating in both modes simultaneously (RRd): 17 in Cetus and 60 in Tucana. Tucana is, after Fornax, the second dSph in which such a large fraction of RRd (~17%) has been observed. We provide the photometry and pulsation parameters for all the variables, and compare the latter with values from the literature for well-studied dSph of the Local Group and Galactic globular clusters.Comment: 26 pages, 24 figures, in emulateapj format. To be published in ApJ. Some figures heavily degraded; See http://www.iac.es/project/LCID/?p=publications for a version with full resolution figure

    The National Center for Biotechnology Information's Protein Clusters Database

    Get PDF
    Rapid increases in DNA sequencing capabilities have led to a vast increase in the data generated from prokaryotic genomic studies, which has been a boon to scientists studying micro-organism evolution and to those who wish to understand the biological underpinnings of microbial systems. The NCBI Protein Clusters Database (ProtClustDB) has been created to efficiently maintain and keep the deluge of data up to date. ProtClustDB contains both curated and uncurated clusters of proteins grouped by sequence similarity. The May 2008 release contains a total of 285 386 clusters derived from over 1.7 million proteins encoded by 3806 nt sequences from the RefSeq collection of complete chromosomes and plasmids from four major groups: prokaryotes, bacteriophages and the mitochondrial and chloroplast organelles. There are 7180 clusters containing 376 513 proteins with curated gene and protein functional annotation. PubMed identifiers and external cross references are collected for all clusters and provide additional information resources. A suite of web tools is available to explore more detailed information, such as multiple alignments, phylogenetic trees and genomic neighborhoods. ProtClustDB provides an efficient method to aggregate gene and protein annotation for researchers and is available at http://www.ncbi.nlm.nih.gov/sites/entrez?db=proteinclusters

    The ACS LCID Project:VIII. The short-period Cepheids of Leo A

    Get PDF
    We present the results of a new search for variable stars in the Local Group dwarf galaxy Leo A, based on deep photometry from the Advanced Camera for Surveys onboard the Hubble Space Telescope. We detected 166 bona fide variables in our field, of which about 60 percent are new discoveries, and 33 candidate variables. Of the confirmed variables, we found 156 Cepheids, but only 10 RR Lyrae stars despite nearly 100 percent completeness at the magnitude of the horizontal branch. The RR Lyrae stars include 7 fundamental and 3 first-overtone pulsators, with mean periods of 0.636 and 0.366 day, respectively. From their position on the period-luminosity (PL) diagram and light-curve morphology, we classify 91, 58, and 4 Cepheids as fundamental, first-overtone, and second-overtone mode Classical Cepheids (CC), respectively, and two as population II Cepheids. However, due to the low metallicity of Leo A, about 90 percent of the detected Cepheids have periods shorter than 1.5 days. Comparison with theoretical models indicate that some of the fainter stars classified as CC could be Anomalous Cepheids. We estimate the distance to Leo A using the tip of the RGB (TRGB) and various methods based on the photometric and pulsational properties of the Cepheids and RR Lyrae stars. The distances obtained with the TRGB and RR Lyrae stars agree well with each other while that from the Cepheid PL relations is somewhat larger, which may indicate a mild metallicity effect on the luminosity of the short-period Cepheids. Due to its very low metallicity, Leo A thus serves as a valuable calibrator of the metallicity dependencies of the variable star luminosities.Comment: 16 pages, 13 figures. MNRAS, in pres

    Analysis of spounaviruses as a case study for the overdue reclassification of tailed phages

    Get PDF
    Tailed bacteriophages are the most abundant and diverse viruses in the world, with genome sizes ranging from 10 kbp to over 500 kbp. Yet, due to historical reasons, all this diversity is confined to a single virus order-Caudovirales, composed of just four families: Myoviridae, Siphoviridae, Podoviridae, and the newly created Ackermannviridae family. In recent years, this morphology-based classification scheme has started to crumble under the constant flood of phage sequences, revealing that tailed phages are even more genetically diverse than once thought. This prompted us, the Bacterial and Archaeal Viruses Subcommittee of the International Committee on Taxonomy of Viruses (ICTV), to consider overall reorganization of phage taxonomy. In this study, we used a wide range of complementary methods-including comparative genomics, core genome analysis, and marker gene phylogenetics-to show that the group of Bacillus phage SPO1-related viruses previously classified into the Spounavirinae subfamily, is clearly distinct from other members of the family Myoviridae and its diversity deserves the rank of an autonomous family. Thus, we removed this group from the Myoviridae family and created the family Herelleviridae-a new taxon of the same rank. In the process of the taxon evaluation, we explored the feasibility of different demarcation criteria and critically evaluated the usefulness of our methods for phage classification. The convergence of results, drawing a consistent and comprehensive picture of a new family with associated subfamilies, regardless of method, demonstrates that the tools applied here are particularly useful in phage taxonomy. We are convinced that creation of this novel family is a crucial milestone toward much-needed reclassification in the Caudovirales order.Peer reviewe

    HAYDN: High-precision AsteroseismologY of DeNse stellar fields

    Get PDF
    In the last decade, the Kepler and CoRoT space-photometry missions have demonstrated the potential of asteroseismology as a novel, versatile and powerful tool to perform exquisite tests of stellar physics, and to enable precise and accurate characterisations of stellar properties, with impact on both exoplanetary and Galactic astrophysics. Based on our improved understanding of the strengths and limitations of such a tool, we argue for a new small/medium space mission dedicated to gathering high-precision, high-cadence, long photometric series in dense stellar fields. Such a mission will lead to breakthroughs in stellar astrophysics, especially in the metal poor regime, will elucidate the evolution and formation of open and globular clusters, and aid our understanding of the assembly history and chemodynamics of the Milky Way’s bulge and a few nearby dwarf galaxies
    corecore