31 research outputs found

    Identification of anti-schistosomal, anthelmintic and anti-parasitic compounds curated and text-mined from the scientific literature

    Get PDF
    More than a billion people are infected with parasitic worms, including nematodes, such as hookworms, and flatworms, such as blood flukes. Few drugs are available to treat worm infections, but high-throughput screening approaches hold promise to identify novel drug candidates. One problem for researchers who find an interesting 'hit' from a high-throughput screen is to identify whether that compound, or a similar compound has previously been published as having anthelmintic or anti-parasitic activity. Here, we present (i) data sets of 2,828 anthelmintic compounds, and 1,269 specific anti-schistosomal compounds, manually curated from scientific papers and books, and (ii) a data set of 24,335 potential anthelmintic and anti-parasitic compounds identified by text-mining PubMed abstracts. We provide their structures in simplified molecular-input line-entry system (SMILES) format so that researchers can easily compare 'hits' from their screens to these anthelmintic compounds and anti-parasitic compounds and find previous literature on them to support/halt their progression in drug discovery pipelines

    Genome sequences and comparative genomics of two Lactobacillus ruminis strains from the bovine and human intestinal tracts

    Get PDF
    peer-reviewedBackground: The genus Lactobacillus is characterized by an extraordinary degree of phenotypic and genotypic diversity, which recent genomic analyses have further highlighted. However, the choice of species for sequencing has been non-random and unequal in distribution, with only a single representative genome from the L. salivarius clade available to date. Furthermore, there is no data to facilitate a functional genomic analysis of motility in the lactobacilli, a trait that is restricted to the L. salivarius clade. Results: The 2.06 Mb genome of the bovine isolate Lactobacillus ruminis ATCC 27782 comprises a single circular chromosome, and has a G+C content of 44.4%. In silico analysis identified 1901 coding sequences, including genes for a pediocin-like bacteriocin, a single large exopolysaccharide-related cluster, two sortase enzymes, two CRISPR loci and numerous IS elements and pseudogenes. A cluster of genes related to a putative pilin was identified, and shown to be transcribed in vitro. A high quality draft assembly of the genome of a second L. ruminis strain, ATCC 25644 isolated from humans, suggested a slightly larger genome of 2.138 Mb, that exhibited a high degree of synteny with the ATCC 27782 genome. In contrast, comparative analysis of L. ruminis and L. salivarius identified a lack of long-range synteny between these closely related species. Comparison of the L. salivarius clade core proteins with those of nine other Lactobacillus species distributed across 4 major phylogenetic groups identified the set of shared proteins, and proteins unique to each group. Conclusions: The genome of L. ruminis provides a comparative tool for directing functional analyses of other members of the L. salivarius clade, and it increases understanding of the divergence of this distinct Lactobacillus lineage from other commensal lactobacilli. The genome sequence provides a definitive resource to facilitate investigation of the genetics, biochemistry and host interactions of these motile intestinal lactobacilli

    TreeFam: a curated database of phylogenetic trees of animal gene families

    Get PDF
    TreeFam is a database of phylogenetic trees of gene families found in animals. It aims to develop a curated resource that presents the accurate evolutionary history of all animal gene families, as well as reliable ortholog and paralog assignments. Curated families are being added progressively, based on seed alignments and trees in a similar fashion to Pfam. Release 1.1 of TreeFam contains curated trees for 690 families and automatically generated trees for another 11 646 families. These represent over 128 000 genes from nine fully sequenced animal genomes and over 45 000 other animal proteins from UniProt; ∼40–85% of proteins encoded in the fully sequenced animal genomes are included in TreeFam. TreeFam is freely available at and

    Non-perturbative dynamics of hot non-Abelian gauge fields: beyond leading log approximation

    Get PDF
    Many aspects of high-temperature gauge theories, such as the electroweak baryon number violation rate, color conductivity, and the hard gluon damping rate, have previously been understood only at leading logarithmic order (that is, neglecting effects suppressed only by an inverse logarithm of the gauge coupling). We discuss how to systematically go beyond leading logarithmic order in the analysis of physical quantities. Specifically, we extend to next-to-leading-log order (NLLO) the simple leading-log effective theory due to Bodeker that describes non-perturbative color physics in hot non-Abelian plasmas. A suitable scaling analysis is used to show that no new operators enter the effective theory at next-to-leading-log order. However, a NLLO calculation of the color conductivity is required, and we report the resulting value. Our NLLO result for the color conductivity can be trivially combined with previous numerical work by G. Moore to yield a NLLO result for the hot electroweak baryon number violation rate.Comment: 20 pages, 1 figur

    The Genome Sequence of Caenorhabditis briggsae: A Platform for Comparative Genomics

    Get PDF
    The soil nematodes Caenorhabditis briggsae and Caenorhabditis elegans diverged from a common ancestor roughly 100 million years ago and yet are almost indistinguishable by eye. They have the same chromosome number and genome sizes, and they occupy the same ecological niche. To explore the basis for this striking conservation of structure and function, we have sequenced the C. briggsae genome to a high-quality draft stage and compared it to the finished C. elegans sequence. We predict approximately 19,500 protein-coding genes in the C. briggsae genome, roughly the same as in C. elegans. Of these, 12,200 have clear C. elegans orthologs, a further 6,500 have one or more clearly detectable C. elegans homologs, and approximately 800 C. briggsae genes have no detectable matches in C. elegans. Almost all of the noncoding RNAs (ncRNAs) known are shared between the two species. The two genomes exhibit extensive colinearity, and the rate of divergence appears to be higher in the chromosomal arms than in the centers. Operons, a distinctive feature of C. elegans, are highly conserved in C. briggsae, with the arrangement of genes being preserved in 96% of cases. The difference in size between the C. briggsae (estimated at approximately 104 Mbp) and C. elegans (100.3 Mbp) genomes is almost entirely due to repetitive sequence, which accounts for 22.4% of the C. briggsae genome in contrast to 16.5% of the C. elegans genome. Few, if any, repeat families are shared, suggesting that most were acquired after the two species diverged or are undergoing rapid evolution. Coclustering the C. elegans and C. briggsae proteins reveals 2,169 protein families of two or more members. Most of these are shared between the two species, but some appear to be expanding or contracting, and there seem to be as many as several hundred novel C. briggsae gene families. The C. briggsae draft sequence will greatly improve the annotation of the C. elegans genome. Based on similarity to C. briggsae, we found strong evidence for 1,300 new C. elegans genes. In addition, comparisons of the two genomes will help to understand the evolutionary forces that mold nematode genomes
    corecore