25 research outputs found

    Genome-wide subcellular localization of putative outer membrane and extracellular proteins in Leptospira interrogans serovar Lai genome using bioinformatics approaches

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>In bacterial pathogens, both cell surface-exposed outer membrane proteins and proteins secreted into the extracellular environment play crucial roles in host-pathogen interaction and pathogenesis. Considerable efforts have been made to identify outer membrane (OM) and extracellular (EX) proteins produced by <it>Leptospira interrogans</it>, which may be used as novel targets for the development of infection markers and leptospirosis vaccines.</p> <p>Result</p> <p>In this study we used a novel computational framework based on combined prediction methods with deduction concept to identify putative OM and EX proteins encoded by the <it>Leptospira interrogans </it>genome. The framework consists of the following steps: (1) identifying proteins homologous to known proteins in subcellular localization databases derived from the "consensus vote" of computational predictions, (2) incorporating homology based search and structural information to enhance gene annotation and functional identification to infer the specific structural characters and localizations, and (3) developing a specific classifier for cytoplasmic proteins (CP) and cytoplasmic membrane proteins (CM) using Linear discriminant analysis (LDA). We have identified 114 putative EX and 63 putative OM proteins, of which 41% are conserved or hypothetical proteins containing sequence and/or protein folding structures similar to those of known EX and OM proteins.</p> <p>Conclusion</p> <p>Overall results derived from the combined computational analysis correlate with the available experimental evidence. This is the most extensive <it>in silico </it>protein subcellular localization identification to date for <it>Leptospira interrogans </it>serovar Lai genome that may be useful in protein annotation, discovery of novel genes and understanding the biology of Leptospira.</p

    World Data Centre for Microorganisms: an information infrastructure to explore and utilize preserved microbial strains worldwide

    Get PDF
    The World Data Centre for Microorganisms (WDCM) was established 50 years ago as the data center of the World Federation for Culture Collections (WFCC) Microbial Resource Center (MIRCEN). WDCM aims to provide integrated information services using big data technology for microbial resource centers and microbiologists all over the world. Here, we provide an overview of WDCM including all of its integrated services. Culture Collections Information Worldwide (CCINFO) provides metadata information on 708 culture collections from 72 countries and regions. Global Catalogue of Microorganism (GCM) gathers strain catalogue information and provides a data retrieval, analysis, and visualization system of microbial resources. Currently, GCM includes more than 368,000 strains from 103 culture collections in 43 countries and regions. Analyzer of Bioresource Citation (ABC) is a data mining tool extracting strain related publications, patents, nucleotide sequences and genome information from public data sources to form a knowledge base. Reference Strain Catalogue (RSC) maintains a database of strains listed in International Standards Organization (ISO) and other international or regional standards. RSC allocates a unique identifier to strains recommended for use in diagnosis and quality control, and hence serves as a valuable cross-platform reference. WDCM provides free access to all these services at www.wdcm.org.National High Technology Research and Development Program of China [2014AA021501, 2014AA021503, 2015AA020108]; International S&T Cooperation Program of China (ISTCP) [2015DFG32550]; Bureau of Science & Technology for Development of Chinese Academy of Sciences (Strategic bio-resources information center) and Field Cloud Project of Chinese Academy of Sciences [XXH12503-05-01]. Funding for open access charge: National High Technology Research and Development Program of China [2014AA021501, 2014AA021503, 2015AA020108]; International S&T Cooperation Program of China (ISTCP) [2015DFG32550] ; Bureau of Science & Technology for Development of Chinese Academy of Sciences [Strategic bio-resources information center]; Field Cloud Project of Chinese Academy of Sciences [XXH12503-05-01]

    MycoBank gearing up for new horizons.

    Get PDF
    MycoBank, a registration system for fungi established in 2004 to capture all taxonomic novelties, acts as a coordination hub between repositories such as Index Fungorum and Fungal Names. Since January 2013, registration of fungal names is a mandatory requirement for valid publication under the International Code of Nomenclature for algae, fungi and plants (ICN). This review explains the database innovations that have been implemented over the past few years, and discusses new features such as advanced queries, registration of typification events (MBT numbers for lecto, epi- and neotypes), the multi-lingual database interface, the nomenclature discussion forum, annotation system, and web services with links to third parties. MycoBank has also introduced novel identification services, linking DNA sequence data to numerous related databases to enable intelligent search queries. Although MycoBank fills an important void for taxon registration, challenges for the future remain to improve links between taxonomic names and DNA data, and to also introduce a formal system for naming fungi known from DNA sequence data only. To further improve the quality of MycoBank data, remote access will now allow registered mycologists to act as MycoBank curators, using Citrix software

    Global catalogue of microorganisms (gcm): a comprehensive database and information retrieval, analysis, and visualization system for microbial resources

    Get PDF
    Abstract Background Throughout the long history of industrial and academic research, many microbes have been isolated, characterized and preserved (whenever possible) in culture collections. With the steady accumulation in observational data of biodiversity as well as microbial sequencing data, bio-resource centers have to function as data and information repositories to serve academia, industry, and regulators on behalf of and for the general public. Hence, the World Data Centre for Microorganisms (WDCM) started to take its responsibility for constructing an effective information environment that would promote and sustain microbial research data activities, and bridge the gaps currently present within and outside the microbiology communities. Description Strain catalogue information was collected from collections by online submission. We developed tools for automatic extraction of strain numbers and species names from various sources, including Genbank, Pubmed, and SwissProt. These new tools connect strain catalogue information with the corresponding nucleotide and protein sequences, as well as to genome sequence and references citing a particular strain. All information has been processed and compiled in order to create a comprehensive database of microbial resources, and was named Global Catalogue of Microorganisms (GCM). The current version of GCM contains information of over 273,933 strains, which includes 43,436bacterial, fungal and archaea species from 52 collections in 25 countries and regions.A number of online analysis and statistical tools have been integrated, together with advanced search functions, which should greatly facilitate the exploration of the content of GCM. Conclusion A comprehensive dynamic database of microbial resources has been created, which unveils the resources preserved in culture collections especially for those whose informatics infrastructures are still under development, which should foster cumulative research, facilitating the activities of microbiologists world-wide, who work in both public and industrial research centres. This database is available from http://gcm.wfcc.info.Peer Reviewe

    The global catalogue of microorganisms 10K type strain sequencing project: closing the genomic gaps for the validly published prokaryotic and fungi species

    Get PDF
    Genomic information is essential for taxonomic, phylogenetic and functional studies to comprehensively decipher the characteristics of microorganisms, to explore microbiomes through metagenomics, and to answer fundamental questions of nature and human life. However, large gaps remain in the available genomic sequencing information published for bacterial and archaeal species, and the gaps are even larger for fungal type strains. The Global Catalogue of Microorganisms (GCM) leads an internationally coordinated effort to sequence type strains and close gaps in the genomic maps of microbes. Hence, the GCM aims to promote research by deep-mining genomic data.This work was supported by the Strategic Priority Research Program of the Chinese Academy of Sciences (grant XDA19050301), the Bureau of International Cooperation of the Chinese Academy of Sciences (grants 153211KYSB20160029 and 153211KYSB20150010), the National Key Research Program of China (grants 2017YFC1201202, 2016YFC1201303, and 2016YFC0901702), the 13th Five-year Informatization Plan of the Chinese Academy of Sciences (grant XXH13506), and the National Science Foundation for Young Scientists of China (grant 31701157).info:eu-repo/semantics/publishedVersio

    Associate Editor: Dr. Limsoon Wong

    No full text
    Summary: sMOL Explorer is a 2D ligand-based computational tool that provides three major functionalities: data management, information retrieval and extraction, and statistical analysis and data mining through Web interface. With sMOL Explorer, users can create personal databases by adding each small molecule via a drawing interface or uploading the data files from internal and external projects into the sMOL database. Then, the database can be browsed and queried with textual and structural similarity search. The molecule can also be submitted to search against external public databases including PubChem, KEGG, DrugBank and eMolecules. Moreover, users can easily access a variety of data mining tools from Weka and R packages to perform analysis including (1) finding the frequent substructure, (2) clustering the molecular fingerprints, (3) identifying and removing irrelevant attributes from the data, and (4) building the classification model of biological activity. Availability: sMOL Explorer is an Open Source project and is freely available to all interested users a
    corecore