362 research outputs found

    An ontology for major histocompatibility restriction

    Get PDF
    BACKGROUND: MHC molecules are a highly diverse family of proteins that play a key role in cellular immune recognition. Over time, different techniques and terminologies have been developed to identify the specific type(s) of MHC molecule involved in a specific immune recognition context. No consistent nomenclature exists across different vertebrate species. PURPOSE: To correctly represent MHC related data in The Immune Epitope Database (IEDB), we built upon a previously established MHC ontology and created an ontology to represent MHC molecules as they relate to immunological experiments. DESCRIPTION: This ontology models MHC protein chains from 16 species, deals with different approaches used to identify MHC, such as direct sequencing verses serotyping, relates engineered MHC molecules to naturally occurring ones, connects genetic loci, alleles, protein chains and multi-chain proteins, and establishes evidence codes for MHC restriction. Where available, this work is based on existing ontologies from the OBO foundry. CONCLUSIONS: Overall, representing MHC molecules provides a challenging and practically important test case for ontology building, and could serve as an example of how to integrate other ontology building efforts into web resources. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s13326-016-0045-5) contains supplementary material, which is available to authorized users

    TogoDoc Server/Client System: Smart Recommendation and Efficient Management of Life Science Literature

    Get PDF
    In this paper, we describe a server/client literature management system specialized for the life science domain, the TogoDoc system (Togo, pronounced Toe-Go, is a romanization of a Japanese word for integration). The server and the client program cooperate closely over the Internet to provide life scientists with an effective literature recommendation service and efficient literature management. The content-based and personalized literature recommendation helps researchers to isolate interesting papers from the “tsunami” of literature, in which, on average, more than one biomedical paper is added to MEDLINE every minute. Because researchers these days need to cover updates of much wider topics to generate hypotheses using massive datasets obtained from public databases or omics experiments, the importance of having an effective literature recommendation service is rising. The automatic recommendation is based on the content of personal literature libraries of electronic PDF papers. The client program automatically analyzes these files, which are sometimes deeply buried in storage disks of researchers' personal computers. Just saving PDF papers to the designated folders makes the client program automatically analyze and retrieve metadata, rename file names, synchronize the data to the server, and receive the recommendation lists of newly published papers, thus accomplishing effortless literature management. In addition, the tag suggestion and associative search functions are provided for easy classification of and access to past papers (researchers who read many papers sometimes only vaguely remember or completely forget what they read in the past). The TogoDoc system is available for both Windows and Mac OS X and is free. The TogoDoc Client software is available at http://tdc.cb.k.u-tokyo.ac.jp/, and the TogoDoc server is available at https://docman.dbcls.jp/pubmed_recom

    mmView: a web-based viewer of the mmCIF format

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Structural biomolecular data are commonly stored in the PDB format. The PDB format is widely supported by software vendors because of its simplicity and readability. However, the PDB format cannot fully address many informatics challenges related to the growing amount of structural data. To overcome the limitations of the PDB format, a new textual format mmCIF was released in June 1997 in its version 1.0. mmCIF provides extra information which has the advantage of being in a computer readable form. However, this advantage becomes a disadvantage if a human must read and understand the stored data. While software tools exist to help to prepare mmCIF files, the number of available systems simplifying the comprehension and interpretation of the mmCIF files is limited.</p> <p>Findings</p> <p>In this paper we present mmView - a cross-platform web-based application that allows to explore comfortably the structural data of biomacromolecules stored in the mmCIF format. The mmCIF categories can be easily browsed in a tree-like structure, and the corresponding data are presented in a well arranged tabular form. The application also allows to display and investigate biomolecular structures via an integrated Java application Jmol.</p> <p>Conclusions</p> <p>The mmView software system is primarily intended for educational purposes, but it can also serve as a useful research tool. The mmView application is offered in two flavors: as an open-source stand-alone application (available from <url>http://sourceforge.net/projects/mmview</url>) that can be installed on the user's computer, and as a publicly available web server.</p

    Chimpanzees modify intentional gestures to coordinate a search for hidden food

    Get PDF
    Humans routinely communicate to coordinate their activities, persisting and elaborating signals to pursue goals that cannot be accomplished individually. Communicative persistence is associated with complex cognitive skills such as intentionality, because interactants modify their communication in response to another's understanding of their meaning. Here we show that two language-trained chimpanzees effectively use intentional gestures to coordinate with an experimentally naive human to retrieve hidden food, providing some of the most compelling evidence to date for the role of communicative flexibility in successful coordination in nonhumans. Both chimpanzees (named Panzee and Sherman) increase the rate of nonindicative gestures when the experimenter approaches the location of the hidden food. Panzee also elaborates her gestures in relation to the experimenter's pointing, which enables her to find food more effectively than Sherman. Communicative persistence facilitates effective communication during behavioural coordination and is likely to have been important in shaping language evolution

    Modelling height in adolescence: a comparison of methods for estimating the age at peak height velocity

    Get PDF
    Background: Controlling for maturational status and timing is crucial in lifecourse epidemiology. One popular non-invasive measure of maturity is the age at peak height velocity (PHV). There are several ways to estimate age at PHV, but it is unclear which of these to use in practice. Aim: To find the optimal approach for estimating age at PHV. Subjects and methods: Methods included the Preece & Baines non-linear growth model, multi-level models with fractional polynomials, SuperImposition by Translation And Rotation (SITAR) and functional data analysis. These were compared through a simulation study and using data from a large cohort of adolescent boys from the Christ’s Hospital School. Results: The SITAR model gave close to unbiased estimates of age at PHV, but convergence issues arose when measurement error was large. Preece & Baines achieved close to unbiased estimates, but shares similarity with the data generation model for our simulation study and was also computationally inefficient, taking 24 hours to fit the data from Christ’s Hospital School. Functional data analysis consistently converged, but had higher mean bias than SITAR. Almost all methods demonstrated strong correlations (r > 0.9) between true and estimated age at PHV. Conclusions: Both SITAR or the PBGM are useful models for adolescent growth and provide unbiased estimates of age at peak height velocity. Care should be taken as substantial bias and variance can occur with large measurement error

    PubChem3D: Diversity of shape

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The shape diversity of 16.4 million biologically relevant molecules from the PubChem Compound database and their 1.46 billion diverse conformers was explored as a function of molecular volume.</p> <p>Results</p> <p>The diversity of shape space was investigated by determining the shape similarity threshold to achieve a maximum on the count of reference shapes per unit of conformer volume. The rate of growth in shape space, as represented by a decreasing shape similarity threshold, was found to be remarkably smooth as a function of volume. There was no apparent correlation between the count of conformers per unit volume and their diversity, meaning that a single reference shape can describe the shape space of many chemical structures. The ability of a volume to describe the shape space of lesser volumes was also examined. It was shown that a given volume was able to describe 40-70% of the shape diversity of lesser volumes, for the majority of the volume range considered in this study.</p> <p>Conclusion</p> <p>The relative growth of shape diversity as a function of volume and shape similarity is surprisingly uniform. Given the distribution of chemicals in PubChem versus what is theoretically synthetically possible, the results from this analysis should be considered a conservative estimate to the true diversity of shape space.</p

    Representative Proteomes: A Stable, Scalable and Unbiased Proteome Set for Sequence Analysis and Functional Annotation

    Get PDF
    The accelerating growth in the number of protein sequences taxes both the computational and manual resources needed to analyze them. One approach to dealing with this problem is to minimize the number of proteins subjected to such analysis in a way that minimizes loss of information. To this end we have developed a set of Representative Proteomes (RPs), each selected from a Representative Proteome Group (RPG) containing similar proteomes calculated based on co-membership in UniRef50 clusters. A Representative Proteome is the proteome that can best represent all the proteomes in its group in terms of the majority of the sequence space and information. RPs at 75%, 55%, 35% and 15% co-membership threshold (CMT) are provided to allow users to decrease or increase the granularity of the sequence space based on their requirements. We find that a CMT of 55% (RP55) most closely follows standard taxonomic classifications. Further analysis of this set reveals that sequence space is reduced by more than 80% relative to UniProtKB, while retaining both sequence diversity (over 95% of InterPro domains) and annotation information (93% of experimentally characterized proteins). All sets can be browsed and are available for sequence similarity searches and download at http://www.proteininformationresource.org/rps, while the set of 637 RPs determined using a 55% CMT are also available for text searches. Potential applications include sequence similarity searches, protein classification and targeted protein annotation and characterization

    Finding related sentence pairs in MEDLINE

    Get PDF
    We explore the feasibility of automatically identifying sentences in different MEDLINE abstracts that are related in meaning. We compared traditional vector space models with machine learning methods for detecting relatedness, and found that machine learning was superior. The Huber method, a variant of Support Vector Machines which minimizes the modified Huber loss function, achieves 73% precision when the score cutoff is set high enough to identify about one related sentence per abstract on average. We illustrate how an abstract viewed in PubMed might be modified to present the related sentences found in other abstracts by this automatic procedure
    corecore