7 research outputs found

    Informatics Technology Mimics Ecology: Dense, Mutualistic Collaboration Networks Are Associated with Higher Publication Rates

    Get PDF
    Information technology (IT) adoption enables biomedical research. Publications are an accepted measure of research output, and network models can describe the collaborative nature of publication. In particular, ecological networks can serve as analogies for publication and technology adoption. We constructed network models of adoption of bioinformatics programming languages and health IT (HIT) from the literature

    New in protein structure and function annotation: Hotspots, single nucleotide polymorphisms and the 'Deep Web'

    Get PDF
    The rapidly increasing quantity of protein sequence data continues to widen the gap between available sequences and annotations. Comparative modeling suggests some aspects of the 3D structures of approximately half of all known proteins; homology- and network-based inferences annotate some aspect of function for a similar fraction of the proteome. For most known protein sequences, however, there is detailed knowledge about neither their function nor their structure. Comprehensive efforts towards the expert curation of sequence annotations have failed to meet the demand of the rapidly increasing number of available sequences. Only the automated prediction of protein function in the absence of homology can close the gap between available sequences and annotations in the foreseeable future. This review focuses on two novel methods for automated annotation, and briefly presents an outlook on how modern web software may revolutionize the field of protein sequence annotation. First, predictions of protein binding sites and functional hotspots, and the evolution of these into the most successful type of prediction of protein function from sequence will be discussed. Second, a new tool, comprehensive in silico mutagenesis, which contributes important novel predictions of function and at the same time prepares for the onset of the next sequencing revolution, will be described. While these two new sub-fields of protein prediction represent the breakthroughs that have been achieved methodologically, it will then be argued that a different development might further change the way biomedical researchers benefit from annotations: modern web software can connect the worldwide web in any browser with the 'Deep Web' (ie, proprietary data resources). The availability of this direct connection, and the resulting access to a wealth of data, may impact drug discovery and development more than any existing method that contributes to protein annotation

    Web-based Named Entity Recognition and Data Integration to Accelerate Molecular Biology Research

    Get PDF
    Finding information about a biological entity is a step tightly bound to molecular biology research. Despite ongoing efforts, this task is both tedious and time consuming, and tends to become Sisyphean as the number of entities increases. Our aim is to assist researchers by providing them with summary information about biological entities while they are browsing the web, as well as with simplified programmatic access to biological data. To materialise this aim we employ emerging web technologies offering novel web-browsing experiences and new ways of software communication Reflect is a tool that couples biological named entity recognition with informative summaries, and can be applied to any web page, during web browsing. Invoked either via its browser extensions or via its web page, Reflect highlights gene, protein and chemical molecule names in a web page, and, dynamically, attaches to them summary information. The latter provides an overview of what is known about the entity, such as a description, the domain composition, the 3D structure and links to more detailed resources. The annotation process occurs via easy-to-use interfaces. The fast performance allows for Reflect to be an interactive companion for scientific readers/researchers, while they are surfing the internet. OnTheFly is a web-based application that not only extends Reflect functionality to Microsoft Word, Microsoft Excel, PDF and plain text format files, but also supports the extraction of networks of known and predicted interactions about the entities recognised in a document. A combination of Reflect and OnTheFly offers a data annotation solution for documents used by life science researchers throughout their work. EasySRS is a set of remote methods that expose the functionality of the Sequence Retrieval System (SRS), a data integration platform used in providing access to life science information including genetic, protein, expression and pathway data. EasySRS supports simultaneous queries to all of the integrated resources. Accessed from a single point, via the web, and based on a simple, common query format, EasySRS facilitates the task of biological data collection and annotation. EasySRS has been employed to enrich the entries of a Plant Defence Mechanism database. UniprotProfiler is a prototype application that employs EasySRS to generate graphs of knowledge based on database record cross-references. These graphs are converted into 3D diagrams of interconnected data. The 3D diagram generation occurs via Systems Biology visualisation tools that employ intuitive graphs to replace long result lists and facilitate hypothesis generation and knowledge discovery

    Requirements-oriented methodology for evaluating ontologies

    Get PDF
    Ontologies play key roles in many applications today. Therefore, whether using a newly-specified ontology or an existing ontology for use in its target application, it is important to determine the suitability of an ontology to the application at hand. This need is addressed by carrying out ontology evaluation, which determines qualities of an ontology using methodologies, criteria or measures. However, for addressing the ontology requirements from a given application, it is necessary to determine what the appropriate set of criteria and measures are. In this thesis, we propose a Requirements-Oriented Methodology for Evaluating Ontologies (ROMEO). ROMEO outlines a methodology for determining appropriate methods for ontology evaluation that incorporates a suite of existing ontology evaluation criteria and measures. ROMEO helps ontology engineers to determine relevant ontology evaluation measures for a given set of ontology requirements by linking these requirements to existing ontology evaluation measures through a set of questions. There are three main parts to ROMEO. First, ontology requirements are elicited from a given application and form the basis for an appropriate evaluation of ontologies. Second, appropriate questions are mapped to each ontology requirement. Third, relevant ontology evaluation measures are mapped to each of those questions. From the ontology requirements of an application, ROMEO is used to determine appropriate methods for ontology evaluation by mapping applicable questions to the requirements and mapping those questions to appropriate measures. In this thesis, we perform the ROMEO methodology to obtain appropriate ontology evaluation methods for ontology-driven applications through case studies of Lonely Planet and Wikipedia. Since the mappings determined by ROMEO are dependent on the analysis of the ontology engineer, the validation of these mappings is needed. As such, in addition to proposing the ROMEO methodology, a method for the empirical validation of ROMEO mappings is proposed in this thesis. We report on two empirical validation experiments that are carried out in controlled environments to examine the performance of the ontologies over a set of tasks. These tasks vary and are used to compare the performance of a set of ontologies in the respective experimental environment. The ontologies used vary on a specific ontology quality or measure being examined. Empirical validation experiments are conducted for two mappings between questions and their associated measures, which are drawn from case studies of Lonely Planet and Wikipedia. These validation experiments focus on mappings between questions and their measures. Furthermore, as these mappings are application-independent, they may be reusable in subsequent applications of the ROMEO methodology. Using a ROMEO mapping from the Lonely Planet case study, we validate a mapping of a coverage question to the F-measure. The validation experiment carried out for this mapping was inconclusive, thus requiring further analysis. Using a ROMEO mapping from the Wikipedia case study, we carry out a separate validation experiment examining a mapping between an intersectedness question and the tangledness measure. The results from this experiment showed the mapping to be valid. For future work, we propose additional validation experiments for mappings that have been identified between questions and measures

    Ontology Driven Dynamic Linking of Biology Resources

    No full text

    Ontology Driven Dynamic Linking of Biology Resources

    No full text
    Biologists were early adopters of the Web and continue to use it as the primary means of delivering data, tools and knowledge to their community. The Web is made by the links between pages, yet these links have many limitations: they are static and maintained by hand; they can only link one lexical item to another single resource; ownership is necessary for the placement of link anchors and the link mechanism is essentially inflexible. Dynamic linking services, supported by ontologies, offer a mechanism to overcome such restrictions. The Conceptual Open Hypermedia Service (COHSE) system enhances web resources through the dynamic addition of hypertext links. These links are derived through the use of an ontology and associated lexicon along with a mapping from concepts to possible link targets. We describe an application of COHSE to Bioinformatics, using the Gene Ontology (GO) as an ontology and associated keyword mappings and GO associations as link targets. The resulting demonstrator (referred to here as GOHSE) provides both glossary functionality and the possibility of building knowledge based hypertext structures linking bioinformatics resources. 1
    corecore