800 research outputs found

    PowerAqua: fishing the semantic web

    Get PDF
    The Semantic Web (SW) offers an opportunity to develop novel, sophisticated forms of question answering (QA). Specifically, the availability of distributed semantic markup on a large scale opens the way to QA systems which can make use of such semantic information to provide precise, formally derived answers to questions. At the same time the distributed, heterogeneous, large-scale nature of the semantic information introduces significant challenges. In this paper we describe the design of a QA system, PowerAqua, designed to exploit semantic markup on the web to provide answers to questions posed in natural language. PowerAqua does not assume that the user has any prior information about the semantic resources. The system takes as input a natural language query, translates it into a set of logical queries, which are then answered by consulting and aggregating information derived from multiple heterogeneous semantic sources

    Relationship between interRAI HC and the ICF: opportunity for operationalizing the ICF

    Get PDF
    Background: The International Classification of Functioning, Disability and Health (ICF) is embraced as a framework to conceptualize human functioning and disability. Health professionals choose measures to represent the domains of the framework. The ICF coding classification is an administrative system but multiple studies have linked diverse clinical assessments to ICF codes. InterRAI-HC (home care) is an assessment designed to assist planning of care for patients receiving home care. Examining the relationship between the ICF and the interRAI HC is of particular interest because the interRAI assessments are widely used in clinical practice and research, are computerized, and uploaded to databases that serve multiple purposes including public reporting of quality in Canada and internationally. The objective of this study was to examine the relationship between the interRAI HC (home care) assessment and the ICF. Specifically, the goal was to determine the proportion of interRAI HC items that can be linked to each of the major domains of the ICF (Body Function, Body Structure, Activities and Participation, and the Environmental Factors), the chapters and the specific ICF codes

    GOSim – an R-package for computation of information theoretic GO similarities between terms and gene products

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>With the increased availability of high throughput data, such as DNA microarray data, researchers are capable of producing large amounts of biological data. During the analysis of such data often there is the need to further explore the similarity of genes not only with respect to their expression, but also with respect to their functional annotation which can be obtained from Gene Ontology (GO).</p> <p>Results</p> <p>We present the freely available software package <it>GOSim</it>, which allows to calculate the functional similarity of genes based on various information theoretic similarity concepts for GO terms. <it>GOSim </it>extends existing tools by providing additional lately developed functional similarity measures for genes. These can e.g. be used to cluster genes according to their biological function. Vice versa, they can also be used to evaluate the homogeneity of a given grouping of genes with respect to their GO annotation. <it>GOSim </it>hence provides the researcher with a flexible and powerful tool to combine knowledge stored in GO with experimental data. It can be seen as complementary to other tools that, for instance, search for significantly overrepresented GO terms within a given group of genes.</p> <p>Conclusion</p> <p><it>GOSim </it>is implemented as a package for the statistical computing environment <it>R </it>and is distributed under GPL within the CRAN project.</p

    Telephone and face to face methods of assessment of veteran's community reintegration yield equivalent results

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The Community Reintegration of Service Members (CRIS) is a new measure of community reintegration developed to measure veteran's participation in life roles. It consists of three sub-scales: Extent of Participation (Extent), Perceived Limitations with Participation (Perceived), and Satisfaction with Participation (Satisfaction). Testing of the CRIS measure to date has utilized in-person administration. Administration of the CRIS measure by telephone, if equivalent to in-person administration, would be desirable to lower cost and decrease administrative burden. The purpose of this study was to test the equivalence of telephone and in-person mode of CRIS administration.</p> <p>Methods</p> <p>A convenience sample of 102 subjects (76% male, 24% female, age mean = 49 years, standard deviation = 8.3) were randomly assigned to received either telephone interview at Visit 1 and in-person interview at Visit 2, or in-person interview at Visit 1 and telephone interview a Visit 2. Both Visits were conducted within one week. Intraclass correlation coefficients, ICC (2,1), were used to evaluate correspondence between modes for both item scores and summary scores. ANOVAs with mode order as a covariate were used to test for presence of an ordering effect.</p> <p>Results</p> <p>ICCs (95%CI) for the subscales were 0.92 (0.88-0.94) for Extent, 0.85 (0.80-0.90) for Perceived, and 0.89 (0.84-0.93) for Satisfaction. No ordering effect was observed.</p> <p>Conclusion</p> <p>Telephone administration of the CRIS measure yielded equivalent results to in-person administration. Telephone administration of the CRIS may enable lower costs of administration and greater adoption.</p

    Identification of disease-causing genes using microarray data mining and gene ontology

    Get PDF
    Background: One of the best and most accurate methods for identifying disease-causing genes is monitoring gene expression values in different samples using microarray technology. One of the shortcomings of microarray data is that they provide a small quantity of samples with respect to the number of genes. This problem reduces the classification accuracy of the methods, so gene selection is essential to improve the predictive accuracy and to identify potential marker genes for a disease. Among numerous existing methods for gene selection, support vector machine-based recursive feature elimination (SVMRFE) has become one of the leading methods, but its performance can be reduced because of the small sample size, noisy data and the fact that the method does not remove redundant genes. Methods: We propose a novel framework for gene selection which uses the advantageous features of conventional methods and addresses their weaknesses. In fact, we have combined the Fisher method and SVMRFE to utilize the advantages of a filtering method as well as an embedded method. Furthermore, we have added a redundancy reduction stage to address the weakness of the Fisher method and SVMRFE. In addition to gene expression values, the proposed method uses Gene Ontology which is a reliable source of information on genes. The use of Gene Ontology can compensate, in part, for the limitations of microarrays, such as having a small number of samples and erroneous measurement results. Results: The proposed method has been applied to colon, Diffuse Large B-Cell Lymphoma (DLBCL) and prostate cancer datasets. The empirical results show that our method has improved classification performance in terms of accuracy, sensitivity and specificity. In addition, the study of the molecular function of selected genes strengthened the hypothesis that these genes are involved in the process of cancer growth. Conclusions: The proposed method addresses the weakness of conventional methods by adding a redundancy reduction stage and utilizing Gene Ontology information. It predicts marker genes for colon, DLBCL and prostate cancer with a high accuracy. The predictions made in this study can serve as a list of candidates for subsequent wet-lab verification and might help in the search for a cure for cancers

    Exact score distribution computation for ontological similarity searches

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Semantic similarity searches in ontologies are an important component of many bioinformatic algorithms, e.g., finding functionally related proteins with the Gene Ontology or phenotypically similar diseases with the Human Phenotype Ontology (HPO). We have recently shown that the performance of semantic similarity searches can be improved by ranking results according to the probability of obtaining a given score at random rather than by the scores themselves. However, to date, there are no algorithms for computing the exact distribution of semantic similarity scores, which is necessary for computing the exact <it>P</it>-value of a given score.</p> <p>Results</p> <p>In this paper we consider the exact computation of score distributions for similarity searches in ontologies, and introduce a simple null hypothesis which can be used to compute a <it>P</it>-value for the statistical significance of similarity scores. We concentrate on measures based on Resnik's definition of ontological similarity. A new algorithm is proposed that collapses subgraphs of the ontology graph and thereby allows fast score distribution computation. The new algorithm is several orders of magnitude faster than the naive approach, as we demonstrate by computing score distributions for similarity searches in the HPO. It is shown that exact <it>P</it>-value calculation improves clinical diagnosis using the HPO compared to approaches based on sampling.</p> <p>Conclusions</p> <p>The new algorithm enables for the first time exact <it>P</it>-value calculation via exact score distribution computation for ontology similarity searches. The approach is applicable to any ontology for which the annotation-propagation rule holds and can improve any bioinformatic method that makes only use of the raw similarity scores. The algorithm was implemented in Java, supports any ontology in OBO format, and is available for non-commercial and academic usage under: <url>https://compbio.charite.de/svn/hpo/trunk/src/tools/significance/</url></p

    Sequencing of 15 622 Gene-bearing BACs Clarifies the Gene-dense Regions of the Barley Genome

    Get PDF
    Barley (Hordeum vulgare L.) possesses a large and highly repetitive genome of 5.1 Gb that has hindered the development of a complete sequence. In 2012, the International Barley Sequencing Consortium released a resource integrating whole-genome shotgun sequences with a physical and genetic framework. However, because only 6278 bacterial artificial chromosome (BACs) in the physical map were sequenced, fine structure was limited. To gain access to the gene-containing portion of the barley genome at high resolution, we identified and sequenced 15 622 BACs representing the minimal tiling path of 72 052 physical-mapped gene-bearing BACs. This generated ~1.7 Gb of genomic sequence containing an estimated 2/3 of all Morex barley genes. Exploration of these sequenced BACs revealed that although distal ends of chromosomes contain most of the gene-enriched BACs and are characterized by high recombination rates, there are also gene-dense regions with suppressed recombination. We made use of published map-anchored sequence data from Aegilops tauschii to develop a synteny viewer between barley and the ancestor of the wheat D-genome. Except for some notable inversions, there is a high level of collinearity between the two species. The software HarvEST:Barley provides facile access to BAC sequences and their annotations, along with the barley–Ae. tauschii synteny viewer. These BAC sequences constitute a resource to improve the efficiency of marker development, map-based cloning, and comparative genomics in barley and related crops. Additional knowledge about regions of the barley genome that are gene-dense but low recombination is particularly relevant
    corecore