1,642 research outputs found
Medical WordNet: A new methodology for the construction and validation of information resources for consumer health
A consumer health information system must be able to comprehend both expert and non-expert medical vocabulary and to map between the two. We describe an ongoing
project to create a new lexical database called Medical WordNet (MWN), consisting of
medically relevant terms used by and intelligible to non-expert subjects and supplemented by a corpus of natural-language sentences that is designed to provide
medically validated contexts for MWN terms. The corpus derives primarily from online health information sources targeted to consumers, and involves two sub-corpora, called Medical FactNet (MFN) and Medical BeliefNet (MBN), respectively. The former consists of statements accredited as true on the basis of a rigorous process of validation, the latter of statements which non-experts believe to be true. We summarize the MWN / MFN / MBN project, and describe some of its applications
The biomedical abbreviation recognition and resolution (BARR) track: Benchmarking, evaluation and importance of abbreviation recognition systems applied to Spanish biomedical abstracts
Healthcare professionals are generating a substantial volume of clinical data in narrative form. As healthcare providers are confronted with serious time constraints, they frequently use telegraphic phrases, domain-specific abbreviations and shorthand notes. Efficient clinical text processing tools need to cope with the recognition and resolution of abbreviations, a task that has been extensively studied for English documents. Despite the outstanding number of clinical documents written worldwide in Spanish, only a marginal amount of studies has been published on this subject. In clinical texts, as opposed to the medical literature, abbreviations are generally used without their definitions or expanded forms. The aim of the first Biomedical Abbreviation Recognition and Resolution (BARR) track, posed at the IberEval 2017 evaluation campaign, was to assess and promote the development of systems for generating a sense inventory of medical abbreviations. The BARR track required the detection of mentions of abbreviations or short forms and their corresponding long forms or definitions from Spanish medical abstracts. For this track, the organizers provided the BARR medical document collection, the BARR corpus of manually annotated abstracts labelled by domain experts and the BARR-Markyt evaluation platform. A total of 7 teams submitted 25 runs for the two BARR subtasks: (a) the identification of mentions of abbreviations and their definitions and (b) the correct detection of short form-long form pairs. Here we describe the BARR track setting, the obtained results and the methodologies used by participating systems. The BARR task summary, corpus, resources and evaluation tool for testing systems beyond this campaign are available at: http://temu.inab.org
.We acknowledge the Encomienda MINETAD-CNIO/OTG Sanidad Plan TL and Open-Minted (654021) H2020 project for funding.Postprint (published version
Matching Scales: The impact of ecosystem service scales on a planning and policy environment
There is an increase in the consideration of ecosystem services (ES) within the planning, policy, and research sectors. The increase in sectors working with ES is leading to an increase in scale mismatches, where ecosystem services are being mismanaged, leading to problems. Using a combination of methods these scale issues were investigated. A systematic review of both scientific and grey literature was undertaken which analysed 112 documents and led to a survey of 72 subjects who were working with ES across different sectors, and finally 19 in-depth interviews were undertaken, in order to understand fully the scale issues, and potential solutions being used. The systematic review found that a lot of ecosystem service scientific literature was based on, or had connections with, the global issue of climate change, this was in contrast to the survey that found that both researchers and those in policy are working at a regional spatial scale or below. The in-depth interviews attributed this to many factors including the pressure to publish in high-impact journals, and applying for funding. The survey found that the different sectors are working at different scales, and where they do work at the same scale, the definition they place on that scale term is different. The survey and in-depth interviews found that funding can influence the extent of a project and funding timelines lead into the temporal scale of a project. Funding can encourage collaboration with stakeholders and between sectors in order to pool resources and expertise. Alongside clarity of terms used and expectations for the project, collaboration was also put forward as one of the methods which can alleviate scale mismatches
Resolving semantic conflicts through ontological layering
We examine the problem of semantic interoperability in modern software systems, which exhibit pervasiveness, a range of heterogeneities and in particular, semantic heterogeneity of data models which are built upon ubiquitous data repositories. We investigate whether we can build ontologies upon heterogeneous data repositories in order to resolve semantic conflicts in them, and achieve their semantic interoperability. We propose a layered software architecture, which accommodates in its core, ontological layering, resulting in a Generic ontology for Context aware, Interoperable and Data sharing (Go-CID) software applications. The software architecture supports retrievals from various data repositories and resolves semantic conflicts which arise from heterogeneities inherent in them. It allows extendibility of heterogeneous data repositories through ontological layering, whilst preserving the autonomy of their individual elements.
Our specific ontological layering for interoperable data repositories is based on clearly defined reasoning mechanisms in order to perform ontology mappings. The reasoning mechanisms depend on the user‟s involvments in retrievals of and types of semantic conflicts, which we have to resolve after identifying semantically related data. Ontologies are described in terms of ontological concepts and their semantic roles that make the types of semantic conflicts explicit. We contextualise semantically related data through our own categorisation of semantic conflicts and their degrees of similarities.
Our software architecture has been tested through a case study of retrievals of semantically related data across repositories in pervasive healthcare and deployed with Semantic Web technology. The extensions to the research results include the applicability of our ontological layering and reasoning mechanisms in various problem domains and in environments where we need to (i) establish if and when we have overlapping “semantics”, and (ii) infer/assert a correct set of “semantics” which can support any decision making in such domains
PhenoMeter: A Metabolome Database Search Tool Using Statistical Similarity Matching of Metabolic Phenotypes for High-Confidence Detection of Functional Links
This article describes PhenoMeter, a new type of metabolomics database search that accepts metabolite response patterns as queries and searches the MetaPhen database of reference patterns for responses that are statistically significantly similar or inverse for the purposes of detecting functional links. To identify a similarity measure that would detect functional links as reliably as possible, we compared the performance of four statistics in correctly top-matching metabolic phenotypes of Arabidopsis thaliana metabolism mutants affected in different steps of the photorespiration metabolic pathway to reference phenotypes of mutants affected in the same enzymes by independent mutations. The best performing statistic, the PhenoMeter Score (PM Score), was a function of both Pearson correlation and Fisher’s Exact Test of directional overlap. This statistic outperformed Pearson correlation, biweight midcorrelation and Fisher’s Exact Test used alone. To demonstrate general applicability, we show that the PhenoMeter reliably retrieved the most closely functionally-linked response in the database when queried with responses to a wide variety of environmental and genetic perturbations. Attempts to match metabolic phenotypes between independent studies were met with varying success and possible reasons for this are discussed. Overall, our results suggest that integration of pattern-based search tools into metabolomics databases will aid functional annotation of newly recorded metabolic phenotypes analogously to the way sequence similarity search algorithms have aided the functional annotation of genes and proteins. PhenoMeter is freely available at MetabolomeExpress (https://www.metabolome-express.org/phenometer.php)
Recommended from our members
Exploiting a perdurantist foundational ontology and graph database for semantic data integration
This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University London.The view of reality that is inherent to perdurantist philosophical ontologies, often termed four dimensional (4D) ontologies, has not been widely adopted within the mainstream of information system design practice. However, as the closed world of enterprise systems is opened to Internet scale Semantic Web and Open Data information sources, there is a need to better understand the semantics of both internal and external data and how they can be integrated. Philosophical foundational ontologies can help establish this understanding and there is, therefore, an emerging need to research how they can be applied to the problem of semantic data integration. Therefore, a prime objective of this research was to develop a framework through which to apply a 4D foundational ontology and a graph database to the problem of semantic data integration, and to assess the effectiveness of the approach. The research employed design science, a methodology which is applicable to undertaking research within information systems as it encompasses methods through which the research can be undertaken and the resultant artefacts evaluated. This methodology has a number of discrete stages: problem awareness; a core design-build-evaluate iterative cycle through which the research is conducted; and a conclusion stage. The design science research was conducted through the development of a number of artefacts, the prime being the 4D-Semantic Extract Load (4D-SETL) framework. The effectiveness of the framework was assessed by applying it to semantically interpret and integrate a number of large scale datasets and to instantiate a prototype graph database warehouse to persist the resultant ontology. A series of technical experiments confirmed that directly reflecting the model patterns of 4D ontology within a prototype data warehouse proved an effective means of both structuring and semantically integrating complex datasets and that the artefacts produced by 4D-SETL could function at scale. Through illustrative scenario, the effectiveness of the approach is described in relation to the ability of the framework to address a number of weaknesses in current approaches. Furthermore the major advantages of the 4D-SETL are elaborated; which include ability of the framework is to combine foundational, domain and instance level ontological models in a single coherent system that dispensed with much of the translation normally undertaken between conceptual, logical and physical data models. Additionally, adopting a perdurantist realist foundational ontology provided a clear means of establishing and maintaining the identity of physical objects as their constituent temporal and spatial parts unfold over the course of tim
Towards the Next Generation of Clinical Decision Support: Overcoming the Integration Challenges of Genomic Data and Electronic Health Records
The wide adoption of electronic health records (EHRs), the unprecedented abundance of genomic data, and the rapid advancements in computational methods have paved the way for next generation clinical decision support (NGCDS) systems. NGCDS provides significant opportunities for the prevention, early detection, and the personalized treatment of complex diseases. The integration of genomic and EHR data into the NGCDS workflow is faced with significant challenges due to the high complexity and sheer magnitude of the associated data.
This dissertation performs an in depth investigation to address the computational and algorithmic challenges of integrating genomic and EHR data within the NGCDS workflow. In particular, the dissertation (i) defines the major genomic challenges NGCDS faces and discusses possible resolution directions, (ii) proposes an accelerated method for processing raw genomic data, (iii) introduces a data representation and compression method to store the processed genomic outcomes in a database schema, and finally, (iv) investigates the feasibility of using EHR data to produce accurate disease risk assessments. We hope that the proposed solutions will expedite the adoption of NGCDS and help advance the state of healthcare
- …