2,003 research outputs found


    Get PDF
    Medical Subject Headings (MeSH) is a controlled vocabulary used by the National Library of Medicine to index medical articles, abstracts, and journals contained within the MEDLINE database. Although MeSH imposes uniformity and consistency in the indexing process, it has been proven that using MeSH indices only result in a small increase in precision over free-text indexing. Moreover, studies have shown that the use of controlled vocabularies in the indexing process is not an effective method to increase semantic relevance in information retrieval. To address the need for semantic relevance, we present an ontology-based information retrieval system for the MEDLINE collection that result in a 37.5% increase in precision when compared to free-text indexing systems. The presented system focuses on the ontology to: provide an alternative to text-representation for medical articles, finding relationships among co-occurring terms in abstracts, and to index terms that appear in text as well as discovered relationships. The presented system is then compared to existing MeSH and Free-Text information retrieval systems. This dissertation provides a proof-of-concept for an online retrieval system capable of providing increased semantic relevance when searching through medical abstracts in MEDLINE

    An experiment with ontology mapping using concept similarity

    Get PDF
    This paper describes a system for automatically mapping between concepts in different ontologies. The motivation for the research stems from the Diogene project, in which the project's own ontology covering the ICT domain is mapped to external ontologies, in order that their associated content can automatically be included in the Diogene system. An approach involving measuring the similarity of concepts is introduced, in which standard Information Retrieval indexing techniques are applied to concept descriptions. A matrix representing the similarity of concepts in two ontologies is generated, and a mapping is performed based on two parameters: the domain coverage of the ontologies, and their levels of granularity. Finally, some initial experimentation is presented which suggests that our approach meets the project's unique set of requirements

    Facilitating the development of controlled vocabularies for metabolomics technologies with text mining

    Get PDF
    BACKGROUND: Many bioinformatics applications rely on controlled vocabularies or ontologies to consistently interpret and seamlessly integrate information scattered across public resources. Experimental data sets from metabolomics studies need to be integrated with one another, but also with data produced by other types of omics studies in the spirit of systems biology, hence the pressing need for vocabularies and ontologies in metabolomics. However, it is time-consuming and non trivial to construct these resources manually. RESULTS: We describe a methodology for rapid development of controlled vocabularies, a study originally motivated by the needs for vocabularies describing metabolomics technologies. We present case studies involving two controlled vocabularies (for nuclear magnetic resonance spectroscopy and gas chromatography) whose development is currently underway as part of the Metabolomics Standards Initiative. The initial vocabularies were compiled manually, providing a total of 243 and 152 terms. A total of 5,699 and 2,612 new terms were acquired automatically from the literature. The analysis of the results showed that full-text articles (especially the Materials and Methods sections) are the major source of technology-specific terms as opposed to paper abstracts. CONCLUSIONS: We suggest a text mining method for efficient corpus-based term acquisition as a way of rapidly expanding a set of controlled vocabularies with the terms used in the scientific literature. We adopted an integrative approach, combining relatively generic software and data resources for time- and cost-effective development of a text mining tool for expansion of controlled vocabularies across various domains, as a practical alternative to both manual term collection and tailor-made named entity recognition methods

    Terminology server for improved resource discovery: analysis of model and functions

    Get PDF
    This paper considers the potential to improve distributed information retrieval via a terminologies server. The restriction upon effective resource discovery caused by the use of disparate terminologies across services and collections is outlined, before considering a DDC spine based approach involving inter-scheme mapping as a possible solution. The developing HILT model is discussed alongside other existing models and alternative approaches to solving the terminologies problem. Results from the current HILT pilot are presented to illustrate functionality and suggestions are made for further research and development

    Mapping the relationship between knowledge management and information architecture

    Get PDF
    Includes bibliographical references (leaves 106-115).This dissertation defines knowledge in terms or traditional epistemological ideals and as a strategic resource. Knowledge management is defined in terms or the ability or organizations to manage knowledge as a strategic resource in order to gain all advantage from it. In the knowledge management framework, knowledge is presented as a continuum consisting of tacit, implicit and explicit knowledge. Tacit and implicit knowledge is managed through the acknowledgement of the social nature of knowledge. One method to achieve this is communities of practice. On the other end of the spectrum, explicit knowledge is very close in nature and character to information. Due to the expansion of available information resources the design and structure of information (explicit knowledge) for effective retrieval has become very important. Information architecture is a field that specializes in the design and structure of information for effective retrieval. Traditional information architecture tools such as metadata and subject classification address some of the issues, but experience difficulty in heterogeneous environments such as the Internet. Topic maps are considered as a possible solution to the concerns of metadata classification and subject based classification. Due to the extent and nature of the information recorded in a topic map, it becomes an information resource in itself. Topic maps also act as an enabling technology for knowledge management as it maps the complex relationships between concepts and include a range of information resources. The conclusion of this dissertation is the representation of a conceptual model based on the themes developed in this dissertation. The main advantage of the conceptual model is the clear and direct link between knowledge management and information architecture

    A model for information retrieval driven by conceptual spaces

    Get PDF
    A retrieval model describes the transformation of a query into a set of documents. The question is: what drives this transformation? For semantic information retrieval type of models this transformation is driven by the content and structure of the semantic models. In this case, Knowledge Organization Systems (KOSs) are the semantic models that encode the meaning employed for monolingual and cross-language retrieval. The focus of this research is the relationship between these meanings’ representations and their role and potential in augmenting existing retrieval models effectiveness. The proposed approach is unique in explicitly interpreting a semantic reference as a pointer to a concept in the semantic model that activates all its linked neighboring concepts. It is in fact the formalization of the information retrieval model and the integration of knowledge resources from the Linguistic Linked Open Data cloud that is distinctive from other approaches. The preprocessing of the semantic model using Formal Concept Analysis enables the extraction of conceptual spaces (formal contexts)that are based on sub-graphs from the original structure of the semantic model. The types of conceptual spaces built in this case are limited by the KOSs structural relations relevant to retrieval: exact match, broader, narrower, and related. They capture the definitional and relational aspects of the concepts in the semantic model. Also, each formal context is assigned an operational role in the flow of processes of the retrieval system enabling a clear path towards the implementations of monolingual and cross-lingual systems. By following this model’s theoretical description in constructing a retrieval system, evaluation results have shown statistically significant results in both monolingual and bilingual settings when no methods for query expansion were used. The test suite was run on the Cross-Language Evaluation Forum Domain Specific 2004-2006 collection with additional extensions to match the specifics of this model

    Why Geospatial Linked Open Data for Smart Mobility?

    Get PDF
    While the concept of Smart Cities is gaining momentum around the world and government data are increasingly available and accessible on the World Wide Web, key issues remain about Open Data and data standards for smart cities. A better integration and interoperabilty of data through the World Wide Web is only possible when everyone agrees on the standards for data representation and sharing. Linked Open Data positions itself as a solution for such standardization, being a method of publishing structured data using standard Web technologies. This facilitates the interlinking between datasets, makes them readable by computers, and easily accesible on the World Wide Web. We illustrate this through the example of an evolution from a traditional Content Management System with a geoportal, to a semantic based aproach. The Traffic Safety Monitor was developed in the period of 2012-2015 to monitor the road safety and to support policy development on road safety in Flanders (the northern part of Belgium). The system is built as a Content Management System (CMS), with publication tools to present geospatial indicators on road safety (e.g. the number of accidents with cars and the number of positive alcohol tests) as Web maps using stardardized Open Geospatial Consortium Webservices. The Traffic Safety Monitor is currently further developed towards a Mobility Monitor. Here, the focus is on the development of a business process model for the semantic exchange and publication of spatial data using Linked Open Data principles targeting indicators of sustainable and smart mobility. In the future, the usability of cycling Infrastructure for vehicles such as mobility scooters, bicycle trailers etc. can be assessed using Linked Open Data. The data and metadata is published in Linked open data format, opening the door for their reuse by a wide range of (smart) applications

    DRIVER Technology Watch Report

    Get PDF
    This report is part of the Discovery Workpackage (WP4) and is the third report out of four deliverables. The objective of this report is to give an overview of the latest technical developments in the world of digital repositories, digital libraries and beyond, in order to serve as theoretical and practical input for the technical DRIVER developments, especially those focused on enhanced publications. This report consists of two main parts, one part focuses on interoperability standards for enhanced publications, the other part consists of three subchapters, which give a landscape picture of current and surfacing technologies and communities crucial to DRIVER. These three subchapters contain the GRID, CRIS and LTP communities and technologies. Every chapter contains a theoretical explanation, followed by case studies and the outcomes and opportunities for DRIVER in this field

    Peer to Peer Information Retrieval: An Overview

    Get PDF
    Peer-to-peer technology is widely used for file sharing. In the past decade a number of prototype peer-to-peer information retrieval systems have been developed. Unfortunately, none of these have seen widespread real- world adoption and thus, in contrast with file sharing, information retrieval is still dominated by centralised solutions. In this paper we provide an overview of the key challenges for peer-to-peer information retrieval and the work done so far. We want to stimulate and inspire further research to overcome these challenges. This will open the door to the development and large-scale deployment of real-world peer-to-peer information retrieval systems that rival existing centralised client-server solutions in terms of scalability, performance, user satisfaction and freedom

    Topic Maps and library and information science : an exploratory study of Topic Maps principles from a Knowledge and Information Organization perspective

    Get PDF
    Purpose: This master thesis attempts to present a ‘state of the art’ of the placement of Topic Maps (ISO13250) in Library and Information Science, through an extensive literature review and a synthesis based on their principles. It was sited from a Knowledge and Information Organization perspective, represented by the work by Elain Svenonius The Intellectual Foundation of Information Organization and some of the concepts of Knowledge Organization. This thesis also intends to present a conceptual and theoretical framework for future research. Design/methodology/approach: The study under review presents a qualitative approach based on Grounded Theory principles to analyse the literature and build the conceptual framework for its analysis. The literature reviewed consisted of more than sixty documents, which included, among others, journal articles, conference presentations and papers, student reports and thesis, as well as a book chapter. Moreover, this was complemented with information obtained from mailing lists, blog postings and websites, and some unstructured interviews. Findings: Topic Maps appears to be a development aligned within the tradition of Knowledge and Information Organization but is completely adapted to the context of the Web and the digital environments. In a LIS perspective, it is bibliographic meta-language able to represent, extend and mostly integrate all the existing Knowledge Organization Systems in a standards-based generic model applicable to digital content and online presentation. Conceptually, Topic Maps is in the borders of the LIS discipline with Knowledge Representation and Computer Science, where LIS conceptual models play the role of intermediaries by providing the ontologies to the ‘bibliographic universe’. Topic Maps questions traditional LIS views and principles. Even though some of them still remain the same, as the meaning-based identification of entities, the notions of ‘document’ and ‘subject’ require further studies. Some important applications give account of the capabilities and potentials for further developments and research on Topic Maps in LIS. The main field of application is the Digital Humanities and TEIcodified texts presentation.Joint Master Degree in Digital Library Learning (DILL