280 research outputs found

    Integrating Medical Ontology and Pseudo Relevance Feedback For Medical Document Retrieval

    Get PDF
    The purpose of this thesis is to undertake and improve the accuracy of locating the relevant documents from a large amount of Electronic Medical Data (EMD). The unique goal of this research is to propose a new idea for using medical ontology to find an easy and more reliable approach for patients to have a better understanding of their diseases and also help doctors to find and further improve the possible methods of diagnosis and treatments. The empirical studies were based on the dataset provided by CLEF focused on health care data. In this research, I have used Information Retrieval to find and obtain relevant information within the large amount of data sets provided by CLEF. I then used ranking functionality on the Terrier platform to calculate and evaluate the matching documents in the collection of data sets. BM25 was used as the base normalization method to retrieve the results and Pseudo Relevance Feedback weighting model to retrieve the information regarding patients health history and medical records in order to find more accurate results. I then used Unified Medical Language System to develop indexing of the queries while searching on the Internet and looking for health related documents. UMLS software was actually used to link the computer system with the health and biomedical terms and vocabularies into classify tools; it works as a dictionary for the patients by translating the medical terms. Later I would like to work on using medical ontology to create a relationship between the documents regarding the medical data and my retrieved results

    The application of intelligent agents in libraries: a survey

    Get PDF
    Purpose - The purpose of this article is to provide a comprehensive literature review on the utilisation of intelligent agent technology in the library environment. Design/methodology/approach - Research papers since 1990 on the use of various intelligent agent technologies in libraries are divided into two main application areas: digital library (DL), including agent-based DL projects, multi-agent architecture for DLs, intelligent agents for distributed heterogeneous information retrieval and agent support to information search process in DLs; and services in traditional libraries, including user interface for library information systems, automatic reference services and multi-agent architecture for library services. For each paper on the topic, its new ideas or models, referred work, analyses, experiments, findings and conclusions are addressed. Findings - The majority of the literature covers DLs and there have been fewer studies about services in traditional libraries. A variety of architecture, framework and models integrating agent technology in library systems or services are proposed, but only a few have been implemented in the practical environment. The application of agent technology is still at the research and experimentation stage. Agent technology has great potential in many areas in the library context; however it presents challenges to libraries that want to be involved in its adoption. Practical implications - The survey has practical implications for libraries, librarians and computer professionals in developing projects that employ intelligent agent technology to meet end-users\u27 expectations as well as to improve information services within limited resources in library settings. Originality/value - The paper provides a comprehensive survey on the development and research of intelligent agents in libraries in literature

    Semantic Interoperability in Digital Library Systems

    Get PDF
    This report is a state-of-the-art overview of activities and research being undertaken in areas relating to semantic interoperability in digital library systems. It has been undertaken as part of the cluster activity of WP5: Knowledge Extraction and Semantic Interoperability (KESI). The authors and contributors draw on the research expertise and experience of a number of organisations (UKOLN, ICS-FORTH, NETLAB, TUC-MUSIC, University of Glamorgan) as well as several work-packages (WP5: Knowledge Extraction and Semantic Interoperability; WP3: Audio-Visual and Non-traditional Objects) within the DELOS2 NoE. In addition, a workshop was held [KESI Workshop Sept. 2004] (co-located with ECDL 2004) in order to provide a forum for the discussion of issues relevant to the topic of this report. We are grateful to those who participated in the forum and for their valuable comments, which have helped to shape this report. Definitions of interoperability, syntactic interoperability and semantic interoperability are presented noting that semantic interoperability is very much about matching concepts as a basis. The NSF Post Digital Libraries Futures Workshop: Wave of the Future [NSF Workshop] has identified semantic interoperability as being of primary importance in digital library research

    Semantic Interoperability in Digital Library Systems

    Get PDF

    The Ensemble MESH-Term Query Expansion Models Using Multiple LDA Topic Models and ANN Classifiers in Health Information Retrieval

    Get PDF
    Information retrieval in the health field has several challenges. Health information terminology is difficult for consumers (laypeople) to understand. Formulating a query with professional terms is not easy for consumers because health-related terms are more familiar to health professionals. If health terms related to a query are automatically added, it would help consumers to find relevant information. The proposed query expansion (QE) models show how to expand a query using MeSH (Medical Subject Headings) terms. The documents were represented by MeSH terms (i.e. Bag-of-MeSH), which were included in the full-text articles. And then the MeSH terms were used to generate LDA (Latent Dirichlet Analysis) topic models. A query and the top k retrieved documents were used to find MeSH terms as topic words related to the query. LDA topic words were filtered by 1) threshold values of topic probability (TP) and word probability (WP) or 2) an ANN (Artificial Neural Network) classifier. Threshold values were effective in an LDA model with a specific number of topics to increase IR performance in terms of infAP (inferred Average Precision) and infNDCG (inferred Normalized Discounted Cumulative Gain), which are common IR metrics for large data collections with incomplete judgments. The top k words were chosen by the word score based on (TP *WP) and retrieved document ranking in an LDA model with specific thresholds. The QE model with specific thresholds for TP and WP showed improved mean infAP and infNDCG scores in an LDA model, comparing with the baseline result. However, the threshold values optimized for a particular LDA model did not perform well in other LDA models with different numbers of topics. An ANN classifier was employed to overcome the weakness of the QE model depending on LDA thresholds by automatically categorizing MeSH terms (positive/negative/neutral) for QE. ANN classifiers were trained on word features related to the LDA model and collection. Two types of QE models (WSW & PWS) using an LDA model and an ANN classifier were proposed: 1) Word Score Weighting (WSW) where the probability of being a positive/negative/neutral word was used to weight the original word score, and 2) Positive Word Selection (PWS) where positive words were identified by the ANN classifier. Forty WSW models showed better average mean infAP and infNDCG scores than the PWS models when the top 7 words were selected for QE. Both approaches based on a binary ANN classifier were effective in increasing infAP and infNDCG, statistically, significantly, compared with the scores of the baseline run. A 3-class classifier performed worse than the binary classifier. The proposed ensemble QE models integrated multiple ANN classifiers with multiple LDA models. Ensemble QE models combined multiple WSW/PWS models and one or multiple classifiers. Multiple classifiers were more effective in selecting relevant words for QE than one classifier. In ensemble QE (WSW/PWS) models, the top k words added to the original queries were effective to increase infAP and infNDCG scores. The ensemble QE model (WSW) using three classifiers showed statistically significant improvements for infAP and infNDCG in the mean scores for 30 queries when the top 3 words were added. The ensemble QE model (PWS) using four classifiers showed statistically significant improvements for 30 queries in the mean infAP and infNDCG scores

    Concepção, implementação e validação de um enfoque para integração e recuperação de conhecimento distribuído em bases de dados heterogêneas

    Get PDF
    Dissertação (mestrado) - Universidade Federal de Santa Catarina, Centro Tecnológico, Programa de Pós-Graduação em Engenharia e Gestão do Conhecimento, Florianópolis, 2010Com o crescimento da demanda e da composição de Bases de Conhecimento para os mais diversos fins e a sua disponibilização através da rede mundial de computadores, passou-se a observar a necessidade de organizar este conhecimento e também integrá-lo para possibilitar maior acessibilidade e facilidade na sua manutenção e utilização, devido à caracterização da disposição dispersa e o formato heterogêneo das referidas bases. Neste trabalho é proposto um sistema que efetua integração do conhecimento de bases de dados em contexto genérico, utilizando como estudo de caso o atendimento emergencial no CIT - Centro de Informações Toxicológicas de Santa Catarina - além de possibilitar a manutenção e manipulação deste artefato através do agrupamento de técnicas de recuperação de informação, aperfeiçoamento semântico, expansão de consulta, fonética em um único mecanismo. Foram avaliadas - através de uma revisão sistemática da literatura - as melhores opções disponibilizadas por estudos prévios em pesquisas realizadas nestas áreas a fim de encontrar a melhor combinação a ser utilizada no mecanismo, além da análise do produto final em um comparativo feito entre mecanismos previamente utilizados pelos profissionais no atendimento de urgência.With growth demand and composition of knowledge bases for different purposes and making them available through internet, it#s possible to see the need to organize this knowledge and also integrate it to provide greater accessibility and ease maintenance and use, due to the characterization of dispersed persistence and format of such heterogeneous databases. This dissertation proposes a system that performs integration of knowledge databases in generic context, using as a case study of emergency care at CIT - Toxicological Information Center of Santa Catarina - besides facilitating the maintenance and manipulation of the artifact by grouping techniques of information retrieval, semantic processing, query expansion, phonetics in a single mechanism. Were evaluated - through a systematic literature review - the best options available in previous studies on research conducted in these areas to find the best combination to be used in the mechanism, besides the analysis of the final product in a comparison made between mechanisms previously used by professionals in emergency care

    The application of ontologies in digital library: a meta-synthesis approach

    Get PDF
    Objective: , the present study examines the current status of the use of ontologies in the digital library area through the analysis of studies in this field. Methodology: The present research is a qualitative study using the meta-synthesis method. In order to collect data in this study, the library method, and to analyze data the seventh-step process of Sandelowski & Barroso for meta-synthesis was used. The research population of the study includes related studies (articles and dissertations) in the area of ontology applications in digital libraries retrieved from scientific databases. CASP evaluation checklist was used to ensure the quality of the studies. Finally, out of 267 retrieved studies, 43 titles were selected and analyzed. Findings: Analysis of studies in the area of ontology application in the digital library led to the identification of 4 categories, 8 components, and 48 dimensions in this field. The main categories include the application of ontology in digital library services, the application of ontology in digital library structures, the basis of ontology application in digital libraries, and the application of ontologies in covering the subject domain of digital libraries. Originality: In this study, which seems to have never been done before, a comprehensive analysis of the field of ontology application in digital libraries, the current situation and its dimensions were presented. Also, by clarifying the topics that have been less addressed, new research subjects were provided for researchers in this field

    Social shaping of digital publishing: exploring the interplay between culture and technology

    Get PDF
    The processes and forms of electronic publishing have been changing since the advent of the Web. In recent years, the open access movement has been a major driver of scholarly communication, and change is also evident in other fields such as e-government and e-learning. Whilst many changes are driven by technological advances, an altered social reality is also pushing the boundaries of digital publishing. With 23 articles and 10 posters, Elpub 2012 focuses on the social shaping of digital publishing and explores the interplay between culture and technology. This book contains the proceedings of the conference, consisting of 11 accepted full articles and 12 articles accepted as extended abstracts. The articles are presented in groups, and cover the topics: digital scholarship and publishing; special archives; libraries and repositories; digital texts and readings; and future solutions and innovations. Offering an overview of the current situation and exploring the trends of the future, this book will be of interest to all those whose work involves digital publishing

    Improving search engines with open Web-based SKOS vocabularies

    Get PDF
    Dissertação para obtenção do Grau de Mestre em Engenharia InformáticaThe volume of digital information is increasingly larger and even though organiza-tions are making more of this information available, without the proper tools users have great difficulties in retrieving documents about subjects of interest. Good infor-mation retrieval mechanisms are crucial for answering user information needs. Nowadays, search engines are unavoidable - they are an essential feature in docu-ment management systems. However, achieving good relevancy is a difficult problem particularly when dealing with specific technical domains where vocabulary mismatch problems can be prejudicial. Numerous research works found that exploiting the lexi-cal or semantic relations of terms in a collection attenuates this problem. In this dissertation, we aim to improve search results and user experience by inves-tigating the use of potentially connected Web vocabularies in information retrieval en-gines. In the context of open Web-based SKOS vocabularies we propose a query expan-sion framework implemented in a widely used IR system (Lucene/Solr), and evaluated using standard IR evaluation datasets. The components described in this thesis were applied in the development of a new search system that was integrated with a rapid applications development tool in the context of an internship at Quidgest S.A.Fundação para a Ciência e Tecnologia - ImTV research project, in the context of the UTAustin-Portugal collaboration (UTA-Est/MAI/0010/2009); QSearch project (FCT/Quidgest