4 research outputs found

    An effective biomedical document classification scheme in support of biocuration: addressing class imbalance.

    Get PDF
    Published literature is an important source of knowledge supporting biomedical research. Given the large and increasing number of publications, automated document classification plays an important role in biomedical research. Effective biomedical document classifiers are especially needed for bio-databases, in which the information stems from many thousands of biomedical publications that curators must read in detail and annotate. In addition, biomedical document classification often amounts to identifying a small subset of relevant publications within a much larger collection of available documents. As such, addressing class imbalance is essential to a practical classifier. We present here an effective classification scheme for automatically identifying papers among a large pool of biomedical publications that contain information relevant to a specific topic, which the curators are interested in annotating. The proposed scheme is based on a meta-classification framework using cluster-based under-sampling combined with named-entity recognition and statistical feature selection strategies. We examined the performance of our method over a large imbalanced data set that was originally manually curated by the Jackson Laboratory\u27s Gene Expression Database (GXD). The set consists of more than 90 000 PubMed abstracts, of which about 13 000 documents are labeled as relevant to GXD while the others are not relevant. Our results, 0.72 precision, 0.80 recall and 0.75 f-measure, demonstrate that our proposed classification scheme effectively categorizes such a large data set in the face of data imbalance

    The Role of Title, Metadata and Abstract in Identifying Clinically Relevant Journal Articles

    No full text
    Access to current clinical information involves searches of bibliographic databases, such as MEDLINE(®), and subsequent evaluation of retrieval results for relevance to a specific clinical situation and quality of the reported research. We establish the amount of information that needs to be provided by an information retrieval system to assist healthcare practitioners in identifying clinically relevant information and evaluating its potential strength of evidence. We find 92% of titles informative enough for a practitioner to correctly classify publications as clinical, but not sufficient for classification of research quality. We suggest automatic organization of retrieval results into strength of evidence categories to supplement title-based judgments and provide quick access to the abstracts of the most promising articles. We find information in the abstracts sufficient to identify articles potentially immediately useful for clinical decision support. These findings are important to the design of information retrieval systems supporting small, low-bandwidth handheld computers

    Níveis de utilização dos sistemas de informação de base tecnológica : a gamification como estratégia de melhoria

    Get PDF
    exploração de tecnologias de informação e de comunicação têm sido alvo de grande prosperidade e desenvolvimento. Apesar disso, as pessoas enquanto utilizadores de tecnologia, seja num âmbito corporativo ou num âmbito pessoal e social, utilizam estes recursos nem sempre de forma aprofundada e sistémica, inviabilizando tirar completo partido dos recursos disponíveis. Com o objetivo de criar uma experiência de trabalho positiva e, sobretudo, de exploração completa e adequada dos meios ao dispor, que sirva os interesses da gestão no plano do controlo e da tomada de decisão, obriga os gestores a iniciativas de motivação inovadoras. Uma dessas iniciativas consiste na gamification, isto é, a utilização de elementos de jogo em contextos exógenos. Esta técnica tem sido adotada nos últimos anos, por várias organizações, nos mais variados campos, de maneira a desenvolver o envolvimento dos utilizadores em determinado ambiente. O trabalho apresentado, de carácter exploratório, apresenta um processo sistemático de análise documental e de text mining a 68 documentos científicos relacionados com a gamification. Esta abordagem permite identificar o potencial da utilização de elementos de jogo, com o propósito de melhorar a experiência dos funcionários em contexto laboral. Em adição, foi desenhado um instrumento que visa compreender a perceção que os colaboradores da CH Business Consulting têm sobre as suas experiências nas aplicações corporativas.The exploration of information and communication technologies has been subjected to prosperous development. However, people - as users both in corporative, personal and social environments - do not always employ technology in a meaningful and systemic fashion, rendering available resources inefficacious. With the main purpose of creating a positive workplace experience, and above all on that is focused on complete and adequate exploration of available means, that serves interests both in planning and management, as well as decision making, managers are pressed to find innovative motivation initiatives. Such initiative is gamification, i.e. utilizing game elements in exogenous contexts. This technique has been adopted over the last decade by various organizations within a multitude of interest fields as a way of developing user engagement in a determined environment. The presented work - within an exploratory scope - presents a systematic process for documental analysis and text mining of a total of 68 scientific documents related to gamification. This approach allows one to identify the potential of using game elements to improve staff experience in a workplace context. Additionally, an instrument to evaluate CH Business Consulting's collaborators' perception of their experiences in corporative applications was designed
    corecore