601 research outputs found

    DutchHatTrick: semantic query modeling, ConText, section detection, and match score maximization

    Get PDF
    This report discusses the collaborative work of the ErasmusMC, University of Twente, and the University of Amsterdam on the TREC 2011 Medical track. Here, the task is to retrieve patient visits from the University of Pittsburgh NLP Repository for 35 topics. The repository consists of 101,711 patient reports, and a patient visit was recorded in one or more reports

    Literature-based priors for gene regulatory networks

    Get PDF
    Motivation: The use of prior knowledge to improve gene regulatory network modelling has often been proposed. In this paper we present the first research on the massive incorporation of prior knowledge from literature for Bayesian network learning of gene networks. As the publication rate of scientific papers grows, updating online databases, which have been proposed as potential prior knowledge in past rese-arch, becomes increasingly challenging. The novelty of our approach lies in the use of gene-pair association scores that describe the over-lap in the contexts in which the genes are mentioned, generated from a large database of scientific literature, harnessing the information contained in a huge number of documents into a simple, clear format. Results: We present a method to transform such literature-based gene association scores to network prior probabilities, and apply it to learn gene sub-networks for yeast, E. coli and Human organisms. We also investigate the effect of weighting the influence of the prior know-ledge. Our findings show that literature-based priors can improve both the number of true regulatory interactions present in the network and the accuracy of expression value prediction on genes, in comparison to a network learnt solely from expression data. Networks learnt with priors also show an improved biological interpretation, with identified subnetworks that coincide with known biological pathways. Contact

    Associative conceptual space-based information retrieval systems

    Get PDF
    In this `Information Era' with the availability of large collections of books, articles, journals, CD-ROMs, video films and so on, there exists an increasing need for intelligent information retrieval systems that enable users to find the information desired easily. Many attempts have been made to construct such retrieval systems, including the electronic ones used in libraries and including the search engines for the World Wide Web. In many cases, however, the so-called `precision' and `recall' of these systems leave much to be desired. In this paper, a new AI-based retrieval system is proposed, inspired by, among other things, the WEBSOM-algorithm. However, contrary to that approach where domain knowledge is extracted from the full text of all books, we propose a system where certain specific meta-information is automatically assembled using only the index of every document. This knowledge extraction process results into a new type of concept space, the so-called Associative Conceptual Space where the `concepts' as found in all documents are clustered using a Hebbian-type of learning algorithm. Then, each document can be characterised by comparing the concepts as occurring in it to those present in the associative conceptual space. Applying these characterisations, all documents can be clustered such that semantically similar documents lie close together on a Self-Organising Map. This map can easily be inspected by its user

    Concept based document retrieval for genomics literature

    Get PDF

    Cross Language Information Retrieval for Biomedical Literature

    Get PDF

    Efficient GPU-accelerated fitting of observational health-scaled stratified and time-varying Cox models

    Full text link
    The Cox proportional hazards model stands as a widely-used semi-parametric approach for survival analysis in medical research and many other fields. Numerous extensions of the Cox model have further expanded its versatility. Statistical computing challenges arise, however, when applying many of these extensions with the increasing complexity and volume of modern observational health datasets. To address these challenges, we demonstrate how to employ massive parallelization through graphics processing units (GPU) to enhance the scalability of the stratified Cox model, the Cox model with time-varying covariates, and the Cox model with time-varying coefficients. First we establish how the Cox model with time-varying coefficients can be transformed into the Cox model with time-varying covariates when using discrete time-to-event data. We then demonstrate how to recast both of these into a stratified Cox model and identify their shared computational bottleneck that results when evaluating the now segmented partial likelihood and its gradient with respect to regression coefficients at scale. These computations mirror a highly transformed segmented scan operation. While this bottleneck is not an immediately obvious target for multi-core parallelization, we convert it into an un-segmented operation to leverage the efficient many-core parallel scan algorithm. Our massively parallel implementation significantly accelerates model fitting on large-scale and high-dimensional Cox models with stratification or time-varying effect, delivering an order of magnitude speedup over traditional central processing unit-based implementations

    Massive Parallelization of Massive Sample-size Survival Analysis

    Full text link
    Large-scale observational health databases are increasingly popular for conducting comparative effectiveness and safety studies of medical products. However, increasing number of patients poses computational challenges when fitting survival regression models in such studies. In this paper, we use graphics processing units (GPUs) to parallelize the computational bottlenecks of massive sample-size survival analyses. Specifically, we develop and apply time- and memory-efficient single-pass parallel scan algorithms for Cox proportional hazards models and forward-backward parallel scan algorithms for Fine-Gray models for analysis with and without a competing risk using a cyclic coordinate descent optimization approach We demonstrate that GPUs accelerate the computation of fitting these complex models in large databases by orders-of-magnitude as compared to traditional multi-core CPU parallelism. Our implementation enables efficient large-scale observational studies involving millions of patients and thousands of patient characteristics
    • …
    corecore