9,201 research outputs found

    An Ontology-Based Recommender System with an Application to the Star Trek Television Franchise

    Full text link
    Collaborative filtering based recommender systems have proven to be extremely successful in settings where user preference data on items is abundant. However, collaborative filtering algorithms are hindered by their weakness against the item cold-start problem and general lack of interpretability. Ontology-based recommender systems exploit hierarchical organizations of users and items to enhance browsing, recommendation, and profile construction. While ontology-based approaches address the shortcomings of their collaborative filtering counterparts, ontological organizations of items can be difficult to obtain for items that mostly belong to the same category (e.g., television series episodes). In this paper, we present an ontology-based recommender system that integrates the knowledge represented in a large ontology of literary themes to produce fiction content recommendations. The main novelty of this work is an ontology-based method for computing similarities between items and its integration with the classical Item-KNN (K-nearest neighbors) algorithm. As a study case, we evaluated the proposed method against other approaches by performing the classical rating prediction task on a collection of Star Trek television series episodes in an item cold-start scenario. This transverse evaluation provides insights into the utility of different information resources and methods for the initial stages of recommender system development. We found our proposed method to be a convenient alternative to collaborative filtering approaches for collections of mostly similar items, particularly when other content-based approaches are not applicable or otherwise unavailable. Aside from the new methods, this paper contributes a testbed for future research and an online framework to collaboratively extend the ontology of literary themes to cover other narrative content.Comment: 25 pages, 6 figures, 5 tables, minor revision

    Exploiting the potential of large databases of electronic health records for research using rapid search algorithms and an intuitive query interface.

    Get PDF
    Objective: UK primary care databases, which contain diagnostic, demographic and prescribing information for millions of patients geographically representative of the UK, represent a significant resource for health services and clinical research. They can be used to identify patients with a specified disease or condition (phenotyping) and to investigate patterns of diagnosis and symptoms. Currently, extracting such information manually is time-consuming and requires considerable expertise. In order to exploit more fully the potential of these large and complex databases, our interdisciplinary team developed generic methods allowing access to different types of user. Materials and methods: Using the Clinical Practice Research Datalink database, we have developed an online user-focused system (TrialViz), which enables users interactively to select suitable medical general practices based on two criteria: suitability of the patient base for the intended study (phenotyping) and measures of data quality. Results: An end-to-end system, underpinned by an innovative search algorithm, allows the user to extract information in near real-time via an intuitive query interface and to explore this information using interactive visualization tools. A usability evaluation of this system produced positive results. Discussion: We present the challenges and results in the development of TrialViz and our plans for its extension for wider applications of clinical research. Conclusions: Our fast search algorithms and simple query algorithms represent a significant advance for users of clinical research databases

    Addendum to Informatics for Health 2017: Advancing both science and practice

    Get PDF
    This article presents presentation and poster abstracts that were mistakenly omitted from the original publication

    Care episode retrieval: distributional semantic models for information retrieval in the clinical domain

    Get PDF
    Patients' health related information is stored in electronic health records (EHRs) by health service providers. These records include sequential documentation of care episodes in the form of clinical notes. EHRs are used throughout the health care sector by professionals, administrators and patients, primarily for clinical purposes, but also for secondary purposes such as decision support and research. The vast amounts of information in EHR systems complicate information management and increase the risk of information overload. Therefore, clinicians and researchers need new tools to manage the information stored in the EHRs. A common use case is, given a - possibly unfinished - care episode, to retrieve the most similar care episodes among the records. This paper presents several methods for information retrieval, focusing on care episode retrieval, based on textual similarity, where similarity is measured through domain-specific modelling of the distributional semantics of words. Models include variants of random indexing and the semantic neural network model word2vec. Two novel methods are introduced that utilize the ICD-10 codes attached to care episodes to better induce domain-specificity in the semantic model. We report on experimental evaluation of care episode retrieval that circumvents the lack of human judgements regarding episode relevance. Results suggest that several of the methods proposed outperform a state-of-the art search engine (Lucene) on the retrieval task

    FinBook: literary content as digital commodity

    Get PDF
    This short essay explains the significance of the FinBook intervention, and invites the reader to participate. We have associated each chapter within this book with a financial robot (FinBot), and created a market whereby book content will be traded with financial securities. As human labour increasingly consists of unstable and uncertain work practices and as algorithms replace people on the virtual trading floors of the worlds markets, we see members of society taking advantage of FinBots to invest and make extra funds. Bots of all kinds are making financial decisions for us, searching online on our behalf to help us invest, to consume products and services. Our contribution to this compilation is to turn the collection of chapters in this book into a dynamic investment portfolio, and thereby play out what might happen to the process of buying and consuming literature in the not-so-distant future. By attaching identities (through QR codes) to each chapter, we create a market in which the chapter can ‘perform’. Our FinBots will trade based on features extracted from the authors’ words in this book: the political, ethical and cultural values embedded in the work, and the extent to which the FinBots share authors’ concerns; and the performance of chapters amongst those human and non-human actors that make up the market, and readership. In short, the FinBook model turns our work and the work of our co-authors into an investment portfolio, mediated by the market and the attention of readers. By creating a digital economy specifically around the content of online texts, our chapter and the FinBook platform aims to challenge the reader to consider how their personal values align them with individual articles, and how these become contested as they perform different value judgements about the financial performance of each chapter and the book as a whole. At the same time, by introducing ‘autonomous’ trading bots, we also explore the different ‘network’ affordances that differ between paper based books that’s scarcity is developed through analogue form, and digital forms of books whose uniqueness is reached through encryption. We thereby speak to wider questions about the conditions of an aggressive market in which algorithms subject cultural and intellectual items – books – to economic parameters, and the increasing ubiquity of data bots as actors in our social, political, economic and cultural lives. We understand that our marketization of literature may be an uncomfortable juxtaposition against the conventionally-imagined way a book is created, enjoyed and shared: it is intended to be

    Towards sophisticated learning from EHRs : increasing prediction specificity and accuracy using clinically meaningful risk criteria

    Get PDF
    Computer based analysis of Electronic Health Records (EHRs) has the potential to provide major novel insights of benefit both to specific individuals in the context of personalized medicine, as well as on the level of population-wide health care and policy. The present paper introduces a novel algorithm that uses machine learning for the discovery of longitudinal patterns in the diagnoses of diseases. Two key technical novelties are introduced: one in the form of a novel learning paradigm which enables greater learning specificity, and another in the form of a risk driven identification of confounding diagnoses. We present a series of experiments which demonstrate the effectiveness of the proposed techniques, and which reveal novel insights regarding the most promising future research directions.Postprin

    Extractive Summarization : Experimental work on nursing notes in Finnish

    Get PDF
    Natural Language Processing (NLP) is a subfield of artificial intelligence and linguistics that is concerned with how a computer machine interacts with human language. With the increasing computational power and the advancement in technologies, researchers have been successful at proposing various NLP tasks that have already been implemented as real-world applications today. Automated text summarization is one of the many tasks that has not yet completely matured particularly in health sector. A success in this task would enable healthcare professionals to grasp patient's history in a minimal time resulting in faster decisions required for better care. Automatic text summarization is a process that helps shortening a large text without sacrificing important information. This could be achieved by paraphrasing the content known as the abstractive method or by concatenating relevant extracted sentences namely the extractive method. In general, this process requires the conversion of text into numerical form and then a method is executed to identify and extract relevant text. This thesis is an attempt of exploring NLP techniques used in extractive text summarization particularly in health domain. The work includes a comparison of basic summarizing models implemented on a corpus of patient notes written by nurses in Finnish language. Concepts and research studies required to understand the implementation have been documented along with the description of the code. A python-based project is structured to build a corpus and execute multiple summarizing models. For this thesis, we observe the performance of two textual embeddings namely Term Frequency - Inverse Document Frequency (TF-IDF) which is based on simple statistical measure and Word2Vec which is based on neural networks. For both models, LexRank, an unsupervised stochastic graph-based sentence scoring algorithm, is used for sentence extraction and a random selection method is used as a baseline method for evaluation. To evaluate and compare the performance of models, summaries of 15 patient care episodes of each model were provided to two human beings for manual evaluations. According to the results of the small sample dataset, we observe that both evaluators seem to agree with each other in preferring summaries produced by Word2Vec LexRank over the summaries generated by TF-IDF LexRank. Both models have also been observed, by both evaluators, to perform better than the baseline model of random selection

    The Bolivian Hyperinflation and Stabilization

    Get PDF
    Chapter 1 gives a brief introduction to the Bolivian economy. Chapter 2 provides an overview of the political economy of macroeconomic policymaking in Bolivia since the 1952 Revolution. Great stress is put on the weakness of fiscal institutions in the face of heavy social and sectoral demands. Chapter 3 highlights some of the main directions of development policy during 1952-85, especially involving public investment spending and trade policy. In chapter 4 we consider important characteristics of Bolivia's international trade, focusing both on structural features (e.g., the heavy dependence on a small number of primary commodities), as well as policy choices. Chapter 5 describes the process of foreign debt accumulation, which was the counterpart of the large budget deficits of the public sector in the 1970s and early 1980s. Chapter 6 lays out the dynamics of the hyperinflation during 1982-85, focusing on the complex causal links among the budget deficit, the money supply, the exchange rate, and the price level. In chapter 7 we detail the process of stabilization since 1985 and discuss some of the general lessons about ending high inflation that might be applied to other economies in the region. Chapter 8 describes the novel arrangements that Bolivia has negotiated in order to escape the severe overhang of external debt. In the concluding chapter 9, we discuss briefly the challenges facing Bolivia in the future, once stabilization has been accomplished.
    • …
    corecore