27 research outputs found
Online Transfer Learning for RSV Case Detection
Transfer learning has become a pivotal technique in machine learning and has
proven to be effective in various real-world applications. However, utilizing
this technique for classification tasks with sequential data often faces
challenges, primarily attributed to the scarcity of class labels. To address
this challenge, we introduce Multi-Source Adaptive Weighting (MSAW), an online
multi-source transfer learning method. MSAW integrates a dynamic weighting
mechanism into an ensemble framework, enabling automatic adjustment of weights
based on the relevance and contribution of each source (representing historical
knowledge) and target model (learning from newly acquired data). We demonstrate
the effectiveness of MSAW by applying it to detect Respiratory Syncytial Virus
cases within Emergency Department visits, utilizing multiple years of
electronic health records from the University of Pittsburgh Medical Center. Our
method demonstrates performance improvements over many baselines, including
refining pre-trained models with online learning as well as three static
weighting approaches, showing MSAW's capacity to integrate historical knowledge
with progressively accumulated new data. This study indicates the potential of
online transfer learning in healthcare, particularly for developing machine
learning models that dynamically adapt to evolving situations where new data is
incrementally accumulated.Comment: 10 pages, 2 figure
Exploiting Background Knowledge in Automated Discovery
Prior work in automated scientific discovery has been successful in finding patterns in data, given that a reasonably small set of mostly relevant features is specified. The work described in this paper places data in the context of large bodies of background knowledge. Specifically, data items are connected to multiple databases of background knowledge represented as inheritance networks. The system has made a practical impact on botanical toxicology research, which required linking examples of cases of plant exposures to databases of botanical, geographical, and climate background knowledge
Augmenting Medical Databases with Domain Knowledge
this paper we discuss a method of linking databases with domain knowledge to provide an extended semantics for use with statistical, machine learning, and automated discovery programs. We focus on the use of data in conjunction with domain knowledge for automated discovery in medical databases, and show how an induction program can find new knowledge in a database by reasoning about classes and relationships that are implicit in the original data, but explicit in the representation of domain knowledge. Programs for automated, inductive discovery have been shown to be effective in discovering patterns from data. Some discoveries have been made that are important enough to be published in the literature of the scientific subject domain [9]. Although induction programs by themselves can make interesting discoveries, we focus here on removing the severe restriction that a learning program always works within a small, fixed, semantic bias. We illustrate these points in the domain of plant exposures, with the RL program [2] extended and applied to a large, multi-year database of toxic and non-toxic plant exposures. The present work is far from complete; however, it shows how knowledge bases and databases codified for other purposes can introduce an open-endedness to the bias within which an induction program operates. Our long-term view is to maintain access, perhaps over the internet, to large stores of background knowledge relevant to a given domain of inquiry. Our goal is that this background knowledge can be linked to the data for different induction problems to extend the semantic bias of the discovery program
The WoRLD: Knowledge Discovery from Multiple Distributed Databases
Inductive machine learning offers techniques for discovering new knowledge from business, medical, and scientific databases. Most techniques assume that all the relevant information for discovery has been gathered and assembled into a single table or database. With multiple databases it is possible to combine features from several perspectives and thus move beyond the confines of an ontology that was fixed by the designers of a single database. We introduce WoRLD ("Worldwide Relational Learning Daemon"), a system that uses spreading activation to enable inductive learning from multiple tables in multiple databases spread across the network. We describe the paradigm and the system, provide demonstrations on synthetic data sets, and then replicate two real-world successes of automated discovery. 1 INTRODUCTION Inductive machine learning offers methods for discovering new knowledge from business, medical, and scientific databases. Although the need to learn across multiple tables has bee..
A Method for Detecting and Characterizing Multiple Outbreaks of Infectious Diseases
We have developed an automated system that reads emergency department clinical reports and constructs models of multiple, possibly overlapping outbreaks. The system relies on a Bayesian scoring metric and search algorithms to find appropriate models. The system has been tested on simulated and actual data with good results
A Method for Detecting and Characterizing Multiple Outbreaks of Infectious Diseases
We have developed an automated system that reads emergency department clinical reports and constructs models of multiple, possibly overlapping outbreaks. The system relies on a Bayesian scoring metric and search algorithms to find appropriate models. The system has been tested on simulated and actual data with good results
A Bayesian approach for detecting a disease that is not being modeled.
Over the past decade, outbreaks of new or reemergent viruses such as severe acute respiratory syndrome (SARS) virus, Middle East respiratory syndrome (MERS) virus, and Zika have claimed thousands of lives and cost governments and healthcare systems billions of dollars. Because the appearance of new or transformed diseases is likely to continue, the detection and characterization of emergent diseases is an important problem. We describe a Bayesian statistical model that can detect and characterize previously unknown and unmodeled diseases from patient-care reports and evaluate its performance on historical data
The design and evaluation of a Bayesian system for detecting and characterizing outbreaks of influenza
The prediction and characterization of outbreaks of infectious diseases such as influenza remains an open and important problem. This paper describes a framework for detecting and characterizing outbreaks of influenza and the results of testing it on data from ten outbreaks collected from two locations over five years. We model outbreaks with compartment models and expliccitly model non-influenza influenza-like illnesses