33 research outputs found
Learning programs by learning from failures
We describe an inductive logic programming (ILP) approach called learning
from failures. In this approach, an ILP system (the learner) decomposes the
learning problem into three separate stages: generate, test, and constrain. In
the generate stage, the learner generates a hypothesis (a logic program) that
satisfies a set of hypothesis constraints (constraints on the syntactic form of
hypotheses). In the test stage, the learner tests the hypothesis against
training examples. A hypothesis fails when it does not entail all the positive
examples or entails a negative example. If a hypothesis fails, then, in the
constrain stage, the learner learns constraints from the failed hypothesis to
prune the hypothesis space, i.e. to constrain subsequent hypothesis generation.
For instance, if a hypothesis is too general (entails a negative example), the
constraints prune generalisations of the hypothesis. If a hypothesis is too
specific (does not entail all the positive examples), the constraints prune
specialisations of the hypothesis. This loop repeats until either (i) the
learner finds a hypothesis that entails all the positive and none of the
negative examples, or (ii) there are no more hypotheses to test. We introduce
Popper, an ILP system that implements this approach by combining answer set
programming and Prolog. Popper supports infinite problem domains, reasoning
about lists and numbers, learning textually minimal programs, and learning
recursive programs. Our experimental results on three domains (toy game
problems, robot strategies, and list transformations) show that (i) constraints
drastically improve learning performance, and (ii) Popper can outperform
existing ILP systems, both in terms of predictive accuracies and learning
times.Comment: Accepted for the machine learning journa
Substantial undocumented infection facilitates the rapid dissemination of novel coronavirus (COVID-19).
Background: Estimation of the fraction and contagiousness of undocumented novel coronavirus (COVID-19) infections is critical for understanding the overall prevalence and pandemic potential of this disease. Many mild infections are typically not reported and, depending on their contagiousness, may support stealth transmission and the spread of documented infection. Methods: Here we use observations of reported infection and spread within China in conjunction with mobility data, a networked dynamic metapopulation model and Bayesian inference, to infer critical epidemiological characteristics associated with the emerging coronavirus, including the fraction of undocumented infections and their contagiousness. Results: We estimate 86% of all infections were undocumented (95% CI: [82%-90%]) prior to the Wuhan travel shutdown (January 23, 2020). Per person, these undocumented infections were 52% as contagious as documented infections ([44%-69%]) and were the source of infection for two-thirds of documented cases. Our estimate of the reproductive number (2.23; [1.77-3.00]) aligns with earlier findings; however, after travel restrictions and control measures were imposed this number falls considerably. Conclusions: A majority of COVID-19 infections were undocumented prior to implementation of control measures on January 23, and these undocumented infections substantially contributed to virus transmission. These findings explain the rapid geographic spread of COVID-19 and indicate containment of this virus will be particularly challenging. Our findings also indicate that heightened awareness of the outbreak, increased use of personal protective measures, and travel restriction have been associated with reductions of the overall force of infection; however, it is unclear whether this reduction will be sufficient to stem the virus spread
Constructing refinement operators by decomposing logical implication
Inductive learning models [Plotkin 1971; Shapiro 1981] often use a search space of clauses, ordered by a generalization hierarchy. To find solutions in the model, search algorithms use different generalization and specialization operators. In this article we will decompose the quasi-ordering induced by logical implication into six increasingly weak orderings. The difference between two successive orderings will be small, and can therefore be understood easily. Using this decomposition, we will describe upward and downward refinement operators for all orderings, including -subsumption and logical implication
Recommended from our members
Predicting under-5 diarrhea outbreaks in Botswana: Understanding the relationships between environmental variability and diarrhea transmission
Diarrhea is the second leading cause of death in children under-5; it kills more children than HIV/AIDS, measles, and malaria combined. Despite this significant health burden, our ability to anticipate and prepare for diarrhea outbreaks remains limited. Precipitation and temperature variability have been shown to affect diarrhea dynamics and therefore contribute to outbreak predictions, but the observed environment-diarrhea relationships are complex and context-specific, depending on local pathogen distribution, host population behavior, and physical environments. To date, studies in sub-Saharan Africa, where the burden of under-5 diarrhea is particularly high, are limited due to sparse diarrheal disease surveillance data. In this dissertation, we leverage unique under-5 diarrhea incidence data to explore the effects of meteorological variability on childhood diarrhea incidence and develop a real-time forecasting system for diarrheal disease in Botswana, where diarrhea remains an important cause of childhood morbidity and mortality. The study focuses in Chobe District, which has an annual dry (April – September) and wet (October – March) season, during which the Chobe River, the primary source of drinking water in the region, floods. Weekly cases of under-5 diarrhea in Chobe District exhibit strong seasonal dynamics with biannual outbreaks occurring during the wet and the dry season. In Chapter 1, we show that wet season diarrhea incidence is strongly associated with increased rainfall and Escherichia coli concentrations in the Chobe River, while dry season incidence is associated with declines in Chobe River flood height and increased total suspended solids in the river. In Chapter 2, we confirm the existence of an El Niño-Southern Oscillation teleconnection with southern Africa by demonstrating that La Niña conditions are associated with cooler temperatures, increased rainfall, and higher flooding in Chobe District during the wet season. In turn, we show that La Niña conditions lagged 0-5 months are associated with higher than average incidence of under-5 diarrhea in the early wet season (December – February). In Chapter 4, we develop and test an epidemiological forecast model for childhood diarrheal disease in Chobe District. The prediction system uses a compartmental susceptible-infected-recovered-susceptible (SIRS) model coupled with Bayesian data assimilation to infer relevant epidemiological parameter values and generate retrospective forecasts. The model system accurately forecasts diarrhea outbreaks up to six weeks before the predicted peak of the outbreak, and prediction accuracy increases over the progression of the outbreak. Many forecasts generated by the model system are more accurate than predictions made using only historical data trends. This dissertation work is an important step forward in our understanding of the links between proximal and distal climatic variability and childhood diarrhea in arid regions of sub-Saharan Africa. Furthermore, it advances methods for generating accurate long-term and short-term forecasts of under-5 diarrhea. We demonstrates the potential use of ENSO data, which are publicly available, to prepare for and mitigate diarrheal disease outbreaks in a low-resource setting up to 5 months in advance, and develop a model-inference system that can generate accurate predictions during an outbreak. Deaths caused by diarrhea are preventable using low-cost treatments. Hence, accurate predictions of diarrhea outbreak magnitudes could help healthcare providers and public health officials prepare for and mitigate the significant morbidity and mortality resulting from diarrhea outbreaks
Differentiable Inductive Logic Programming for Structured Examples
The differentiable implementation of logic yields a seamless combination of
symbolic reasoning and deep neural networks. Recent research, which has
developed a differentiable framework to learn logic programs from examples, can
even acquire reasonable solutions from noisy datasets. However, this framework
severely limits expressions for solutions, e.g., no function symbols are
allowed, and the shapes of clauses are fixed. As a result, the framework cannot
deal with structured examples. Therefore we propose a new framework to learn
logic programs from noisy and structured examples, including the following
contributions. First, we propose an adaptive clause search method by looking
through structured space, which is defined by the generality of the clauses, to
yield an efficient search space for differentiable solvers. Second, we propose
for ground atoms an enumeration algorithm, which determines a necessary and
sufficient set of ground atoms to perform differentiable inference functions.
Finally, we propose a new method to compose logic programs softly, enabling the
system to deal with complex programs consisting of several clauses. Our
experiments show that our new framework can learn logic programs from noisy and
structured examples, such as sequences or trees. Our framework can be scaled to
deal with complex programs that consist of several clauses with function
symbols.Comment: Accepted by AAAI202
Retrospective Parameter Estimation and Forecast of Respiratory Syncytial Virus in the United States
Recent studies have shown that systems combining mathematical modeling and Bayesian inference methods can be used to generate real-time forecasts of future infectious disease incidence. Here we develop such a system to study and forecast respiratory syncytial virus (RSV). RSV is the most common cause of acute lower respiratory infection and bronchiolitis. Advanced warning of the epidemic timing and volume of RSV patient surges has the potential to reduce well-documented delays of treatment in emergency departments. We use a susceptible-infectious-recovered (SIR) model in conjunction with an ensemble adjustment Kalman filter (EAKF) and ten years of regional U.S. specimen data provided by the Centers for Disease Control and Prevention. The data and EAKF are used to optimize the SIR model and i) estimate critical epidemiological parameters over the course of each outbreak and ii) generate retrospective forecasts. The basic reproductive number, R0, is estimated at 3.0 (standard deviation 0.6) across all seasons and locations. The peak magnitude of RSV outbreaks is forecast with nearly 70% accuracy (i.e. nearly 70% of forecasts within 25% of the actual peak), four weeks before the predicted peak. This work represents a first step in the development of a real-time RSV prediction system