459 research outputs found
Understanding Learned Models by Identifying Important Features at the Right Resolution
In many application domains, it is important to characterize how complex
learned models make their decisions across the distribution of instances. One
way to do this is to identify the features and interactions among them that
contribute to a model's predictive accuracy. We present a model-agnostic
approach to this task that makes the following specific contributions. Our
approach (i) tests feature groups, in addition to base features, and tries to
determine the level of resolution at which important features can be
determined, (ii) uses hypothesis testing to rigorously assess the effect of
each feature on the model's loss, (iii) employs a hierarchical approach to
control the false discovery rate when testing feature groups and individual
base features for importance, and (iv) uses hypothesis testing to identify
important interactions among features and feature groups. We evaluate our
approach by analyzing random forest and LSTM neural network models learned in
two challenging biomedical applications.Comment: First two authors contributed equally to this work, Accepted for
presentation at the Thirty-Third AAAI Conference on Artificial Intelligence
(AAAI-19
A Portfolio of Academic, Therapeutic Practice and Research Work. Including an Investigation of Psycho-Diagnostic Categories, Psychopathology and Counselling Psychologists' Talk.
Abstract Not Provided
Biomedical event extraction from abstracts and full papers using search-based structured prediction.
BACKGROUND: Biomedical event extraction has attracted substantial attention as it can assist researchers in understanding the plethora of interactions among genes that are described in publications in molecular biology. While most recent work has focused on abstracts, the BioNLP 2011 shared task evaluated the submitted systems on both abstracts and full papers. In this article, we describe our submission to the shared task which decomposes event extraction into a set of classification tasks that can be learned either independently or jointly using the search-based structured prediction framework. Our intention is to explore how these two learning paradigms compare in the context of the shared task. RESULTS: We report that models learned using search-based structured prediction exceed the accuracy of independently learned classifiers by 8.3 points in F-score, with the gains being more pronounced on the more complex Regulation events (13.23 points). Furthermore, we show how the trade-off between recall and precision can be adjusted in both learning paradigms and that search-based structured prediction achieves better recall at all precision points. Finally, we report on experiments with a simple domain-adaptation method, resulting in the second-best performance achieved by a single system. CONCLUSIONS: We demonstrate that joint inference using the search-based structured prediction framework can achieve better performance than independently learned classifiers, thus demonstrating the potential of this learning paradigm for event extraction and other similarly complex information-extraction tasks.RIGHTS : This article is licensed under the BioMed Central licence at http://www.biomedcentral.com/about/license which is similar to the 'Creative Commons Attribution Licence'. In brief you may : copy, distribute, and display the work; make derivative works; or make commercial use of the work - under the following conditions: the original author must be given credit; for any reuse or distribution, it must be made clear to others what the license terms of this work are
Learning Statistical Models for Annotating Proteins with Function Information using Biomedical Text
<p>Abstract</p> <p>Background</p> <p>The BioCreative text mining evaluation investigated the application of text mining methods to the task of automatically extracting information from text in biomedical research articles. We participated in Task 2 of the evaluation. For this task, we built a system to automatically annotate a given protein with codes from the Gene Ontology (GO) using the text of an article from the biomedical literature as evidence.</p> <p>Methods</p> <p>Our system relies on simple statistical analyses of the full text article provided. We learn <it>n</it>-gram models for each GO code using statistical methods and use these models to hypothesize annotations. We also learn a set of Naïve Bayes models that identify textual clues of possible connections between the given protein and a hypothesized annotation. These models are used to filter and rank the predictions of the <it>n</it>-gram models.</p> <p>Results</p> <p>We report experiments evaluating the utility of various components of our system on a set of data held out during development, and experiments evaluating the utility of external data sources that we used to learn our models. Finally, we report our evaluation results from the BioCreative organizers.</p> <p>Conclusion</p> <p>We observe that, on the test data, our system performs quite well relative to the other systems submitted to the evaluation. From other experiments on the held-out data, we observe that (i) the Naïve Bayes models were effective in filtering and ranking the initially hypothesized annotations, and (ii) our learned models were significantly more accurate when external data sources were used during learning.</p
Feature Importance Explanations for Temporal Black-Box Models
Models in the supervised learning framework may capture rich and complex
representations over the features that are hard for humans to interpret.
Existing methods to explain such models are often specific to architectures and
data where the features do not have a time-varying component. In this work, we
propose TIME, a method to explain models that are inherently temporal in
nature. Our approach (i) uses a model-agnostic permutation-based approach to
analyze global feature importance, (ii) identifies the importance of salient
features with respect to their temporal ordering as well as localized windows
of influence, and (iii) uses hypothesis testing to provide statistical rigor
Bacteria classification with an electronic nose employing artificial neural networks
This PhD thesis describes research for a medical application of electronic nose technology.
There is a need at present for early detection of bacterial infection in order to
improve treatment. At present, the clinical methods used to detect and classify bacteria
types (usually using samples of infected matter taken from patients) can take up to
two or three days. Many experienced medical staff, who treat bacterial infections, are
able to recognise some types of bacteria from their odours. Identification of pathogens
(i.e. bacteria responsible for disease) from their odours using an electronic nose could
provide a rapid measurement and therefore early treatment. This research project used
existing sensor technology in the form of an electronic nose in conjunction with data
pre-processing and classification methods to classify up to four bacteria types from
their odours. Research was performed mostly in the area of signal conditioning, data
pre-processing and classification. A major area of interest was the use of artificial neural
networks classifiers. There were three main objectives. First, to classify successfully
a small range of bacteria types. Second, to identify issues relating to bacteria odour
that affect the ability of an artificially intelligent system to classify bacteria from odour
alone. And third, to establish optimal signal conditioning, data pre-processing and
classification methods.
The Electronic Nose consisted of a gas sensor array with temperature and humidity
sensors, signal conditioning circuits, and gas flow apparatus. The bacteria odour was
analysed using an automated sampling system, which used computer software to direct
gas flow through one of several vessels (which were used to contain the odour samples,
into the Electronic Nose. The electrical resistance of the odour sensors were monitored
and output as electronic signals to a computer. The purpose of the automated sampling system was to improve repeatability and reduce human error. Further improvement
of the Electronic Nose were implemented as a temperature control system which controlled
the ambient gas temperature, and a new gas sensor chamber which incorporated
improved gas flow.
The odour data were collected and stored as numerical values within data files in
the computer system. Once the data were stored in a non-volatile manner various classification
experiments were performed. Comparisons were made and conclusions were
drawn from the performance of various data pre-processing and classification methods.
Classification methods employed included artificial neural networks, discriminant
function analysis and multi-variate linear regression. For classifying one from four
types, the best accuracy achieved was 92.78%. This was achieved using a growth phase
compensated multiple layer perceptron. For identifying a single bacteria type from a
mixture of two different types, the best accuracy was 96.30%. This was achieved using
a standard multiple layer perceptron.
Classification of bacteria odours is a typical `real world' application of the kind that
electronic noses will have to be applied to if this technology is to be successful. The
methods and principles researched here are one step towards the goal of introducing
artificially intelligent sensor systems into everyday use. The results are promising and
showed that it is feasible to used Electronic Nose technology in this application and that
with further development useful products could be developed. The conclusion from this
thesis is that an electronic nose can detect and classify different types of bacteria
Illuminating new and known relations between knot invariants
We automate the process of machine learning correlations between knot
invariants. For nearly 200,000 distinct sets of input knot invariants together
with an output invariant, we attempt to learn the output invariant by training
a neural network on the input invariants. Correlation between invariants is
measured by the accuracy of the neural network prediction, and bipartite or
tripartite correlations are sequentially filtered from the input invariant sets
so that experiments with larger input sets are checking for true multipartite
correlation. We rediscover several known relationships between polynomial,
homological, and hyperbolic knot invariants, while also finding novel
correlations which are not explained by known results in knot theory. These
unexplained correlations strengthen previous observations concerning links
between Khovanov and knot Floer homology. Our results also point to a new
connection between quantum algebraic and hyperbolic invariants, similar to the
generalized volume conjecture.Comment: 30 pages, 9 figure
- …