459 research outputs found

    Understanding Learned Models by Identifying Important Features at the Right Resolution

    Full text link
    In many application domains, it is important to characterize how complex learned models make their decisions across the distribution of instances. One way to do this is to identify the features and interactions among them that contribute to a model's predictive accuracy. We present a model-agnostic approach to this task that makes the following specific contributions. Our approach (i) tests feature groups, in addition to base features, and tries to determine the level of resolution at which important features can be determined, (ii) uses hypothesis testing to rigorously assess the effect of each feature on the model's loss, (iii) employs a hierarchical approach to control the false discovery rate when testing feature groups and individual base features for importance, and (iv) uses hypothesis testing to identify important interactions among features and feature groups. We evaluate our approach by analyzing random forest and LSTM neural network models learned in two challenging biomedical applications.Comment: First two authors contributed equally to this work, Accepted for presentation at the Thirty-Third AAAI Conference on Artificial Intelligence (AAAI-19

    Biomedical event extraction from abstracts and full papers using search-based structured prediction.

    Get PDF
    BACKGROUND: Biomedical event extraction has attracted substantial attention as it can assist researchers in understanding the plethora of interactions among genes that are described in publications in molecular biology. While most recent work has focused on abstracts, the BioNLP 2011 shared task evaluated the submitted systems on both abstracts and full papers. In this article, we describe our submission to the shared task which decomposes event extraction into a set of classification tasks that can be learned either independently or jointly using the search-based structured prediction framework. Our intention is to explore how these two learning paradigms compare in the context of the shared task. RESULTS: We report that models learned using search-based structured prediction exceed the accuracy of independently learned classifiers by 8.3 points in F-score, with the gains being more pronounced on the more complex Regulation events (13.23 points). Furthermore, we show how the trade-off between recall and precision can be adjusted in both learning paradigms and that search-based structured prediction achieves better recall at all precision points. Finally, we report on experiments with a simple domain-adaptation method, resulting in the second-best performance achieved by a single system. CONCLUSIONS: We demonstrate that joint inference using the search-based structured prediction framework can achieve better performance than independently learned classifiers, thus demonstrating the potential of this learning paradigm for event extraction and other similarly complex information-extraction tasks.RIGHTS : This article is licensed under the BioMed Central licence at http://www.biomedcentral.com/about/license which is similar to the 'Creative Commons Attribution Licence'. In brief you may : copy, distribute, and display the work; make derivative works; or make commercial use of the work - under the following conditions: the original author must be given credit; for any reuse or distribution, it must be made clear to others what the license terms of this work are

    Learning Statistical Models for Annotating Proteins with Function Information using Biomedical Text

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The BioCreative text mining evaluation investigated the application of text mining methods to the task of automatically extracting information from text in biomedical research articles. We participated in Task 2 of the evaluation. For this task, we built a system to automatically annotate a given protein with codes from the Gene Ontology (GO) using the text of an article from the biomedical literature as evidence.</p> <p>Methods</p> <p>Our system relies on simple statistical analyses of the full text article provided. We learn <it>n</it>-gram models for each GO code using statistical methods and use these models to hypothesize annotations. We also learn a set of Naïve Bayes models that identify textual clues of possible connections between the given protein and a hypothesized annotation. These models are used to filter and rank the predictions of the <it>n</it>-gram models.</p> <p>Results</p> <p>We report experiments evaluating the utility of various components of our system on a set of data held out during development, and experiments evaluating the utility of external data sources that we used to learn our models. Finally, we report our evaluation results from the BioCreative organizers.</p> <p>Conclusion</p> <p>We observe that, on the test data, our system performs quite well relative to the other systems submitted to the evaluation. From other experiments on the held-out data, we observe that (i) the Naïve Bayes models were effective in filtering and ranking the initially hypothesized annotations, and (ii) our learned models were significantly more accurate when external data sources were used during learning.</p

    Feature Importance Explanations for Temporal Black-Box Models

    Full text link
    Models in the supervised learning framework may capture rich and complex representations over the features that are hard for humans to interpret. Existing methods to explain such models are often specific to architectures and data where the features do not have a time-varying component. In this work, we propose TIME, a method to explain models that are inherently temporal in nature. Our approach (i) uses a model-agnostic permutation-based approach to analyze global feature importance, (ii) identifies the importance of salient features with respect to their temporal ordering as well as localized windows of influence, and (iii) uses hypothesis testing to provide statistical rigor

    Bacteria classification with an electronic nose employing artificial neural networks

    Get PDF
    This PhD thesis describes research for a medical application of electronic nose technology. There is a need at present for early detection of bacterial infection in order to improve treatment. At present, the clinical methods used to detect and classify bacteria types (usually using samples of infected matter taken from patients) can take up to two or three days. Many experienced medical staff, who treat bacterial infections, are able to recognise some types of bacteria from their odours. Identification of pathogens (i.e. bacteria responsible for disease) from their odours using an electronic nose could provide a rapid measurement and therefore early treatment. This research project used existing sensor technology in the form of an electronic nose in conjunction with data pre-processing and classification methods to classify up to four bacteria types from their odours. Research was performed mostly in the area of signal conditioning, data pre-processing and classification. A major area of interest was the use of artificial neural networks classifiers. There were three main objectives. First, to classify successfully a small range of bacteria types. Second, to identify issues relating to bacteria odour that affect the ability of an artificially intelligent system to classify bacteria from odour alone. And third, to establish optimal signal conditioning, data pre-processing and classification methods. The Electronic Nose consisted of a gas sensor array with temperature and humidity sensors, signal conditioning circuits, and gas flow apparatus. The bacteria odour was analysed using an automated sampling system, which used computer software to direct gas flow through one of several vessels (which were used to contain the odour samples, into the Electronic Nose. The electrical resistance of the odour sensors were monitored and output as electronic signals to a computer. The purpose of the automated sampling system was to improve repeatability and reduce human error. Further improvement of the Electronic Nose were implemented as a temperature control system which controlled the ambient gas temperature, and a new gas sensor chamber which incorporated improved gas flow. The odour data were collected and stored as numerical values within data files in the computer system. Once the data were stored in a non-volatile manner various classification experiments were performed. Comparisons were made and conclusions were drawn from the performance of various data pre-processing and classification methods. Classification methods employed included artificial neural networks, discriminant function analysis and multi-variate linear regression. For classifying one from four types, the best accuracy achieved was 92.78%. This was achieved using a growth phase compensated multiple layer perceptron. For identifying a single bacteria type from a mixture of two different types, the best accuracy was 96.30%. This was achieved using a standard multiple layer perceptron. Classification of bacteria odours is a typical `real world' application of the kind that electronic noses will have to be applied to if this technology is to be successful. The methods and principles researched here are one step towards the goal of introducing artificially intelligent sensor systems into everyday use. The results are promising and showed that it is feasible to used Electronic Nose technology in this application and that with further development useful products could be developed. The conclusion from this thesis is that an electronic nose can detect and classify different types of bacteria

    Illuminating new and known relations between knot invariants

    Full text link
    We automate the process of machine learning correlations between knot invariants. For nearly 200,000 distinct sets of input knot invariants together with an output invariant, we attempt to learn the output invariant by training a neural network on the input invariants. Correlation between invariants is measured by the accuracy of the neural network prediction, and bipartite or tripartite correlations are sequentially filtered from the input invariant sets so that experiments with larger input sets are checking for true multipartite correlation. We rediscover several known relationships between polynomial, homological, and hyperbolic knot invariants, while also finding novel correlations which are not explained by known results in knot theory. These unexplained correlations strengthen previous observations concerning links between Khovanov and knot Floer homology. Our results also point to a new connection between quantum algebraic and hyperbolic invariants, similar to the generalized volume conjecture.Comment: 30 pages, 9 figure
    • …
    corecore