261,681 research outputs found
Using Machine Learning and Natural Language Processing to Review and Classify the Medical Literature on Cancer Susceptibility Genes
PURPOSE: The medical literature relevant to germline genetics is growing
exponentially. Clinicians need tools monitoring and prioritizing the literature
to understand the clinical implications of the pathogenic genetic variants. We
developed and evaluated two machine learning models to classify abstracts as
relevant to the penetrance (risk of cancer for germline mutation carriers) or
prevalence of germline genetic mutations. METHODS: We conducted literature
searches in PubMed and retrieved paper titles and abstracts to create an
annotated dataset for training and evaluating the two machine learning
classification models. Our first model is a support vector machine (SVM) which
learns a linear decision rule based on the bag-of-ngrams representation of each
title and abstract. Our second model is a convolutional neural network (CNN)
which learns a complex nonlinear decision rule based on the raw title and
abstract. We evaluated the performance of the two models on the classification
of papers as relevant to penetrance or prevalence. RESULTS: For penetrance
classification, we annotated 3740 paper titles and abstracts and used 60% for
training the model, 20% for tuning the model, and 20% for evaluating the model.
The SVM model achieves 89.53% accuracy (percentage of papers that were
correctly classified) while the CNN model achieves 88.95 % accuracy. For
prevalence classification, we annotated 3753 paper titles and abstracts. The
SVM model achieves 89.14% accuracy while the CNN model achieves 89.13 %
accuracy. CONCLUSION: Our models achieve high accuracy in classifying abstracts
as relevant to penetrance or prevalence. By facilitating literature review,
this tool could help clinicians and researchers keep abreast of the burgeoning
knowledge of gene-cancer associations and keep the knowledge bases for clinical
decision support tools up to date
Recommended from our members
Abstract Rule Based Pattern Learning with Neural Networks
The ability to learn abstractions and generalise is seen as the essence of human intelligence.7 Since 1950s, there have been efforts to build systems that learn and think like humans.16 It is observed that humans including infants tend to have good generalisation power when compared to the machine learning models in which hypothesis is usually approximated and may be prone to errors. The examples proposed by Marcus19,18,17 such as the failure to generalise equality, distinguish between even to odd numbers or the recognition of ABA or ABB patterns of syllables have attracted a significant amount of attention in psychology, particularly in the study of human language learning, but they have not been addressed systematically as problems of machine learning and neural networks.
In this article, the problem of learning abstract rules using neural networks is explained and a solution called âRelation Based Patternsâ (RBP) which model abstract relationships based on equality is proposed. RBP creates an inductive bias in the neural networks that leads to the learning of generalisable solutions. It is observed that integration of RBP leads to almost perfect generalisation in abstract rule learning tasks with synthetic data and to improvements in neural language modelling on real-world data.
The outline of the article is as follows : introduction to the problem is briefly described followed by a section on what is abstract pattern (rule) learning, the need for inductive bias and various ways of adding inductive bias into neural networks. The RBP method and its integration along with the experiments on the tasks of abstract rule learning, character prediction and melody prediction are summarized followed by conclusions and future work
Machine learning algorithms for monitoring pavement performance
ABSTRACT: This work introduces the need to develop competitive, low-cost and applicable technologies to real roads to detect the asphalt condition by means of Machine Learning (ML) algorithms. Specifically, the most recent studies are described according to the data collection methods: images, ground penetrating radar (GPR), laser and optic fiber. The main models that are presented for such state-of-the-art studies are Support Vector Machine, Random Forest, NaĂŻve Bayes, Artificial neural networks or Convolutional Neural Networks. For these analyses, the methodology, type of problem, data source, computational resources, discussion and future research are highlighted. Open data sources, programming frameworks, model comparisons and data collection technologies are illustrated to allow the research community to initiate future investigation. There is indeed research on ML-based pavement evaluation but there is not a widely used applicability by pavement management entities yet, so it is mandatory to work on the refinement of models and data collection methods
Algorithms for Neural Prosthetic Applications
abstract: In the last 15 years, there has been a significant increase in the number of motor neural prostheses used for restoring limb function lost due to neurological disorders or accidents. The aim of this technology is to enable patients to control a motor prosthesis using their residual neural pathways (central or peripheral). Recent studies in non-human primates and humans have shown the possibility of controlling a prosthesis for accomplishing varied tasks such as self-feeding, typing, reaching, grasping, and performing fine dexterous movements. A neural decoding system comprises mainly of three components: (i) sensors to record neural signals, (ii) an algorithm to map neural recordings to upper limb kinematics and (iii) a prosthetic arm actuated by control signals generated by the algorithm. Machine learning algorithms that map input neural activity to the output kinematics (like finger trajectory) form the core of the neural decoding system. The choice of the algorithm is thus, mainly imposed by the neural signal of interest and the output parameter being decoded. The various parts of a neural decoding system are neural data, feature extraction, feature selection, and machine learning algorithm. There have been significant advances in the field of neural prosthetic applications. But there are challenges for translating a neural prosthesis from a laboratory setting to a clinical environment. To achieve a fully functional prosthetic device with maximum user compliance and acceptance, these factors need to be addressed and taken into consideration. Three challenges in developing robust neural decoding systems were addressed by exploring neural variability in the peripheral nervous system for dexterous finger movements, feature selection methods based on clinically relevant metrics and a novel method for decoding dexterous finger movements based on ensemble methods.Dissertation/ThesisDoctoral Dissertation Bioengineering 201
A Classification Algorithm for High-dimensional Data
abstract: With the advent of high-dimensional stored big data and streaming data, suddenly machine learning on a very large scale has become a critical need. Such machine learning should be extremely fast, should scale up easily with volume and dimension, should be able to learn from streaming data, should automatically perform dimension reduction for high-dimensional data, and should be deployable on hardware. Neural networks are well positioned to address these challenges of large scale machine learning. In this paper, we present a method that can effectively handle large scale, high-dimensional data. It is an online method that can be used for both streaming and large volumes of stored big data. It primarily uses Kohonen nets, although only a few selected neurons (nodes) from multiple Kohonen nets are actually retained in the end; we discard all Kohonen nets after training. We use Kohonen nets both for dimensionality reduction through feature selection and for building an ensemble of classifiers using single Kohonen neurons. The method is meant to exploit massive parallelism and should be easily deployable on hardware that implements Kohonen nets. Some initial computational results are presented
Text Classification
There is an abundance of text data in this world but most of it is raw. We need to extract information from this data to make use of it. One way to extract this information from raw text is to apply informative labels drawn from a pre-defined fixed set i.e. Text Classification. In this thesis, we focus on the general problem of text classification, and work towards solving challenges associated to binary/multi-class/multi-label classification. More specifically, we deal with the problem of (i) Zero-shot labels during testing; (ii) Active learning for text screening; (iii) Multi-label classification under low supervision; (iv) Structured label space; (v) Classifying pairs of words in raw text i.e. Relation Extraction. For (i), we use a zero-shot classification model that utilizes independently learned semantic embeddings. Regarding (ii), we propose a novel active learning algorithm that reduces problem of bias in naive active learning algorithms. For (iii), we propose neural candidate-selector architecture that starts from a set of high-recall candidate labels to obtain high-precision predictions. In the case of (iv), we proposed an attention based neural tree decoder that recursively decodes an abstract into the ontology tree. For (v), we propose using second-order relations that are derived by explicitly connecting pairs of words via context token(s) for improved relation extraction. We use a wide variety of both traditional and deep machine learning tools. More specifically, we used traditional machine learning models like multi-valued linear regression and logistic regression for (i, ii), deep convolutional neural networks for (iii), recurrent neural networks for (iv) and transformer networks for (v)
An AI-based solution for wireless channel interference prediction and wireless remote control
Abstract. Most control systems rely on wired connectivity between controllers and plants due to their need for fast and reliable real-time control. Yet the demand for mobility, scalability, low operational and maintenance costs call for wireless networked control system designs. Naturally, over-the-air communication is susceptible to interference and fading and therefore, enabling low latency and high reliability is crucial for wireless control scenarios. In this view, the work of this thesis aims to enhance reliability of the wireless communication and to optimize the energy consumption while maintaining low latency and the stability of the controller-plant system. To achieve this goal, two core abstractions have been used, a neural wireless channel interference predictor and a neural predictive controller. This neural predictor design is motivated by the capability of machine learning in assimilating underlying patterns and dynamics of systems using the observed data. The system model is composed of a controller-plant scheme on which the controller transmits control signals wirelessly. The neural wireless predictor and the neural controller predict wireless channel interference and plant states, respectively. This information is used to optimize energy consumption and prevent communication outages while controlling the plant. This thesis presents the development of the neural wireless predictor, the neural controller and a neural plant. Interaction and functionality of these elements are demonstrated using a Simulink simulation. Results of simulation illustrate the effectiveness of neural networks in both control and wireless domain. The proposed solution yields about 17% reduction in energy consumption compared to state-of-the-art designs by minimizing the impact of interference in the control links while ensuring plant stability
On a Neural Network to Extract Implied Information from American Options
[Abstract] Extracting implied information, like volatility and dividend, from observed option prices is a challenging task when dealing with American options, because of the complex-shaped early-exercise regions and the computational costs to solve the corresponding mathematical problem repeatedly. We will employ a data-driven machine learning approach to estimate the Black-Scholes implied volatility and the dividend yield for American options in a fast and robust way. To determine the implied volatility, the inverse function is approximated by an artificial neural network on the effective computational domain of interest, which decouples the offline (training) and online (prediction) stages and thus eliminates the need for an iterative process. In the case of an unknown dividend yield, we formulate the inverse problem as a calibration problem and determine simultaneously the implied volatility and dividend yield. For this, a generic and robust calibration framework, the Calibration Neural Network (CaNN), is introduced to estimate multiple parameters. It is shown that machine learning can be used as an efficient numerical technique to extract implied information from American options, particularly when considering multiple early-exercise regions due to negative interest rates.We would also like to thank Dr.ir Lech Grzelak for valuable suggestions, as well as Dr. Damien Ackerer for fruitful discussions. The author S. Liu would like to thank the China Scholarship Council (CSC) for the financial suppor
Recommended from our members
Neural Methods for Answer Passage Retrieval over Sparse Collections
Recent advances in machine learning have allowed information retrieval (IR) techniques to advance beyond the stage of handcrafting domain specific features. Specifically, deep neural models incorporate varying levels of features to learn whether a document answers the information need of a query. However, these neural models rely on a large number of parameters to successfully learn a relation between a query and a relevant document. This reliance on a large number of parameters, combined with the current methods of optimization relying on small updates necessitates numerous samples to allow the neural model to converge on an effective relevance function. This presents a significant obstacle in the realm of IR as relevance judgements are often sparse or noisy and combined with a large class imbalance. This is especially true for short text retrieval where there is often only one relevant passage. This problem is exacerbated when training these artificial neural networks, as excessive negative sampling can result in poor performance. Thus, we propose approaching this task through multiple avenues and examining their effectiveness on a non-factoid question answering (QA) task.We first propose learning local embeddings specific to the relevance information of the collection to improve performance of an upstream neural model. In doing so, we find significantly improved results over standard pre-trained embeddings, despite only developing the embeddings on a small collection which would not be sufficient for a full language model. Leveraging this local representation, and inspired by recent work in machine translation, we introduce a hybrid embedding based model that incorporates both pre-trained embeddings while dynamically constructing local representations from character embeddings. The hybrid approach relies on pre-trained embeddings to achieve an effective retrieval model, and continually adjusts its character level abstraction to fit a local representation.We next approach methods to adapt neural models to multiple IR collections, therefore reducing the collection specific training required and alleviating the need to retrain a neural model\u27s parameters for a new subdomain of a collection. First, we propose an adversarial retrieval model which achieves state-of-the-art performance on out of subdomain queries while maintaining in-domain performance. Second, we establish an informed negative sampling approach using a reinforcement learning agent. The agent is trained to directly maximize the performance of a neural IR model using a predefined IR metric by choosing which ranking function from which to sample negative documents. This policy based sampling allows the neural model to be exposed to more of a collection and results in a more consistent neural retrieval model over multiple training instances. Lastly, we move towards a universal retrieval function. We initially introduce a probe-based inspection of neural relevance models through the lens of standard natural language processing tasks and establish that while seemingly similar QA collections require the same basic abstract information, the final layers that determine relevance differ significantly. We then introduce Universal Retrieval Functions, a method to incorporate new collections using a library of previously trained linear relevance models and a common neural representation
Towards understanding the challenges faced by machine learning software developers and enabling automated solutions
Modern software systems are increasingly including machine learning (ML) as an integral component. However, we do not yet understand the difficulties faced by software developers when learning about ML libraries and using them within their systems. To fill that gap this thesis reports on a detailed (manual) examination of 3,243 highly-rated Q&A posts related to ten ML libraries, namely Tensorflow, Keras, scikitlearn, Weka, Caffe, Theano, MLlib, Torch, Mahout, and H2O, on Stack Overflow, a popular online technical Q&A forum. Our findings reveal the urgent need for software engineering (SE) research in this area. The second part of the thesis particularly focuses on understanding the Deep Neural Network (DNN) bug characteristics. We study 2,716 high-quality posts from Stack Overflow and 500 bug fix commits from Github about five popular deep learning libraries Caffe, Keras, Tensorflow, Theano, and Torch to understand the types of bugs, their root causes and impacts, bug-prone stage of deep learning pipeline as well as whether there are some common antipatterns found in this buggy software. While exploring the bug characteristics, our findings imply that repairing software that uses DNNs is one such unmistakable SE need where automated tools could be beneficial; however, we do not fully understand challenges to repairing and patterns that are utilized when manually repairing DNNs. So, the third part of this thesis presents a comprehensive study of bug fix patterns to address these questions. We have studied 415 repairs from Stack Overflow and 555 repairs from Github for five popular deep learning libraries Caffe, Keras, Tensorflow, Theano, and Torch to understand challenges in repairs and bug repair patterns. Our key findings reveal that DNN bug fix patterns are distinctive compared to traditional bug fix patterns and the most common bug fix patterns are fixing data dimension and neural network connectivity. Finally, we propose an automatic technique to detect ML Application Programming Interface (API) misuses. We started with an empirical study to understand ML API misuses. Our study shows that ML API misuse is prevalent and distinct compared to non-ML API misuses. Inspired by these findings, we contributed Amimla (Api Misuse In Machine Learning Apis) an approach and a tool for ML API misuse detection. Amimla relies on several technical innovations. First, we proposed an abstract representation of ML pipelines to use in misuse detection. Second, we proposed an abstract representation of neural networks for deep learning related APIs. Third, we have developed a representation strategy for constraints on ML APIs. Finally, we have developed a misuse detection strategy for both single and multi-APIs. Our experimental evaluation shows that Amimla achieves a high average accuracy of âŒ80% on two benchmarks of misuses from Stack Overflow and Github
- âŠ