1,335 research outputs found
Internal and collective interpretation for improving human interpretability of multi-layered neural networks
The present paper aims to propose a new type of information-theoretic method to interpret the inference mechanism of neural networks. We interpret the internal inference mechanism for itself without any external methods such as symbolic or fuzzy rules. In addition, we make interpretation processes as stable as possible. This means that we interpret the inference mechanism, considering all internal representations, created by those different conditions and patterns. To make the internal interpretation possible, we try to compress multi-layered neural networks into the simplest ones without hidden layers. Then, the natural information loss in the process of compression is complemented by the introduction of a mutual information augmentation component. The method was applied to two data sets, namely, the glass data set and the pregnancy data set. In both data sets, information augmentation and compression methods could improve generalization performance. In addition, compressed or collective weights from the multi-layered networks tended to produce weights, ironically, similar to the linear correlation coefficients between inputs and targets, while the conventional methods such as the logistic regression analysis failed to do so
Recommended from our members
Discovering gated recurrent neural network architectures
Reinforcement Learning agent networks with memory are a key component in solving POMDP tasks.
Gated recurrent networks such as those composed of Long Short-Term
Memory (LSTM) nodes have recently been used to improve
state of the art in many supervised sequential processing tasks such as speech
recognition and machine translation. However, scaling them to deep
memory tasks in reinforcement learning domain is challenging because of sparse and deceptive
reward function. To address this challenge first, a new secondary optimization objective is introduced
that maximizes the information (Info-max) stored in
the LSTM network. Results indicate that when combined with neuroevolution, Info-max can discover powerful
LSTM-based memory solutions that outperform traditional
RNNs. Next, for the supervised learning tasks, neuroevolution techniques are employed
to design new LSTM architectures. Such architectural variations include
discovering new pathways between the recurrent layers as well as designing new gated
recurrent nodes. This dissertation proposes evolution of a tree-based
encoding of the gated memory nodes, and shows that it makes
it possible to explore new variations more effectively than other
methods. The method discovers nodes with multiple recurrent paths
and multiple memory cells, which lead to significant improvement
in the standard language modeling benchmark task. The dissertation also
shows how the search process can be speeded up by training an
LSTM network to estimate performance of candidate structures, and
by encouraging exploration of novel solutions. Thus, evolutionary
design of complex neural network structures promises to improve
performance of deep learning architectures beyond human ability
to do so.Computer Science
Methodological challenges and analytic opportunities for modeling and interpreting Big Healthcare Data
Abstract
Managing, processing and understanding big healthcare data is challenging, costly and demanding. Without a robust fundamental theory for representation, analysis and inference, a roadmap for uniform handling and analyzing of such complex data remains elusive. In this article, we outline various big data challenges, opportunities, modeling methods and software techniques for blending complex healthcare data, advanced analytic tools, and distributed scientific computing. Using imaging, genetic and healthcare data we provide examples of processing heterogeneous datasets using distributed cloud services, automated and semi-automated classification techniques, and open-science protocols. Despite substantial advances, new innovative technologies need to be developed that enhance, scale and optimize the management and processing of large, complex and heterogeneous data. Stakeholder investments in data acquisition, research and development, computational infrastructure and education will be critical to realize the huge potential of big data, to reap the expected information benefits and to build lasting knowledge assets. Multi-faceted proprietary, open-source, and community developments will be essential to enable broad, reliable, sustainable and efficient data-driven discovery and analytics. Big data will affect every sector of the economy and their hallmark will be ‘team science’.http://deepblue.lib.umich.edu/bitstream/2027.42/134522/1/13742_2016_Article_117.pd
Systematic Review on Missing Data Imputation Techniques with Machine Learning Algorithms for Healthcare
Missing data is one of the most common issues encountered in data cleaning process especially when dealing with medical dataset. A real collected dataset is prone to be incomplete, inconsistent, noisy and redundant due to potential reasons such as human errors, instrumental failures, and adverse death. Therefore, to accurately deal with incomplete data, a sophisticated algorithm is proposed to impute those missing values. Many machine learning algorithms have been applied to impute missing data with plausible values. However, among all machine learning imputation algorithms, KNN algorithm has been widely adopted as an imputation for missing data due to its robustness and simplicity and it is also a promising method to outperform other machine learning methods. This paper provides a comprehensive review of different imputation techniques used to replace the missing data. The goal of the review paper is to bring specific attention to potential improvements to existing methods and provide readers with a better grasps of imputation technique trends
Learning Behavior Models for Interpreting and Predicting Traffic Situations
In this thesis, we present Bayesian state estimation and machine learning methods for predicting traffic situations. The cognitive ability to assess situations and behaviors of traffic participants, and to anticipate possible developments is an essential requirement for several applications in the traffic domain, especially for self-driving cars. We present a method for learning behavior models from unlabeled traffic observations and develop improved learning methods for decision trees
Recommended from our members
Application of Deep Learning to Brain Connectivity Classification in Large MRI Datasets
The use of machine learning for whole-brain classification of magnetic resonance imaging (MRI) data is of clear interest, both for understanding phenotypic differences in brain structure and function and for diagnostic applications. Developments of deep learning models in the past decade have revolutionized photographic image and speech recognition, bringing promise to do the same to other fields of science. However, there are many practical and theoretical challenges in the translation of such methods to the unique context of MRIs of the brain. This thesis presents a theoretical underpinning for whole-brain classification of extremely large datasets of multi-site MRIs, including machine learning model architecture, dataset curation methods, machine learning visualization methods, encoding of MRI data, and feature extraction. To replicate large sample sizes typically applied to deep learning models, a dataset of over 50,000 functional and structural MRIs was amassed from nine different databases, and the undertaken analyses were conducted on three covariates commonly found across these collections: sex, resting state/task, and autism spectrum disorder. I find that deep learning is not only a method that has promise for clinical application in the future, but also a powerful statistical tool for analyzing complex, nonlinear relationships in brain data where conventional statistics may fail. However, results are also dependent on factors such as dataset imbalances, confounding factors such as motion and head size, selected methods of encoding MRI data, variability of machine learning models and selected methods of visualizing the machine learning results. In this thesis, I present the following methodological innovations: (1) a method of balancing datasets as a means of regressing out measurable confounding factors; (2) a means of removing spatial biases from deep learning visualization methods; (3) methods of encoding functional and structural datasets as connectivity matrices; (4) the use of ensemble models and convolutional neural network architectures to improve classification accuracy and consistency; (5) adaptation of deep learning visualization methods to study brain connections utilized in the classification process. Additionally, I discuss interpretations, limitations, and future directions of this research.Gates Cambridge Scholarshi
- …