3,527 research outputs found
An Interpretable Deep Architecture for Similarity Learning Built Upon Hierarchical Concepts
In general, development of adequately complex mathematical models, such as deep neural networks, can be an effective way to improve the accuracy of learning models. However, this is achieved at the cost of reduced post-hoc model interpretability, because what is learned by the model can become less intelligible and tractable to humans as the model complexity increases. In this paper, we target a similarity learning task in the context of image retrieval, with a focus on the model interpretability issue. An effective similarity neural network (SNN) is proposed not only to seek robust retrieval performance but also to achieve satisfactory post-hoc interpretability. The network is designed by linking the neuron architecture with the organization of a concept tree and by formulating neuron operations to pass similarity information between concepts. Various ways of understanding and visualizing what is learned by the SNN neurons are proposed. We also exhaustively evaluate the proposed approach using a number of relevant datasets against a number of state-of-the-art approaches to demonstrate the effectiveness of the proposed network. Our results show that the proposed approach can offer superior performance when compared against state-of-the-art approaches. Neuron visualization results are demonstrated to support the understanding of the trained neurons
Deep EHR: A Survey of Recent Advances in Deep Learning Techniques for Electronic Health Record (EHR) Analysis
The past decade has seen an explosion in the amount of digital information
stored in electronic health records (EHR). While primarily designed for
archiving patient clinical information and administrative healthcare tasks,
many researchers have found secondary use of these records for various clinical
informatics tasks. Over the same period, the machine learning community has
seen widespread advances in deep learning techniques, which also have been
successfully applied to the vast amount of EHR data. In this paper, we review
these deep EHR systems, examining architectures, technical aspects, and
clinical applications. We also identify shortcomings of current techniques and
discuss avenues of future research for EHR-based deep learning.Comment: Accepted for publication with Journal of Biomedical and Health
Informatics: http://ieeexplore.ieee.org/abstract/document/8086133
Construction and Quality Evaluation of Heterogeneous Hierarchical Topic Models
In our work, we propose to represent HTM as a set of flat models, or layers,
and a set of topical hierarchies, or edges. We suggest several quality measures
for edges of hierarchical models, resembling those proposed for flat models. We
conduct an assessment experimentation and show strong correlation between the
proposed measures and human judgement on topical edge quality. We also
introduce heterogeneous algorithm to build hierarchical topic models for
heterogeneous data sources. We show how making certain adjustments to learning
process helps to retain original structure of customized models while allowing
for slight coherent modifications for new documents. We evaluate this approach
using the proposed measures and show that the proposed heterogeneous algorithm
significantly outperforms the baseline concat approach. Finally, we implement
our own ESE called Rysearch, which demonstrates the potential of ARTM approach
for visualizing large heterogeneous document collections
Hierarchical Semantic Tree Concept Whitening for Interpretable Image Classification
With the popularity of deep neural networks (DNNs), model interpretability is
becoming a critical concern. Many approaches have been developed to tackle the
problem through post-hoc analysis, such as explaining how predictions are made
or understanding the meaning of neurons in middle layers. Nevertheless, these
methods can only discover the patterns or rules that naturally exist in models.
In this work, rather than relying on post-hoc schemes, we proactively instill
knowledge to alter the representation of human-understandable concepts in
hidden layers. Specifically, we use a hierarchical tree of semantic concepts to
store the knowledge, which is leveraged to regularize the representations of
image data instances while training deep models. The axes of the latent space
are aligned with the semantic concepts, where the hierarchical relations
between concepts are also preserved. Experiments on real-world image datasets
show that our method improves model interpretability, showing better
disentanglement of semantic concepts, without negatively affecting model
classification performance
Legal Judgement Prediction for UK Courts
Legal Judgement Prediction (LJP) is the task of automatically predicting the outcome of a court case given only the case document. During the last five years researchers have successfully attempted this task for the supreme courts of three jurisdictions: the European Union, France, and China. Motivation includes the many real world applications including: a prediction system that can be used at the judgement drafting stage, and the identification of the most important words and phrases within a judgement. The aim of our research was to build, for the first time, an LJP model for UK court cases. This required the creation of a labelled data set of UK court judgements and the subsequent application of machine learning models. We evaluated different feature representations and different algorithms. Our best performing model achieved: 69.05% accuracy and 69.02 F1 score. We demonstrate that LJP is a promising area of further research for UK courts by achieving high model performance and the ability to easily extract useful features
On the Semantic Interpretability of Artificial Intelligence Models
Artificial Intelligence models are becoming increasingly more powerful and
accurate, supporting or even replacing humans' decision making. But with
increased power and accuracy also comes higher complexity, making it hard for
users to understand how the model works and what the reasons behind its
predictions are. Humans must explain and justify their decisions, and so do the
AI models supporting them in this process, making semantic interpretability an
emerging field of study. In this work, we look at interpretability from a
broader point of view, going beyond the machine learning scope and covering
different AI fields such as distributional semantics and fuzzy logic, among
others. We examine and classify the models according to their nature and also
based on how they introduce interpretability features, analyzing how each
approach affects the final users and pointing to gaps that still need to be
addressed to provide more human-centered interpretability solutions.Comment: 17 pages, 4 figures. Submitted to AI Magazine on August, 201
Data Curation with Deep Learning [Vision]
Data curation - the process of discovering, integrating, and cleaning data -
is one of the oldest, hardest, yet inevitable data management problems. Despite
decades of efforts from both researchers and practitioners, it is still one of
the most time consuming and least enjoyable work of data scientists. In most
organizations, data curation plays an important role so as to fully unlock the
value of big data. Unfortunately, the current solutions are not keeping up with
the ever-changing data ecosystem, because they often require substantially high
human cost. Meanwhile, deep learning is making strides in achieving remarkable
successes in multiple areas, such as image recognition, natural language
processing, and speech recognition. In this vision paper, we explore how some
of the fundamental innovations in deep learning could be leveraged to improve
existing data curation solutions and to help build new ones. In particular, we
provide a thorough overview of the current deep learning landscape, and
identify interesting research opportunities and dispel common myths. We hope
that the synthesis of these important domains will unleash a series of research
activities that will lead to significantly improved solutions for many data
curation tasks
A general approach for Explanations in terms of Middle Level Features
Nowadays, it is growing interest to make Machine Learning (ML) systems more
understandable and trusting to general users. Thus, generating explanations for
ML system behaviours that are understandable to human beings is a central
scientific and technological issue addressed by the rapidly growing research
area of eXplainable Artificial Intelligence (XAI). Recently, it is becoming
more and more evident that new directions to create better explanations should
take into account what a good explanation is to a human user, and consequently,
develop XAI solutions able to provide user-centred explanations. This paper
suggests taking advantage of developing an XAI general approach that allows
producing explanations for an ML system behaviour in terms of different and
user-selected input features, i.e., explanations composed of input properties
that the human user can select according to his background knowledge and goals.
To this end, we propose an XAI general approach which is able: 1) to construct
explanations in terms of input features that represent more salient and
understandable input properties for a user, which we call here Middle-Level
input Features (MLFs), 2) to be applied to different types of MLFs. We
experimentally tested our approach on two different datasets and using three
different types of MLFs. The results seem encouraging
Hierarchical Semantic Correspondence Learning for Post-Discharge Patient Mortality Prediction
Predicting patient mortality is an important and challenging problem in the
healthcare domain, especially for intensive care unit (ICU) patients.
Electronic health notes serve as a rich source for learning patient
representations, that can facilitate effective risk assessment. However, a
large portion of clinical notes are unstructured and also contain domain
specific terminologies, from which we need to extract structured information.
In this paper, we introduce an embedding framework to learn
semantically-plausible distributed representations of clinical notes that
exploits the semantic correspondence between the unstructured texts and their
corresponding structured knowledge, known as semantic frame, in a hierarchical
fashion. Our approach integrates text modeling and semantic correspondence
learning into a single model that comprises 1) an unstructured embedding module
that makes use of self-similarity matrix representations in order to inject
structural regularities of different segments inherent in clinical texts to
promote local coherence, 2) a structured embedding module to embed the semantic
frames (e.g., UMLS semantic types) with deep ConvNet and 3) a hierarchical
semantic correspondence module that embeds by enhancing the interactions
between text-semantic frame embedding pairs at multiple levels (i.e., words,
sentence, note). Evaluations on multiple embedding benchmarks on post discharge
intensive care patient mortality prediction tasks demonstrate its effectiveness
compared to approaches that do not exploit the semantic interactions between
structured and unstructured information present in clinical notes
A Machine Learning Approach to Indoor Localization Data Mining
Indoor positioning systems are increasingly commonplace in various environments and
produce large quantities of data. They are used in industrial applications, robotics,
asset and employee tracking just to name a few use cases. The growing amount of data
and the accelerating progress of machine learning opens up many new possibilities for
analyzing this data in ways that were not conceivable or relevant before. This paper
introduces connected concepts and implementations to answer question how this data
can be utilized. Data gathered in this thesis originates from an indoor positioning system
deployed in retail environment, but the discussed methods can be applied generally.
The issue will be approached by first introducing the concept of machine learning
and more generally, artificial intelligence, and how they work on a general level. A
deeper dive is done to subfields and algorithms that are relevant to the data mining task
at hand. Indoor positioning system basics are also shortly discussed to create a base understanding
on the realistic capabilities and constraints that these kinds of systems encase.
These methods and previous knowledge from literature are put to test with the
freshly gathered data. An algorithm based on existing example from literature was tested
and improved upon with the new data. A novel method to cluster and classify movement
patterns was introduced, utilizing deep learning to create embedded representations of the
trajectories in a more complex learning pipeline. This type of learning is often referred
to as deep clustering.
The results are promising and both of the methods produce useful high level representations
of the complex dataset that can help a human operator to discern the
relevant patterns from raw data and to be used as an input for subsequent supervised and
unsupervised learning steps. Several factors related to optimizing the learning pipeline,
such as regularization were also researched and the results presented as visualizations.
The research found that pipeline consisting of CNN-autoencoder followed by a classic
clustering algorithm such as DBSCAN produces useful results in the form of trajectory
clusters. Regularization such as L1 regression improves this performance.
The research done in this paper presents useful algorithms for processing raw, noisy
localization data from indoor environments that can be used for further implementations
in both industrial applications and academia
- …