Search CORE

3,527 research outputs found

An Interpretable Deep Architecture for Similarity Learning Built Upon Hierarchical Concepts

Author: Gao Xinjian
Goulermas John
Mu Tingting
Thiyagalingam Jeyarajan
Wang Meng
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 20/01/2020
Field of study

In general, development of adequately complex mathematical models, such as deep neural networks, can be an effective way to improve the accuracy of learning models. However, this is achieved at the cost of reduced post-hoc model interpretability, because what is learned by the model can become less intelligible and tractable to humans as the model complexity increases. In this paper, we target a similarity learning task in the context of image retrieval, with a focus on the model interpretability issue. An effective similarity neural network (SNN) is proposed not only to seek robust retrieval performance but also to achieve satisfactory post-hoc interpretability. The network is designed by linking the neuron architecture with the organization of a concept tree and by formulating neuron operations to pass similarity information between concepts. Various ways of understanding and visualizing what is learned by the SNN neurons are proposed. We also exhaustively evaluate the proposed approach using a number of relevant datasets against a number of state-of-the-art approaches to demonstrate the effectiveness of the proposed network. Our results show that the proposed approach can offer superior performance when compared against state-of-the-art approaches. Neuron visualization results are demonstrated to support the understanding of the trained neurons

University of Liverpool Repository

Deep EHR: A Survey of Recent Advances in Deep Learning Techniques for Electronic Health Record (EHR) Analysis

Author: Bihorac Azra
Rashidi Parisa
Shickel Benjamin
Tighe Patrick
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 23/02/2018
Field of study

The past decade has seen an explosion in the amount of digital information stored in electronic health records (EHR). While primarily designed for archiving patient clinical information and administrative healthcare tasks, many researchers have found secondary use of these records for various clinical informatics tasks. Over the same period, the machine learning community has seen widespread advances in deep learning techniques, which also have been successfully applied to the vast amount of EHR data. In this paper, we review these deep EHR systems, examining architectures, technical aspects, and clinical applications. We also identify shortcomings of current techniques and discuss avenues of future research for EHR-based deep learning.Comment: Accepted for publication with Journal of Biomedical and Health Informatics: http://ieeexplore.ieee.org/abstract/document/8086133

arXiv.org e-Print Archive

Construction and Quality Evaluation of Heterogeneous Hierarchical Topic Models

Author: Belyy Anton
Publication venue
Publication date: 07/11/2018
Field of study

In our work, we propose to represent HTM as a set of flat models, or layers, and a set of topical hierarchies, or edges. We suggest several quality measures for edges of hierarchical models, resembling those proposed for flat models. We conduct an assessment experimentation and show strong correlation between the proposed measures and human judgement on topical edge quality. We also introduce heterogeneous algorithm to build hierarchical topic models for heterogeneous data sources. We show how making certain adjustments to learning process helps to retain original structure of customized models while allowing for slight coherent modifications for new documents. We evaluate this approach using the proposed measures and show that the proposed heterogeneous algorithm significantly outperforms the baseline concat approach. Finally, we implement our own ESE called Rysearch, which demonstrates the potential of ARTM approach for visualizing large heterogeneous document collections

arXiv.org e-Print Archive

Hierarchical Semantic Tree Concept Whitening for Interpretable Image Classification

Author: Dai Haixing
Li Changying
Liu David
Liu Ninghao
Liu Tianming
Liu Zhengliang
Lyu Yanjun
Wu Zihao
Yu Xiaowei
Zhang Lu
Zhao Lin
Zhu Dajiang
Publication venue
Publication date: 10/07/2023
Field of study

With the popularity of deep neural networks (DNNs), model interpretability is becoming a critical concern. Many approaches have been developed to tackle the problem through post-hoc analysis, such as explaining how predictions are made or understanding the meaning of neurons in middle layers. Nevertheless, these methods can only discover the patterns or rules that naturally exist in models. In this work, rather than relying on post-hoc schemes, we proactively instill knowledge to alter the representation of human-understandable concepts in hidden layers. Specifically, we use a hierarchical tree of semantic concepts to store the knowledge, which is leveraged to regularize the representations of image data instances while training deep models. The axes of the latent space are aligned with the semantic concepts, where the hierarchical relations between concepts are also preserved. Experiments on real-world image datasets show that our method improves model interpretability, showing better disentanglement of semantic concepts, without negatively affecting model classification performance

arXiv.org e-Print Archive

Legal Judgement Prediction for UK Courts

Author: Bengio Yoshua
Blei David M
Edo-Osagie Oduwa
Joachims Thorsten
Lawlor Reed C
Le Quoc
Lee Sangno
Medvedeva Masha
Mikolov Tomas
Socher Richard
Wyner Adam
Zhang Xiang
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 20/04/2020
Field of study

Legal Judgement Prediction (LJP) is the task of automatically predicting the outcome of a court case given only the case document. During the last five years researchers have successfully attempted this task for the supreme courts of three jurisdictions: the European Union, France, and China. Motivation includes the many real world applications including: a prediction system that can be used at the judgement drafting stage, and the identification of the most important words and phrases within a judgement. The aim of our research was to build, for the first time, an LJP model for UK court cases. This required the creation of a labelled data set of UK court judgements and the subsequent application of machine learning models. We evaluated different feature representations and different algorithms. Our best performing model achieved: 69.05% accuracy and 69.02 F1 score. We demonstrate that LJP is a promising area of further research for UK courts by achieving high model performance and the ability to easily extract useful features

On the Semantic Interpretability of Artificial Intelligence Models

Author: Freitas André
Handschuh Siegfried
Silva Vivian S.
Publication venue
Publication date: 09/07/2019
Field of study

Artificial Intelligence models are becoming increasingly more powerful and accurate, supporting or even replacing humans' decision making. But with increased power and accuracy also comes higher complexity, making it hard for users to understand how the model works and what the reasons behind its predictions are. Humans must explain and justify their decisions, and so do the AI models supporting them in this process, making semantic interpretability an emerging field of study. In this work, we look at interpretability from a broader point of view, going beyond the machine learning scope and covering different AI fields such as distributional semantics and fuzzy logic, among others. We examine and classify the models according to their nature and also based on how they introduce interpretability features, analyzing how each approach affects the final users and pointing to gaps that still need to be addressed to provide more human-centered interpretability solutions.Comment: 17 pages, 4 figures. Submitted to AI Magazine on August, 201

arXiv.org e-Print Archive

Data Curation with Deep Learning [Vision]

Author: Doan AnHai
Ouzzani Mourad
Tang Nan
Thirumuruganathan Saravanan
Publication venue
Publication date: 24/03/2019
Field of study

Data curation - the process of discovering, integrating, and cleaning data - is one of the oldest, hardest, yet inevitable data management problems. Despite decades of efforts from both researchers and practitioners, it is still one of the most time consuming and least enjoyable work of data scientists. In most organizations, data curation plays an important role so as to fully unlock the value of big data. Unfortunately, the current solutions are not keeping up with the ever-changing data ecosystem, because they often require substantially high human cost. Meanwhile, deep learning is making strides in achieving remarkable successes in multiple areas, such as image recognition, natural language processing, and speech recognition. In this vision paper, we explore how some of the fundamental innovations in deep learning could be leveraged to improve existing data curation solutions and to help build new ones. In particular, we provide a thorough overview of the current deep learning landscape, and identify interesting research opportunities and dispel common myths. We hope that the synthesis of these important domains will unleash a series of research activities that will lead to significantly improved solutions for many data curation tasks

arXiv.org e-Print Archive

A general approach for Explanations in terms of Middle Level Features

Author: Apicella Andrea
Isgrò Francesco
Prevete Roberto
Publication venue
Publication date: 01/07/2021
Field of study

Nowadays, it is growing interest to make Machine Learning (ML) systems more understandable and trusting to general users. Thus, generating explanations for ML system behaviours that are understandable to human beings is a central scientific and technological issue addressed by the rapidly growing research area of eXplainable Artificial Intelligence (XAI). Recently, it is becoming more and more evident that new directions to create better explanations should take into account what a good explanation is to a human user, and consequently, develop XAI solutions able to provide user-centred explanations. This paper suggests taking advantage of developing an XAI general approach that allows producing explanations for an ML system behaviour in terms of different and user-selected input features, i.e., explanations composed of input properties that the human user can select according to his background knowledge and goals. To this end, we propose an XAI general approach which is able: 1) to construct explanations in terms of input features that represent more salient and understandable input properties for a user, which we call here Middle-Level input Features (MLFs), 2) to be applied to different types of MLFs. We experimentally tested our approach on two different datasets and using three different types of MLFs. The results seem encouraging

arXiv.org e-Print Archive

Hierarchical Semantic Correspondence Learning for Post-Discharge Patient Mortality Prediction

Author: Chowdhury Shaika
Luo Yuan
Yu Philip S.
Zhang Chenwei
Publication venue
Publication date: 14/10/2019
Field of study

Predicting patient mortality is an important and challenging problem in the healthcare domain, especially for intensive care unit (ICU) patients. Electronic health notes serve as a rich source for learning patient representations, that can facilitate effective risk assessment. However, a large portion of clinical notes are unstructured and also contain domain specific terminologies, from which we need to extract structured information. In this paper, we introduce an embedding framework to learn semantically-plausible distributed representations of clinical notes that exploits the semantic correspondence between the unstructured texts and their corresponding structured knowledge, known as semantic frame, in a hierarchical fashion. Our approach integrates text modeling and semantic correspondence learning into a single model that comprises 1) an unstructured embedding module that makes use of self-similarity matrix representations in order to inject structural regularities of different segments inherent in clinical texts to promote local coherence, 2) a structured embedding module to embed the semantic frames (e.g., UMLS semantic types) with deep ConvNet and 3) a hierarchical semantic correspondence module that embeds by enhancing the interactions between text-semantic frame embedding pairs at multiple levels (i.e., words, sentence, note). Evaluations on multiple embedding benchmarks on post discharge intensive care patient mortality prediction tasks demonstrate its effectiveness compared to approaches that do not exploit the semantic interactions between structured and unstructured information present in clinical notes

arXiv.org e-Print Archive

A Machine Learning Approach to Indoor Localization Data Mining

Author: Lindqvist Samuel
Publication venue
Publication date: 25/11/2020
Field of study

Indoor positioning systems are increasingly commonplace in various environments and produce large quantities of data. They are used in industrial applications, robotics, asset and employee tracking just to name a few use cases. The growing amount of data and the accelerating progress of machine learning opens up many new possibilities for analyzing this data in ways that were not conceivable or relevant before. This paper introduces connected concepts and implementations to answer question how this data can be utilized. Data gathered in this thesis originates from an indoor positioning system deployed in retail environment, but the discussed methods can be applied generally. The issue will be approached by first introducing the concept of machine learning and more generally, artificial intelligence, and how they work on a general level. A deeper dive is done to subfields and algorithms that are relevant to the data mining task at hand. Indoor positioning system basics are also shortly discussed to create a base understanding on the realistic capabilities and constraints that these kinds of systems encase. These methods and previous knowledge from literature are put to test with the freshly gathered data. An algorithm based on existing example from literature was tested and improved upon with the new data. A novel method to cluster and classify movement patterns was introduced, utilizing deep learning to create embedded representations of the trajectories in a more complex learning pipeline. This type of learning is often referred to as deep clustering. The results are promising and both of the methods produce useful high level representations of the complex dataset that can help a human operator to discern the relevant patterns from raw data and to be used as an input for subsequent supervised and unsupervised learning steps. Several factors related to optimizing the learning pipeline, such as regularization were also researched and the results presented as visualizations. The research found that pipeline consisting of CNN-autoencoder followed by a classic clustering algorithm such as DBSCAN produces useful results in the form of trajectory clusters. Regularization such as L1 regression improves this performance. The research done in this paper presents useful algorithms for processing raw, noisy localization data from indoor environments that can be used for further implementations in both industrial applications and academia