145 research outputs found
Human-Interpretable Explanations for Black-Box Machine Learning Models: An Application to Fraud Detection
Machine Learning (ML) has been increasingly used to aid humans making high-stakes
decisions in a wide range of areas, from public policy to criminal justice, education,
healthcare, or financial services. However, it is very hard for humans to grasp the rationale
behind every ML model’s prediction, hindering trust in the system. The field
of Explainable Artificial Intelligence (XAI) emerged to tackle this problem, aiming to
research and develop methods to make those “black-boxes” more interpretable, but there
is still no major breakthrough. Additionally, the most popular explanation methods —
LIME and SHAP — produce very low-level feature attribution explanations, being of
limited usefulness to personas without any ML knowledge.
This work was developed at Feedzai, a fintech company that uses ML to prevent financial
crime. One of the main Feedzai products is a case management application used
by fraud analysts to review suspicious financial transactions flagged by the ML models.
Fraud analysts are domain experts trained to look for suspicious evidence in transactions
but they do not have ML knowledge, and consequently, current XAI methods do not
suit their information needs. To address this, we present JOEL, a neural network-based
framework to jointly learn a decision-making task and associated domain knowledge
explanations. JOEL is tailored to human-in-the-loop domain experts that lack deep technical
ML knowledge, providing high-level insights about the model’s predictions that
very much resemble the experts’ own reasoning. Moreover, by collecting the domain
feedback from a pool of certified experts (human teaching), we promote seamless and
better quality explanations. Lastly, we resort to semantic mappings between legacy expert
systems and domain taxonomies to automatically annotate a bootstrap training set, overcoming
the absence of concept-based human annotations. We validate JOEL empirically
on a real-world fraud detection dataset, at Feedzai. We show that JOEL can generalize
the explanations from the bootstrap dataset. Furthermore, obtained results indicate that
human teaching is able to further improve the explanations prediction quality.A Aprendizagem de Máquina (AM) tem sido cada vez mais utilizada para ajudar os
humanos a tomar decisões de alto risco numa vasta gama de áreas, desde política até à
justiça criminal, educação, saúde e serviços financeiros. Porém, é muito difícil para os
humanos perceber a razão da decisão do modelo de AM, prejudicando assim a confiança
no sistema. O campo da Inteligência Artificial Explicável (IAE) surgiu para enfrentar
este problema, visando desenvolver métodos para tornar as “caixas-pretas” mais interpretáveis,
embora ainda sem grande avanço. Além disso, os métodos de explicação mais
populares — LIME and SHAP — produzem explicações de muito baixo nível, sendo de
utilidade limitada para pessoas sem conhecimento de AM.
Este trabalho foi desenvolvido na Feedzai, a fintech que usa a AM para prevenir crimes
financeiros. Um dos produtos da Feedzai é uma aplicação de gestão de casos, usada por
analistas de fraude. Estes são especialistas no domínio treinados para procurar evidências
suspeitas em transações financeiras, contudo não tendo o conhecimento em AM, os
métodos de IAE atuais não satisfazem as suas necessidades de informação. Para resolver
isso, apresentamos JOEL, a framework baseada em rede neuronal para aprender conjuntamente
a tarefa de tomada de decisão e as explicações associadas. A JOEL é orientada
a especialistas de domínio que não têm conhecimento técnico profundo de AM, fornecendo
informações de alto nível sobre as previsões do modelo, que muito se assemelham
ao raciocínio dos próprios especialistas. Ademais, ao recolher o feedback de especialistas
certificados (ensino humano), promovemos explicações contínuas e de melhor qualidade.
Por último, recorremos a mapeamentos semânticos entre sistemas legados e taxonomias
de domínio para anotar automaticamente um conjunto de dados, superando a ausência
de anotações humanas baseadas em conceitos. Validamos a JOEL empiricamente em um
conjunto de dados de detecção de fraude do mundo real, na Feedzai. Mostramos que a
JOEL pode generalizar as explicações aprendidas no conjunto de dados inicial e que o
ensino humano é capaz de melhorar a qualidade da previsão das explicações
Learning visual representations with neural networks for video captioning and image generation
La recherche sur les réseaux de neurones a permis de réaliser de larges progrès durant la dernière décennie. Non seulement les réseaux de neurones ont été appliqués avec succès pour résoudre des problèmes de plus en plus complexes; mais ils sont aussi devenus l’approche dominante dans les domaines où ils ont été testés tels que la compréhension du langage, les agents jouant à des jeux de manière automatique ou encore la vision par ordinateur, grâce à leurs capacités calculatoires et leurs efficacités statistiques.
La présente thèse étudie les réseaux de neurones appliqués à des problèmes en vision par ordinateur, où les représentations sémantiques abstraites jouent un rôle fondamental. Nous démontrerons, à la fois par la théorie et par l’expérimentation, la capacité des réseaux de neurones à apprendre de telles représentations à partir de données, avec ou sans supervision.
Le contenu de la thèse est divisé en deux parties. La première partie étudie les réseaux de neurones appliqués à la description de vidéo en langage naturel, nécessitant l’apprentissage de représentation visuelle. Le premier modèle proposé permet d’avoir une attention dynamique sur les différentes trames de la vidéo lors de la génération de la description textuelle pour de courtes vidéos. Ce modèle est ensuite amélioré par l’introduction d’une opération de convolution récurrente. Par la suite, la dernière section de cette partie identifie un problème fondamental dans la description de vidéo en langage naturel et propose un nouveau type de métrique d’évaluation qui peut être utilisé empiriquement comme un oracle afin d’analyser les performances de modèles concernant cette tâche.
La deuxième partie se concentre sur l’apprentissage non-supervisé et étudie une famille de modèles capables de générer des images. En particulier, l’accent est mis sur les “Neural Autoregressive Density Estimators (NADEs), une famille de modèles probabilistes pour les images naturelles. Ce travail met tout d’abord en évidence une connection entre les modèles NADEs et les réseaux stochastiques génératifs (GSN). De plus, une amélioration des modèles NADEs standards est proposée. Dénommés NADEs itératifs, cette amélioration introduit plusieurs itérations lors de l’inférence du modèle NADEs tout en préservant son nombre de paramètres.
Débutant par une revue chronologique, ce travail se termine par un résumé des récents développements en lien avec les contributions présentées dans les deux parties principales, concernant les problèmes d’apprentissage de représentation sémantiques pour les images et les vidéos. De prometteuses directions de recherche sont envisagées.The past decade has been marked as a golden era of neural network research. Not only have neural networks been successfully applied to solve more and more challenging real- world problems, but also they have become the dominant approach in many of the places where they have been tested. These places include, for instance, language understanding, game playing, and computer vision, thanks to neural networks’ superiority in computational efficiency and statistical capacity. This thesis applies neural networks to problems in computer vision where high-level and semantically meaningful representations play a fundamental role. It demonstrates both in theory and in experiment the ability to learn such representations from data with and without supervision. The main content of the thesis is divided into two parts. The first part studies neural networks in the context of learning visual representations for the task of video captioning. Models are developed to dynamically focus on different frames while generating a natural language description of a short video. Such a model is further improved by recurrent convolutional operations. The end of this part identifies fundamental challenges in video captioning and proposes a new type of evaluation metric that may be used experimentally as an oracle to benchmark performance. The second part studies the family of models that generate images. While the first part is supervised, this part is unsupervised. The focus of it is the popular family of Neural Autoregressive Density Estimators (NADEs), a tractable probabilistic model for natural images. This work first makes a connection between NADEs and Generative Stochastic Networks (GSNs). The standard NADE is improved by introducing multiple iterations in its inference without increasing the number of parameters, which is dubbed iterative NADE. With a historical view at the beginning, this work ends with a summary of recent development for work discussed in the first two parts around the central topic of learning visual representations for images and videos. A bright future is envisioned at the end
END-TO-END PERFORMANCE ANALYSIS OF A RESOURCE ALLOCATION SERVICE
This dissertation is centered around the monitoring and control platform of the company
Skyline Communications, the DataMiner. This platform has a specific module called
SRM (Service and Resource Management). One of SRM’s many features is the capacity
to make an advance reservation (booking) of resources of the client’s network. When a
booking is created, there is a time interval/delay between the moment that a booking
is requested and the moment that all necessary configurations for this booking actually
start to be made at the Resource level. This delay is called SyncTime.
The SyncTime is affected by the dynamics of the network (e.g, a sudden increase in
the number of bookings made, at a given time). In order to guarantee the maximum
possible quality of service to the client and ensure that the network dynamics will have
minimal impact in the real-time delivery of the desired content, the SRM module must
be able to estimate if a booking can be done in an acceptable SyncTime value. Given
this problem, the main goal of this dissertation is to develop a machine learning based
estimation/classification module that is capable of, based on the temporal state of the
SRM module, make a prediction or classification of the SyncTime.
Two approaches were considered: Classify the SyncTime based on classes or estimate
the value of it. In order to test both approaches, we implemented several traditional
machine learning methods as well as, several neural networks. Both approaches were
tested using a dataset collected from a DataMiner cluster composed of three DataMiner
agents using software developed in this dissertation. In order to collect the dataset, we
considered several setups that captured the cluster in different network conditions.
By comparing both approaches, the results suggested that classifying the predicted
SyncTime using a classification model and classifying the predicted SyncTime of a estimation
model are both equally good options. Furthermore, based on the results of all
implementations, a prototype application was also developed. This application was fully
developed in Python and it uses a Multilayer Perceptron in order to do the classification
of the SyncTime of a booking, based on several inputs given by the user.Esta dissertação centra-se na plataforma de monitorização e controlo da empresa
Skyline Communications, o DataMiner. Esta plataforma possui um módulo específico denominado
de SRM (Service and Resource Management). Uma das características do SRM é a
capacidade de fazer uma reserva antecipada (booking) dos recursos da rede de um cliente.
Quando uma reserva é criada, existe um intervalo de tempo/atraso entre o momento em
que uma reserva é solicitada e o momento em que todas as configurações necessárias para
a mesma realmente começam a ser feitas ao nível de Recurso do DataMiner. Este atraso é
chamado de SyncTime.
O SyncTime é afetado pela dinâmica da rede (por exemplo, um aumento repentino
no número de reservas feitas num determinado momento). De forma a assegurar a máxima
qualidade de serviço possível ao cliente e garantir que a dinâmica da rede tenha o
mínimo impacto na entrega em tempo real do conteúdo desejado, o módulo SRM deve
ser capaz de estimar se uma reserva pode ser feita com um valor de SyncTime aceitável.
Diante deste problema, o principal objetivo desta dissertação é desenvolver um módulo
de regressão/classificação baseado em aprendizagem automática (machine learning) que
seja capaz de fazer uma previsão do valor do SyncTime ou classificação do mesmo, com
base no estado temporal do módulo SRM.
Duas abordagens foram consideradas: Classificar o SyncTime com base em classes ou
estimar o seu valor. Para testar as mesmas, implementou-se vários métodos tradicionais de
machine learning, bem como várias redes neuronais. Ambas as abordagens foram testadas
utilizando um conjunto de dados recolhido de um cluster composto por três agentes
DataMiner usando software desenvolvido nesta dissertação. Para a recolha dos dados,
considerou-se várias configurações que capturam o cluster em diferentes condições.
Ao comparar ambas as abordagens, os resultados sugerem que classificar o SyncTime
usando um modelo de classificação e classificar o valor do SyncTime previsto por um
modelo de regressão são ambas boas opções. Com base nos resultados obtidos, foi criada
ainda uma aplicação protótipo. Esta foi totalmente desenvolvida em Python e utiliza um
Multilayer Perceptron para realizar a classificação do SyncTime de uma reserva, a partir
dos dados introduzidos pelo utilizador
Conditional Invertible Generative Models for Supervised Problems
Invertible neural networks (INNs), in the setting of normalizing flows, are a type of unconditional generative likelihood model. Despite various attractive properties compared to other common generative model types, they are rarely useful for supervised tasks or real applications due to their unguided outputs. In this work, we therefore present three new methods that extend the standard INN setting, falling under a broader category we term generative invertible models. These new methods allow leveraging the theoretical and practical benefits of INNs to solve supervised problems in new ways, including real-world applications from different branches of science. The key finding is that our approaches enhance many aspects of trustworthiness in comparison to conventional feed-forward networks, such as uncertainty estimation and quantification, explainability, and proper handling of outlier data
A study of the temporal relationship between eye actions and facial expressions
A dissertation submitted in ful llment of the requirements for the
degree of Master of Science
in the
School of Computer Science and Applied Mathematics
Faculty of Science
August 15, 2017Facial expression recognition is one of the most common means of communication used
for complementing spoken word. However, people have grown to master ways of ex-
hibiting deceptive expressions. Hence, it is imperative to understand di erences in
expressions mostly for security purposes among others. Traditional methods employ
machine learning techniques in di erentiating real and fake expressions. However, this
approach does not always work as human subjects can easily mimic real expressions with
a bit of practice. This study presents an approach that evaluates the time related dis-
tance that exists between eye actions and an exhibited expression. The approach gives
insights on some of the most fundamental characteristics of expressions. The study fo-
cuses on nding and understanding the temporal relationship that exists between eye
blinks and smiles. It further looks at the relationship that exits between eye closure and
pain expressions. The study incorporates active appearance models (AAM) for feature
extraction and support vector machines (SVM) for classi cation. It tests extreme learn-
ing machines (ELM) in both smile and pain studies, which in turn, attains excellent
results than predominant algorithms like the SVM. The study shows that eye blinks
are highly correlated with the beginning of a smile in posed smiles while eye blinks are
highly correlated with the end of a smile in spontaneous smiles. A high correlation is
observed between eye closure and pain in spontaneous pain expressions. Furthermore,
this study brings about ideas that lead to potential applications such as lie detection
systems, robust health care monitoring systems and enhanced animation design systems
among others.MT 201
Autonomous Drone Landings on an Unmanned Marine Vehicle using Deep Reinforcement Learning
This thesis describes with the integration of an Unmanned Surface Vehicle (USV) and an Unmanned Aerial Vehicle (UAV, also commonly known as drone) in a single Multi-Agent System (MAS). In marine robotics, the advantage offered by a MAS consists of exploiting the key features of a single robot to compensate for the shortcomings in the other. In this way, a USV can serve as the landing platform to alleviate the need for a UAV to be airborne for long periods time, whilst the latter can increase the overall environmental awareness thanks to the possibility to cover large portions of the prevailing environment with a camera (or more than one) mounted on it. There are numerous potential applications in which this system can be used, such as deployment in search and rescue missions, water and coastal monitoring, and reconnaissance and force protection, to name but a few.
The theory developed is of a general nature. The landing manoeuvre has been accomplished mainly identifying, through artificial vision techniques, a fiducial marker placed on a flat surface serving as a landing platform. The raison d'etre for the thesis was to propose a new solution for autonomous landing that relies solely on onboard sensors and with minimum or no communications between the vehicles. To this end, initial work solved the problem while using only data from the cameras mounted on the in-flight drone. In the situation in which the tracking of the marker is interrupted, the current position of the USV is estimated and integrated into the control commands. The limitations of classic control theory used in this approached suggested the need for a new solution that empowered the flexibility of intelligent methods, such as fuzzy logic or artificial neural networks. The recent achievements obtained by deep reinforcement learning (DRL) techniques in end-to-end control in playing the Atari video-games suite represented a fascinating while challenging new way to see and address the landing problem. Therefore, novel architectures were designed for approximating the action-value function of a Q-learning algorithm and used to map raw input observation to high-level navigation actions. In this way, the UAV learnt how to land from high latitude without any human supervision, using only low-resolution grey-scale images and with a level of accuracy and robustness. Both the approaches have been implemented on a simulated test-bed based on Gazebo simulator and the model of the Parrot AR-Drone. The solution based on DRL was further verified experimentally using the Parrot Bebop 2 in a series of trials. The outcomes demonstrate that both these innovative methods are both feasible and practicable, not only in an outdoor marine scenario but also in indoor ones as well
- …