145 research outputs found

    Human-Interpretable Explanations for Black-Box Machine Learning Models: An Application to Fraud Detection

    Get PDF
    Machine Learning (ML) has been increasingly used to aid humans making high-stakes decisions in a wide range of areas, from public policy to criminal justice, education, healthcare, or financial services. However, it is very hard for humans to grasp the rationale behind every ML model’s prediction, hindering trust in the system. The field of Explainable Artificial Intelligence (XAI) emerged to tackle this problem, aiming to research and develop methods to make those “black-boxes” more interpretable, but there is still no major breakthrough. Additionally, the most popular explanation methods — LIME and SHAP — produce very low-level feature attribution explanations, being of limited usefulness to personas without any ML knowledge. This work was developed at Feedzai, a fintech company that uses ML to prevent financial crime. One of the main Feedzai products is a case management application used by fraud analysts to review suspicious financial transactions flagged by the ML models. Fraud analysts are domain experts trained to look for suspicious evidence in transactions but they do not have ML knowledge, and consequently, current XAI methods do not suit their information needs. To address this, we present JOEL, a neural network-based framework to jointly learn a decision-making task and associated domain knowledge explanations. JOEL is tailored to human-in-the-loop domain experts that lack deep technical ML knowledge, providing high-level insights about the model’s predictions that very much resemble the experts’ own reasoning. Moreover, by collecting the domain feedback from a pool of certified experts (human teaching), we promote seamless and better quality explanations. Lastly, we resort to semantic mappings between legacy expert systems and domain taxonomies to automatically annotate a bootstrap training set, overcoming the absence of concept-based human annotations. We validate JOEL empirically on a real-world fraud detection dataset, at Feedzai. We show that JOEL can generalize the explanations from the bootstrap dataset. Furthermore, obtained results indicate that human teaching is able to further improve the explanations prediction quality.A Aprendizagem de Máquina (AM) tem sido cada vez mais utilizada para ajudar os humanos a tomar decisões de alto risco numa vasta gama de áreas, desde política até à justiça criminal, educação, saúde e serviços financeiros. Porém, é muito difícil para os humanos perceber a razão da decisão do modelo de AM, prejudicando assim a confiança no sistema. O campo da Inteligência Artificial Explicável (IAE) surgiu para enfrentar este problema, visando desenvolver métodos para tornar as “caixas-pretas” mais interpretáveis, embora ainda sem grande avanço. Além disso, os métodos de explicação mais populares — LIME and SHAP — produzem explicações de muito baixo nível, sendo de utilidade limitada para pessoas sem conhecimento de AM. Este trabalho foi desenvolvido na Feedzai, a fintech que usa a AM para prevenir crimes financeiros. Um dos produtos da Feedzai é uma aplicação de gestão de casos, usada por analistas de fraude. Estes são especialistas no domínio treinados para procurar evidências suspeitas em transações financeiras, contudo não tendo o conhecimento em AM, os métodos de IAE atuais não satisfazem as suas necessidades de informação. Para resolver isso, apresentamos JOEL, a framework baseada em rede neuronal para aprender conjuntamente a tarefa de tomada de decisão e as explicações associadas. A JOEL é orientada a especialistas de domínio que não têm conhecimento técnico profundo de AM, fornecendo informações de alto nível sobre as previsões do modelo, que muito se assemelham ao raciocínio dos próprios especialistas. Ademais, ao recolher o feedback de especialistas certificados (ensino humano), promovemos explicações contínuas e de melhor qualidade. Por último, recorremos a mapeamentos semânticos entre sistemas legados e taxonomias de domínio para anotar automaticamente um conjunto de dados, superando a ausência de anotações humanas baseadas em conceitos. Validamos a JOEL empiricamente em um conjunto de dados de detecção de fraude do mundo real, na Feedzai. Mostramos que a JOEL pode generalizar as explicações aprendidas no conjunto de dados inicial e que o ensino humano é capaz de melhorar a qualidade da previsão das explicações

    Learning visual representations with neural networks for video captioning and image generation

    Full text link
    La recherche sur les réseaux de neurones a permis de réaliser de larges progrès durant la dernière décennie. Non seulement les réseaux de neurones ont été appliqués avec succès pour résoudre des problèmes de plus en plus complexes; mais ils sont aussi devenus l’approche dominante dans les domaines où ils ont été testés tels que la compréhension du langage, les agents jouant à des jeux de manière automatique ou encore la vision par ordinateur, grâce à leurs capacités calculatoires et leurs efficacités statistiques. La présente thèse étudie les réseaux de neurones appliqués à des problèmes en vision par ordinateur, où les représentations sémantiques abstraites jouent un rôle fondamental. Nous démontrerons, à la fois par la théorie et par l’expérimentation, la capacité des réseaux de neurones à apprendre de telles représentations à partir de données, avec ou sans supervision. Le contenu de la thèse est divisé en deux parties. La première partie étudie les réseaux de neurones appliqués à la description de vidéo en langage naturel, nécessitant l’apprentissage de représentation visuelle. Le premier modèle proposé permet d’avoir une attention dynamique sur les différentes trames de la vidéo lors de la génération de la description textuelle pour de courtes vidéos. Ce modèle est ensuite amélioré par l’introduction d’une opération de convolution récurrente. Par la suite, la dernière section de cette partie identifie un problème fondamental dans la description de vidéo en langage naturel et propose un nouveau type de métrique d’évaluation qui peut être utilisé empiriquement comme un oracle afin d’analyser les performances de modèles concernant cette tâche. La deuxième partie se concentre sur l’apprentissage non-supervisé et étudie une famille de modèles capables de générer des images. En particulier, l’accent est mis sur les “Neural Autoregressive Density Estimators (NADEs), une famille de modèles probabilistes pour les images naturelles. Ce travail met tout d’abord en évidence une connection entre les modèles NADEs et les réseaux stochastiques génératifs (GSN). De plus, une amélioration des modèles NADEs standards est proposée. Dénommés NADEs itératifs, cette amélioration introduit plusieurs itérations lors de l’inférence du modèle NADEs tout en préservant son nombre de paramètres. Débutant par une revue chronologique, ce travail se termine par un résumé des récents développements en lien avec les contributions présentées dans les deux parties principales, concernant les problèmes d’apprentissage de représentation sémantiques pour les images et les vidéos. De prometteuses directions de recherche sont envisagées.The past decade has been marked as a golden era of neural network research. Not only have neural networks been successfully applied to solve more and more challenging real- world problems, but also they have become the dominant approach in many of the places where they have been tested. These places include, for instance, language understanding, game playing, and computer vision, thanks to neural networks’ superiority in computational efficiency and statistical capacity. This thesis applies neural networks to problems in computer vision where high-level and semantically meaningful representations play a fundamental role. It demonstrates both in theory and in experiment the ability to learn such representations from data with and without supervision. The main content of the thesis is divided into two parts. The first part studies neural networks in the context of learning visual representations for the task of video captioning. Models are developed to dynamically focus on different frames while generating a natural language description of a short video. Such a model is further improved by recurrent convolutional operations. The end of this part identifies fundamental challenges in video captioning and proposes a new type of evaluation metric that may be used experimentally as an oracle to benchmark performance. The second part studies the family of models that generate images. While the first part is supervised, this part is unsupervised. The focus of it is the popular family of Neural Autoregressive Density Estimators (NADEs), a tractable probabilistic model for natural images. This work first makes a connection between NADEs and Generative Stochastic Networks (GSNs). The standard NADE is improved by introducing multiple iterations in its inference without increasing the number of parameters, which is dubbed iterative NADE. With a historical view at the beginning, this work ends with a summary of recent development for work discussed in the first two parts around the central topic of learning visual representations for images and videos. A bright future is envisioned at the end

    END-TO-END PERFORMANCE ANALYSIS OF A RESOURCE ALLOCATION SERVICE

    Get PDF
    This dissertation is centered around the monitoring and control platform of the company Skyline Communications, the DataMiner. This platform has a specific module called SRM (Service and Resource Management). One of SRM’s many features is the capacity to make an advance reservation (booking) of resources of the client’s network. When a booking is created, there is a time interval/delay between the moment that a booking is requested and the moment that all necessary configurations for this booking actually start to be made at the Resource level. This delay is called SyncTime. The SyncTime is affected by the dynamics of the network (e.g, a sudden increase in the number of bookings made, at a given time). In order to guarantee the maximum possible quality of service to the client and ensure that the network dynamics will have minimal impact in the real-time delivery of the desired content, the SRM module must be able to estimate if a booking can be done in an acceptable SyncTime value. Given this problem, the main goal of this dissertation is to develop a machine learning based estimation/classification module that is capable of, based on the temporal state of the SRM module, make a prediction or classification of the SyncTime. Two approaches were considered: Classify the SyncTime based on classes or estimate the value of it. In order to test both approaches, we implemented several traditional machine learning methods as well as, several neural networks. Both approaches were tested using a dataset collected from a DataMiner cluster composed of three DataMiner agents using software developed in this dissertation. In order to collect the dataset, we considered several setups that captured the cluster in different network conditions. By comparing both approaches, the results suggested that classifying the predicted SyncTime using a classification model and classifying the predicted SyncTime of a estimation model are both equally good options. Furthermore, based on the results of all implementations, a prototype application was also developed. This application was fully developed in Python and it uses a Multilayer Perceptron in order to do the classification of the SyncTime of a booking, based on several inputs given by the user.Esta dissertação centra-se na plataforma de monitorização e controlo da empresa Skyline Communications, o DataMiner. Esta plataforma possui um módulo específico denominado de SRM (Service and Resource Management). Uma das características do SRM é a capacidade de fazer uma reserva antecipada (booking) dos recursos da rede de um cliente. Quando uma reserva é criada, existe um intervalo de tempo/atraso entre o momento em que uma reserva é solicitada e o momento em que todas as configurações necessárias para a mesma realmente começam a ser feitas ao nível de Recurso do DataMiner. Este atraso é chamado de SyncTime. O SyncTime é afetado pela dinâmica da rede (por exemplo, um aumento repentino no número de reservas feitas num determinado momento). De forma a assegurar a máxima qualidade de serviço possível ao cliente e garantir que a dinâmica da rede tenha o mínimo impacto na entrega em tempo real do conteúdo desejado, o módulo SRM deve ser capaz de estimar se uma reserva pode ser feita com um valor de SyncTime aceitável. Diante deste problema, o principal objetivo desta dissertação é desenvolver um módulo de regressão/classificação baseado em aprendizagem automática (machine learning) que seja capaz de fazer uma previsão do valor do SyncTime ou classificação do mesmo, com base no estado temporal do módulo SRM. Duas abordagens foram consideradas: Classificar o SyncTime com base em classes ou estimar o seu valor. Para testar as mesmas, implementou-se vários métodos tradicionais de machine learning, bem como várias redes neuronais. Ambas as abordagens foram testadas utilizando um conjunto de dados recolhido de um cluster composto por três agentes DataMiner usando software desenvolvido nesta dissertação. Para a recolha dos dados, considerou-se várias configurações que capturam o cluster em diferentes condições. Ao comparar ambas as abordagens, os resultados sugerem que classificar o SyncTime usando um modelo de classificação e classificar o valor do SyncTime previsto por um modelo de regressão são ambas boas opções. Com base nos resultados obtidos, foi criada ainda uma aplicação protótipo. Esta foi totalmente desenvolvida em Python e utiliza um Multilayer Perceptron para realizar a classificação do SyncTime de uma reserva, a partir dos dados introduzidos pelo utilizador

    Conditional Invertible Generative Models for Supervised Problems

    Get PDF
    Invertible neural networks (INNs), in the setting of normalizing flows, are a type of unconditional generative likelihood model. Despite various attractive properties compared to other common generative model types, they are rarely useful for supervised tasks or real applications due to their unguided outputs. In this work, we therefore present three new methods that extend the standard INN setting, falling under a broader category we term generative invertible models. These new methods allow leveraging the theoretical and practical benefits of INNs to solve supervised problems in new ways, including real-world applications from different branches of science. The key finding is that our approaches enhance many aspects of trustworthiness in comparison to conventional feed-forward networks, such as uncertainty estimation and quantification, explainability, and proper handling of outlier data

    A study of the temporal relationship between eye actions and facial expressions

    Get PDF
    A dissertation submitted in ful llment of the requirements for the degree of Master of Science in the School of Computer Science and Applied Mathematics Faculty of Science August 15, 2017Facial expression recognition is one of the most common means of communication used for complementing spoken word. However, people have grown to master ways of ex- hibiting deceptive expressions. Hence, it is imperative to understand di erences in expressions mostly for security purposes among others. Traditional methods employ machine learning techniques in di erentiating real and fake expressions. However, this approach does not always work as human subjects can easily mimic real expressions with a bit of practice. This study presents an approach that evaluates the time related dis- tance that exists between eye actions and an exhibited expression. The approach gives insights on some of the most fundamental characteristics of expressions. The study fo- cuses on nding and understanding the temporal relationship that exists between eye blinks and smiles. It further looks at the relationship that exits between eye closure and pain expressions. The study incorporates active appearance models (AAM) for feature extraction and support vector machines (SVM) for classi cation. It tests extreme learn- ing machines (ELM) in both smile and pain studies, which in turn, attains excellent results than predominant algorithms like the SVM. The study shows that eye blinks are highly correlated with the beginning of a smile in posed smiles while eye blinks are highly correlated with the end of a smile in spontaneous smiles. A high correlation is observed between eye closure and pain in spontaneous pain expressions. Furthermore, this study brings about ideas that lead to potential applications such as lie detection systems, robust health care monitoring systems and enhanced animation design systems among others.MT 201

    Autonomous Drone Landings on an Unmanned Marine Vehicle using Deep Reinforcement Learning

    Get PDF
    This thesis describes with the integration of an Unmanned Surface Vehicle (USV) and an Unmanned Aerial Vehicle (UAV, also commonly known as drone) in a single Multi-Agent System (MAS). In marine robotics, the advantage offered by a MAS consists of exploiting the key features of a single robot to compensate for the shortcomings in the other. In this way, a USV can serve as the landing platform to alleviate the need for a UAV to be airborne for long periods time, whilst the latter can increase the overall environmental awareness thanks to the possibility to cover large portions of the prevailing environment with a camera (or more than one) mounted on it. There are numerous potential applications in which this system can be used, such as deployment in search and rescue missions, water and coastal monitoring, and reconnaissance and force protection, to name but a few. The theory developed is of a general nature. The landing manoeuvre has been accomplished mainly identifying, through artificial vision techniques, a fiducial marker placed on a flat surface serving as a landing platform. The raison d'etre for the thesis was to propose a new solution for autonomous landing that relies solely on onboard sensors and with minimum or no communications between the vehicles. To this end, initial work solved the problem while using only data from the cameras mounted on the in-flight drone. In the situation in which the tracking of the marker is interrupted, the current position of the USV is estimated and integrated into the control commands. The limitations of classic control theory used in this approached suggested the need for a new solution that empowered the flexibility of intelligent methods, such as fuzzy logic or artificial neural networks. The recent achievements obtained by deep reinforcement learning (DRL) techniques in end-to-end control in playing the Atari video-games suite represented a fascinating while challenging new way to see and address the landing problem. Therefore, novel architectures were designed for approximating the action-value function of a Q-learning algorithm and used to map raw input observation to high-level navigation actions. In this way, the UAV learnt how to land from high latitude without any human supervision, using only low-resolution grey-scale images and with a level of accuracy and robustness. Both the approaches have been implemented on a simulated test-bed based on Gazebo simulator and the model of the Parrot AR-Drone. The solution based on DRL was further verified experimentally using the Parrot Bebop 2 in a series of trials. The outcomes demonstrate that both these innovative methods are both feasible and practicable, not only in an outdoor marine scenario but also in indoor ones as well
    corecore