1,535 research outputs found

    explAIner: A Visual Analytics Framework for Interactive and Explainable Machine Learning

    Full text link
    We propose a framework for interactive and explainable machine learning that enables users to (1) understand machine learning models; (2) diagnose model limitations using different explainable AI methods; as well as (3) refine and optimize the models. Our framework combines an iterative XAI pipeline with eight global monitoring and steering mechanisms, including quality monitoring, provenance tracking, model comparison, and trust building. To operationalize the framework, we present explAIner, a visual analytics system for interactive and explainable machine learning that instantiates all phases of the suggested pipeline within the commonly used TensorBoard environment. We performed a user-study with nine participants across different expertise levels to examine their perception of our workflow and to collect suggestions to fill the gap between our system and framework. The evaluation confirms that our tightly integrated system leads to an informed machine learning process while disclosing opportunities for further extensions.Comment: 9 pages paper, 2 pages references, 5 pages supplementary material (ancillary files

    Understanding Video Transformers for Segmentation: A Survey of Application and Interpretability

    Full text link
    Video segmentation encompasses a wide range of categories of problem formulation, e.g., object, scene, actor-action and multimodal video segmentation, for delineating task-specific scene components with pixel-level masks. Recently, approaches in this research area shifted from concentrating on ConvNet-based to transformer-based models. In addition, various interpretability approaches have appeared for transformer models and video temporal dynamics, motivated by the growing interest in basic scientific understanding, model diagnostics and societal implications of real-world deployment. Previous surveys mainly focused on ConvNet models on a subset of video segmentation tasks or transformers for classification tasks. Moreover, component-wise discussion of transformer-based video segmentation models has not yet received due focus. In addition, previous reviews of interpretability methods focused on transformers for classification, while analysis of video temporal dynamics modelling capabilities of video models received less attention. In this survey, we address the above with a thorough discussion of various categories of video segmentation, a component-wise discussion of the state-of-the-art transformer-based models, and a review of related interpretability methods. We first present an introduction to the different video segmentation task categories, their objectives, specific challenges and benchmark datasets. Next, we provide a component-wise review of recent transformer-based models and document the state of the art on different video segmentation tasks. Subsequently, we discuss post-hoc and ante-hoc interpretability methods for transformer models and interpretability methods for understanding the role of the temporal dimension in video models. Finally, we conclude our discussion with future research directions

    Detection of Power Line Supporting Towers via Interpretable Semantic Segmentation of 3D Point Clouds

    Get PDF
    The inspection and maintenance of energy transmission networks are demanding and crucial tasks for any transmission system operator. They rely on a combination of on-theground staff and costly low-flying helicopters to visually inspect the power grid structure. Recently, LiDAR-based inspections have shown the potential to accelerate and increase inspection precision. These high-resolution sensors allow one to scan an environment and store it in a 3D point cloud format for further processing and analysis by maintenance specialists to prevent fires and damage to the electrical system. However, this task is especially demanding to handle on time when we consider the extensive area that the transmission network covers. Nonetheless, the transition to point cloud data allows us to take advantage of Deep Learning to automate these inspections, by detecting collisions between the grid and the revolving scene. Deep Learning is a recent and powerful tool that has been successfully applied to a myriad of real-life problems, such as image recognition and speech generation. With the introduction of affordable LiDAR sensors, the application of Deep Learning on 3D data emerged, with numerous methods being proposed every day to address difficult problems, from 3D object detection to 3D point cloud segmentation. Alas, state-of-the-art methods are remarkably complex, composed of millions of trainable parameters, and take several weeks, if not months, to train on specific hardware, which makes it difficult for traditional companies, like utilities, to employ them. Therefore, we explore a novel mathematical framework that allows us to define tailored operators that incorporate prior knowledge regarding our problem. These operators are then integrated into a learning agent, called SCENE-Net, that detects power line supporting towers in 3D point clouds. SCENE-Net allows for the interpretability of its results, which is not possible in conventional models, it shows an efficient training and inference time of 85 mn and 20 ms on a regular laptop. Our model is composed of 11 trainable geometrical parameters, like the height of a cylinder, and has a Precision gain of 24% against a comparable CNN with 2190 parameters.A inspeção e manutenção de redes de transmissão de energia são tarefas cruciais para operadores de rede. Recentemente, foram adotadas inspeções utilizando sensores LiDAR de forma a acelerar este processo e aumentar a sua precisão. Estes sensores são objetos de alta precisão que conseguem inspecionar ambientes e guarda-los no formato de nuvens de pontos 3D, para serem posteriormente analisadas por specialistas que procuram prevenir fogos florestais e danos à estruta eléctrica. No entanto, esta tarefa torna-se bastante difícil de concluir em tempo útil pois a rede de transmissão é bastasnte vasta. Por isso, podemos tirar partido da transição para dados LiDAR e utilizar aprendizagem profunda para automatizar as inspeções à rede. Aprendizagem profunda é um campo recente e em grande desenvolvimento, sendo aplicado a vários problemas do nosso quotidiano e facilmente atinge um desempenho superior ao do ser humano, como em reconhecimento de imagens, geração de voz, entre outros. Com o desenvolvimento de sensores LiDAR acessíveis, o uso de aprendizagem profunda em dados 3D rapidamente se desenvolveu, apresentando várias metodologias novas todos os dias que respondem a problemas complexos, como deteção de objetos 3D. No entanto, modelos do estado da arte são incrivelmente complexos e compostos por milhões de parâmetros e demoram várias semanas, senão meses, a treinar em GPU potentes, o que dificulta a sua utilização em empresas tradicionais, como a EDP. Portanto, nós exploramos uma nova teoria matemática que nos permite definir operadores específicos que incorporaram conhecimento sobre o nosso problema. Estes operadores são integrados num modelo de aprendizagem prounda, designado SCENE-Net, que deteta torres de suporte de linhas de transmissão em nuvens de pontos. SCENE-Net permite a interpretação dos seus resultados, aspeto que não é possível com modelos convencionais, demonstra um treino eficiente de 85 minutos e tempo de inferência de 20 milissegundos num computador tradicional. O nosso modelo contém apenas 11 parâmetros geométricos, como a altura de um cilindro, e demonstra um ganho de Precisão de 24% quando comparado com uma CNN com 2190 parâmetros

    Unmasking Clever Hans Predictors and Assessing What Machines Really Learn

    Full text link
    Current learning machines have successfully solved hard application problems, reaching high accuracy and displaying seemingly "intelligent" behavior. Here we apply recent techniques for explaining decisions of state-of-the-art learning machines and analyze various tasks from computer vision and arcade games. This showcases a spectrum of problem-solving behaviors ranging from naive and short-sighted, to well-informed and strategic. We observe that standard performance evaluation metrics can be oblivious to distinguishing these diverse problem solving behaviors. Furthermore, we propose our semi-automated Spectral Relevance Analysis that provides a practically effective way of characterizing and validating the behavior of nonlinear learning machines. This helps to assess whether a learned model indeed delivers reliably for the problem that it was conceived for. Furthermore, our work intends to add a voice of caution to the ongoing excitement about machine intelligence and pledges to evaluate and judge some of these recent successes in a more nuanced manner.Comment: Accepted for publication in Nature Communication

    A Learning Health System for Radiation Oncology

    Get PDF
    The proposed research aims to address the challenges faced by clinical data science researchers in radiation oncology accessing, integrating, and analyzing heterogeneous data from various sources. The research presents a scalable intelligent infrastructure, called the Health Information Gateway and Exchange (HINGE), which captures and structures data from multiple sources into a knowledge base with semantically interlinked entities. This infrastructure enables researchers to mine novel associations and gather relevant knowledge for personalized clinical outcomes. The dissertation discusses the design framework and implementation of HINGE, which abstracts structured data from treatment planning systems, treatment management systems, and electronic health records. It utilizes disease-specific smart templates for capturing clinical information in a discrete manner. HINGE performs data extraction, aggregation, and quality and outcome assessment functions automatically, connecting seamlessly with local IT/medical infrastructure. Furthermore, the research presents a knowledge graph-based approach to map radiotherapy data to an ontology-based data repository using FAIR (Findable, Accessible, Interoperable, Reusable) concepts. This approach ensures that the data is easily discoverable and accessible for clinical decision support systems. The dissertation explores the ETL (Extract, Transform, Load) process, data model frameworks, ontologies, and provides a real-world clinical use case for this data mapping. To improve the efficiency of retrieving information from large clinical datasets, a search engine based on ontology-based keyword searching and synonym-based term matching tool was developed. The hierarchical nature of ontologies is leveraged to retrieve patient records based on parent and children classes. Additionally, patient similarity analysis is conducted using vector embedding models (Word2Vec, Doc2Vec, GloVe, and FastText) to identify similar patients based on text corpus creation methods. Results from the analysis using these models are presented. The implementation of a learning health system for predicting radiation pneumonitis following stereotactic body radiotherapy is also discussed. 3D convolutional neural networks (CNNs) are utilized with radiographic and dosimetric datasets to predict the likelihood of radiation pneumonitis. DenseNet-121 and ResNet-50 models are employed for this study, along with integrated gradient techniques to identify salient regions within the input 3D image dataset. The predictive performance of the 3D CNN models is evaluated based on clinical outcomes. Overall, the proposed Learning Health System provides a comprehensive solution for capturing, integrating, and analyzing heterogeneous data in a knowledge base. It offers researchers the ability to extract valuable insights and associations from diverse sources, ultimately leading to improved clinical outcomes. This work can serve as a model for implementing LHS in other medical specialties, advancing personalized and data-driven medicine
    • …
    corecore