8 research outputs found

    This Looks Like That, Because ... Explaining Prototypes for Interpretable Image Recognition

    Get PDF
    Image recognition with prototypes is considered an interpretable alternative for black box deep learning models. Classification depends on the extent to which a test image "looks like" a prototype. However, perceptual similarity for humans can be different from the similarity learned by the classification model. Hence, only visualising prototypes can be insufficient for a user to understand what a prototype exactly represents, and why the model considers a prototype and an image to be similar. We address this ambiguity and argue that prototypes should be explained. We improve interpretability by automatically enhancing visual prototypes with textual quantitative information about visual characteristics deemed important by the classification model. Specifically, our method clarifies the meaning of a prototype by quantifying the influence of colour hue, shape, texture, contrast and saturation and can generate both global and local explanations. Because of the generality of our approach, it can improve the interpretability of any similarity-based method for prototypical image recognition. In our experiments, we apply our method to the existing Prototypical Part Network (ProtoPNet). Our analysis confirms that the global explanations are generalisable, and often correspond to the visually perceptible properties of a prototype. Our explanations are especially relevant for prototypes which might have been interpreted incorrectly otherwise. By explaining such 'misleading' prototypes, we improve the interpretability and simulatability of a prototype-based classification model. We also use our method to check whether visually similar prototypes have similar explanations, and are able to discover redundancy. Code is available at https://github.com/M-Nauta/Explaining_Prototypes .Comment: 10 pages, 9 figure

    Explainable AI and Interpretable Computer Vision:From Oversight to Insight

    Get PDF
    The increasing availability of big data and computational power has facilitated unprecedented progress in Artificial Intelligence (AI) and Machine Learning (ML). However, complex model architectures have resulted in high-performing yet uninterpretable ‘black boxes’. This prevents users from verifying that the reasoning process aligns with expectations and intentions. This thesis posits that the sole focus on predictive performance is an unsustainable trajectory, since a model can make right predictions for the wrong reasons. The research field of Explainable AI (XAI) addresses the black-box nature of AI by generating explanations that present (aspects of) a model's behaviour in human-understandable terms. This thesis supports the transition from oversight to insight, and shows that explainability can give users more insight into every part of the machine learning pipeline: from the training data to the prediction model and the resulting explanations. When relying on explanations for judging a model's reasoning process, it is important that the explanations are truthful, relevant and understandable. Part I of this thesis reflects upon explanation quality and identifies 12 desirable properties, including compactness, completeness and correctness. Additionally, it provides an extensive collection of quantitative XAI evaluation methods, and analyses their availabilities in open-source toolkits. As alternative to common post-model explainability that reverse-engineers an already trained prediction model, Part II of this thesis presents in-model explainability for interpretable computer vision. These image classifiers learn prototypical parts, which are used in an interpretable decision tree or scoring sheet. The models are explainable by design since their reasoning depends on the extent to which an image patch “looks like” a learned part-prototype. Part III of this thesis shows that ML can also explain characteristics of a dataset. Because of a model's ability to analyse large amounts of data in little time, extracting hidden patterns can contribute to the validation and potential discovery of domain knowledge, and allows to detect sources of bias and shortcuts early on. Concluding, neither the prediction model nor the data nor the explanation method should be handled as a black box. The way forward? AI with a human touch: developing powerful models that learn interpretable features, and using these meaningful features in a decision process that users can understand, validate and adapt. This in-model explainability, such as the part-prototype models from Part II, opens up the opportunity to ‘re-educate’ models with our desired norms, values and reasoning. Enabling human decision-makers to detect and correct undesired model behaviour will contribute towards an effective but also reliable and responsible usage of AI

    Spirit calls Nature: Bridging Science and Spirituality, Consciousness and Evolution in a Synthesis of Knowledge

    Get PDF
    This is a technical treatise for the scientific-minded readers trying to expand their intellectual horizon beyond the straitjacket of materialism. It is dedicated to those scientists and philosophers who feel there is something more, but struggle with connecting the dots into a more coherent picture supported by a way of seeing that allows us to overcome the present paradigm and yet maintains a scientific and conceptual rigor, without falling into oversimplifications. Most of the topics discussed are unknown even to neuroscientists, biologists, philosophers, and yet are based on the findings published in their own mainstream peer reviewed literature or on deep insights of the scientific, philosophical and spiritual giants of the past. A scientific, philosophical, and spiritual overview of the relationship between science and spirituality, neuroscience and the mystery of consciousness, mind and the nature of reality, evolution and life. A plaidoyer for a science that goes beyond the curve of reason and embraces a new synthesis of knowledge. The overcoming of the limitations of the intellect into an extended vision of ourselves and Nature. A critique of physicalism, the still-dominant doctrine that believes that all reality can be reduced to matter and the laws of physics alone. A review and reassessment of the old and new philosophical and metaphysical ideas which attempts to bring closer Western and Eastern traditions where science, philosophy, consciousness, Spirit and Nature are united in a grand vision that transcends the limited conventional scientific and philosophical paradigm. A possible answer to the questions of purpose and meaning and the future evolution of humankind beyond a conception that posits a priori a purposeless and meaningless universe. A report of the new scientific discoveries of a basal intelligence in cells and plants, on the question if mind is computational, the issue of free will, the mind-body problem, and the so called ‘hard problem of consciousness’. An essay on ancient as modern philosophical conceptions, from the One of Plotinus, the God of Spinoza until the recent revival of panpsychism or the universal consciousness. A journey into quantum physics from the perspective of philosophical idealism and an invitation to adopt new ways of seeing that might help us to transform our present understanding, expanding it into an integral cosmology, with a special emphasis on the spiritual and evolutionary cosmology of the Indian seer Sri Aurobindo. Not just a philosophical and metaphysical meditation but, rather, an appeal to work towards a change of consciousness, a widening of our perspective towards a new way of seeing beyond a purely mechanistic worldview to avoid a social, environmental and economic collapse. Humans are transitional beings that will have to make a choice: relapse into a pre-rational state or evolve towards a new trans-rational species supported by an ideal of human unity in diversity as the expression of a spiritual evolutionary process, the call of the Spirit on Nature

    Computer Vision on Web Pages: A Study of Man-Made Images

    Get PDF
    This thesis is focused on the development of computer vision techniques for parsing web pages using an image of the rendered page as evidence, and on understanding this under-explored class of images from the perspective of computer vision. This project is divided into two tracks---applied and theoretical---which complement each other. Our practical motivation is the application of improved web page parsing to assistive technology, such as screenreaders for visually impaired users or the ability to declutter the presentation of a web page for those with cognitive deficit. From a more theoretical standpoint, images of rendered web pages have interesting properties from a computer vision perspective; in particular, low-level assumptions can be made in this domain, but the most important cues are often subtle and can be highly non-local. The parsing system developed in this thesis is a principled Bayesian segmentation-classification pipeline, using innovative techniques to produce valuable results in this challenging domain. The thesis includes both implementation and evaluation solutions. Segmentation of a web page is the problem of dividing it into semantically significant, visually coherent regions. We use a hierarchical segmentation method based on the detection of semantically significant lines (possibly broken lines) which divide regions. The Bayesian design allows sophisticated probability models to be applied to the segmentation process, and our method produces segmentation trees that achieve good performance on a variety of measures. Classification, for our purposes, is identifying the semantic role of regions in the segmentation tree of a page. We achieve promising results with a Bayesian classification algorithm based on the novel use of a hidden Markov tree model, in which the structure of the model is adapted to reflect the structure of the segmentation tree. This allows the algorithm to make effective use of the context in which regions appear as well as the features of each individual region. The methods used to evaluate our page parsing system include qualitative and quantitative evaluation of algorithm performance (using manually-prepared ground truth data) as well as a user study of an assistive interface based on our page segmentation algorithm. We also performed a separate user study to investigate users' perceptions of web page organization and to generate ground truth segmentations, leading to important insights about consistency. Taken as a whole, this thesis presents innovative work in computer vision which contributes both to addressing the problem of web accessibility and to the understanding of semantic cues in images
    corecore