5 research outputs found

    Informed Machine Learning: Integrating Prior Knowledge into Data-Driven Learning Systems

    Get PDF
    Machine Learning is an important method in Artificial Intelligence (AI). It has shown great success in building models for tasks like prediction or image recognition by learning from patterns in large amounts of data. However, it can have its limits when dealing with insufficient training data. A potential solution is the additional integration of prior knowledge, such as physical laws, logic rules, or knowledge graphs. This leads to the notion of Informed Machine Learning (Informed ML). However, the field is so application-driven that general analyses are rare. The goal of this PhD thesis is the unification of Informed ML through general, systematic frameworks. In particular, the following research questions are answered: 1) What is the fundamental concept of Informed ML, and how can existing approaches be structurally classified, 2) is it possible to integrate prior knowledge in a universal way, and 3) how can the benefits of Informed ML be quantified, and what are the requirements for the injected knowledge? First, a concept for Informed ML is proposed, which defines it as learning from a hybrid information source that consists of data and prior knowledge. A taxonomy that serves as a structured classification framework for existing or potential approaches is presented. It considers the knowledge source, its representation type, and the integration stage into the ML pipeline. The concept of Informed ML is further extended to the combination of ML and simulation towards Hybrid AI. Then, two new methods for a universal knowledge integration are developed. The first method, Informed Pre-Training, allows to initialize neural networks with prototypes from prior knowledge. Experiments show that it improves generalization, especially for small data, and increases robustness. An analysis of the individual neural network layers shows that the improvements come from transferring the deeper layers, which confirms the transfer of semantic knowledge (Informed Transfer Learning). The second method, Geo-Informed Validation, checks models for their conformity with knowledge from street maps. It is developed in the application context of autonomous driving, where it can help to prevent potential predictions errors, e.g., in semantic segmentations of traffic scenes. Finally, a catalogue of relevant metrics for quantifying the benefits of knowledge injection is defined. Among others, it includes in-distribution accuracy, out-of-distribution robustness, as well as knowledge conformity, and a new metric that combines performance improvement and data reduction is introduced. Furthermore, a theoretical framework that represents prior knowledge in a function space and relates it to data representations is presented. It reveals that the distances between knowledge and data influence potential model improvements, which is confirmed in a systematic experimental study. All in all, these frameworks support the unification of Informed ML, which makes it more accessible and usable – and helps to achieve trustworthy AI

    Factors associated with non-use of condoms among heterosexually-active single people in Germany: Results from the first representative, population-based German health and sexuality survey (GeSiD)

    Get PDF
    BACKGROUND: Against the backdrop of rising STI incidence among the heterosexual population, sexually active single people are at particularly high STI transmission risk. Gaining insight into circumstances related to condoms non-use in this population is therefore important for developing effective health interventions. METHODS: The nationally-representative survey, GeSiD (German Health and Sexuality Survey) undertaken 2018–2019, interviewed 4,955 people aged 18–75 years. A total of 343 heterosexually-active single participants answered a question about condom use at last sex. Data on sociodemographic characteristics, sexual behaviours and circumstances of last sex were analysed to identify independently associated factors. RESULTS: Condom non-use at last sex was reported more commonly by participants aged >35 years than by younger participants (48.5 vs 33.7%, respectively) and more likely among longer relationships (adjusted odds ratio [AOR]: 2.43) or early loving relationships (AOR: 3.59) than in one-night-stands. It was also associated with not discussing using condoms before sex (AOR: 6.50) and with reporting non-use of condoms at sexual debut (AOR: 4.75). CONCLUSIONS: Non-use of condoms is a common STI risk behaviour among heterosexually-active single people in Germany and so needs promoting from sexual debut throughout the life course, regardless of relationship type and age, but particularly among middle-aged and older people

    Separating the wheat from the chaff

    No full text
    Performance-analysis tools are indispensable for understanding and optimizing the behavior of parallel programs running on increasingly powerful supercomputers. However, with size and complexity of hardware and software on the rise, performance data sets are becoming so voluminous that their analysis poses serious challenges. In particular, the search space that must be traversed and the number of individual performance views that must be explored to identify phenomena of interest becomes too large. To mitigate this problem, we use visual analytics. Specifically, we accelerate the analysis of performance profiles by automatically identifying (1) relevant and (2) similar data subsets and their performance views. We focus on views of the virtual-process topology, showing that their relevance can be well captured with visual-quality metrics and that they can be further assigned to topical groups according to their visual features. A case study demonstrates that our approach helps reduce the search space by up to 80%.publishe

    Magnostics: Image-based Search of Interesting Matrix Views for Guided Network Exploration

    Get PDF
    International audienceIn this work we address the problem of retrieving potentially interesting matrix views to support the exploration of networks. We introduce Matrix Diagnostics (or MAGNOSTICS), following in spirit related approaches for rating and ranking other visualization techniques, such as Scagnostics for scatter plots. Our approach ranks matrix views according to the appearance of specific visual patterns, such as blocks and lines, indicating the existence of topological motifs in the data, such as clusters, bi-graphs, or central nodes. MAGNOSTICS can be used to analyze, query, or search for visually similar matrices in large collections, or to assess the quality of matrix reordering algorithms. While many feature descriptors for image analyzes exist, there is no evidence how they perform for detecting patterns in matrices. In order to make an informed choice of feature descriptors for matrix diagnostics, we evaluate 30 feature descriptors—27 existing ones and three new descriptors that we designed specifically for MAGNOSTICS—with respect to four criteria: pattern response, pattern variability, pattern sensibility, and pattern discrimination. We conclude with an informed set of six descriptors as most appropriate for MAGNOSTICS and demonstrate their application in two scenarios; exploring a large collection of matrices and analyzing temporal networks
    corecore