1,130 research outputs found

    Recalibrating machine learning for social biases: demonstrating a new methodology through a case study classifying gender biases in archival documentation

    Get PDF
    This thesis proposes a recalibration of Machine Learning for social biases to minimize harms from existing approaches and practices in the field. Prioritizing quality over quantity, accuracy over efficiency, representativeness over convenience, and situated thinking over universal thinking, the thesis demonstrates an alternative approach to creating Machine Learning models. Drawing on GLAM, the Humanities, the Social Sciences, and Design, the thesis focuses on understanding and communicating biases in a specific use case. 11,888 metadata descriptions from the University of Edinburgh Heritage Collections' Archives catalog were manually annotated for gender biases and text classification models were then trained on the resulting dataset of 55,260 annotations. Evaluations of the models' performance demonstrates that annotating gender biases can be automated; however, the subjectivity of bias as a concept complicates the generalizability of any one approach. The contributions are: (1) an interdisciplinary and participatory Bias-Aware Methodology, (2) a Taxonomy of Gendered and Gender Biased Language, (3) data annotated for gender biased language, (4) gender biased text classification models, and (5) a human-centered approach to model evaluation. The contributions have implications for Machine Learning, demonstrating how bias is inherent to all data and models; more specifically for Natural Language Processing, providing an annotation taxonomy, annotated datasets and classification models for analyzing gender biased language at scale; for the Gallery, Library, Archives, and Museum sector, offering guidance to institutions seeking to reconcile with histories of marginalizing communities through their documentation practices; and for historians, who utilize cultural heritage documentation to study and interpret the past. Through a real-world application of the Bias-Aware Methodology in a case study, the thesis illustrates the need to shift away from removing social biases and towards acknowledging them, creating data and models that surface the uncertainty and multiplicity characteristic of human societies

    Multidisciplinary perspectives on Artificial Intelligence and the law

    Get PDF
    This open access book presents an interdisciplinary, multi-authored, edited collection of chapters on Artificial Intelligence (‘AI’) and the Law. AI technology has come to play a central role in the modern data economy. Through a combination of increased computing power, the growing availability of data and the advancement of algorithms, AI has now become an umbrella term for some of the most transformational technological breakthroughs of this age. The importance of AI stems from both the opportunities that it offers and the challenges that it entails. While AI applications hold the promise of economic growth and efficiency gains, they also create significant risks and uncertainty. The potential and perils of AI have thus come to dominate modern discussions of technology and ethics – and although AI was initially allowed to largely develop without guidelines or rules, few would deny that the law is set to play a fundamental role in shaping the future of AI. As the debate over AI is far from over, the need for rigorous analysis has never been greater. This book thus brings together contributors from different fields and backgrounds to explore how the law might provide answers to some of the most pressing questions raised by AI. An outcome of the Católica Research Centre for the Future of Law and its interdisciplinary working group on Law and Artificial Intelligence, it includes contributions by leading scholars in the fields of technology, ethics and the law.info:eu-repo/semantics/publishedVersio

    Sound Event Detection by Exploring Audio Sequence Modelling

    Get PDF
    Everyday sounds in real-world environments are a powerful source of information by which humans can interact with their environments. Humans can infer what is happening around them by listening to everyday sounds. At the same time, it is a challenging task for a computer algorithm in a smart device to automatically recognise, understand, and interpret everyday sounds. Sound event detection (SED) is the process of transcribing an audio recording into sound event tags with onset and offset time values. This involves classification and segmentation of sound events in the given audio recording. SED has numerous applications in everyday life which include security and surveillance, automation, healthcare monitoring, multimedia information retrieval, and assisted living technologies. SED is to everyday sounds what automatic speech recognition (ASR) is to speech and automatic music transcription (AMT) is to music. The fundamental questions in designing a sound recognition system are, which portion of a sound event should the system analyse, and what proportion of a sound event should the system process in order to claim a confident detection of that particular sound event. While the classification of sound events has improved a lot in recent years, it is considered that the temporal-segmentation of sound events has not improved in the same extent. The aim of this thesis is to propose and develop methods to improve the segmentation and classification of everyday sound events in SED models. In particular, this thesis explores the segmentation of sound events by investigating audio sequence encoding-based and audio sequence modelling-based methods, in an effort to improve the overall sound event detection performance. In the first phase of this thesis, efforts are put towards improving sound event detection by explicitly conditioning the audio sequence representations of an SED model using sound activity detection (SAD) and onset detection. To achieve this, we propose multi-task learning-based SED models in which SAD and onset detection are used as auxiliary tasks for the SED task. The next part of this thesis explores self-attention-based audio sequence modelling, which aggregates audio representations based on temporal relations within and between sound events, scored on the basis of the similarity of sound event portions in audio event sequences. We propose SED models that include memory-controlled, adaptive, dynamic, and source separation-induced self-attention variants, with the aim to improve overall sound recognition

    LIPIcs, Volume 251, ITCS 2023, Complete Volume

    Get PDF
    LIPIcs, Volume 251, ITCS 2023, Complete Volum

    Linguistically inspired roadmap for building biologically reliable protein language models

    Full text link
    Deep neural-network-based language models (LMs) are increasingly applied to large-scale protein sequence data to predict protein function. However, being largely black-box models and thus challenging to interpret, current protein LM approaches do not contribute to a fundamental understanding of sequence-function mappings, hindering rule-based biotherapeutic drug development. We argue that guidance drawn from linguistics, a field specialized in analytical rule extraction from natural language data, can aid with building more interpretable protein LMs that are more likely to learn relevant domain-specific rules. Differences between protein sequence data and linguistic sequence data require the integration of more domain-specific knowledge in protein LMs compared to natural language LMs. Here, we provide a linguistics-based roadmap for protein LM pipeline choices with regard to training data, tokenization, token embedding, sequence embedding, and model interpretation. Incorporating linguistic ideas into protein LMs enables the development of next-generation interpretable machine-learning models with the potential of uncovering the biological mechanisms underlying sequence-function relationships.Comment: 27 pages, 4 figure

    Rethink Digital Health Innovation: Understanding Socio-Technical Interoperability as Guiding Concept

    Get PDF
    Diese Dissertation sucht nach einem theoretischem GrundgerĂŒst, um komplexe, digitale Gesundheitsinnovationen so zu entwickeln, dass sie bessere Erfolgsaussichten haben, auch in der alltĂ€glichen Versorgungspraxis anzukommen. Denn obwohl es weder am Bedarf von noch an Ideen fĂŒr digitale Gesundheitsinnovationen mangelt, bleibt die Flut an erfolgreich in der Praxis etablierten Lösungen leider aus. Dieser unzureichende Diffusionserfolg einer entwickelten Lösung - gern auch als Pilotitis pathologisiert - offenbart sich insbesondere dann, wenn die geplante Innovation mit grĂ¶ĂŸeren Ambitionen und KomplexitĂ€t verbunden ist. Dem geĂŒbten Kritiker werden sofort ketzerische Gegenfragen in den Sinn kommen. Beispielsweise was denn unter komplexen, digitalen Gesundheitsinnovationen verstanden werden soll und ob es ĂŒberhaupt möglich ist, eine universale Lösungsformel zu finden, die eine erfolgreiche Diffusion digitaler Gesundheitsinnovationen garantieren kann. Beide Fragen sind nicht nur berechtigt, sondern mĂŒnden letztlich auch in zwei ForschungsstrĂ€nge, welchen ich mich in dieser Dissertation explizit widme. In einem ersten Block erarbeite ich eine Abgrenzung jener digitalen Gesundheitsinnovationen, welche derzeit in Literatur und Praxis besondere Aufmerksamkeit aufgrund ihres hohen Potentials zur Versorgungsverbesserung und ihrer resultierenden KomplexitĂ€t gewidmet ist. Genauer gesagt untersuche ich dominante Zielstellungen und welche Herausforderung mit ihnen einhergehen. Innerhalb der Arbeiten in diesem Forschungsstrang kristallisieren sich vier Zielstellungen heraus: 1. die UnterstĂŒtzung kontinuierlicher, gemeinschaftlicher Versorgungsprozesse ĂŒber diverse Leistungserbringer (auch als inter-organisationale Versorgungspfade bekannt); 2. die aktive Einbeziehung der Patient:innen in ihre Versorgungsprozesse (auch als Patient Empowerment oder Patient Engagement bekannt); 3. die StĂ€rkung der sektoren-ĂŒbergreifenden Zusammenarbeit zwischen Wissenschaft und Versorgungpraxis bis hin zu lernenden Gesundheitssystemen und 4. die Etablierung daten-zentrierter Wertschöpfung fĂŒr das Gesundheitswesen aufgrund steigender bzgl. VerfĂŒgbarkeit valider Daten, neuen Verarbeitungsmethoden (Stichwort KĂŒnstliche Intelligenz) sowie den zahlreichen Nutzungsmöglichkeiten. Im Fokus dieser Dissertation stehen daher weniger die autarken, klar abgrenzbaren Innovationen (bspw. eine Symptomtagebuch-App zur Beschwerdedokumentation). Vielmehr adressiert diese Doktorarbeit jene Innovationsvorhaben, welche eine oder mehrere der o.g. Zielstellung verfolgen, ein weiteres technologisches Puzzleteil in komplexe Informationssystemlandschaften hinzufĂŒgen und somit im Zusammenspiel mit diversen weiteren IT-Systemen zur Verbesserung der Gesundheitsversorgung und/ oder ihrer Organisation beitragen. In der Auseinandersetzung mit diesen Zielstellungen und verbundenen Herausforderungen der Systementwicklung rĂŒckte das Problem fragmentierter IT-Systemlandschaften des Gesundheitswesens in den Mittelpunkt. Darunter wird der unerfreuliche Zustand verstanden, dass unterschiedliche Informations- und Anwendungssysteme nicht wie gewĂŒnscht miteinander interagieren können. So kommt es zu Unterbrechungen von InformationsflĂŒssen und Versorgungsprozessen, welche anderweitig durch fehleranfĂ€llige ZusatzaufwĂ€nde (bspw. Doppeldokumentation) aufgefangen werden mĂŒssen. Um diesen EinschrĂ€nkungen der EffektivitĂ€t und Effizienz zu begegnen, mĂŒssen eben jene IT-System-Silos abgebaut werden. Alle o.g. Zielstellungen ordnen sich dieser defragmentierenden Wirkung unter, in dem sie 1. verschiedene Leistungserbringer, 2. Versorgungsteams und Patient:innen, 3. Wissenschaft und Versorgung oder 4. diverse Datenquellen und moderne Auswertungstechnologien zusammenfĂŒhren wollen. Doch nun kommt es zu einem komplexen Ringschluss. Einerseits suchen die in dieser Arbeit thematisierten digitalen Gesundheitsinnovationen Wege zur Defragmentierung der Informationssystemlandschaften. Andererseits ist ihre eingeschrĂ€nkte Erfolgsquote u.a. in eben jener bestehenden Fragmentierung begrĂŒndet, die sie aufzulösen suchen. Mit diesem Erkenntnisgewinn eröffnet sich der zweite Forschungsstrang dieser Arbeit, der sich mit der Eigenschaft der 'InteroperabilitĂ€t' intensiv auseinandersetzt. Er untersucht, wie diese Eigenschaft eine zentrale Rolle fĂŒr Innovationsvorhaben in der Digital Health DomĂ€ne einnehmen soll. Denn InteroperabilitĂ€t beschreibt, vereinfacht ausgedrĂŒckt, die FĂ€higkeit von zwei oder mehreren Systemen miteinander gemeinsame Aufgaben zu erfĂŒllen. Sie reprĂ€sentiert somit das Kernanliegen der identifizierten Zielstellungen und ist Dreh- und Angelpunkt, wenn eine entwickelte Lösung in eine konkrete Zielumgebung integriert werden soll. Von einem technisch-dominierten Blickwinkel aus betrachtet, geht es hierbei um die GewĂ€hrleistung von validen, performanten und sicheren Kommunikationsszenarien, sodass die o.g. InformationsflussbrĂŒche zwischen technischen Teilsystemen abgebaut werden. Ein rein technisches InteroperabilitĂ€tsverstĂ€ndnis genĂŒgt jedoch nicht, um die Vielfalt an Diffusionsbarrieren von digitalen Gesundheitsinnovationen zu umfassen. Denn beispielsweise das Fehlen adĂ€quater VergĂŒtungsoptionen innerhalb der gesetzlichen Rahmenbedingungen oder eine mangelhafte PassfĂ€higkeit fĂŒr den bestimmten Versorgungsprozess sind keine rein technischen Probleme. Vielmehr kommt hier eine Grundhaltung der Wirtschaftsinformatik zum Tragen, die Informationssysteme - auch die des Gesundheitswesens - als sozio-technische Systeme begreift und dabei Technologie stets im Zusammenhang mit Menschen, die sie nutzen, von ihr beeinflusst werden oder sie organisieren, betrachtet. Soll eine digitale Gesundheitsinnovation, die einen Mehrwert gemĂ€ĂŸ der o.g. Zielstellungen verspricht, in eine existierende Informationssystemlandschaft der Gesundheitsversorgung integriert werden, so muss sie aus technischen sowie nicht-technischen Gesichtspunkten 'interoperabel' sein. Zwar ist die Notwendigkeit von InteroperabilitĂ€t in der Wissenschaft, Politik und Praxis bekannt und auch positive Bewegungen der DomĂ€ne hin zu mehr InteroperabilitĂ€t sind zu verspĂŒren. Jedoch dominiert dabei einerseits ein technisches VerstĂ€ndnis und andererseits bleibt das Potential dieser Eigenschaft als Leitmotiv fĂŒr das Innovationsmanagement bislang weitestgehend ungenutzt. An genau dieser Stelle knĂŒpft nun der Hauptbeitrag dieser Doktorarbeit an, in dem sie eine sozio-technische Konzeptualisierung und Kontextualisierung von InteroperabilitĂ€t fĂŒr kĂŒnftige digitale Gesundheitsinnovationen vorschlĂ€gt. Literatur- und expertenbasiert wird ein Rahmenwerk erarbeitet - das Digital Health Innovation Interoperability Framework - das insbesondere Innovatoren und Innovationsfördernde dabei unterstĂŒtzen soll, die Diffusionswahrscheinlichkeit in die Praxis zu erhöhen. Nun sind mit diesem Framework viele Erkenntnisse und Botschaften verbunden, die ich fĂŒr diesen Prolog wie folgt zusammenfassen möchte: 1. Um die Entwicklung digitaler Gesundheitsinnovationen bestmöglich auf eine erfolgreiche Integration in eine bestimmte Zielumgebung auszurichten, sind die Realisierung eines neuartigen Wertversprechens sowie die GewĂ€hrleistung sozio-technischer InteroperabilitĂ€t die zwei zusammenhĂ€ngenden Hauptaufgaben eines Innovationsprozesses. 2. Die GewĂ€hrleistung von InteroperabilitĂ€t ist eine aktiv zu verantwortende Managementaufgabe und wird durch projektspezifische Bedingungen sowie von externen und internen Dynamiken beeinflusst. 3. Sozio-technische InteroperabilitĂ€t im Kontext digitaler Gesundheitsinnovationen kann ĂŒber sieben, interdependente Ebenen definiert werden: Politische und regulatorische Bedingungen; Vertragsbedingungen; Versorgungs- und GeschĂ€ftsprozesse; Nutzung; Information; Anwendungen; IT-Infrastruktur. 4. Um InteroperabilitĂ€t auf jeder dieser Ebenen zu gewĂ€hrleisten, sind Strategien differenziert zu definieren, welche auf einem Kontinuum zwischen KompatibilitĂ€tsanforderungen aufseiten der Innovation und der Motivation von Anpassungen aufseiten der Zielumgebung verortet werden können. 5. Das Streben nach mehr InteroperabilitĂ€t fördert sowohl den nachhaltigen Erfolg der einzelnen digitalen Gesundheitsinnovation als auch die Defragmentierung existierender Informationssystemlandschaften und trĂ€gt somit zur Verbesserung des Gesundheitswesens bei. Zugegeben: die letzte dieser fĂŒnf Botschaften trĂ€gt eher die FĂ€rbung einer Überzeugung, als dass sie ein Ergebnis wissenschaftlicher BeweisfĂŒhrung ist. Dennoch empfinde ich diese, wenn auch persönliche Erkenntnis als Maxim der DomĂ€ne, der ich mich zugehörig fĂŒhle - der IT-Systementwicklung des Gesundheitswesens

    Improving Prediction Performance and Model Interpretability through Attention Mechanisms from Basic and Applied Research Perspectives

    Get PDF
    With the dramatic advances in deep learning technology, machine learning research is focusing on improving the interpretability of model predictions as well as prediction performance in both basic and applied research. While deep learning models have much higher prediction performance than conventional machine learning models, the specific prediction process is still difficult to interpret and/or explain. This is known as the black-boxing of machine learning models and is recognized as a particularly important problem in a wide range of research fields, including manufacturing, commerce, robotics, and other industries where the use of such technology has become commonplace, as well as the medical field, where mistakes are not tolerated.Focusing on natural language processing tasks, we consider interpretability as the presentation of the contribution of a prediction to an input word in a recurrent neural network. In interpreting predictions from deep learning models, much work has been done mainly on visualization of importance mainly based on attention weights and gradients for the inference results. However, it has become clear in recent years that there are not negligible problems with these mechanisms of attention mechanisms and gradients-based techniques. The first is that the attention weight learns which parts to focus on, but depending on the task or problem setting, the relationship with the importance of the gradient may be strong or weak, and these may not always be strongly related. Furthermore, it is often unclear how to integrate both interpretations. From another perspective, there are several unclear aspects regarding the appropriate application of the effects of attention mechanisms to real-world problems with large datasets, as well as the properties and characteristics of the applied effects. This dissertation discusses both basic and applied research on how attention mechanisms improve the performance and interpretability of machine learning models.From the basic research perspective, we proposed a new learning method that focuses on the vulnerability of the attention mechanism to perturbations, which contributes significantly to prediction performance and interpretability. Deep learning models are known to respond to small perturbations that humans cannot perceive and may exhibit unintended behaviors and predictions. Attention mechanisms used to interpret predictions are no exception. This is a very serious problem because current deep learning models rely heavily on this mechanism. We focused on training techniques using adversarial perturbations, i.e., perturbations that dares to deceive the attention mechanism. We demonstrated that such an adversarial training technique makes the perturbation-sensitive attention mechanism robust and enables the presentation of highly interpretable predictive evidence. By further extending the proposed technique to semi-supervised learning, a general-purpose learning model with a more robust and interpretable attention mechanism was achieved.From the applied research perspective, we investigated the effectiveness of the deep learning models with attention mechanisms validated in the basic research, are in real-world applications. Since deep learning models with attention mechanisms have mainly been evaluated using basic tasks in natural language processing and computer vision, their performance when used as core components of applications and services has often been unclear. We confirm the effectiveness of the proposed framework with an attention mechanism by focusing on the real world of applications, particularly in the field of computational advertising, where the amount of data is large, and the interpretation of predictions is necessary. The proposed frameworks are new attempts to support operations by predicting the nature of digital advertisements with high serving effectiveness, and their effectiveness has been confirmed using large-scale ad-serving data.In light of the above, the research summarized in this dissertation focuses on the attention mechanism, which has been the focus of much attention in recent years, and discusses its potential for both basic research in terms of improving prediction performance and interpretability, and applied research in terms of evaluating it for real-world applications using large data sets beyond the laboratory environment. The dissertation also concludes with a summary of the implications of these findings for subsequent research and future prospects in the field.ćšćŁ«ïŒˆć·„ć­ŠïŒ‰æł•æ”żć€§ć­Š (Hosei University

    Defining Safe Training Datasets for Machine Learning Models Using Ontologies

    Get PDF
    Machine Learning (ML) models have been gaining popularity in recent years in a wide variety of domains, including safety-critical domains. While ML models have shown high accuracy in their predictions, they are still considered black boxes, meaning that developers and users do not know how the models make their decisions. While this is simply a nuisance in some domains, in safetycritical domains, this makes ML models difficult to trust. To fully utilize ML models in safetycritical domains, there needs to be a method to improve trust in their safety and accuracy without human experts checking each decision. This research proposes a method to increase trust in ML models used in safety-critical domains by ensuring the safety and completeness of the model’s training dataset. Since most of the complexity of the model is built through training, ensuring the safety of the training dataset could help to increase the trust in the safety of the model. The method proposed in this research uses a domain ontology and an image quality characteristic ontology to validate the domain completeness and image quality robustness of a training dataset. This research also presents an experiment as a proof of concept for this method where ontologies are built for the emergency road vehicle domain

    AI: Limits and Prospects of Artificial Intelligence

    Get PDF
    The emergence of artificial intelligence has triggered enthusiasm and promise of boundless opportunities as much as uncertainty about its limits. The contributions to this volume explore the limits of AI, describe the necessary conditions for its functionality, reveal its attendant technical and social problems, and present some existing and potential solutions. At the same time, the contributors highlight the societal and attending economic hopes and fears, utopias and dystopias that are associated with the current and future development of artificial intelligence
    • 

    corecore