453 research outputs found

    LIPIcs, Volume 251, ITCS 2023, Complete Volume

    Get PDF
    LIPIcs, Volume 251, ITCS 2023, Complete Volum

    Asymptotics of stochastic learning in structured networks

    Get PDF

    Asymptotics of stochastic learning in structured networks

    Get PDF

    Conditional Invertible Generative Models for Supervised Problems

    Get PDF
    Invertible neural networks (INNs), in the setting of normalizing flows, are a type of unconditional generative likelihood model. Despite various attractive properties compared to other common generative model types, they are rarely useful for supervised tasks or real applications due to their unguided outputs. In this work, we therefore present three new methods that extend the standard INN setting, falling under a broader category we term generative invertible models. These new methods allow leveraging the theoretical and practical benefits of INNs to solve supervised problems in new ways, including real-world applications from different branches of science. The key finding is that our approaches enhance many aspects of trustworthiness in comparison to conventional feed-forward networks, such as uncertainty estimation and quantification, explainability, and proper handling of outlier data

    LIPIcs, Volume 261, ICALP 2023, Complete Volume

    Get PDF
    LIPIcs, Volume 261, ICALP 2023, Complete Volum

    Deep Neural Networks and Tabular Data: Inference, Generation, and Explainability

    Get PDF
    Over the last decade, deep neural networks have enabled remarkable technological advancements, potentially transforming a wide range of aspects of our lives in the future. It is becoming increasingly common for deep-learning models to be used in a variety of situations in the modern life, ranging from search and recommendations to financial and healthcare solutions, and the number of applications utilizing deep neural networks is still on the rise. However, a lot of recent research efforts in deep learning have focused primarily on neural networks and domains in which they excel. This includes computer vision, audio processing, and natural language processing. It is a general tendency for data in these areas to be homogeneous, whereas heterogeneous tabular datasets have received relatively scant attention despite the fact that they are extremely prevalent. In fact, more than half of the datasets on the Google dataset platform are structured and can be represented in a tabular form. The first aim of this study is to provide a thoughtful and comprehensive analysis of deep neural networks' application to modeling and generating tabular data. Apart from that, an open-source performance benchmark on tabular data is presented, where we thoroughly compare over twenty machine and deep learning models on heterogeneous tabular datasets. The second contribution relates to synthetic tabular data generation. Inspired by their success in other homogeneous data modalities, deep generative models such as variational autoencoders and generative adversarial networks are also commonly applied for tabular data generation. However, the use of Transformer-based large language models (which are also generative) for tabular data generation have been received scant research attention. Our contribution to this literature consists of the development of a novel method for generating tabular data based on this family of autoregressive generative models that, on multiple challenging benchmarks, outperformed the current state-of-the-art methods for tabular data generation. Another crucial aspect for a deep-learning data system is that it needs to be reliable and trustworthy to gain broader acceptance in practice, especially in life-critical fields. One of the possible ways to bring trust into a data-driven system is to use explainable machine-learning methods. In spite of this, the current explanation methods often fail to provide robust explanations due to their high sensitivity to the hyperparameter selection or even changes of the random seed. Furthermore, most of these methods are based on feature-wise importance, ignoring the crucial relationship between variables in a sample. The third aim of this work is to address both of these issues by offering more robust and stable explanations, as well as taking into account the relationships between variables using a graph structure. In summary, this thesis made a significant contribution that touched many areas related to deep neural networks and heterogeneous tabular data as well as the usage of explainable machine learning methods

    The text classification pipeline: Starting shallow, going deeper

    Get PDF
    An increasingly relevant and crucial subfield of Natural Language Processing (NLP), tackled in this PhD thesis from a computer science and engineering perspective, is the Text Classification (TC). Also in this field, the exceptional success of deep learning has sparked a boom over the past ten years. Text retrieval and categorization, information extraction and summarization all rely heavily on TC. The literature has presented numerous datasets, models, and evaluation criteria. Even if languages as Arabic, Chinese, Hindi and others are employed in several works, from a computer science perspective the most used and referred language in the literature concerning TC is English. This is also the language mainly referenced in the rest of this PhD thesis. Even if numerous machine learning techniques have shown outstanding results, the classifier effectiveness depends on the capability to comprehend intricate relations and non-linear correlations in texts. In order to achieve this level of understanding, it is necessary to pay attention not only to the architecture of a model but also to other stages of the TC pipeline. In an NLP framework, a range of text representation techniques and model designs have emerged, including the large language models. These models are capable of turning massive amounts of text into useful vector representations that effectively capture semantically significant information. The fact that this field has been investigated by numerous communities, including data mining, linguistics, and information retrieval, is an aspect of crucial interest. These communities frequently have some overlap, but are mostly separate and do their research on their own. Bringing researchers from other groups together to improve the multidisciplinary comprehension of this field is one of the objectives of this dissertation. Additionally, this dissertation makes an effort to examine text mining from both a traditional and modern perspective. This thesis covers the whole TC pipeline in detail. However, the main contribution is to investigate the impact of every element in the TC pipeline to evaluate the impact on the final performance of a TC model. It is discussed the TC pipeline, including the traditional and the most recent deep learning-based models. This pipeline consists of State-Of-The-Art (SOTA) datasets used in the literature as benchmark, text preprocessing, text representation, machine learning models for TC, evaluation metrics and current SOTA results. In each chapter of this dissertation, I go over each of these steps, covering both the technical advancements and my most significant and recent findings while performing experiments and introducing novel models. The advantages and disadvantages of various options are also listed, along with a thorough comparison of the various approaches. At the end of each chapter, there are my contributions with experimental evaluations and discussions on the results that I have obtained during my three years PhD course. The experiments and the analysis related to each chapter (i.e., each element of the TC pipeline) are the main contributions that I provide, extending the basic knowledge of a regular survey on the matter of TC.An increasingly relevant and crucial subfield of Natural Language Processing (NLP), tackled in this PhD thesis from a computer science and engineering perspective, is the Text Classification (TC). Also in this field, the exceptional success of deep learning has sparked a boom over the past ten years. Text retrieval and categorization, information extraction and summarization all rely heavily on TC. The literature has presented numerous datasets, models, and evaluation criteria. Even if languages as Arabic, Chinese, Hindi and others are employed in several works, from a computer science perspective the most used and referred language in the literature concerning TC is English. This is also the language mainly referenced in the rest of this PhD thesis. Even if numerous machine learning techniques have shown outstanding results, the classifier effectiveness depends on the capability to comprehend intricate relations and non-linear correlations in texts. In order to achieve this level of understanding, it is necessary to pay attention not only to the architecture of a model but also to other stages of the TC pipeline. In an NLP framework, a range of text representation techniques and model designs have emerged, including the large language models. These models are capable of turning massive amounts of text into useful vector representations that effectively capture semantically significant information. The fact that this field has been investigated by numerous communities, including data mining, linguistics, and information retrieval, is an aspect of crucial interest. These communities frequently have some overlap, but are mostly separate and do their research on their own. Bringing researchers from other groups together to improve the multidisciplinary comprehension of this field is one of the objectives of this dissertation. Additionally, this dissertation makes an effort to examine text mining from both a traditional and modern perspective. This thesis covers the whole TC pipeline in detail. However, the main contribution is to investigate the impact of every element in the TC pipeline to evaluate the impact on the final performance of a TC model. It is discussed the TC pipeline, including the traditional and the most recent deep learning-based models. This pipeline consists of State-Of-The-Art (SOTA) datasets used in the literature as benchmark, text preprocessing, text representation, machine learning models for TC, evaluation metrics and current SOTA results. In each chapter of this dissertation, I go over each of these steps, covering both the technical advancements and my most significant and recent findings while performing experiments and introducing novel models. The advantages and disadvantages of various options are also listed, along with a thorough comparison of the various approaches. At the end of each chapter, there are my contributions with experimental evaluations and discussions on the results that I have obtained during my three years PhD course. The experiments and the analysis related to each chapter (i.e., each element of the TC pipeline) are the main contributions that I provide, extending the basic knowledge of a regular survey on the matter of TC

    Specificity of the innate immune responses to different classes of non-tuberculous mycobacteria

    Get PDF
    Mycobacterium avium is the most common nontuberculous mycobacterium (NTM) species causing infectious disease. Here, we characterized a M. avium infection model in zebrafish larvae, and compared it to M. marinum infection, a model of tuberculosis. M. avium bacteria are efficiently phagocytosed and frequently induce granuloma-like structures in zebrafish larvae. Although macrophages can respond to both mycobacterial infections, their migration speed is faster in infections caused by M. marinum. Tlr2 is conservatively involved in most aspects of the defense against both mycobacterial infections. However, Tlr2 has a function in the migration speed of macrophages and neutrophils to infection sites with M. marinum that is not observed with M. avium. Using RNAseq analysis, we found a distinct transcriptome response in cytokine-cytokine receptor interaction for M. avium and M. marinum infection. In addition, we found differences in gene expression in metabolic pathways, phagosome formation, matrix remodeling, and apoptosis in response to these mycobacterial infections. In conclusion, we characterized a new M. avium infection model in zebrafish that can be further used in studying pathological mechanisms for NTM-caused diseases

    Interdisciplinarity in the Age of the Triple Helix: a Film Practitioner's Perspective

    Get PDF
    This integrative chapter contextualises my research including articles I have published as well as one of the creative artefacts developed from it, the feature film The Knife That Killed Me. I review my work considering the ways in which technology, industry methods and academic practice have evolved as well as how attitudes to interdisciplinarity have changed, linking these to Etzkowitz and Leydesdorff’s ‘Triple Helix’ model (1995). I explore my own experiences and observations of opportunities and challenges that have been posed by the intersection of different stakeholder needs and expectations, both from industry and academic perspectives, and argue that my work provides novel examples of the applicability of the ‘Triple Helix’ to the creative industries. The chapter concludes with a reflection on the evolution and direction of my work, the relevance of the ‘Triple Helix’ to creative practice, and ways in which this relationship could be investigated further

    Brain Computations and Connectivity [2nd edition]

    Get PDF
    This is an open access title available under the terms of a CC BY-NC-ND 4.0 International licence. It is free to read on the Oxford Academic platform and offered as a free PDF download from OUP and selected open access locations. Brain Computations and Connectivity is about how the brain works. In order to understand this, it is essential to know what is computed by different brain systems; and how the computations are performed. The aim of this book is to elucidate what is computed in different brain systems; and to describe current biologically plausible computational approaches and models of how each of these brain systems computes. Understanding the brain in this way has enormous potential for understanding ourselves better in health and in disease. Potential applications of this understanding are to the treatment of the brain in disease; and to artificial intelligence which will benefit from knowledge of how the brain performs many of its extraordinarily impressive functions. This book is pioneering in taking this approach to brain function: to consider what is computed by many of our brain systems; and how it is computed, and updates by much new evidence including the connectivity of the human brain the earlier book: Rolls (2021) Brain Computations: What and How, Oxford University Press. Brain Computations and Connectivity will be of interest to all scientists interested in brain function and how the brain works, whether they are from neuroscience, or from medical sciences including neurology and psychiatry, or from the area of computational science including machine learning and artificial intelligence, or from areas such as theoretical physics
    • …
    corecore