149 research outputs found

    Improved neural machine translation systems for low resource correction tasks

    Get PDF
    Recent advances in Neural Machine Translation (NMT) systems have achieved impressive results on language translation tasks. However, the success of these systems has been limited when applied to similar low-resource tasks, such as language correction. In these cases, datasets are often small whilst still containing long sequences, leading to significant overfitting and poor generalization. In this thesis we study issues preventing widespread adoption of NMT systems into low resource tasks, with a special focus on sequence correction for both code and language. We propose two novel techniques for handling these low-resource tasks. The first uses Generative Adversarial Networks to handle datasets without paired data. This technique allows the use of available unpaired datasets which are typically much larger than paired datasets since they do not require manual annotation. We first develop a methodology for generation of discrete sequences using a Wasserstein Generative Adversarial Network, and then use this methodology to train a NMT system on unpaired data. Our second technique converts sequences into a tree-structured representation, and performs translation from tree-to-tree. This improves the handling of very long sequences since it reduces the distance between nodes in the network, and allows the network to take advantage of information contained in the tree structure to reduce overfitting

    Ultra high-density hybrid pixel sensors for the detection of charge particles

    Get PDF
    L'abstract è presente nell'allegato / the abstract is in the attachmen

    Bias and Fairness in Large Language Models: A Survey

    Full text link
    Rapid advancements of large language models (LLMs) have enabled the processing, understanding, and generation of human-like text, with increasing integration into systems that touch our social sphere. Despite this success, these models can learn, perpetuate, and amplify harmful social biases. In this paper, we present a comprehensive survey of bias evaluation and mitigation techniques for LLMs. We first consolidate, formalize, and expand notions of social bias and fairness in natural language processing, defining distinct facets of harm and introducing several desiderata to operationalize fairness for LLMs. We then unify the literature by proposing three intuitive taxonomies, two for bias evaluation, namely metrics and datasets, and one for mitigation. Our first taxonomy of metrics for bias evaluation disambiguates the relationship between metrics and evaluation datasets, and organizes metrics by the different levels at which they operate in a model: embeddings, probabilities, and generated text. Our second taxonomy of datasets for bias evaluation categorizes datasets by their structure as counterfactual inputs or prompts, and identifies the targeted harms and social groups; we also release a consolidation of publicly-available datasets for improved access. Our third taxonomy of techniques for bias mitigation classifies methods by their intervention during pre-processing, in-training, intra-processing, and post-processing, with granular subcategories that elucidate research trends. Finally, we identify open problems and challenges for future work. Synthesizing a wide range of recent research, we aim to provide a clear guide of the existing literature that empowers researchers and practitioners to better understand and prevent the propagation of bias in LLMs

    Towards Debugging and Testing Deep Learning Systems

    Get PDF
    Au cours des dernières années, l’apprentissage profond, en anglais Deep Learning (DL) a fait d’énormes progrès, en atteignant et dépassant même parfois le niveau de performance des humains pour différentes tâches, telles que la classification des images et la reconnaissance vocale. Grâce à ces progrès, nous constatons une large adoption du DL dans des applications critiques, telles que la conduite autonome de véhicules, la prévention et la détection du crime, et le traitement médical. Cependant, malgré leurs progrès spectaculaires, les systèmes de DL, tout comme les logiciels traditionnels, présentent souvent des comportements erronés en raison de l’existence de défauts cachés ou d’inefficacités. Ces comportements erronés peuvent être à l’origine d’accidents catastrophiques. Ainsi, l’assurance de la qualité des logiciels (SQA), y compris la fiabilité et la robustesse, pour les systèmes de DL devient une préoccupation majeure. Les tests traditionnels pour les modèles de DL consistent à mesurer leurs performances sur des données collectées manuellement ; ils dépendent donc fortement de la qualité des données de test qui, souvent, n’incluent pas de données d’entrée rares, comme en témoignent les récents accidents de voitures avec conduite autonome (exemple Tesla/Uber). Les techniques de test avancées sont très demandées pour améliorer la fiabilité des systèmes de DL. Néanmoins, les tests des systèmes de DL posent des défis importants, en raison de leur nature non-déterministe puisqu’ils suivent un paradigme axé sur les données (la tâche cible est apprise statistiquement) et leur manque d’oracle puisqu’ils sont conçus principalement pour fournir la réponse. Récemment, les chercheurs en génie logiciel ont commencé à adapter des concepts du domaine du test logiciel tels que la couverture des cas de tests et les pseudo-oracles, pour résoudre ces difficultés. Malgré les résultats prometteurs obtenus de cette rénovation des méthodes existantes de test logiciel, le domaine du test des systèmes de DL est encore immature et les méthodes proposées à ce jour ne sont pas très efficaces. Dans ce mémoire, nous examinons les solutions existantes proposées pour tester les systèmes de DL et proposons quelques nouvelles techniques. Nous réalisons cet objectif en suivant une approche systématique qui consiste à : (1) étudier les problèmes et les défis liés aux tests des logiciels de DL; (2) souligner les forces et les faiblesses des techniques de test logiciel adaptées aux systèmes de DL; (3) proposer de nouvelles solutions de test pour combler certaines lacunes identifiées dans la littérature, et potentiellement aider à améliorer l’assurance qualité des systèmes de DL.----------ABSTRACT: Over the past few years, Deep Learning (DL) has made tremendous progress, achieving or surpassing human-level performance for different tasks such as image classification and speech recognition. Thanks to these advances, we are witnessing a wide adoption of DL in safetycritical applications such as autonomous driving cars, crime prevention and detection, and medical treatment. However, despite their spectacular progress, DL systems, just like traditional software systems, often exhibit erroneous corner-cases behaviors due to the existence of latent defects or inefficiencies, and which can lead to catastrophic accidents. Thus, software quality assurance (SQA), including reliability and robustness, for DL systems becomes a big concern. Traditional testing for DL models consists of measuring their performance on manually collected data ; so it heavily depends on the quality of the test data that often fails to include rare inputs, as evidenced by recent autonomous-driving car accidents (e.g., Tesla/Uber). Advanced testing techniques are in high demand to improve the trustworthiness of DL systems. Nevertheless, DL testing poses significant challenges stemming from the non-deterministic nature of DL systems (since they follow a data-driven paradigm ; the target task is learned statistically) and their lack of oracle (since they are designed principally to provide the answer). Recently, software researchers have started adapting concepts from the software testing domain such as test coverage and pseudo-oracles to tackle these difficulties. Despite some promising results obtained from adapting existing software testing methods, current software testing techniques for DL systems are still quite immature. In this thesis, we examine existing testing techniques for DL systems and propose some new techniques. We achieve this by following a systematic approach consisting of : (1) investigating DL software issues and testing challenges ; (2) outlining the strengths and weaknesses of the software-based testing techniques adapted for DL systems ; and (3) proposing novel testing solutions to fill some of the identified literature gaps, and potentially help improving the SQA of DL systems

    Deep Representation Learning with Limited Data for Biomedical Image Synthesis, Segmentation, and Detection

    Get PDF
    Biomedical imaging requires accurate expert annotation and interpretation that can aid medical staff and clinicians in automating differential diagnosis and solving underlying health conditions. With the advent of Deep learning, it has become a standard for reaching expert-level performance in non-invasive biomedical imaging tasks by training with large image datasets. However, with the need for large publicly available datasets, training a deep learning model to learn intrinsic representations becomes harder. Representation learning with limited data has introduced new learning techniques, such as Generative Adversarial Networks, Semi-supervised Learning, and Self-supervised Learning, that can be applied to various biomedical applications. For example, ophthalmologists use color funduscopy (CF) and fluorescein angiography (FA) to diagnose retinal degenerative diseases. However, fluorescein angiography requires injecting a dye, which can create adverse reactions in the patients. So, to alleviate this, a non-invasive technique needs to be developed that can translate fluorescein angiography from fundus images. Similarly, color funduscopy and optical coherence tomography (OCT) are also utilized to semantically segment the vasculature and fluid build-up in spatial and volumetric retinal imaging, which can help with the future prognosis of diseases. Although many automated techniques have been proposed for medical image segmentation, the main drawback is the model's precision in pixel-wise predictions. Another critical challenge in the biomedical imaging field is accurately segmenting and quantifying dynamic behaviors of calcium signals in cells. Calcium imaging is a widely utilized approach to studying subcellular calcium activity and cell function; however, large datasets have yielded a profound need for fast, accurate, and standardized analyses of calcium signals. For example, image sequences from calcium signals in colonic pacemaker cells ICC (Interstitial cells of Cajal) suffer from motion artifacts and high periodic and sensor noise, making it difficult to accurately segment and quantify calcium signal events. Moreover, it is time-consuming and tedious to annotate such a large volume of calcium image stacks or videos and extract their associated spatiotemporal maps. To address these problems, we propose various deep representation learning architectures that utilize limited labels and annotations to address the critical challenges in these biomedical applications. To this end, we detail our proposed semi-supervised, generative adversarial networks and transformer-based architectures for individual learning tasks such as retinal image-to-image translation, vessel and fluid segmentation from fundus and OCT images, breast micro-mass segmentation, and sub-cellular calcium events tracking from videos and spatiotemporal map quantification. We also illustrate two multi-modal multi-task learning frameworks with applications that can be extended to other domains of biomedical applications. The main idea is to incorporate each of these as individual modules to our proposed multi-modal frameworks to solve the existing challenges with 1) Fluorescein angiography synthesis, 2) Retinal vessel and fluid segmentation, 3) Breast micro-mass segmentation, and 4) Dynamic quantification of calcium imaging datasets

    Cybersecurity: Past, Present and Future

    Full text link
    The digital transformation has created a new digital space known as cyberspace. This new cyberspace has improved the workings of businesses, organizations, governments, society as a whole, and day to day life of an individual. With these improvements come new challenges, and one of the main challenges is security. The security of the new cyberspace is called cybersecurity. Cyberspace has created new technologies and environments such as cloud computing, smart devices, IoTs, and several others. To keep pace with these advancements in cyber technologies there is a need to expand research and develop new cybersecurity methods and tools to secure these domains and environments. This book is an effort to introduce the reader to the field of cybersecurity, highlight current issues and challenges, and provide future directions to mitigate or resolve them. The main specializations of cybersecurity covered in this book are software security, hardware security, the evolution of malware, biometrics, cyber intelligence, and cyber forensics. We must learn from the past, evolve our present and improve the future. Based on this objective, the book covers the past, present, and future of these main specializations of cybersecurity. The book also examines the upcoming areas of research in cyber intelligence, such as hybrid augmented and explainable artificial intelligence (AI). Human and AI collaboration can significantly increase the performance of a cybersecurity system. Interpreting and explaining machine learning models, i.e., explainable AI is an emerging field of study and has a lot of potentials to improve the role of AI in cybersecurity.Comment: Author's copy of the book published under ISBN: 978-620-4-74421-
    corecore