149 research outputs found
Improved neural machine translation systems for low resource correction tasks
Recent advances in Neural Machine Translation (NMT) systems have achieved impressive results on language translation tasks. However, the success of these systems has been limited when applied to similar low-resource tasks, such as language correction. In these cases, datasets are often small whilst still containing long sequences, leading to significant overfitting and poor generalization. In this thesis we study issues preventing widespread adoption of NMT systems into low resource tasks, with a special focus on sequence correction for both code and language. We propose two novel techniques for handling these low-resource tasks. The first uses Generative Adversarial Networks to handle datasets without paired data. This technique allows the use of available unpaired datasets which are typically much larger than paired datasets since they do not require manual annotation. We first develop a methodology for generation of discrete sequences using a Wasserstein Generative Adversarial Network, and then use this methodology to train a NMT system on unpaired data. Our second technique converts sequences into a tree-structured representation, and performs translation from tree-to-tree. This improves the handling of very long sequences since it reduces the distance between nodes in the network, and allows the network to take advantage of information contained in the tree structure to reduce overfitting
Ultra high-density hybrid pixel sensors for the detection of charge particles
L'abstract è presente nell'allegato / the abstract is in the attachmen
Bias and Fairness in Large Language Models: A Survey
Rapid advancements of large language models (LLMs) have enabled the
processing, understanding, and generation of human-like text, with increasing
integration into systems that touch our social sphere. Despite this success,
these models can learn, perpetuate, and amplify harmful social biases. In this
paper, we present a comprehensive survey of bias evaluation and mitigation
techniques for LLMs. We first consolidate, formalize, and expand notions of
social bias and fairness in natural language processing, defining distinct
facets of harm and introducing several desiderata to operationalize fairness
for LLMs. We then unify the literature by proposing three intuitive taxonomies,
two for bias evaluation, namely metrics and datasets, and one for mitigation.
Our first taxonomy of metrics for bias evaluation disambiguates the
relationship between metrics and evaluation datasets, and organizes metrics by
the different levels at which they operate in a model: embeddings,
probabilities, and generated text. Our second taxonomy of datasets for bias
evaluation categorizes datasets by their structure as counterfactual inputs or
prompts, and identifies the targeted harms and social groups; we also release a
consolidation of publicly-available datasets for improved access. Our third
taxonomy of techniques for bias mitigation classifies methods by their
intervention during pre-processing, in-training, intra-processing, and
post-processing, with granular subcategories that elucidate research trends.
Finally, we identify open problems and challenges for future work. Synthesizing
a wide range of recent research, we aim to provide a clear guide of the
existing literature that empowers researchers and practitioners to better
understand and prevent the propagation of bias in LLMs
Towards Debugging and Testing Deep Learning Systems
Au cours des dernières années, l’apprentissage profond, en anglais Deep Learning (DL) a fait d’énormes progrès, en atteignant et dépassant même parfois le niveau de performance des humains pour différentes tâches, telles que la classification des images et la reconnaissance vocale. Grâce à ces progrès, nous constatons une large adoption du DL dans des applications critiques, telles que la conduite autonome de véhicules, la prévention et la détection du
crime, et le traitement médical. Cependant, malgré leurs progrès spectaculaires, les systèmes de DL, tout comme les logiciels traditionnels, présentent souvent des comportements erronés en raison de l’existence de défauts cachés ou d’inefficacités. Ces comportements erronés
peuvent être à l’origine d’accidents catastrophiques. Ainsi, l’assurance de la qualité des logiciels (SQA), y compris la fiabilité et la robustesse, pour les systèmes de DL devient une préoccupation majeure. Les tests traditionnels pour les modèles de DL consistent à mesurer leurs performances sur des données collectées manuellement ; ils dépendent donc fortement de la qualité des données de test qui, souvent, n’incluent pas de données d’entrée rares, comme en
témoignent les récents accidents de voitures avec conduite autonome (exemple Tesla/Uber). Les techniques de test avancées sont très demandées pour améliorer la fiabilité des systèmes de DL. Néanmoins, les tests des systèmes de DL posent des défis importants, en raison de leur nature non-déterministe puisqu’ils suivent un paradigme axé sur les données (la tâche cible est apprise statistiquement) et leur manque d’oracle puisqu’ils sont conçus principalement
pour fournir la réponse. Récemment, les chercheurs en génie logiciel ont commencé à adapter des concepts du domaine du test logiciel tels que la couverture des cas de tests et
les pseudo-oracles, pour résoudre ces difficultés. Malgré les résultats prometteurs obtenus de cette rénovation des méthodes existantes de test logiciel, le domaine du test des systèmes de DL est encore immature et les méthodes proposées à ce jour ne sont pas très efficaces. Dans ce mémoire, nous examinons les solutions existantes proposées pour tester les systèmes de DL et proposons quelques nouvelles techniques. Nous réalisons cet objectif en suivant une approche systématique qui consiste à : (1) étudier les problèmes et les défis liés aux tests des logiciels de DL; (2) souligner les forces et les faiblesses des techniques de test logiciel adaptées aux systèmes de DL; (3) proposer de nouvelles solutions de test pour combler certaines lacunes identifiées dans la littérature, et potentiellement aider à améliorer l’assurance qualité des systèmes de DL.----------ABSTRACT: Over the past few years, Deep Learning (DL) has made tremendous progress, achieving or surpassing human-level performance for different tasks such as image classification and speech recognition. Thanks to these advances, we are witnessing a wide adoption of DL in safetycritical applications such as autonomous driving cars, crime prevention and detection, and medical treatment. However, despite their spectacular progress, DL systems, just like traditional software systems, often exhibit erroneous corner-cases behaviors due to the existence of
latent defects or inefficiencies, and which can lead to catastrophic accidents. Thus, software quality assurance (SQA), including reliability and robustness, for DL systems becomes a big concern. Traditional testing for DL models consists of measuring their performance on manually collected data ; so it heavily depends on the quality of the test data that often fails to include rare inputs, as evidenced by recent autonomous-driving car accidents (e.g., Tesla/Uber). Advanced testing techniques are in high demand to improve the trustworthiness of DL systems. Nevertheless, DL testing poses significant challenges stemming from the non-deterministic nature of DL systems (since they follow a data-driven paradigm ; the target task is learned
statistically) and their lack of oracle (since they are designed principally to provide the answer). Recently, software researchers have started adapting concepts from the software testing domain such as test coverage and pseudo-oracles to tackle these difficulties. Despite some
promising results obtained from adapting existing software testing methods, current software testing techniques for DL systems are still quite immature. In this thesis, we examine existing testing techniques for DL systems and propose some new techniques. We achieve this by following a systematic approach consisting of : (1) investigating DL software issues and testing challenges ; (2) outlining the strengths and weaknesses of the software-based testing techniques adapted for DL systems ; and (3) proposing novel testing solutions to fill some of the identified literature gaps, and potentially help improving the SQA of DL systems
Deep Representation Learning with Limited Data for Biomedical Image Synthesis, Segmentation, and Detection
Biomedical imaging requires accurate expert annotation and interpretation that can aid medical staff and clinicians in automating differential diagnosis and solving underlying health conditions. With the advent of Deep learning, it has become a standard for reaching expert-level performance in non-invasive biomedical imaging tasks by training with large image datasets. However, with the need for large publicly available datasets, training a deep learning model to learn intrinsic representations becomes harder. Representation learning with limited data has introduced new learning techniques, such as Generative Adversarial Networks, Semi-supervised Learning, and Self-supervised Learning, that can be applied to various biomedical applications. For example, ophthalmologists use color funduscopy (CF) and fluorescein angiography (FA) to diagnose retinal degenerative diseases. However, fluorescein angiography requires injecting a dye, which can create adverse reactions in the patients. So, to alleviate this, a non-invasive technique needs to be developed that can translate fluorescein angiography from fundus images. Similarly, color funduscopy and optical coherence tomography (OCT) are also utilized to semantically segment the vasculature and fluid build-up in spatial and volumetric retinal imaging, which can help with the future prognosis of diseases. Although many automated techniques have been proposed for medical image segmentation, the main drawback is the model's precision in pixel-wise predictions. Another critical challenge in the biomedical imaging field is accurately segmenting and quantifying dynamic behaviors of calcium signals in cells. Calcium imaging is a widely utilized approach to studying subcellular calcium activity and cell function; however, large datasets have yielded a profound need for fast, accurate, and standardized analyses of calcium signals. For example, image sequences from calcium signals in colonic pacemaker cells ICC (Interstitial cells of Cajal) suffer from motion artifacts and high periodic and sensor noise, making it difficult to accurately segment and quantify calcium signal events. Moreover, it is time-consuming and tedious to annotate such a large volume of calcium image stacks or videos and extract their associated spatiotemporal maps. To address these problems, we propose various deep representation learning architectures that utilize limited labels and annotations to address the critical challenges in these biomedical applications. To this end, we detail our proposed semi-supervised, generative adversarial networks and transformer-based architectures for individual learning tasks such as retinal image-to-image translation, vessel and fluid segmentation from fundus and OCT images, breast micro-mass segmentation, and sub-cellular calcium events tracking from videos and spatiotemporal map quantification. We also illustrate two multi-modal multi-task learning frameworks with applications that can be extended to other domains of biomedical applications. The main idea is to incorporate each of these as individual modules to our proposed multi-modal frameworks to solve the existing challenges with 1) Fluorescein angiography synthesis, 2) Retinal vessel and fluid segmentation, 3) Breast micro-mass segmentation, and 4) Dynamic quantification of calcium imaging datasets
Cybersecurity: Past, Present and Future
The digital transformation has created a new digital space known as
cyberspace. This new cyberspace has improved the workings of businesses,
organizations, governments, society as a whole, and day to day life of an
individual. With these improvements come new challenges, and one of the main
challenges is security. The security of the new cyberspace is called
cybersecurity. Cyberspace has created new technologies and environments such as
cloud computing, smart devices, IoTs, and several others. To keep pace with
these advancements in cyber technologies there is a need to expand research and
develop new cybersecurity methods and tools to secure these domains and
environments. This book is an effort to introduce the reader to the field of
cybersecurity, highlight current issues and challenges, and provide future
directions to mitigate or resolve them. The main specializations of
cybersecurity covered in this book are software security, hardware security,
the evolution of malware, biometrics, cyber intelligence, and cyber forensics.
We must learn from the past, evolve our present and improve the future. Based
on this objective, the book covers the past, present, and future of these main
specializations of cybersecurity. The book also examines the upcoming areas of
research in cyber intelligence, such as hybrid augmented and explainable
artificial intelligence (AI). Human and AI collaboration can significantly
increase the performance of a cybersecurity system. Interpreting and explaining
machine learning models, i.e., explainable AI is an emerging field of study and
has a lot of potentials to improve the role of AI in cybersecurity.Comment: Author's copy of the book published under ISBN: 978-620-4-74421-
- …