5,067 research outputs found

    Undergraduate Catalog of Studies, 2023-2024

    Get PDF

    Self-supervised learning for transferable representations

    Get PDF
    Machine learning has undeniably achieved remarkable advances thanks to large labelled datasets and supervised learning. However, this progress is constrained by the labour-intensive annotation process. It is not feasible to generate extensive labelled datasets for every problem we aim to address. Consequently, there has been a notable shift in recent times toward approaches that solely leverage raw data. Among these, self-supervised learning has emerged as a particularly powerful approach, offering scalability to massive datasets and showcasing considerable potential for effective knowledge transfer. This thesis investigates self-supervised representation learning with a strong focus on computer vision applications. We provide a comprehensive survey of self-supervised methods across various modalities, introducing a taxonomy that categorises them into four distinct families while also highlighting practical considerations for real-world implementation. Our focus thenceforth is on the computer vision modality, where we perform a comprehensive benchmark evaluation of state-of-the-art self supervised models against many diverse downstream transfer tasks. Our findings reveal that self-supervised models often outperform supervised learning across a spectrum of tasks, albeit with correlations weakening as tasks transition beyond classification, particularly for datasets with distribution shifts. Digging deeper, we investigate the influence of data augmentation on the transferability of contrastive learners, uncovering a trade-off between spatial and appearance-based invariances that generalise to real-world transformations. This begins to explain the differing empirical performances achieved by self-supervised learners on different downstream tasks, and it showcases the advantages of specialised representations produced with tailored augmentation. Finally, we introduce a novel self-supervised pre-training algorithm for object detection, aligning pre-training with downstream architecture and objectives, leading to reduced localisation errors and improved label efficiency. In conclusion, this thesis contributes a comprehensive understanding of self-supervised representation learning and its role in enabling effective transfer across computer vision tasks

    Multidisciplinary perspectives on Artificial Intelligence and the law

    Get PDF
    This open access book presents an interdisciplinary, multi-authored, edited collection of chapters on Artificial Intelligence (‘AI’) and the Law. AI technology has come to play a central role in the modern data economy. Through a combination of increased computing power, the growing availability of data and the advancement of algorithms, AI has now become an umbrella term for some of the most transformational technological breakthroughs of this age. The importance of AI stems from both the opportunities that it offers and the challenges that it entails. While AI applications hold the promise of economic growth and efficiency gains, they also create significant risks and uncertainty. The potential and perils of AI have thus come to dominate modern discussions of technology and ethics – and although AI was initially allowed to largely develop without guidelines or rules, few would deny that the law is set to play a fundamental role in shaping the future of AI. As the debate over AI is far from over, the need for rigorous analysis has never been greater. This book thus brings together contributors from different fields and backgrounds to explore how the law might provide answers to some of the most pressing questions raised by AI. An outcome of the Católica Research Centre for the Future of Law and its interdisciplinary working group on Law and Artificial Intelligence, it includes contributions by leading scholars in the fields of technology, ethics and the law.info:eu-repo/semantics/publishedVersio

    Sound Event Detection by Exploring Audio Sequence Modelling

    Get PDF
    Everyday sounds in real-world environments are a powerful source of information by which humans can interact with their environments. Humans can infer what is happening around them by listening to everyday sounds. At the same time, it is a challenging task for a computer algorithm in a smart device to automatically recognise, understand, and interpret everyday sounds. Sound event detection (SED) is the process of transcribing an audio recording into sound event tags with onset and offset time values. This involves classification and segmentation of sound events in the given audio recording. SED has numerous applications in everyday life which include security and surveillance, automation, healthcare monitoring, multimedia information retrieval, and assisted living technologies. SED is to everyday sounds what automatic speech recognition (ASR) is to speech and automatic music transcription (AMT) is to music. The fundamental questions in designing a sound recognition system are, which portion of a sound event should the system analyse, and what proportion of a sound event should the system process in order to claim a confident detection of that particular sound event. While the classification of sound events has improved a lot in recent years, it is considered that the temporal-segmentation of sound events has not improved in the same extent. The aim of this thesis is to propose and develop methods to improve the segmentation and classification of everyday sound events in SED models. In particular, this thesis explores the segmentation of sound events by investigating audio sequence encoding-based and audio sequence modelling-based methods, in an effort to improve the overall sound event detection performance. In the first phase of this thesis, efforts are put towards improving sound event detection by explicitly conditioning the audio sequence representations of an SED model using sound activity detection (SAD) and onset detection. To achieve this, we propose multi-task learning-based SED models in which SAD and onset detection are used as auxiliary tasks for the SED task. The next part of this thesis explores self-attention-based audio sequence modelling, which aggregates audio representations based on temporal relations within and between sound events, scored on the basis of the similarity of sound event portions in audio event sequences. We propose SED models that include memory-controlled, adaptive, dynamic, and source separation-induced self-attention variants, with the aim to improve overall sound recognition

    Whole-genome sequencing of chronic lymphocytic leukemia identifies subgroups with distinct biological and clinical features.

    Get PDF
    The value of genome-wide over targeted driver analyses for predicting clinical outcomes of cancer patients is debated. Here, we report the whole-genome sequencing of 485 chronic lymphocytic leukemia patients enrolled in clinical trials as part of the United Kingdom's 100,000 Genomes Project. We identify an extended catalog of recurrent coding and noncoding genetic mutations that represents a source for future studies and provide the most complete high-resolution map of structural variants, copy number changes and global genome features including telomere length, mutational signatures and genomic complexity. We demonstrate the relationship of these features with clinical outcome and show that integration of 186 distinct recurrent genomic alterations defines five genomic subgroups that associate with response to therapy, refining conventional outcome prediction. While requiring independent validation, our findings highlight the potential of whole-genome sequencing to inform future risk stratification in chronic lymphocytic leukemia

    Social learning across symbolic cultural barriers in non-human cultures

    Full text link
    Social learning is key in the development of both human and non-human animal societies. Here, we provide quantitative evidence that supports the existence of social learning in sperm whales across socio-cultural barriers, based on acoustic data from locations in the Pacific and Atlantic Oceans. Sperm whale populations have traditionally been partitioned into clans based on their vocal repertoire (what they say) of rhythmically patterned clicks (codas), and in particular their identity codas, which serve as symbolic markers for each clan. However, identity codas account for between 35% and 60% of all codas vocalized depending on the different clans. We introduce a computational modeling approach that recovers clan structure and shows new evidence of social learning across clans from the internal temporal structure of non-identity codas, the remaining fraction of codas. The proposed method is based on vocal style, which encodes how sperm whales assemble individual clicks into codas. Specifically, we modeled clicking pattern data using generative models based on variable length Markov chains, producing what we term "subcoda trees". Based on our results, we propose here a new concept of vocal identity, which consists of both vocal repertoire and style. We show that (i) style-delimited clans are similar to repertoire-delimited clans, and that (ii) sympatry increases vocal style similarity between clans for non-identity codas, but has no significant effect on identity codas. This implies that different clans who geographically overlap have similar styles for most codas, which in turn implies social learning across cultural boundaries. More broadly, the proposed method provides a new framework for comparing communication systems of other animal species, with potential implications for our understanding of cultural transmission in animal societies

    AI: Limits and Prospects of Artificial Intelligence

    Get PDF
    The emergence of artificial intelligence has triggered enthusiasm and promise of boundless opportunities as much as uncertainty about its limits. The contributions to this volume explore the limits of AI, describe the necessary conditions for its functionality, reveal its attendant technical and social problems, and present some existing and potential solutions. At the same time, the contributors highlight the societal and attending economic hopes and fears, utopias and dystopias that are associated with the current and future development of artificial intelligence

    Automated Detection of COVID-19 Cough Sound using Mel-Spectrogram Images and Convolutional Neural Network

    Get PDF
    COVID-19 is a new disease caused by the Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) variant. The initial symptoms of the disease commonly include fever (83-98%), fatigue or myalgia, dry cough (76-82%), and shortness of breath (31-55%). Given the prevalence of coughing as a symptom, artificial intelligence has been employed to detect COVID-19 based on cough sounds. This study aims to compare the performance of six different Convolutional Neural Network (CNN) models (VGG-16, VGG-19, LeNet-5, AlexNet, ResNet-50, and ResNet-152) in detecting COVID-19 using mel-spectrogram images derived from cough sounds. The training and validation of these CNN models were conducted using the Virufy dataset, consisting of 121 cough audio recordings with a sample rate of 48,000 and a duration of 1 second for all audio data. Audio data was processed to generate mel-spectrogram images, which were subsequently employed as inputs for the CNN models. This study used accuracy, area under curve (AUC), precision, recall, and F1 score as evaluation metrics. The AlexNet model, utilizing an input size of 227×227, exhibited the best performance with the highest Area Under the Curve (AUC) value of 0.930. This study provides compelling evidence of the efficacy of CNN models in detecting COVID-19 based on cough sounds through mel-spectrogram images. Furthermore, the study underscores the impact of input size on model performance. This research contributes to identifying the CNN model that demonstrates the best performance in COVID-19 detection based on cough sounds. By exploring the effectiveness of CNN models with different mel-spectrogram image sizes, this study offers novel insights into the optimal and fast audio-based method for early detection of COVID-19. Additionally, this study establishes the fundamental groundwork for selecting an appropriate CNN methodology for early detection of COVID-19
    • …
    corecore