5,067 research outputs found
Self-supervised learning for transferable representations
Machine learning has undeniably achieved remarkable advances thanks to large labelled datasets and supervised learning. However, this progress is constrained by the labour-intensive annotation process. It is not feasible to generate extensive labelled datasets for every problem we aim to address. Consequently, there has been a notable shift in recent times toward approaches that solely leverage raw data. Among these, self-supervised learning has emerged as a particularly powerful approach, offering scalability to massive datasets and showcasing considerable potential for effective knowledge transfer. This thesis investigates self-supervised representation learning with a strong focus on computer vision applications. We provide a comprehensive survey of self-supervised methods across various modalities, introducing a taxonomy that categorises them into four distinct families while also highlighting practical considerations for real-world implementation. Our focus thenceforth is on the computer vision modality, where we perform a comprehensive benchmark evaluation of state-of-the-art self supervised models against many diverse downstream transfer tasks. Our findings reveal that self-supervised models often outperform supervised learning across a spectrum of tasks, albeit with correlations weakening as tasks transition beyond classification, particularly for datasets with distribution shifts. Digging deeper, we investigate the influence of data augmentation on the transferability of contrastive learners, uncovering a trade-off between spatial and appearance-based invariances that generalise to real-world transformations. This begins to explain the differing empirical performances achieved by self-supervised learners on different downstream tasks, and it showcases the advantages of specialised representations produced with tailored augmentation. Finally, we introduce a novel self-supervised pre-training algorithm for object detection, aligning pre-training with downstream architecture and objectives, leading to reduced localisation errors and improved label efficiency. In conclusion, this thesis contributes a comprehensive understanding of self-supervised representation learning and its role in enabling effective transfer across computer vision tasks
Multidisciplinary perspectives on Artificial Intelligence and the law
This open access book presents an interdisciplinary, multi-authored, edited collection of chapters on Artificial Intelligence (‘AI’) and the Law. AI technology has come to play a central role in the modern data economy. Through a combination of increased computing power, the growing availability of data and the advancement of algorithms, AI has now become an umbrella term for some of the most transformational technological breakthroughs of this age. The importance of AI stems from both the opportunities that it offers and the challenges that it entails. While AI applications hold the promise of economic growth and efficiency gains, they also create significant risks and uncertainty. The potential and perils of AI have thus come to dominate modern discussions of technology and ethics – and although AI was initially allowed to largely develop without guidelines or rules, few would deny that the law is set to play a fundamental role in shaping the future of AI. As the debate over AI is far from over, the need for rigorous analysis has never been greater. This book thus brings together contributors from different fields and backgrounds to explore how the law might provide answers to some of the most pressing questions raised by AI. An outcome of the Católica Research Centre for the Future of Law and its interdisciplinary working group on Law and Artificial Intelligence, it includes contributions by leading scholars in the fields of technology, ethics and the law.info:eu-repo/semantics/publishedVersio
Sound Event Detection by Exploring Audio Sequence Modelling
Everyday sounds in real-world environments are a powerful source of information by which humans can interact with their environments. Humans can infer what is happening around them by listening to everyday sounds. At the same time, it is a challenging task for a computer algorithm in a smart device to automatically recognise, understand, and interpret everyday sounds. Sound event detection (SED) is the process of transcribing an audio recording into sound event tags with onset and offset time values. This involves classification and segmentation of sound events in the given audio recording. SED has numerous applications in everyday life which include security and surveillance, automation, healthcare monitoring, multimedia information retrieval, and assisted living technologies. SED is to everyday sounds what automatic speech recognition (ASR) is to speech and automatic music transcription (AMT) is to music. The fundamental questions in designing a sound recognition system are, which portion of a sound event should the system analyse, and what proportion of a sound event should the system process in order to claim a confident detection of that particular sound event. While the classification of sound events has improved a lot in recent years, it is considered that the temporal-segmentation of sound events has not improved in the same extent. The aim of this thesis is to propose and develop methods to improve the segmentation and classification of everyday sound events in SED models. In particular, this thesis explores the segmentation of sound events by investigating audio sequence encoding-based and audio sequence modelling-based methods, in an effort to improve the overall sound event detection performance. In the first phase of this thesis, efforts are put towards improving sound event detection by explicitly conditioning the audio sequence representations of an SED model using sound activity detection (SAD) and onset detection. To achieve this, we propose multi-task learning-based SED models in which SAD and onset detection are used as auxiliary tasks for the SED task. The next part of this thesis explores self-attention-based audio sequence modelling, which aggregates audio representations based on temporal relations within and between sound events, scored on the basis of the similarity of sound event portions in audio event sequences. We propose SED models that include memory-controlled, adaptive, dynamic, and source separation-induced self-attention variants, with the aim to improve overall sound recognition
Whole-genome sequencing of chronic lymphocytic leukemia identifies subgroups with distinct biological and clinical features.
The value of genome-wide over targeted driver analyses for predicting clinical outcomes of cancer patients is debated. Here, we report the whole-genome sequencing of 485 chronic lymphocytic leukemia patients enrolled in clinical trials as part of the United Kingdom's 100,000 Genomes Project. We identify an extended catalog of recurrent coding and noncoding genetic mutations that represents a source for future studies and provide the most complete high-resolution map of structural variants, copy number changes and global genome features including telomere length, mutational signatures and genomic complexity. We demonstrate the relationship of these features with clinical outcome and show that integration of 186 distinct recurrent genomic alterations defines five genomic subgroups that associate with response to therapy, refining conventional outcome prediction. While requiring independent validation, our findings highlight the potential of whole-genome sequencing to inform future risk stratification in chronic lymphocytic leukemia
Recommended from our members
The impact of enterprise social networking on knowledge sharing between academic staff in higher education
This thesis was submitted for the award of Doctor of Philosophy and was awarded by Brunel University LondonHigher education institutions have always considered knowledge sharing critical for research excellence and finding proper methods for sharing knowledge across academic staff has therefore been a major issue for universities and knowledge management research. Recent evidence shows that many universities have embraced enterprise social networking tools to improve communication, relationships, partnerships, and knowledge sharing. To date, there is little understanding of the critical factors for online knowledge sharing behaviour between academic staff, and the impact of these factors on work benefits for academic staff which differ between consumptive users and contributive users in higher education. This study employed the extended unified theory of acceptance and use of technology (UTAUT) to examine factors affecting knowledge sharing about the consumptive use and contributive use of enterprise social network (ESN) behaviour. The study adopts a critical realism philosophical approach and employed a grounded theory mixed methods. The conceptual model was validated through structural equation modelling based on an online survey of 254 academic staff using enterprise social networking as a part of their work in the United Kingdom. The findings have significant theoretical and practical implications for researchers and policy makers. The research has developed a cohesive ESN use model by extending and modifying the unified theory of acceptance and use of technology. The findings indicate significant differences around factors affecting consumptive and contributive usage patterns within ESNs. Due to advances in communication technologies, this research argues that a previous model suggested by Venkatesh et al. (2003) is no longer fit for purpose and the new communication tools can lead to improved knowledge in higher education. This research also makes valuable contributions to universities from a managerial viewpoint, suggesting that universities could help their scholars find a more comprehensive range of funding sources matching scholars' ideas
Social learning across symbolic cultural barriers in non-human cultures
Social learning is key in the development of both human and non-human animal
societies. Here, we provide quantitative evidence that supports the existence
of social learning in sperm whales across socio-cultural barriers, based on
acoustic data from locations in the Pacific and Atlantic Oceans. Sperm whale
populations have traditionally been partitioned into clans based on their vocal
repertoire (what they say) of rhythmically patterned clicks (codas), and in
particular their identity codas, which serve as symbolic markers for each clan.
However, identity codas account for between 35% and 60% of all codas vocalized
depending on the different clans. We introduce a computational modeling
approach that recovers clan structure and shows new evidence of social learning
across clans from the internal temporal structure of non-identity codas, the
remaining fraction of codas. The proposed method is based on vocal style, which
encodes how sperm whales assemble individual clicks into codas. Specifically,
we modeled clicking pattern data using generative models based on variable
length Markov chains, producing what we term "subcoda trees". Based on our
results, we propose here a new concept of vocal identity, which consists of
both vocal repertoire and style. We show that (i) style-delimited clans are
similar to repertoire-delimited clans, and that (ii) sympatry increases vocal
style similarity between clans for non-identity codas, but has no significant
effect on identity codas. This implies that different clans who geographically
overlap have similar styles for most codas, which in turn implies social
learning across cultural boundaries. More broadly, the proposed method provides
a new framework for comparing communication systems of other animal species,
with potential implications for our understanding of cultural transmission in
animal societies
AI: Limits and Prospects of Artificial Intelligence
The emergence of artificial intelligence has triggered enthusiasm and promise of boundless opportunities as much as uncertainty about its limits. The contributions to this volume explore the limits of AI, describe the necessary conditions for its functionality, reveal its attendant technical and social problems, and present some existing and potential solutions. At the same time, the contributors highlight the societal and attending economic hopes and fears, utopias and dystopias that are associated with the current and future development of artificial intelligence
Automated Detection of COVID-19 Cough Sound using Mel-Spectrogram Images and Convolutional Neural Network
COVID-19 is a new disease caused by the Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) variant. The initial symptoms of the disease commonly include fever (83-98%), fatigue or myalgia, dry cough (76-82%), and shortness of breath (31-55%). Given the prevalence of coughing as a symptom, artificial intelligence has been employed to detect COVID-19 based on cough sounds. This study aims to compare the performance of six different Convolutional Neural Network (CNN) models (VGG-16, VGG-19, LeNet-5, AlexNet, ResNet-50, and ResNet-152) in detecting COVID-19 using mel-spectrogram images derived from cough sounds. The training and validation of these CNN models were conducted using the Virufy dataset, consisting of 121 cough audio recordings with a sample rate of 48,000 and a duration of 1 second for all audio data. Audio data was processed to generate mel-spectrogram images, which were subsequently employed as inputs for the CNN models. This study used accuracy, area under curve (AUC), precision, recall, and F1 score as evaluation metrics. The AlexNet model, utilizing an input size of 227×227, exhibited the best performance with the highest Area Under the Curve (AUC) value of 0.930. This study provides compelling evidence of the efficacy of CNN models in detecting COVID-19 based on cough sounds through mel-spectrogram images. Furthermore, the study underscores the impact of input size on model performance. This research contributes to identifying the CNN model that demonstrates the best performance in COVID-19 detection based on cough sounds. By exploring the effectiveness of CNN models with different mel-spectrogram image sizes, this study offers novel insights into the optimal and fast audio-based method for early detection of COVID-19. Additionally, this study establishes the fundamental groundwork for selecting an appropriate CNN methodology for early detection of COVID-19
- …