61 research outputs found
Self-supervised learning for transferable representations
Machine learning has undeniably achieved remarkable advances thanks to large labelled datasets and supervised learning. However, this progress is constrained by the labour-intensive annotation process. It is not feasible to generate extensive labelled datasets for every problem we aim to address. Consequently, there has been a notable shift in recent times toward approaches that solely leverage raw data. Among these, self-supervised learning has emerged as a particularly powerful approach, offering scalability to massive datasets and showcasing considerable potential for effective knowledge transfer. This thesis investigates self-supervised representation learning with a strong focus on computer vision applications. We provide a comprehensive survey of self-supervised methods across various modalities, introducing a taxonomy that categorises them into four distinct families while also highlighting practical considerations for real-world implementation. Our focus thenceforth is on the computer vision modality, where we perform a comprehensive benchmark evaluation of state-of-the-art self supervised models against many diverse downstream transfer tasks. Our findings reveal that self-supervised models often outperform supervised learning across a spectrum of tasks, albeit with correlations weakening as tasks transition beyond classification, particularly for datasets with distribution shifts. Digging deeper, we investigate the influence of data augmentation on the transferability of contrastive learners, uncovering a trade-off between spatial and appearance-based invariances that generalise to real-world transformations. This begins to explain the differing empirical performances achieved by self-supervised learners on different downstream tasks, and it showcases the advantages of specialised representations produced with tailored augmentation. Finally, we introduce a novel self-supervised pre-training algorithm for object detection, aligning pre-training with downstream architecture and objectives, leading to reduced localisation errors and improved label efficiency. In conclusion, this thesis contributes a comprehensive understanding of self-supervised representation learning and its role in enabling effective transfer across computer vision tasks
On the Utility of Representation Learning Algorithms for Myoelectric Interfacing
Electrical activity produced by muscles during voluntary movement is a reflection of the firing patterns of relevant motor neurons and, by extension, the latent motor intent driving the movement. Once transduced via electromyography (EMG) and converted into digital form, this activity can be processed to provide an estimate of the original motor intent and is as such a feasible basis for non-invasive efferent neural interfacing. EMG-based motor intent decoding has so far received the most attention in the field of upper-limb prosthetics, where alternative means of interfacing are scarce and the utility of better control apparent. Whereas myoelectric prostheses have been available since the 1960s, available EMG control interfaces still lag behind the mechanical capabilities of the artificial limbs they are intended to steer—a gap at least partially due to limitations in current methods for translating EMG into appropriate motion commands. As the relationship between EMG signals and concurrent effector kinematics is highly non-linear and apparently stochastic, finding ways to accurately extract and combine relevant information from across electrode sites is still an active area of inquiry.This dissertation comprises an introduction and eight papers that explore issues afflicting the status quo of myoelectric decoding and possible solutions, all related through their use of learning algorithms and deep Artificial Neural Network (ANN) models. Paper I presents a Convolutional Neural Network (CNN) for multi-label movement decoding of high-density surface EMG (HD-sEMG) signals. Inspired by the successful use of CNNs in Paper I and the work of others, Paper II presents a method for automatic design of CNN architectures for use in myocontrol. Paper III introduces an ANN architecture with an appertaining training framework from which simultaneous and proportional control emerges. Paper Iv introduce a dataset of HD-sEMG signals for use with learning algorithms. Paper v applies a Recurrent Neural Network (RNN) model to decode finger forces from intramuscular EMG. Paper vI introduces a Transformer model for myoelectric interfacing that do not need additional training data to function with previously unseen users. Paper vII compares the performance of a Long Short-Term Memory (LSTM) network to that of classical pattern recognition algorithms. Lastly, paper vIII describes a framework for synthesizing EMG from multi-articulate gestures intended to reduce training burden
Advanced analytical methods for fraud detection: a systematic literature review
The developments of the digital era demand new ways of producing goods and rendering
services. This fast-paced evolution in the companies implies a new approach from the
auditors, who must keep up with the constant transformation. With the dynamic
dimensions of data, it is important to seize the opportunity to add value to the companies.
The need to apply more robust methods to detect fraud is evident.
In this thesis the use of advanced analytical methods for fraud detection will be
investigated, through the analysis of the existent literature on this topic.
Both a systematic review of the literature and a bibliometric approach will be applied to
the most appropriate database to measure the scientific production and current trends.
This study intends to contribute to the academic research that have been conducted, in
order to centralize the existing information on this topic
Application of a generative adversarial network for multi-featured fermentation data synthesis and artificial neural network (ANN) modeling of bitter gourd–grape beverage production.
Artificial neural networks (ANNs) have in recent times found increasing application in predictive modelling of various food processing operations including fermentation, as they have the ability to learn nonlinear complex relationships in high dimensional datasets, which might otherwise be outside the scope of conventional regression models. Nonetheless, a major limiting factor of ANNs is that they require quite a large amount of training data for better performance. Obtaining such an amount of data from biological processes is usually difficult for many reasons. To resolve this problem, methods are proposed to inflate existing data by artificially synthesizing additional valid data samples. In this paper, we present a generative adversarial network (GAN) able to synthesize an infinite amount of realistic multi-dimensional regression data from limited experimental data (n = 20). Rigorous testing showed that the synthesized data (n = 200) significantly conserved the variances and distribution patterns of the real data. Further, the synthetic data was used to generalize a deep neural network. The model trained on the artificial data showed a lower loss (2.029 ± 0.124) and converged to a solution faster than its counterpart trained on real data (2.1614 ± 0.117)
Data-Driven Methods for Data Center Operations Support
During the last decade, cloud technologies have been evolving at
an impressive pace, such that we are now living in a cloud-native
era where developers can leverage on an unprecedented landscape
of (possibly managed) services for orchestration, compute, storage,
load-balancing, monitoring, etc. The possibility to have on-demand
access to a diverse set of configurable virtualized resources allows
for building more elastic, flexible and highly-resilient distributed
applications. Behind the scenes, cloud providers sustain the heavy
burden of maintaining the underlying infrastructures, consisting in
large-scale distributed systems, partitioned and replicated among
many geographically dislocated data centers to guarantee scalability,
robustness to failures, high availability and low latency. The larger the
scale, the more cloud providers have to deal with complex interactions
among the various components, such that monitoring, diagnosing and
troubleshooting issues become incredibly daunting tasks.
To keep up with these challenges, development and operations
practices have undergone significant transformations, especially in
terms of improving the automations that make releasing new software,
and responding to unforeseen issues, faster and sustainable at scale.
The resulting paradigm is nowadays referred to as DevOps. However,
while such automations can be very sophisticated, traditional DevOps
practices fundamentally rely on reactive mechanisms, that typically
require careful manual tuning and supervision from human experts.
To minimize the risk of outages—and the related costs—it is crucial to
provide DevOps teams with suitable tools that can enable a proactive
approach to data center operations.
This work presents a comprehensive data-driven framework to address
the most relevant problems that can be experienced in large-scale
distributed cloud infrastructures. These environments are indeed characterized
by a very large availability of diverse data, collected at each
level of the stack, such as: time-series (e.g., physical host measurements,
virtual machine or container metrics, networking components
logs, application KPIs); graphs (e.g., network topologies, fault graphs
reporting dependencies among hardware and software components,
performance issues propagation networks); and text (e.g., source code,
system logs, version control system history, code review feedbacks).
Such data are also typically updated with relatively high frequency,
and subject to distribution drifts caused by continuous configuration
changes to the underlying infrastructure. In such a highly dynamic scenario,
traditional model-driven approaches alone may be inadequate
at capturing the complexity of the interactions among system components. DevOps teams would certainly benefit from having robust
data-driven methods to support their decisions based on historical
information. For instance, effective anomaly detection capabilities may
also help in conducting more precise and efficient root-cause analysis.
Also, leveraging on accurate forecasting and intelligent control
strategies would improve resource management.
Given their ability to deal with high-dimensional, complex data,
Deep Learning-based methods are the most straightforward option for
the realization of the aforementioned support tools. On the other hand,
because of their complexity, this kind of models often requires huge
processing power, and suitable hardware, to be operated effectively
at scale. These aspects must be carefully addressed when applying
such methods in the context of data center operations. Automated
operations approaches must be dependable and cost-efficient, not to
degrade the services they are built to improve.
i
Advances in Artificial Intelligence: Models, Optimization, and Machine Learning
The present book contains all the articles accepted and published in the Special Issue “Advances in Artificial Intelligence: Models, Optimization, and Machine Learning” of the MDPI Mathematics journal, which covers a wide range of topics connected to the theory and applications of artificial intelligence and its subfields. These topics include, among others, deep learning and classic machine learning algorithms, neural modelling, architectures and learning algorithms, biologically inspired optimization algorithms, algorithms for autonomous driving, probabilistic models and Bayesian reasoning, intelligent agents and multiagent systems. We hope that the scientific results presented in this book will serve as valuable sources of documentation and inspiration for anyone willing to pursue research in artificial intelligence, machine learning and their widespread applications
Data-driven Disease Surveillance
The recent and still ongoing pandemic of SARS-CoV-2 has shown that an infectious disease outbreak can have serious consequences on public health and economy. In this situation, public health officials constantly aim to control and reduce the number of infections in order to avoid overburdening health care system. Besides minimizing personal contact through political measures, a fundamental approach to contain the spread of diseases is to isolate infected individuals. The effectiveness of the latter approach strongly depends on a timely detection of the outbreak as the tracking of individuals can quickly become infeasible when the number of cases increases. Hence, a key factor in the containment of an infectious disease is the early detection of a potential larger outbreak, commonly known as outbreak detection.
For this purpose, epidemiologists rely on a variety of statistical surveillance methods in order to maintain an overview of the current situation of infections by either monitoring confirmed cases or cases with early symptoms. Mainly based on statistical hypothesis testing, these methods automatically raise an alarm if an unexpected increase in the number of infections is observed. The practical usefulness of such methods highly depends on the trade-off between the ability to detect outbreaks and the chances of raising a false alarm. However, this hypothesis-based approach to disease surveillance has several limitations. On the one hand, it is a hand-crafted approach which requires domain knowledge to set up the statistical methods, especially if early symptoms are monitored. On the other hand, outbreaks of emerging infectious diseases with different symptom patterns are likely to be missed by such a surveillance system.
In this thesis, we focus on data-driven disease surveillance and address these challenges in the following ways. To support epidemiologists in the process of defining reliable disease patterns for monitoring cases with early symptoms, we present a novel approach to discover such patterns in historic data. With respect to supervised learning, we propose a fusion classifier which can combine the output of multiple statistical methods using the univariate time series of infection counts as the only source of information. In addition, we develop algorithms based on unsupervised learning which frame the task of outbreak detection as a general anomaly detection task. This even includes the surveillance of emerging infectious diseases. Therefore, we contribute a novel framework and propose a new approach based on sum-product networks to monitor multiple disease patterns simultaneously. Our results show that data-driven approaches are ideal to assist epidemiologists by processing large amounts of data that cannot fully be understood and analyzed by humans. Most significantly, the incorporation of additional information into the surveillance through machine learning techniques shows reliable and promising results
Emotion Recognition for Affective Computing: Computer Vision and Machine Learning Approach
The purpose of affective computing is to develop reliable and intelligent models that computers can use to interact more naturally with humans. The critical requirements for such models are that they enable computers to recognise, understand and interpret the emotional states expressed by humans. The emotion recognition has been a research topic of interest for decades, not only in relation to developments in the affective computing field but also due to its other potential applications.
A particularly challenging problem that has emerged from this body of work, however, is the task of recognising facial expressions and emotions from still images or videos in real-time. This thesis aimed to solve this challenging problem by developing new techniques involving computer vision, machine learning and different levels of information fusion.
Firstly, an efficient and effective algorithm was developed to improve the performance of the Viola-Jones algorithm. The proposed method achieved significantly higher detection accuracy (95%) than the standard Viola-Jones method (90%) in face detection from thermal images, while also doubling the detection speed. Secondly, an automatic subsystem for detecting eyeglasses, Shallow-GlassNet, was proposed to address the facial occlusion problem by designing a shallow convolutional neural network capable of detecting eyeglasses rapidly and accurately. Thirdly, a novel neural network model for decision fusion was proposed in order to make use of multiple classifier systems, which can increase the classification accuracy by up to 10%. Finally, a high-speed approach to emotion recognition from videos, called One-Shot Only (OSO), was developed based on a novel spatio-temporal data fusion method for representing video frames. The OSO method tackled video classification as a single image classification problem, which not only made it extremely fast but also reduced the overfitting problem
Models and Analysis of Vocal Emissions for Biomedical Applications
The MAVEBA Workshop proceedings, held on a biannual basis, collect the scientific papers presented both as oral and poster contributions, during the conference. The main subjects are: development of theoretical and mechanical models as an aid to the study of main phonatory dysfunctions, as well as the biomedical engineering methods for the analysis of voice signals and images, as a support to clinical diagnosis and classification of vocal pathologies
Audio-Visual Automatic Speech Recognition Using PZM, MFCC and Statistical Analysis
Audio-Visual Automatic Speech Recognition (AV-ASR) has become the most promising research area when the audio signal gets corrupted by noise. The main objective of this paper is to select the important and discriminative audio and visual speech features to recognize audio-visual speech. This paper proposes Pseudo Zernike Moment (PZM) and feature selection method for audio-visual speech recognition. Visual information is captured from the lip contour and computes the moments for lip reading. We have extracted 19th order of Mel Frequency Cepstral Coefficients (MFCC) as speech features from audio. Since all the 19 speech features are not equally important, therefore, feature selection algorithms are used to select the most efficient features. The various statistical algorithm such as Analysis of Variance (ANOVA), Kruskal-wallis, and Friedman test are employed to analyze the significance of features along with Incremental Feature Selection (IFS) technique. Statistical analysis is used to analyze the statistical significance of the speech features and after that IFS is used to select the speech feature subset. Furthermore, multiclass Support Vector Machine (SVM), Artificial Neural Network (ANN) and Naive Bayes (NB) machine learning techniques are used to recognize the speech for both the audio and visual modalities. Based on the recognition rate combined decision is taken from the two individual recognition systems. This paper compares the result achieved by the proposed model and the existing model for both audio and visual speech recognition. Zernike Moment (ZM) is compared with PZM and shows that our proposed model using PZM extracts better discriminative features for visual speech recognition. This study also proves that audio feature selection using statistical analysis outperforms methods without any feature selection technique
- …