The recent years are experiencing an extremely fast evolution of the Computer Vision and
Machine Learning fields: several application domains benefit from the newly developed
technologies and industries are investing a growing amount of money in Artificial Intelligence.
Convolutional Neural Networks and Deep Learning substantially contributed to the rise and
the diffusion of AI-based solutions, creating the potential for many disruptive new businesses.
The effectiveness of Deep Learning models is grounded by the availability of a huge
amount of training data. Unfortunately, data collection and labeling is an extremely expensive
task in terms of both time and costs; moreover, it frequently requires the collaboration of
domain experts.
In the first part of the thesis, I will investigate some methods for reducing the cost
of data acquisition for Deep Learning applications in the relatively constrained industrial
scenarios related to visual inspection. I will primarily assess the effectiveness of Deep Neural
Networks in comparison with several classical Machine Learning algorithms requiring a
smaller amount of data to be trained. Hereafter, I will introduce a hardware-based data
augmentation approach, which leads to a considerable performance boost taking advantage of
a novel illumination setup designed for this purpose. Finally, I will investigate the situation in
which acquiring a sufficient number of training samples is not possible, in particular the most
extreme situation: zero-shot learning (ZSL), which is the problem of multi-class classification
when no training data is available for some of the classes. Visual features designed for image
classification and trained offline have been shown to be useful for ZSL to generalize towards
classes not seen during training. Nevertheless, I will show that recognition performances
on unseen classes can be sharply improved by learning ad hoc semantic embedding (the
pre-defined list of present and absent attributes that represent a class) and visual features, to
increase the correlation between the two geometrical spaces and ease the metric learning
process for ZSL.
In the second part of the thesis, I will present some successful applications of state-of-the-
art Computer Vision, Data Analysis and Artificial Intelligence methods. I will illustrate
some solutions developed during the 2020 Coronavirus Pandemic for controlling the disease
vii
evolution and for reducing virus spreading. I will describe the first publicly available
dataset for the analysis of face-touching behavior that we annotated and distributed, and
I will illustrate an extensive evaluation of several computer vision methods applied to the
produced dataset. Moreover, I will describe the privacy-preserving solution we developed
for estimating the \u201cSocial Distance\u201d and its violations, given a single uncalibrated image
in unconstrained scenarios. I will conclude the thesis with a Computer Vision solution
developed in collaboration with the Egyptian Museum of Turin for digitally unwrapping
mummies analyzing their CT scan, to support the archaeologists during mummy analysis
and avoiding the devastating and irreversible process of physically unwrapping the bandages
for removing amulets and jewels from the body