2 research outputs found
Providing theoretical learning guarantees to Deep Learning Networks
Deep Learning (DL) is one of the most common subjects when Machine Learning
and Data Science approaches are considered. There are clearly two movements
related to DL: the first aggregates researchers in quest to outperform other
algorithms from literature, trying to win contests by considering often small
decreases in the empirical risk; and the second investigates overfitting
evidences, questioning the learning capabilities of DL classifiers. Motivated
by such opposed points of view, this paper employs the Statistical Learning
Theory (SLT) to study the convergence of Deep Neural Networks, with particular
interest in Convolutional Neural Networks. In order to draw theoretical
conclusions, we propose an approach to estimate the Shattering coefficient of
those classification algorithms, providing a lower bound for the complexity of
their space of admissible functions, a.k.a. algorithm bias. Based on such
estimator, we generalize the complexity of network biases, and, next, we study
AlexNet and VGG16 architectures in the point of view of their Shattering
coefficients, and number of training examples required to provide theoretical
learning guarantees. From our theoretical formulation, we show the conditions
which Deep Neural Networks learn as well as point out another issue: DL
benchmarks may be strictly driven by empirical risks, disregarding the
complexity of algorithms biases.Comment: Submitted to JML
Generalization of feature embeddings transferred from different video anomaly detection domains
Detecting anomalous activity in video surveillance often involves using only
normal activity data in order to learn an accurate detector. Due to lack of
annotated data for some specific target domain, one could employ existing data
from a source domain to produce better predictions. Hence, transfer learning
presents itself as an important tool. But how to analyze the resulting data
space? This paper investigates video anomaly detection, in particular feature
embeddings of pre-trained CNN that can be used with non-fully supervised data.
By proposing novel cross-domain generalization measures, we study how source
features can generalize for different target video domains, as well as analyze
unsupervised transfer learning. The proposed generalization measures are not
only a theorical approach, but show to be useful in practice as a way to
understand which datasets can be used or transferred to describe video frames,
which it is possible to better discriminate between normal and anomalous
activity