1,448 research outputs found
Detecting Overfitting of Deep Generative Networks via Latent Recovery
State of the art deep generative networks are capable of producing images
with such incredible realism that they can be suspected of memorizing training
images. It is why it is not uncommon to include visualizations of training set
nearest neighbors, to suggest generated images are not simply memorized. We
demonstrate this is not sufficient and motivates the need to study
memorization/overfitting of deep generators with more scrutiny. This paper
addresses this question by i) showing how simple losses are highly effective at
reconstructing images for deep generators ii) analyzing the statistics of
reconstruction errors when reconstructing training and validation images, which
is the standard way to analyze overfitting in machine learning. Using this
methodology, this paper shows that overfitting is not detectable in the pure
GAN models proposed in the literature, in contrast with those using hybrid
adversarial losses, which are amongst the most widely applied generative
methods. The paper also shows that standard GAN evaluation metrics fail to
capture memorization for some deep generators. Finally, the paper also shows
how off-the-shelf GAN generators can be successfully applied to face inpainting
and face super-resolution using the proposed reconstruction method, without
hybrid adversarial losses
Feature Likelihood Score: Evaluating Generalization of Generative Models Using Samples
The past few years have seen impressive progress in the development of deep
generative models capable of producing high-dimensional, complex, and
photo-realistic data. However, current methods for evaluating such models
remain incomplete: standard likelihood-based metrics do not always apply and
rarely correlate with perceptual fidelity, while sample-based metrics, such as
FID, are insensitive to overfitting, i.e., inability to generalize beyond the
training set. To address these limitations, we propose a new metric called the
Feature Likelihood Score (FLS), a parametric sample-based score that uses
density estimation to provide a comprehensive trichotomic evaluation accounting
for novelty (i.e., different from the training samples), fidelity, and
diversity of generated samples. We empirically demonstrate the ability of FLS
to identify specific overfitting problem cases, where previously proposed
metrics fail. We also extensively evaluate FLS on various image datasets and
model classes, demonstrating its ability to match intuitions of previous
metrics like FID while offering a more comprehensive evaluation of generative
models
Generating Private Data Surrogates for Vision Related Tasks
International audienceWith the widespread application of deep networks in industry, membership inference attacks, i.e. the ability to discern training data from a model, become more and more problematic for data privacy. Recent work suggests that generative networks may be robust against membership attacks. In this work, we build on this observation, offering a general-purpose solution to the membership privacy problem. As the primary contribution, we demonstrate how to construct surrogate datasets, using images from GAN generators, labelled with a classifier trained on the private dataset. Next, we show this surrogate data can further be used for a variety of downstream tasks (here classification and regression), while being resistant to membership attacks. We study a variety of different GANs proposed in the literature, concluding that higher quality GANs result in better surrogate data with respect to the task at hand
A Very Brief Introduction to Machine Learning With Applications to Communication Systems
Given the unprecedented availability of data and computing resources, there
is widespread renewed interest in applying data-driven machine learning methods
to problems for which the development of conventional engineering solutions is
challenged by modelling or algorithmic deficiencies. This tutorial-style paper
starts by addressing the questions of why and when such techniques can be
useful. It then provides a high-level introduction to the basics of supervised
and unsupervised learning. For both supervised and unsupervised learning,
exemplifying applications to communication networks are discussed by
distinguishing tasks carried out at the edge and at the cloud segments of the
network at different layers of the protocol stack
Validation of machine learning based scenario generators
Machine learning methods are getting more and more important in the
development of internal models using scenario generation. As internal models
under Solvency 2 have to be validated, an important question is in which
aspects the validation of these data-driven models differs from a classical
theory-based model. On the specific example of market risk, we discuss the
necessity of two additional validation tasks: one to check the dependencies
between the risk factors used and one to detect the unwanted memorizing effect.
The first one is necessary because in this new method, the dependencies are not
derived from a financial-mathematical theory. The latter one arises when the
machine learning model only repeats empirical data instead of generating new
scenarios. These measures are then applied for an machine learning based
economic scenario generator. It is shown that those measures lead to reasonable
results in this context and are able to be used for validation as well as for
model optimization
Segmentation of surgical tools from laparoscopy images
Relatório de projeto de mestrado em Engenharia BiomédicaCirurgias roboticamente assistidas têm vindo a substituir as cirurgias abertas com
enorme impacto no tempo de convalescença do paciente e consequentemente em tudo
o que isso implica, economia de recursos no sector da saúde e a retoma antecipada das
atividades laborais do paciente. Este tipo de cirurgia auxiliada por um sistema robótico
é guiado por uma câmara laparoscópica, facultando ao médico uma visão das partes
anatómicas do paciente. A fim do cirurgião se encontrar apto para operar este
equipamento tem de passar por inúmeras horas de formação, tornando o processo
desgastante e dispendioso. Para além do referido, a manipulação dos instrumentos
cirúrgicos em concordância com a câmara laparoscópica não é de todo um processo
intuitivo, ou seja, os erros de natureza subjetiva não são erradicados. A diretiva desta
tese é o desenvolvimento de um sistema automático capaz de segmentar instrumentos
cirúrgicos, possibilitando desta forma a monitorização constante da posição dos
instrumentos. Para tal foram explorados diferentes modelos de aprendizagem
automática. Numa segunda fase, foram considerados métodos que pudessem ser
incorporados no modelo base. Tendo-se encontrado uma resposta, partiu-se para a
comparação dos modelos previamente selecionados, com o modelo base e ainda com o
otimizado. Numa terceira abordagem, de forma a melhorar as métricas que serviram de
comparação, procurou-se por soluções alternativas, nomeadamente a geração de dados
artificiais. Neste ponto, deparou-se com duas possibilidades, uma baseada em sistemas
de aprendizagem autónoma por competição e outra em sistemas de aprendizagem de
síntese de imagens a partir de ruido com densidade espectral sucessivamente
incrementada. Ambas as abordagens permitiram o aumento da base de dados tendo-se
aferido a sua eficácia por comparação do efeito do aumento de dados nos sistemas de
segmentação. O sistema proposto pode vir a ser implementado em cirurgias
roboticamente assistidas, necessitando apenas de mínimas alterações.Robotic-assisted surgeries have been replacing open surgeries with a significant
impact on patient recovery time, and consequently, on various aspects such as
healthcare resource savings and the early resumption of the patient's work activities.
This type of surgery, assisted by a robotic system, is guided by a laparoscopic camera,
providing the surgeon with a view of the patient's anatomical structures. To operate this
equipment, surgeons must undergo numerous hours of training, making the process
exhaustive and costly. In addition, manipulating surgical instruments in coordination
with the laparoscopic camera is not an intuitive process, meaning errors of a subjective
nature are not eliminated. The objective of this thesis is the development of an
automated system capable of segmenting surgical instruments, thereby enabling
constant monitoring of their positions. Various machine learning models were explored
to address this issue. In a second phase, methods that could be incorporated into the
base model were considered. Once a solution was found, a comparison was made
between the previously selected models, the base model, and the optimized model. In
a third approach, with the aim of improving the comparison metrics, alternative
solutions were sought, including the generation of synthetic data. At this point, two
possibilities were encountered, one based on autonomous learning systems through
competition and the other on image synthesis learning systems from progressively
increasing noise spectral density. Both approaches expanded the available database,
and their effectiveness was evaluated by comparing the impact of data augmentation
on segmentation systems. The proposed system can potentially be implemented in
robotic-assisted surgeries with minimal modifications
- …