34 research outputs found
The Missing Data Encoder: Cross-Channel Image Completion\\with Hide-And-Seek Adversarial Network
Image completion is the problem of generating whole images from fragments
only. It encompasses inpainting (generating a patch given its surrounding),
reverse inpainting/extrapolation (generating the periphery given the central
patch) as well as colorization (generating one or several channels given other
ones). In this paper, we employ a deep network to perform image completion,
with adversarial training as well as perceptual and completion losses, and call
it the ``missing data encoder'' (MDE). We consider several configurations based
on how the seed fragments are chosen. We show that training MDE for ``random
extrapolation and colorization'' (MDE-REC), i.e. using random
channel-independent fragments, allows a better capture of the image semantics
and geometry. MDE training makes use of a novel ``hide-and-seek'' adversarial
loss, where the discriminator seeks the original non-masked regions, while the
generator tries to hide them. We validate our models both qualitatively and
quantitatively on several datasets, showing their interest for image
completion, unsupervised representation learning as well as face occlusion
handling
MultIOD: Rehearsal-free Multihead Incremental Object Detector
Class-Incremental learning (CIL) is the ability of artificial agents to
accommodate new classes as they appear in a stream. It is particularly
interesting in evolving environments where agents have limited access to memory
and computational resources. The main challenge of class-incremental learning
is catastrophic forgetting, the inability of neural networks to retain past
knowledge when learning a new one. Unfortunately, most existing
class-incremental object detectors are applied to two-stage algorithms such as
Faster-RCNN and rely on rehearsal memory to retain past knowledge. We believe
that the current benchmarks are not realistic, and more effort should be
dedicated to anchor-free and rehearsal-free object detection. In this context,
we propose MultIOD, a class-incremental object detector based on CenterNet. Our
main contributions are: (1) we propose a multihead feature pyramid and
multihead detection architecture to efficiently separate class representations,
(2) we employ transfer learning between classes learned initially and those
learned incrementally to tackle catastrophic forgetting, and (3) we use a
class-wise non-max-suppression as a post-processing technique to remove
redundant boxes. Without bells and whistles, our method outperforms a range of
state-of-the-art methods on two Pascal VOC datasets.Comment: Under review at the WACV 2024 conferenc
Gradient-Based Post-Training Quantization: Challenging the Status Quo
Quantization has become a crucial step for the efficient deployment of deep
neural networks, where floating point operations are converted to simpler fixed
point operations. In its most naive form, it simply consists in a combination
of scaling and rounding transformations, leading to either a limited
compression rate or a significant accuracy drop. Recently, Gradient-based
post-training quantization (GPTQ) methods appears to be constitute a suitable
trade-off between such simple methods and more powerful, yet expensive
Quantization-Aware Training (QAT) approaches, particularly when attempting to
quantize LLMs, where scalability of the quantization process is of paramount
importance. GPTQ essentially consists in learning the rounding operation using
a small calibration set. In this work, we challenge common choices in GPTQ
methods. In particular, we show that the process is, to a certain extent,
robust to a number of variables (weight selection, feature augmentation, choice
of calibration set). More importantly, we derive a number of best practices for
designing more efficient and scalable GPTQ methods, regarding the problem
formulation (loss, degrees of freedom, use of non-uniform quantization schemes)
or optimization process (choice of variable and optimizer). Lastly, we propose
a novel importance-based mixed-precision technique. Those guidelines lead to
significant performance improvements on all the tested state-of-the-art GPTQ
methods and networks (e.g. +6.819 points on ViT for 4-bit quantization), paving
the way for the design of scalable, yet effective quantization methods