33 research outputs found
Shot-based object retrieval from video with compressed Fisher vectors
This paper addresses the problem of retrieving those shots from a database of video sequences that match a query image. Existing architectures are mainly based on Bag of Words model, which consists in matching the query image with a high-level representation of local features extracted from the video database. Such architectures lack however the capability to scale up to very large databases. Recently, Fisher Vectors showed promising results in large scale image retrieval
problems, but it is still not clear how they can be best exploited in video-related applications. In our work, we use compressed Fisher Vectors to represent the video-shots and we show that inherent correlation between video-frames can be proficiently exploited. Experiments show that our proposal enables better performance for lower computational requirements than similar architectures
Learnable Descriptors for Visual Search
This work proposes LDVS, a learnable binary local descriptor devised for matching natural images within the MPEG CDVS framework. LDVS descriptors are learned so that they can be sign-quantized and compared using the Hamming distance. The underlying convolutional architecture enjoys a moderate parameters count for operations on mobile devices. Our experiments show that LDVS descriptors perform favorably over comparable learned binary descriptors at patch matching on two different datasets. A complete pair-wise image matching pipeline is then designed around LDVS descriptors, integrating them in the reference CDVS evaluation framework. Experiments show that LDVS descriptors outperform the compressed CDVS SIFT-like descriptors at pair-wise image matching over the challenging CDVS image dataset
Capsule Networks with Routing Annealing
International audienc
HEMP: High-order entropy minimization for neural network compression
We formulate the entropy of a quantized artificial neural network as a
differentiable function that can be plugged as a regularization term into the
cost function minimized by gradient descent. Our formulation scales efficiently
beyond the first order and is agnostic of the quantization scheme. The network
can then be trained to minimize the entropy of the quantized parameters, so
that they can be optimally compressed via entropy coding. We experiment with
our entropy formulation at quantizing and compressing well-known network
architectures over multiple datasets. Our approach compares favorably over
similar methods, enjoying the benefits of higher order entropy estimate,
showing flexibility towards non-uniform quantization (we use Lloyd-max
quantization), scalability towards any entropy order to be minimized and
efficiency in terms of compression. We show that HEMP is able to work in
synergy with other approaches aiming at pruning or quantizing the model itself,
delivering significant benefits in terms of storage size compressibility
without harming the model's performance