177 research outputs found
Bamboo: A fast descriptor based on AsymMetric pairwise BOOsting
A robust hash, or content-based fingerprint, is a succinct representation of the perceptually most relevant parts of a multimedia object. A key requirement of fingerprinting is that elements with perceptually similar content should map to the same fingerprint, even if their bit-level representations are different. In this work we propose BAMBOO (Binary descriptor based on AsymMetric pairwise BOOsting), a binary local descriptor that exploits a combination of content-based fingerprinting techniques and computationally efficient filters (box filters, Haar-like features, etc.) applied to image patches. In particular, we define a possibly large set of filters and iteratively select the most discriminative ones resorting to an asymmetric pair-wise boosting technique. The output values of the filtering process are quantized to one bit, leading to a very compact binary descriptor. Results show that such descriptor leads to compelling results, significantly outperforming binary descriptors having comparable complexity (e.g., BRISK), and approaching the discriminative power of state-of-the-art descriptors which are significantly more complex (e.g., SIFT and BinBoost)
REAL ADABOOST FOR CONTENT IDENTIFICATION
ABSTRACT This paper proposes a machine learning method based on Real Adaboost that jointly optimizes the content ID codes and the decoding metric. Significant performance gains over prior art are demonstrated for audio fingerprinting
Perceptual Video Hashing for Content Identification and Authentication
Perceptual hashing has been broadly used in the literature to identify similar contents for video copy detection. It has also been adopted to detect malicious manipulations for video authentication. However, targeting both applications with a single system using the same hash would be highly desirable as this saves the storage space and reduces the computational complexity. This paper proposes a perceptual video hashing system for content identification and authentication. The objective is to design a hash extraction technique that can withstand signal processing operations on one hand and detect malicious attacks on the other hand. The proposed system relies on a new signal calibration technique for extracting the hash using the discrete cosine transform (DCT) and the discrete sine transform (DST). This consists of determining the number of samples, called the normalizing shift, that is required for shifting a digital signal so that the shifted version matches a certain pattern according to DCT/DST coefficients. The rationale for the calibration idea is that the normalizing shift resists signal processing operations while it exhibits sensitivity to local tampering (i.e., replacing a small portion of the signal with a different one). While the same hash serves both applications, two different similarity measures have been proposed for video identification and authentication, respectively. Through intensive experiments with various types of video distortions and manipulations, the proposed system has been shown to outperform related state-of-the art video hashing techniques in terms of identification and authentication with the advantageous ability to locate tampered regions
Learning compact hashing codes for large-scale similarity search
Retrieval of similar objects is a key component in many applications. As databases grow larger, learning compact representations for efficient storage and fast search becomes increasingly important. Moreover, these representations should preserve similarity, i.e., similar objects should have similar representations. Hashing algorithms, which encode objects into compact binary codes to preserve similarity, have demonstrated promising results in addressing these challenges. This dissertation studies the problem of learning compact hashing codes for large-scale similarity search. Specifically, we investigate two classes of approach: regularized Adaboost and signal-to-noise ratio (SNR) maximization. The regularized Adaboost builds on the classical boosting framework for hashing, while SNR maximization is a novel hashing framework with theoretical guarantee and great flexibility in designing hashing algorithms for various scenarios.
The regularized Adaboost algorithm is to learn and extract binary hash codes (fingerprints) of time-varying content by filtering and quantizing perceptually significant features. The proposed algorithm extends the recent symmetric pairwise boosting (SPB) algorithm by taking feature sequence correlation into account. An information-theoretic analysis of the SPB algorithm is given, showing that each iteration of SPB maximizes a lower bound on the mutual information between matching fingerprint pairs. Based on the analysis, two practical regularizers are proposed to penalize those filters generating highly correlated filter responses. A learning-theoretic analysis of the regularized Adaboost algorithm is given. The proposed algorithm demonstrates significant performance gains over SPB for both audio and video content identification (ID) systems.
SNR maximization hashing (SRN-MH) uses the SNR metric to select a set of uncorrelated projection directions, and one hash bit is extracted from each projection direction. We first motivate this approach under a Gaussian model for the underlying signals, in which case maximizing SNR is equivalent to minimizing the hashing error probability. This theoretical guarantee differentiates SNR-MH from other hashing algorithms where learning has to be carried out with a continuous relaxation of quantization functions. A globally optimal solution can be obtained by solving a generalized eigenvalue problem. Experiments on both synthetic and real datasets demonstrate the power of SNR-MH to learn compact codes.
We extend SNR-MH to two different scenarios in large-scale similarity search. The first extension aims at applications with a larger bit budget. To learn longer hash codes, we propose a multi-bit per projection algorithm, called SNR multi-bit hashing (SNR-MBH), to learn longer hash codes when the number of high-SNR projections is limited. Extensive experiments demonstrate the superior performance of SNR-MBH. The second extension aims at a multi-feature setting, where more than one feature vector is available for each object. We propose two multi-feature hashing methods, SNR joint hashing (SNR-JH) and SNR selection hashing (SNR-SH). SNR-JH jointly considers all feature correlations and learns uncorrelated hash functions that maximize SNR, while SNR-SH separately learns hash functions on each individual feature and selects the final hash functions based on the SNR associated with each hash function. The proposed methods perform favorably compared to other state-of-the-art multi-feature hashing algorithms on several benchmark datasets
Probabilistic modelling and inference of human behaviour from mobile phone time series
With an estimated 4.1 billion subscribers around the world, the mobile phone offers a unique
opportunity to sense and understand human behaviour from location, co-presence and communication
data. While the benefit of modelling this unprecedented amount of data is widely
recognised, a number of challenges impede the development of accurate behaviour models. In
this thesis, we identify and address two modelling problems and show that their consideration
improves the accuracy of behaviour inference.
We first examine the modelling of long-range dependencies in human behaviour. Human behaviour
models only take into account short-range dependencies in mobile phone time series.
Using information theory, we quantify long-range dependencies in mobile phone time series for
the first time, demonstrate that they exhibit periodic oscillations and introduce novel tools to
analyse them. We further show that considering what the user did 24 hours earlier improves
accuracy when predicting user behaviour five hours or longer in advance.
The second problem that we address is the modelling of temporal variations in human behaviour.
The time spent by a user on an activity varies from one day to the next. In order to
recognise behaviour patterns despite temporal variations, we establish a methodological connection
between human behaviour modelling and biological sequence alignment. This connection
allows us to compare, cluster and model behaviour sequences and introduce novel features for
behaviour recognition which improve its accuracy.
The experiments presented in this thesis have been conducted on the largest publicly available
mobile phone dataset labelled in an unsupervised fashion and are entirely repeatable. Furthermore,
our techniques only require cellular data which can easily be recorded by today's mobile
phones and could benefit a wide range of applications including life logging, health monitoring,
customer profiling and large-scale surveillance
- …