108 research outputs found
Ensemble of Different Approaches for a Reliable Person Re-identification System
An ensemble of approaches for reliable person re-identification is proposed in this paper. The proposed ensemble is built combining widely used person re-identification systems using different color spaces and some variants of state-of-the-art approaches that are proposed in this paper. Different descriptors are tested, and both texture and color features are extracted from the images; then the different descriptors are compared using different distance measures (e.g., the Euclidean distance, angle, and the Jeffrey distance). To improve performance, a method based on skeleton detection, extracted from the depth map, is also applied when the depth map is available. The proposed ensemble is validated on three widely used datasets (CAVIAR4REID, IAS, and VIPeR), keeping the same parameter set of each approach constant across all tests to avoid overfitting and to demonstrate that the proposed system can be considered a general-purpose person re-identification system. Our experimental results show that the proposed system offers significant improvements over baseline approaches. The source code used for the approaches tested in this paper will be available at https://www.dei.unipd.it/node/2357 and http://robotics.dei.unipd.it/reid/
Bispectrum Inversion with Application to Multireference Alignment
We consider the problem of estimating a signal from noisy
circularly-translated versions of itself, called multireference alignment
(MRA). One natural approach to MRA could be to estimate the shifts of the
observations first, and infer the signal by aligning and averaging the data. In
contrast, we consider a method based on estimating the signal directly, using
features of the signal that are invariant under translations. Specifically, we
estimate the power spectrum and the bispectrum of the signal from the
observations. Under mild assumptions, these invariant features contain enough
information to infer the signal. In particular, the bispectrum can be used to
estimate the Fourier phases. To this end, we propose and analyze a few
algorithms. Our main methods consist of non-convex optimization over the smooth
manifold of phases. Empirically, in the absence of noise, these non-convex
algorithms appear to converge to the target signal with random initialization.
The algorithms are also robust to noise. We then suggest three additional
methods. These methods are based on frequency marching, semidefinite relaxation
and integer programming. The first two methods provably recover the phases
exactly in the absence of noise. In the high noise level regime, the invariant
features approach for MRA results in stable estimation if the number of
measurements scales like the cube of the noise variance, which is the
information-theoretic rate. Additionally, it requires only one pass over the
data which is important at low signal-to-noise ratio when the number of
observations must be large
Intelligent Garbage Classifier
IGC (Intelligent Garbage Classifier) is a system
for visual classification and separation of solid waste products.
Currently, an important part of the separation effort is based on
manual work, from household separation to industrial waste
management. Taking advantage of the technologies currently
available, a system has been built that can analyze images from
a camera and control a robot arm and conveyor belt to
automatically separate different kinds of waste
Adaptive Representations for Image Restoration
In the �eld of image processing, building good representation models for
natural images is crucial for various applications, such as image restora-
tion, sampling, segmentation, etc. Adaptive image representation models
are designed for describing the intrinsic structures of natural images. In
the classical Bayesian inference, this representation is often known as the
prior of the intensity distribution of the input image. Early image priors
have forms such as total variation norm, Markov Random Fields (MRF),
and wavelets. Recently, image priors obtained from machine learning tech-
niques tend to be more adaptive, which aims at capturing the natural image
models via learning from larger databases. In this thesis, we study adaptive
representations of natural images for image restoration.
The purpose of image restoration is to remove the artifacts which degrade
an image. The degradation comes in many forms such as image blurs,
noises, and artifacts from the codec. Take image denoising for an example.
There are several classic representation methods which can generate state-
of-the-art results. The �rst one is the assumption of image self-similarity.
However, this representation has the issue that sometimes the self-similarity
assumption would fail because of high noise levels or unique image contents.
The second one is the wavelet based nonlocal representation, which also has
a problem in that the �xed basis function is not adaptive enough for any
arbitrary type of input images. The third is the sparse coding using over-
complete dictionaries, which does not have the hierarchical structure that is
similar to the one in human visual system and is therefore prone to denoising
artifacts.
My research started from image denoising. Through the thorough review
and evaluation of state-of-the-art denoising methods, it was found that the representation of images is substantially important for the denoising tech-
nique. At the same time, an improvement on one of the nonlocal denoising
method was proposed, which improves the representation of images by the
integration of Gaussian blur, clustering and Rotationally Invariant Block
Matching. Enlightened by the successful application of sparse coding in
compressive sensing, we exploited the image self-similarity by using a sparse
representation based on wavelet coe�cients in a nonlocal and hierarchical
way, which generates competitive results compared to the state-of-the-art
denoising algorithms. Meanwhile, another adaptive local �lter learned by
Genetic Programming (GP) was proposed for e�cient image denoising. In
this work, we employed GP to �nd the optimal representations for local im-
age patches through training on massive datasets, which yields competitive
results compared to state-of-the-art local denoising �lters. After success-
fully dealt with the denoising part, we moved to the parameter estimation
for image degradation models. For instance, image blur identi�cation uses
deep learning, which has recently been proposed as a popular image repre-
sentation approach. This work has also been extended to blur estimation
based on the fact that the second step of the framework has been replaced
with general regression neural network. In a word, in this thesis, spatial cor-
relations, sparse coding, genetic programming, deep learning are explored
as adaptive image representation models for both image restoration and
parameter estimation.
We conclude this thesis by considering methods based on machine learning
to be the best adaptive representations for natural images. We have shown
that they can generate better results than conventional representation mod-
els for the tasks of image denoising and deblurring
Automated framework for robust content-based verification of print-scan degraded text documents
Fraudulent documents frequently cause severe financial damages and impose security breaches to civil and government organizations. The rapid advances in technology and the widespread availability of personal computers has not reduced the use of printed documents. While digital documents can be verified by many robust and secure methods such as digital signatures and digital watermarks, verification of printed documents still relies on manual inspection of embedded physical security mechanisms.The objective of this thesis is to propose an efficient automated framework for robust content-based verification of printed documents. The principal issue is to achieve robustness with respect to the degradations and increased levels of noise that occur from multiple cycles of printing and scanning. It is shown that classic OCR systems fail under such conditions, moreover OCR systems typically rely heavily on the use of high level linguistic structures to improve recognition rates. However inferring knowledge about the contents of the document image from a-priori statistics is contrary to the nature of document verification. Instead a system is proposed that utilizes specific knowledge of the document to perform highly accurate content verification based on a Print-Scan degradation model and character shape recognition. Such specific knowledge of the document is a reasonable choice for the verification domain since the document contents are already known in order to verify them.The system analyses digital multi font PDF documents to generate a descriptive summary of the document, referred to as \Document Description Map" (DDM). The DDM is later used for verifying the content of printed and scanned copies of the original documents. The system utilizes 2-D Discrete Cosine Transform based features and an adaptive hierarchical classifier trained with synthetic data generated by a Print-Scan degradation model. The system is tested with varying degrees of Print-Scan Channel corruption on a variety of documents with corruption produced by repetitive printing and scanning of the test documents. Results show the approach achieves excellent accuracy and robustness despite the high level of noise
Elliptical higher-order-spectra periocular code
The periocular region has recently emerged as a standalone biometric trait, promising attractive trade-off between the iris alone and the entire face, especially for cases where neither the iris nor a full facial image can be acquired. This advantage provides another dimension for implementing a robust biometric system, performed in non-ideal conditions. Global features (LBP, HOG) and local features (SIFT) have been introduced; however, the performance of these features can deteriorate for images captured in unconstrained and less-cooperative conditions. A particular set of Higher Order Spectral (HOS) features have been proved to be invariant to translation, scale, rotation, brightness level shift and contrast change. These properties are desirable in the periocular recognition problem to deal with the non-ideal imaging conditions. This paper investigates the HOS features in different configurations for the periocular recognition problem under non-ideal conditions. Especially, we introduce a new sampling approach for the periocular region based on an elliptical coordinate. This non-linear sampling approach is then combined with the robustness of the HOS features for encoding the periocular region. In addition, we also propose a new technique for combining left and right periocular. The proposed feature-level fusion approach bases on state-of-the-art bilinear pooling technique to allow efficient interaction between the features of both perioculars. We show the validity of the proposed approach in encoding discriminant features, outperforming or comparing favorably with the state-of-the-art features on the two popular datasets: FRGC and JAFFE
- …