3,292 research outputs found
Quality Aware Network for Set to Set Recognition
This paper targets on the problem of set to set recognition, which learns the
metric between two image sets. Images in each set belong to the same identity.
Since images in a set can be complementary, they hopefully lead to higher
accuracy in practical applications. However, the quality of each sample cannot
be guaranteed, and samples with poor quality will hurt the metric. In this
paper, the quality aware network (QAN) is proposed to confront this problem,
where the quality of each sample can be automatically learned although such
information is not explicitly provided in the training stage. The network has
two branches, where the first branch extracts appearance feature embedding for
each sample and the other branch predicts quality score for each sample.
Features and quality scores of all samples in a set are then aggregated to
generate the final feature embedding. We show that the two branches can be
trained in an end-to-end manner given only the set-level identity annotation.
Analysis on gradient spread of this mechanism indicates that the quality
learned by the network is beneficial to set-to-set recognition and simplifies
the distribution that the network needs to fit. Experiments on both face
verification and person re-identification show advantages of the proposed QAN.
The source code and network structure can be downloaded at
https://github.com/sciencefans/Quality-Aware-Network.Comment: Accepted at CVPR 201
A Pattern Classification Based approach for Blur Classification
Blur type identification is one of the most crucial step of image restoration. In case of blind restoration of such images, it is generally assumed that the blur type is known prior to restoration of such images. However, it is not practical in real applications. So, blur type identification is extremely desirable before application of blind restoration technique to restore a blurred image. An approach to categorize blur in three classes namely motion, defocus, and combined blur is presented in this paper. Curvelet transform based energy features are utilized as features of blur patterns and a neural network is designed for classification. The simulation results show preciseness of proposed approach
Automatic Scaffolding Productivity Measurement through Deep Learning
This study developed a method to automatically measure scaffolding productivity by extracting and analysing semantic information from onsite vision data
Adaptive Representations for Image Restoration
In the �eld of image processing, building good representation models for
natural images is crucial for various applications, such as image restora-
tion, sampling, segmentation, etc. Adaptive image representation models
are designed for describing the intrinsic structures of natural images. In
the classical Bayesian inference, this representation is often known as the
prior of the intensity distribution of the input image. Early image priors
have forms such as total variation norm, Markov Random Fields (MRF),
and wavelets. Recently, image priors obtained from machine learning tech-
niques tend to be more adaptive, which aims at capturing the natural image
models via learning from larger databases. In this thesis, we study adaptive
representations of natural images for image restoration.
The purpose of image restoration is to remove the artifacts which degrade
an image. The degradation comes in many forms such as image blurs,
noises, and artifacts from the codec. Take image denoising for an example.
There are several classic representation methods which can generate state-
of-the-art results. The �rst one is the assumption of image self-similarity.
However, this representation has the issue that sometimes the self-similarity
assumption would fail because of high noise levels or unique image contents.
The second one is the wavelet based nonlocal representation, which also has
a problem in that the �xed basis function is not adaptive enough for any
arbitrary type of input images. The third is the sparse coding using over-
complete dictionaries, which does not have the hierarchical structure that is
similar to the one in human visual system and is therefore prone to denoising
artifacts.
My research started from image denoising. Through the thorough review
and evaluation of state-of-the-art denoising methods, it was found that the representation of images is substantially important for the denoising tech-
nique. At the same time, an improvement on one of the nonlocal denoising
method was proposed, which improves the representation of images by the
integration of Gaussian blur, clustering and Rotationally Invariant Block
Matching. Enlightened by the successful application of sparse coding in
compressive sensing, we exploited the image self-similarity by using a sparse
representation based on wavelet coe�cients in a nonlocal and hierarchical
way, which generates competitive results compared to the state-of-the-art
denoising algorithms. Meanwhile, another adaptive local �lter learned by
Genetic Programming (GP) was proposed for e�cient image denoising. In
this work, we employed GP to �nd the optimal representations for local im-
age patches through training on massive datasets, which yields competitive
results compared to state-of-the-art local denoising �lters. After success-
fully dealt with the denoising part, we moved to the parameter estimation
for image degradation models. For instance, image blur identi�cation uses
deep learning, which has recently been proposed as a popular image repre-
sentation approach. This work has also been extended to blur estimation
based on the fact that the second step of the framework has been replaced
with general regression neural network. In a word, in this thesis, spatial cor-
relations, sparse coding, genetic programming, deep learning are explored
as adaptive image representation models for both image restoration and
parameter estimation.
We conclude this thesis by considering methods based on machine learning
to be the best adaptive representations for natural images. We have shown
that they can generate better results than conventional representation mod-
els for the tasks of image denoising and deblurring
Blur Classification Using Segmentation Based Fractal Texture Analysis
The objective of vision based gesture recognition is to design a system, which can understand the human actions and convey the acquired information with the help of captured images. An image restoration approach is extremely required whenever image gets blur during acquisition process since blurred images can severely degrade the performance of such systems. Image restoration recovers a true image from a degraded version. It is referred as blind restoration if blur information is unidentified. Blur identification is essential before application of any blind restoration algorithm. This paper presents a blur identification approach which categories a hand gesture image into one of the sharp, motion, defocus and combined blurred categories. Segmentation based fractal texture analysis extraction algorithm is utilized for featuring the neural network based classification system. The simulation results demonstrate the preciseness of proposed method
A Multi-view Impartial Decision Network for Frontotemporal Dementia Diagnosis
Frontotemporal Dementia (FTD) diagnosis has been successfully progress using
deep learning techniques. However, current FTD identification methods suffer
from two limitations. Firstly, they do not exploit the potential of multi-view
functional magnetic resonance imaging (fMRI) for classifying FTD. Secondly,
they do not consider the reliability of the multi-view FTD diagnosis. To
address these limitations, we propose a reliable multi-view impartial decision
network (MID-Net) for FTD diagnosis in fMRI. Our MID-Net provides confidence
for each view and generates a reliable prediction without any conflict. To
achieve this, we employ multiple expert models to extract evidence from the
abundant neural network information contained in fMRI images. We then introduce
the Dirichlet Distribution to characterize the expert class probability
distribution from an evidence level. Additionally, a novel Impartial Decision
Maker (IDer) is proposed to combine the different opinions inductively to
arrive at an unbiased prediction without additional computation cost. Overall,
our MID-Net dynamically integrates the decisions of different experts on FTD
disease, especially when dealing with multi-view high-conflict cases. Extensive
experiments on a high-quality FTD fMRI dataset demonstrate that our model
outperforms previous methods and provides high uncertainty for hard-to-classify
examples. We believe that our approach represents a significant step toward the
deployment of reliable FTD decision-making under multi-expert conditions. We
will release the codes for reproduction after acceptance
Pedestrian Attribute Recognition: A Survey
Recognizing pedestrian attributes is an important task in computer vision
community due to it plays an important role in video surveillance. Many
algorithms has been proposed to handle this task. The goal of this paper is to
review existing works using traditional methods or based on deep learning
networks. Firstly, we introduce the background of pedestrian attributes
recognition (PAR, for short), including the fundamental concepts of pedestrian
attributes and corresponding challenges. Secondly, we introduce existing
benchmarks, including popular datasets and evaluation criterion. Thirdly, we
analyse the concept of multi-task learning and multi-label learning, and also
explain the relations between these two learning algorithms and pedestrian
attribute recognition. We also review some popular network architectures which
have widely applied in the deep learning community. Fourthly, we analyse
popular solutions for this task, such as attributes group, part-based,
\emph{etc}. Fifthly, we shown some applications which takes pedestrian
attributes into consideration and achieve better performance. Finally, we
summarized this paper and give several possible research directions for
pedestrian attributes recognition. The project page of this paper can be found
from the following website:
\url{https://sites.google.com/view/ahu-pedestrianattributes/}.Comment: Check our project page for High Resolution version of this survey:
https://sites.google.com/view/ahu-pedestrianattributes
- …