3,292 research outputs found

    Quality Aware Network for Set to Set Recognition

    Full text link
    This paper targets on the problem of set to set recognition, which learns the metric between two image sets. Images in each set belong to the same identity. Since images in a set can be complementary, they hopefully lead to higher accuracy in practical applications. However, the quality of each sample cannot be guaranteed, and samples with poor quality will hurt the metric. In this paper, the quality aware network (QAN) is proposed to confront this problem, where the quality of each sample can be automatically learned although such information is not explicitly provided in the training stage. The network has two branches, where the first branch extracts appearance feature embedding for each sample and the other branch predicts quality score for each sample. Features and quality scores of all samples in a set are then aggregated to generate the final feature embedding. We show that the two branches can be trained in an end-to-end manner given only the set-level identity annotation. Analysis on gradient spread of this mechanism indicates that the quality learned by the network is beneficial to set-to-set recognition and simplifies the distribution that the network needs to fit. Experiments on both face verification and person re-identification show advantages of the proposed QAN. The source code and network structure can be downloaded at https://github.com/sciencefans/Quality-Aware-Network.Comment: Accepted at CVPR 201

    A Pattern Classification Based approach for Blur Classification

    Get PDF
    Blur type identification is one of the most crucial step of image restoration. In case of blind restoration of such images, it is generally assumed that the blur type is known prior to restoration of such images. However, it is not practical in real applications. So, blur type identification is extremely desirable before application of blind restoration technique to restore a blurred image. An approach to categorize blur in three classes namely motion, defocus, and combined blur is presented in this paper. Curvelet transform based energy features are utilized as features of blur patterns and a neural network is designed for classification. The simulation results show preciseness of proposed approach

    Automatic Scaffolding Productivity Measurement through Deep Learning

    Get PDF
    This study developed a method to automatically measure scaffolding productivity by extracting and analysing semantic information from onsite vision data

    Adaptive Representations for Image Restoration

    Get PDF
    In the �eld of image processing, building good representation models for natural images is crucial for various applications, such as image restora- tion, sampling, segmentation, etc. Adaptive image representation models are designed for describing the intrinsic structures of natural images. In the classical Bayesian inference, this representation is often known as the prior of the intensity distribution of the input image. Early image priors have forms such as total variation norm, Markov Random Fields (MRF), and wavelets. Recently, image priors obtained from machine learning tech- niques tend to be more adaptive, which aims at capturing the natural image models via learning from larger databases. In this thesis, we study adaptive representations of natural images for image restoration. The purpose of image restoration is to remove the artifacts which degrade an image. The degradation comes in many forms such as image blurs, noises, and artifacts from the codec. Take image denoising for an example. There are several classic representation methods which can generate state- of-the-art results. The �rst one is the assumption of image self-similarity. However, this representation has the issue that sometimes the self-similarity assumption would fail because of high noise levels or unique image contents. The second one is the wavelet based nonlocal representation, which also has a problem in that the �xed basis function is not adaptive enough for any arbitrary type of input images. The third is the sparse coding using over- complete dictionaries, which does not have the hierarchical structure that is similar to the one in human visual system and is therefore prone to denoising artifacts. My research started from image denoising. Through the thorough review and evaluation of state-of-the-art denoising methods, it was found that the representation of images is substantially important for the denoising tech- nique. At the same time, an improvement on one of the nonlocal denoising method was proposed, which improves the representation of images by the integration of Gaussian blur, clustering and Rotationally Invariant Block Matching. Enlightened by the successful application of sparse coding in compressive sensing, we exploited the image self-similarity by using a sparse representation based on wavelet coe�cients in a nonlocal and hierarchical way, which generates competitive results compared to the state-of-the-art denoising algorithms. Meanwhile, another adaptive local �lter learned by Genetic Programming (GP) was proposed for e�cient image denoising. In this work, we employed GP to �nd the optimal representations for local im- age patches through training on massive datasets, which yields competitive results compared to state-of-the-art local denoising �lters. After success- fully dealt with the denoising part, we moved to the parameter estimation for image degradation models. For instance, image blur identi�cation uses deep learning, which has recently been proposed as a popular image repre- sentation approach. This work has also been extended to blur estimation based on the fact that the second step of the framework has been replaced with general regression neural network. In a word, in this thesis, spatial cor- relations, sparse coding, genetic programming, deep learning are explored as adaptive image representation models for both image restoration and parameter estimation. We conclude this thesis by considering methods based on machine learning to be the best adaptive representations for natural images. We have shown that they can generate better results than conventional representation mod- els for the tasks of image denoising and deblurring

    Blur Classification Using Segmentation Based Fractal Texture Analysis

    Get PDF
    The objective of vision based gesture recognition is to design a system, which can understand the human actions and convey the acquired information with the help of captured images. An image restoration approach is extremely required whenever image gets blur during acquisition process since blurred images can severely degrade the performance of such systems. Image restoration recovers a true image from a degraded version. It is referred as blind restoration if blur information is unidentified. Blur identification is essential before application of any blind restoration algorithm. This paper presents a blur identification approach which categories a hand gesture image into one of the sharp, motion, defocus and combined blurred categories. Segmentation based fractal texture analysis extraction algorithm is utilized for featuring the neural network based classification system. The simulation results demonstrate the preciseness of proposed method

    A Multi-view Impartial Decision Network for Frontotemporal Dementia Diagnosis

    Full text link
    Frontotemporal Dementia (FTD) diagnosis has been successfully progress using deep learning techniques. However, current FTD identification methods suffer from two limitations. Firstly, they do not exploit the potential of multi-view functional magnetic resonance imaging (fMRI) for classifying FTD. Secondly, they do not consider the reliability of the multi-view FTD diagnosis. To address these limitations, we propose a reliable multi-view impartial decision network (MID-Net) for FTD diagnosis in fMRI. Our MID-Net provides confidence for each view and generates a reliable prediction without any conflict. To achieve this, we employ multiple expert models to extract evidence from the abundant neural network information contained in fMRI images. We then introduce the Dirichlet Distribution to characterize the expert class probability distribution from an evidence level. Additionally, a novel Impartial Decision Maker (IDer) is proposed to combine the different opinions inductively to arrive at an unbiased prediction without additional computation cost. Overall, our MID-Net dynamically integrates the decisions of different experts on FTD disease, especially when dealing with multi-view high-conflict cases. Extensive experiments on a high-quality FTD fMRI dataset demonstrate that our model outperforms previous methods and provides high uncertainty for hard-to-classify examples. We believe that our approach represents a significant step toward the deployment of reliable FTD decision-making under multi-expert conditions. We will release the codes for reproduction after acceptance

    Pedestrian Attribute Recognition: A Survey

    Full text link
    Recognizing pedestrian attributes is an important task in computer vision community due to it plays an important role in video surveillance. Many algorithms has been proposed to handle this task. The goal of this paper is to review existing works using traditional methods or based on deep learning networks. Firstly, we introduce the background of pedestrian attributes recognition (PAR, for short), including the fundamental concepts of pedestrian attributes and corresponding challenges. Secondly, we introduce existing benchmarks, including popular datasets and evaluation criterion. Thirdly, we analyse the concept of multi-task learning and multi-label learning, and also explain the relations between these two learning algorithms and pedestrian attribute recognition. We also review some popular network architectures which have widely applied in the deep learning community. Fourthly, we analyse popular solutions for this task, such as attributes group, part-based, \emph{etc}. Fifthly, we shown some applications which takes pedestrian attributes into consideration and achieve better performance. Finally, we summarized this paper and give several possible research directions for pedestrian attributes recognition. The project page of this paper can be found from the following website: \url{https://sites.google.com/view/ahu-pedestrianattributes/}.Comment: Check our project page for High Resolution version of this survey: https://sites.google.com/view/ahu-pedestrianattributes
    corecore