435 research outputs found
Construction of Latent Descriptor Space and Inference Model of Hand-Object Interactions
Appearance-based generic object recognition is a challenging problem because
all possible appearances of objects cannot be registered, especially as new
objects are produced every day. Function of objects, however, has a
comparatively small number of prototypes. Therefore, function-based
classification of new objects could be a valuable tool for generic object
recognition. Object functions are closely related to hand-object interactions
during handling of a functional object; i.e., how the hand approaches the
object, which parts of the object and contact the hand, and the shape of the
hand during interaction. Hand-object interactions are helpful for modeling
object functions. However, it is difficult to assign discrete labels to
interactions because an object shape and grasping hand-postures intrinsically
have continuous variations. To describe these interactions, we propose the
interaction descriptor space which is acquired from unlabeled appearances of
human hand-object interactions. By using interaction descriptors, we can
numerically describe the relation between an object's appearance and its
possible interaction with the hand. The model infers the quantitative state of
the interaction from the object image alone. It also identifies the parts of
objects designed for hand interactions such as grips and handles. We
demonstrate that the proposed method can unsupervisedly generate interaction
descriptors that make clusters corresponding to interaction types. And also we
demonstrate that the model can infer possible hand-object interactions
Face Centered Image Analysis Using Saliency and Deep Learning Based Techniques
Image analysis starts with the purpose of configuring vision machines that can perceive like human to intelligently infer general principles and sense the surrounding situations from imagery. This dissertation studies the face centered image analysis as the core problem in high level computer vision research and addresses the problem by tackling three challenging subjects: Are there anything interesting in the image? If there is, what is/are that/they? If there is a person presenting, who is he/she? What kind of expression he/she is performing? Can we know his/her age? Answering these problems results in the saliency-based object detection, deep learning structured objects categorization and recognition, human facial landmark detection and multitask biometrics.
To implement object detection, a three-level saliency detection based on the self-similarity technique (SMAP) is firstly proposed in the work. The first level of SMAP accommodates statistical methods to generate proto-background patches, followed by the second level that implements local contrast computation based on image self-similarity characteristics. At last, the spatial color distribution constraint is considered to realize the saliency detection. The outcome of the algorithm is a full resolution image with highlighted saliency objects and well-defined edges.
In object recognition, the Adaptive Deconvolution Network (ADN) is implemented to categorize the objects extracted from saliency detection. To improve the system performance, L1/2 norm regularized ADN has been proposed and tested in different applications. The results demonstrate the efficiency and significance of the new structure.
To fully understand the facial biometrics related activity contained in the image, the low rank matrix decomposition is introduced to help locate the landmark points on the face images. The natural extension of this work is beneficial in human facial expression recognition and facial feature parsing research.
To facilitate the understanding of the detected facial image, the automatic facial image analysis becomes essential. We present a novel deeply learnt tree-structured face representation to uniformly model the human face with different semantic meanings. We show that the proposed feature yields unified representation in multi-task facial biometrics and the multi-task learning framework is applicable to many other computer vision tasks
A survey of face recognition techniques under occlusion
The limited capacity to recognize faces under occlusions is a long-standing
problem that presents a unique challenge for face recognition systems and even
for humans. The problem regarding occlusion is less covered by research when
compared to other challenges such as pose variation, different expressions,
etc. Nevertheless, occluded face recognition is imperative to exploit the full
potential of face recognition for real-world applications. In this paper, we
restrict the scope to occluded face recognition. First, we explore what the
occlusion problem is and what inherent difficulties can arise. As a part of
this review, we introduce face detection under occlusion, a preliminary step in
face recognition. Second, we present how existing face recognition methods cope
with the occlusion problem and classify them into three categories, which are
1) occlusion robust feature extraction approaches, 2) occlusion aware face
recognition approaches, and 3) occlusion recovery based face recognition
approaches. Furthermore, we analyze the motivations, innovations, pros and
cons, and the performance of representative approaches for comparison. Finally,
future challenges and method trends of occluded face recognition are thoroughly
discussed
- …