30,865 research outputs found
F-formation Detection: Individuating Free-standing Conversational Groups in Images
Detection of groups of interacting people is a very interesting and useful
task in many modern technologies, with application fields spanning from
video-surveillance to social robotics. In this paper we first furnish a
rigorous definition of group considering the background of the social sciences:
this allows us to specify many kinds of group, so far neglected in the Computer
Vision literature. On top of this taxonomy, we present a detailed state of the
art on the group detection algorithms. Then, as a main contribution, we present
a brand new method for the automatic detection of groups in still images, which
is based on a graph-cuts framework for clustering individuals; in particular we
are able to codify in a computational sense the sociological definition of
F-formation, that is very useful to encode a group having only proxemic
information: position and orientation of people. We call the proposed method
Graph-Cuts for F-formation (GCFF). We show how GCFF definitely outperforms all
the state of the art methods in terms of different accuracy measures (some of
them are brand new), demonstrating also a strong robustness to noise and
versatility in recognizing groups of various cardinality.Comment: 32 pages, submitted to PLOS On
View-tolerant face recognition and Hebbian learning imply mirror-symmetric neural tuning to head orientation
The primate brain contains a hierarchy of visual areas, dubbed the ventral
stream, which rapidly computes object representations that are both specific
for object identity and relatively robust against identity-preserving
transformations like depth-rotations. Current computational models of object
recognition, including recent deep learning networks, generate these properties
through a hierarchy of alternating selectivity-increasing filtering and
tolerance-increasing pooling operations, similar to simple-complex cells
operations. While simulations of these models recapitulate the ventral stream's
progression from early view-specific to late view-tolerant representations,
they fail to generate the most salient property of the intermediate
representation for faces found in the brain: mirror-symmetric tuning of the
neural population to head orientation. Here we prove that a class of
hierarchical architectures and a broad set of biologically plausible learning
rules can provide approximate invariance at the top level of the network. While
most of the learning rules do not yield mirror-symmetry in the mid-level
representations, we characterize a specific biologically-plausible Hebb-type
learning rule that is guaranteed to generate mirror-symmetric tuning to faces
tuning at intermediate levels of the architecture
Face recognition technologies for evidential evaluation of video traces
Human recognition from video traces is an important task in forensic investigations and evidence evaluations. Compared with other biometric traits, face is one of the most popularly used modalities for human recognition due to the fact that its collection is non-intrusive and requires less cooperation from the subjects. Moreover, face images taken at a long distance can still provide reasonable resolution, while most biometric modalities, such as iris and fingerprint, do not have this merit. In this chapter, we discuss automatic face recognition technologies for evidential evaluations of video traces. We first introduce the general concepts in both forensic and automatic face recognition , then analyse the difficulties in face recognition from videos . We summarise and categorise the approaches for handling different uncontrollable factors in difficult recognition conditions. Finally we discuss some challenges and trends in face recognition research in both forensics and biometrics . Given its merits tested in many deployed systems and great potential in other emerging applications, considerable research and development efforts are expected to be devoted in face recognition in the near future
Joint Training of a Convolutional Network and a Graphical Model for Human Pose Estimation
This paper proposes a new hybrid architecture that consists of a deep
Convolutional Network and a Markov Random Field. We show how this architecture
is successfully applied to the challenging problem of articulated human pose
estimation in monocular images. The architecture can exploit structural domain
constraints such as geometric relationships between body joint locations. We
show that joint training of these two model paradigms improves performance and
allows us to significantly outperform existing state-of-the-art techniques
Reference face graph for face recognition
Face recognition has been studied extensively; however, real-world face recognition still remains a challenging task. The demand for unconstrained practical face recognition is rising with the explosion of online multimedia such as social networks, and video surveillance footage where face analysis is of significant importance. In this paper, we approach face recognition in the context of graph theory. We recognize an unknown face using an external reference face graph (RFG). An RFG is generated and recognition of a given face is achieved by comparing it to the faces in the constructed RFG. Centrality measures are utilized to identify distinctive faces in the reference face graph. The proposed RFG-based face recognition algorithm is robust to the changes in pose and it is also alignment free. The RFG recognition is used in conjunction with DCT locality sensitive hashing for efficient retrieval to ensure scalability. Experiments are conducted on several publicly available databases and the results show that the proposed approach outperforms the state-of-the-art methods without any preprocessing necessities such as face alignment. Due to the richness in the reference set construction, the proposed method can also handle illumination and expression variation
Robust Adaptive Median Binary Pattern for noisy texture classification and retrieval
Texture is an important cue for different computer vision tasks and
applications. Local Binary Pattern (LBP) is considered one of the best yet
efficient texture descriptors. However, LBP has some notable limitations,
mostly the sensitivity to noise. In this paper, we address these criteria by
introducing a novel texture descriptor, Robust Adaptive Median Binary Pattern
(RAMBP). RAMBP based on classification process of noisy pixels, adaptive
analysis window, scale analysis and image regions median comparison. The
proposed method handles images with high noisy textures, and increases the
discriminative properties by capturing microstructure and macrostructure
texture information. The proposed method has been evaluated on popular texture
datasets for classification and retrieval tasks, and under different high noise
conditions. Without any train or prior knowledge of noise type, RAMBP achieved
the best classification compared to state-of-the-art techniques. It scored more
than under impulse noise densities, more than under
Gaussian noised textures with standard deviation , and more than
under Gaussian blurred textures with standard deviation .
The proposed method yielded competitive results and high performance as one of
the best descriptors in noise-free texture classification. Furthermore, RAMBP
showed also high performance for the problem of noisy texture retrieval
providing high scores of recall and precision measures for textures with high
levels of noise
Gradient-orientation-based PCA subspace for novel face recognition
This article has been made available through the Brunel Open Access Publishing Fund.Face recognition is an interesting and a challenging problem that has been widely studied in the field of pattern recognition and computer vision. It has many applications such as biometric authentication, video surveillance, and others. In the past decade, several methods for face recognition were proposed. However, these methods suffer from pose and illumination variations. In order to address these problems, this paper proposes a novel methodology to recognize the face images. Since image gradients are invariant to illumination and pose variations, the proposed approach uses gradient orientation to handle these effects. The Schur decomposition is used for matrix decomposition and then Schurvalues and Schurvectors are extracted for subspace projection. We call this subspace projection of face features as Schurfaces, which is numerically stable and have the ability of handling defective matrices. The Hausdorff distance is used with the nearest neighbor classifier to measure the similarity between different faces. Experiments are conducted with Yale face database and ORL face database. The results show that the proposed approach is highly discriminant and achieves a promising accuracy for face recognition than the state-of-the-art approaches
- …