38,345 research outputs found
Unsupervised learning of clutter-resistant visual representations from natural videos
Populations of neurons in inferotemporal cortex (IT) maintain an explicit
code for object identity that also tolerates transformations of object
appearance e.g., position, scale, viewing angle [1, 2, 3]. Though the learning
rules are not known, recent results [4, 5, 6] suggest the operation of an
unsupervised temporal-association-based method e.g., Foldiak's trace rule [7].
Such methods exploit the temporal continuity of the visual world by assuming
that visual experience over short timescales will tend to have invariant
identity content. Thus, by associating representations of frames from nearby
times, a representation that tolerates whatever transformations occurred in the
video may be achieved. Many previous studies verified that such rules can work
in simple situations without background clutter, but the presence of visual
clutter has remained problematic for this approach. Here we show that temporal
association based on large class-specific filters (templates) avoids the
problem of clutter. Our system learns in an unsupervised way from natural
videos gathered from the internet, and is able to perform a difficult
unconstrained face recognition task on natural images: Labeled Faces in the
Wild [8]
Automatic landmark annotation and dense correspondence registration for 3D human facial images
Dense surface registration of three-dimensional (3D) human facial images
holds great potential for studies of human trait diversity, disease genetics,
and forensics. Non-rigid registration is particularly useful for establishing
dense anatomical correspondences between faces. Here we describe a novel
non-rigid registration method for fully automatic 3D facial image mapping. This
method comprises two steps: first, seventeen facial landmarks are automatically
annotated, mainly via PCA-based feature recognition following 3D-to-2D data
transformation. Second, an efficient thin-plate spline (TPS) protocol is used
to establish the dense anatomical correspondence between facial images, under
the guidance of the predefined landmarks. We demonstrate that this method is
robust and highly accurate, even for different ethnicities. The average face is
calculated for individuals of Han Chinese and Uyghur origins. While fully
automatic and computationally efficient, this method enables high-throughput
analysis of human facial feature variation.Comment: 33 pages, 6 figures, 1 tabl
Object Detection in 20 Years: A Survey
Object detection, as of one the most fundamental and challenging problems in
computer vision, has received great attention in recent years. Its development
in the past two decades can be regarded as an epitome of computer vision
history. If we think of today's object detection as a technical aesthetics
under the power of deep learning, then turning back the clock 20 years we would
witness the wisdom of cold weapon era. This paper extensively reviews 400+
papers of object detection in the light of its technical evolution, spanning
over a quarter-century's time (from the 1990s to 2019). A number of topics have
been covered in this paper, including the milestone detectors in history,
detection datasets, metrics, fundamental building blocks of the detection
system, speed up techniques, and the recent state of the art detection methods.
This paper also reviews some important detection applications, such as
pedestrian detection, face detection, text detection, etc, and makes an in-deep
analysis of their challenges as well as technical improvements in recent years.Comment: This work has been submitted to the IEEE TPAMI for possible
publicatio
- …