2,260 research outputs found
Iris Indexing and Ear Classification
To identify an individual using a biometric system, the input biometric data has to be typically compared against that of each and every identity in the existing database during the matching stage. The response time of the system increases with the increase in number of individuals (i.e., database size), which is not acceptable in real time monitoring or when working on large scale data. This thesis addresses the problem of reducing the number of database candidates to be considered during matching in the context of iris and ear recognition. In the case of iris, an indexing mechanism based on Burrows Wheeler Transform (BWT) is proposed. Experiments on the CASIA version 3 iris database show a significant reduction in both search time and search space, suggesting the potential of this scheme for indexing iris databases. The ear classification scheme proposed in the thesis is based on parameterizing the shape of the ear and assigning it to one of four classes: round, rectangle, oval and triangle. Experiments on the MAGNA database suggest the potential of this scheme for classifying ear databases
End-to-end Recovery of Human Shape and Pose
We describe Human Mesh Recovery (HMR), an end-to-end framework for
reconstructing a full 3D mesh of a human body from a single RGB image. In
contrast to most current methods that compute 2D or 3D joint locations, we
produce a richer and more useful mesh representation that is parameterized by
shape and 3D joint angles. The main objective is to minimize the reprojection
loss of keypoints, which allow our model to be trained using images in-the-wild
that only have ground truth 2D annotations. However, the reprojection loss
alone leaves the model highly under constrained. In this work we address this
problem by introducing an adversary trained to tell whether a human body
parameter is real or not using a large database of 3D human meshes. We show
that HMR can be trained with and without using any paired 2D-to-3D supervision.
We do not rely on intermediate 2D keypoint detections and infer 3D pose and
shape parameters directly from image pixels. Our model runs in real-time given
a bounding box containing the person. We demonstrate our approach on various
images in-the-wild and out-perform previous optimization based methods that
output 3D meshes and show competitive results on tasks such as 3D joint
location estimation and part segmentation.Comment: CVPR 2018, Project page with code: https://akanazawa.github.io/hmr
Escape from Cells: Deep Kd-Networks for the Recognition of 3D Point Cloud Models
We present a new deep learning architecture (called Kd-network) that is
designed for 3D model recognition tasks and works with unstructured point
clouds. The new architecture performs multiplicative transformations and share
parameters of these transformations according to the subdivisions of the point
clouds imposed onto them by Kd-trees. Unlike the currently dominant
convolutional architectures that usually require rasterization on uniform
two-dimensional or three-dimensional grids, Kd-networks do not rely on such
grids in any way and therefore avoid poor scaling behaviour. In a series of
experiments with popular shape recognition benchmarks, Kd-networks demonstrate
competitive performance in a number of shape recognition tasks such as shape
classification, shape retrieval and shape part segmentation.Comment: Spotlight at ICCV'1
Video browsing interfaces and applications: a review
We present a comprehensive review of the state of the art in video browsing and retrieval systems, with special emphasis on interfaces and applications. There has been a significant increase in activity (e.g., storage, retrieval, and sharing) employing video data in the past decade, both for personal and professional use. The ever-growing amount of video content available for human consumption and the inherent characteristics of video data—which, if presented in its raw format, is rather unwieldy and costly—have become driving forces for the development of more effective solutions to present video contents and allow rich user interaction. As a result, there are many contemporary research efforts toward developing better video browsing solutions, which we summarize. We review more than 40 different video browsing and retrieval interfaces and classify them into three groups: applications that use video-player-like interaction, video retrieval applications, and browsing solutions based on video surrogates. For each category, we present a summary of existing work, highlight the technical aspects of each solution, and compare them against each other
- …