3,207 research outputs found
Boosted Random ferns for object detection
© 20xx IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.In this paper we introduce the Boosted Random Ferns (BRFs) to rapidly build discriminative classifiers for learning and detecting object categories. At the core of our approach we use standard random ferns, but we introduce four main innovations that let us bring ferns from an instance to a category level, and still retain efficiency. First, we define binary features on the histogram of oriented gradients-domain (as opposed to intensity-), allowing for a better representation of intra-class variability. Second, both the positions where ferns are evaluated within the sliding window, and the location of the binary features for each fern are not chosen completely at random, but instead we use a boosting strategy to pick the most discriminative combination of them. This is further enhanced by our third contribution, that is to adapt the boosting strategy to enable sharing of binary features among different ferns, yielding high recognition rates at a low computational cost. And finally, we show that training can be performed online, for sequentially arriving images. Overall, the resulting classifier can be very efficiently trained, densely evaluated for all image locations in about 0.1 seconds, and provides detection rates similar to competing approaches that require expensive and significantly slower processing times. We demonstrate the effectiveness of our approach by thorough experimentation in publicly available datasets in which we compare against state-of-the-art, and for tasks of both 2D detection and 3D multi-view estimation.Peer ReviewedPostprint (author's final draft
Towards a Scalable Hardware/Software Co-Design Platform for Real-time Pedestrian Tracking Based on a ZYNQ-7000 Device
Currently, most designers face a daunting task to
research different design flows and learn the intricacies of
specific software from various manufacturers in
hardware/software co-design. An urgent need of creating a
scalable hardware/software co-design platform has become a key
strategic element for developing hardware/software integrated
systems. In this paper, we propose a new design flow for building
a scalable co-design platform on FPGA-based system-on-chip.
We employ an integrated approach to implement a histogram
oriented gradients (HOG) and a support vector machine (SVM)
classification on a programmable device for pedestrian tracking.
Not only was hardware resource analysis reported, but the
precision and success rates of pedestrian tracking on nine open
access image data sets are also analysed. Finally, our proposed
design flow can be used for any real-time image processingrelated
products on programmable ZYNQ-based embedded
systems, which benefits from a reduced design time and provide a
scalable solution for embedded image processing products
Manipulating Highly Deformable Materials Using a Visual Feedback Dictionary
The complex physical properties of highly deformable materials such as
clothes pose significant challenges fanipulation systems. We present a novel
visual feedback dictionary-based method for manipulating defoor autonomous
robotic mrmable objects towards a desired configuration. Our approach is based
on visual servoing and we use an efficient technique to extract key features
from the RGB sensor stream in the form of a histogram of deformable model
features. These histogram features serve as high-level representations of the
state of the deformable material. Next, we collect manipulation data and use a
visual feedback dictionary that maps the velocity in the high-dimensional
feature space to the velocity of the robotic end-effectors for manipulation. We
have evaluated our approach on a set of complex manipulation tasks and
human-robot manipulation tasks on different cloth pieces with varying material
characteristics.Comment: The video is available at goo.gl/mDSC4
Articulated Clinician Detection Using 3D Pictorial Structures on RGB-D Data
Reliable human pose estimation (HPE) is essential to many clinical
applications, such as surgical workflow analysis, radiation safety monitoring
and human-robot cooperation. Proposed methods for the operating room (OR) rely
either on foreground estimation using a multi-camera system, which is a
challenge in real ORs due to color similarities and frequent illumination
changes, or on wearable sensors or markers, which are invasive and therefore
difficult to introduce in the room. Instead, we propose a novel approach based
on Pictorial Structures (PS) and on RGB-D data, which can be easily deployed in
real ORs. We extend the PS framework in two ways. First, we build robust and
discriminative part detectors using both color and depth images. We also
present a novel descriptor for depth images, called histogram of depth
differences (HDD). Second, we extend PS to 3D by proposing 3D pairwise
constraints and a new method that makes exact inference tractable. Our approach
is evaluated for pose estimation and clinician detection on a challenging RGB-D
dataset recorded in a busy operating room during live surgeries. We conduct
series of experiments to study the different part detectors in conjunction with
the various 2D or 3D pairwise constraints. Our comparisons demonstrate that 3D
PS with RGB-D part detectors significantly improves the results in a visually
challenging operating environment.Comment: The supplementary video is available at https://youtu.be/iabbGSqRSg
Real-time food intake classification and energy expenditure estimation on a mobile device
© 2015 IEEE.Assessment of food intake has a wide range of applications in public health and life-style related chronic disease management. In this paper, we propose a real-time food recognition platform combined with daily activity and energy expenditure estimation. In the proposed method, food recognition is based on hierarchical classification using multiple visual cues, supported by efficient software implementation suitable for realtime mobile device execution. A Fischer Vector representation together with a set of linear classifiers are used to categorize food intake. Daily energy expenditure estimation is achieved by using the built-in inertial motion sensors of the mobile device. The performance of the vision-based food recognition algorithm is compared to the current state-of-the-art, showing improved accuracy and high computational efficiency suitable for realtime feedback. Detailed user studies have also been performed to demonstrate the practical value of the software environment
See the Difference: Direct Pre-Image Reconstruction and Pose Estimation by Differentiating HOG
The Histogram of Oriented Gradient (HOG) descriptor has led to many advances
in computer vision over the last decade and is still part of many state of the
art approaches. We realize that the associated feature computation is piecewise
differentiable and therefore many pipelines which build on HOG can be made
differentiable. This lends to advanced introspection as well as opportunities
for end-to-end optimization. We present our implementation of HOG based
on the auto-differentiation toolbox Chumpy and show applications to pre-image
visualization and pose estimation which extends the existing differentiable
renderer OpenDR pipeline. Both applications improve on the respective
state-of-the-art HOG approaches
Asymmetric Pruning for Learning Cascade Detectors
Cascade classifiers are one of the most important contributions to real-time
object detection. Nonetheless, there are many challenging problems arising in
training cascade detectors. One common issue is that the node classifier is
trained with a symmetric classifier. Having a low misclassification error rate
does not guarantee an optimal node learning goal in cascade classifiers, i.e.,
an extremely high detection rate with a moderate false positive rate. In this
work, we present a new approach to train an effective node classifier in a
cascade detector. The algorithm is based on two key observations: 1) Redundant
weak classifiers can be safely discarded; 2) The final detector should satisfy
the asymmetric learning objective of the cascade architecture. To achieve this,
we separate the classifier training into two steps: finding a pool of
discriminative weak classifiers/features and training the final classifier by
pruning weak classifiers which contribute little to the asymmetric learning
criterion (asymmetric classifier construction). Our model reduction approach
helps accelerate the learning time while achieving the pre-determined learning
objective. Experimental results on both face and car data sets verify the
effectiveness of the proposed algorithm. On the FDDB face data sets, our
approach achieves the state-of-the-art performance, which demonstrates the
advantage of our approach.Comment: 14 page
- …