5,330 research outputs found
A Review of Codebook Models in Patch-Based Visual Object Recognition
The codebook model-based approach, while ignoring any structural aspect in vision, nonetheless provides state-of-the-art performances on current datasets. The key role of a visual codebook is to provide a way to map the low-level features into a fixed-length vector in histogram space to which standard classifiers can be directly applied. The discriminative power of such a visual codebook determines the quality of the codebook model, whereas the size of the codebook controls the complexity of the model. Thus, the construction of a codebook is an important step which is usually done by cluster analysis. However, clustering is a process that retains regions of high density in a distribution and it follows that the resulting codebook need not have discriminant properties. This is also recognised as a computational bottleneck of such systems. In our recent work, we proposed a resource-allocating codebook, to constructing a discriminant codebook in a one-pass design procedure that slightly outperforms more traditional approaches at drastically reduced computing times. In this review we survey several approaches that have been proposed over the last decade with their use of feature detectors, descriptors, codebook construction schemes, choice of classifiers in recognising objects, and datasets that were used in evaluating the proposed methods
PCA-RECT: An Energy-efficient Object Detection Approach for Event Cameras
We present the first purely event-based, energy-efficient approach for object
detection and categorization using an event camera. Compared to traditional
frame-based cameras, choosing event cameras results in high temporal resolution
(order of microseconds), low power consumption (few hundred mW) and wide
dynamic range (120 dB) as attractive properties. However, event-based object
recognition systems are far behind their frame-based counterparts in terms of
accuracy. To this end, this paper presents an event-based feature extraction
method devised by accumulating local activity across the image frame and then
applying principal component analysis (PCA) to the normalized neighborhood
region. Subsequently, we propose a backtracking-free k-d tree mechanism for
efficient feature matching by taking advantage of the low-dimensionality of the
feature representation. Additionally, the proposed k-d tree mechanism allows
for feature selection to obtain a lower-dimensional dictionary representation
when hardware resources are limited to implement dimensionality reduction.
Consequently, the proposed system can be realized on a field-programmable gate
array (FPGA) device leading to high performance over resource ratio. The
proposed system is tested on real-world event-based datasets for object
categorization, showing superior classification performance and relevance to
state-of-the-art algorithms. Additionally, we verified the object detection
method and real-time FPGA performance in lab settings under non-controlled
illumination conditions with limited training data and ground truth
annotations.Comment: Accepted in ACCV 2018 Workshops, to appea
How to Find More Supernovae with Less Work: Object Classification Techniques for Difference Imaging
We present the results of applying new object classification techniques to
difference images in the context of the Nearby Supernova Factory supernova
search. Most current supernova searches subtract reference images from new
images, identify objects in these difference images, and apply simple threshold
cuts on parameters such as statistical significance, shape, and motion to
reject objects such as cosmic rays, asteroids, and subtraction artifacts.
Although most static objects subtract cleanly, even a very low false positive
detection rate can lead to hundreds of non-supernova candidates which must be
vetted by human inspection before triggering additional followup. In comparison
to simple threshold cuts, more sophisticated methods such as Boosted Decision
Trees, Random Forests, and Support Vector Machines provide dramatically better
object discrimination. At the Nearby Supernova Factory, we reduced the number
of non-supernova candidates by a factor of 10 while increasing our supernova
identification efficiency. Methods such as these will be crucial for
maintaining a reasonable false positive rate in the automated transient alert
pipelines of upcoming projects such as PanSTARRS and LSST.Comment: 25 pages; 6 figures; submitted to Ap
Visually Indicated Sounds
Objects make distinctive sounds when they are hit or scratched. These sounds
reveal aspects of an object's material properties, as well as the actions that
produced them. In this paper, we propose the task of predicting what sound an
object makes when struck as a way of studying physical interactions within a
visual scene. We present an algorithm that synthesizes sound from silent videos
of people hitting and scratching objects with a drumstick. This algorithm uses
a recurrent neural network to predict sound features from videos and then
produces a waveform from these features with an example-based synthesis
procedure. We show that the sounds predicted by our model are realistic enough
to fool participants in a "real or fake" psychophysical experiment, and that
they convey significant information about material properties and physical
interactions
Human gesture classification by brute-force machine learning for exergaming in physiotherapy
In this paper, a novel approach for human gesture classification on skeletal data is proposed for the application of exergaming in physiotherapy. Unlike existing methods, we propose to use a general classifier like Random Forests to recognize dynamic gestures. The temporal dimension is handled afterwards by majority voting in a sliding window over the consecutive predictions of the classifier. The gestures can have partially similar postures, such that the classifier will decide on the dissimilar postures. This brute-force classification strategy is permitted, because dynamic human gestures show sufficient dissimilar postures. Online continuous human gesture recognition can classify dynamic gestures in an early stage, which is a crucial advantage when controlling a game by automatic gesture recognition. Also, ground truth can be easily obtained, since all postures in a gesture get the same label, without any discretization into consecutive postures. This way, new gestures can be easily added, which is advantageous in adaptive game development. We evaluate our strategy by a leave-one-subject-out cross-validation on a self-captured stealth game gesture dataset and the publicly available Microsoft Research Cambridge-12 Kinect (MSRC-12) dataset. On the first dataset we achieve an excellent accuracy rate of 96.72%. Furthermore, we show that Random Forests perform better than Support Vector Machines. On the second dataset we achieve an accuracy rate of 98.37%, which is on average 3.57% better then existing methods
- …