228 research outputs found
Generalized Boundaries from Multiple Image Interpretations
Boundary detection is essential for a variety of computer vision tasks such
as segmentation and recognition. In this paper we propose a unified formulation
and a novel algorithm that are applicable to the detection of different types
of boundaries, such as intensity edges, occlusion boundaries or object category
specific boundaries. Our formulation leads to a simple method with
state-of-the-art performance and significantly lower computational cost than
existing methods. We evaluate our algorithm on different types of boundaries,
from low-level boundaries extracted in natural images, to occlusion boundaries
obtained using motion cues and RGB-D cameras, to boundaries from
soft-segmentation. We also propose a novel method for figure/ground
soft-segmentation that can be used in conjunction with our boundary detection
method and improve its accuracy at almost no extra computational cost
Labeling the Features Not the Samples: Efficient Video Classification with Minimal Supervision
Feature selection is essential for effective visual recognition. We propose
an efficient joint classifier learning and feature selection method that
discovers sparse, compact representations of input features from a vast sea of
candidates, with an almost unsupervised formulation. Our method requires only
the following knowledge, which we call the \emph{feature sign}---whether or not
a particular feature has on average stronger values over positive samples than
over negatives. We show how this can be estimated using as few as a single
labeled training sample per class. Then, using these feature signs, we extend
an initial supervised learning problem into an (almost) unsupervised clustering
formulation that can incorporate new data without requiring ground truth
labels. Our method works both as a feature selection mechanism and as a fully
competitive classifier. It has important properties, low computational cost and
excellent accuracy, especially in difficult cases of very limited training
data. We experiment on large-scale recognition in video and show superior speed
and performance to established feature selection approaches such as AdaBoost,
Lasso, greedy forward-backward selection, and powerful classifiers such as SVM.Comment: arXiv admin note: text overlap with arXiv:1411.771
PCA-SIFT: A more distinctive representation for local image descriptors
Stable local feature detection and representation is a fundamental component of many image registration and object recognition algorithms. Mikolajczyk and Schmid [14] recently evaluated a variety of approaches and identified the SIFT [11] algorithm as being the most resistant to common image deformations. This paper examines (and improves upon) the local image descriptor used by SIFT. Like SIFT, our descriptors encode the salient aspects of the image gradient in the feature point's neighborhood; however, instead of using SIFT's smoothed weighted histograms, we apply Principal Components Analysis (PCA) to the normalized gradient patch. Our experiments demonstrate that the PCAbased local descriptors are more distinctive, more robust to image deformations, and more compact than the standard SIFT representation. We also present results showing that using these descriptors in an image retrieval application results in increased accuracy and faster matching
PORNOGRAPHY DETECTION IN VIDEO USING CHARACTERISTIC MOTION PATTERNS
A mechanism for detecting pornographic content in a video based on an analysis of characteristic motions in the video. The motion-based mechanism is to appearance-based and audio-based approaches, and therefore can be employed in combination with such approaches, or by itself. In one implementation, the mechanism employs a first phase of unsupervised learning, and a second phase of supervised learning. In the first phase, characteristic motion patterns are discovered and clustering is performed. In one implementation, characteristic motion patterns are discovered by analyzing the relative displacements of large numbers of ordered trajectory pairs over time. In the second phase, a classifier is trained to associate the clusters identified in the first phase with pornographic content. A motion pattern descriptor (e.g., a feature vector, etc.) for a video (e.g., a newly-uploaded video, etc.) is obtained and is provided to the trained classifier to obtain a pornography score for the video
- …