2,835 research outputs found
A Similarity Measure for Material Appearance
We present a model to measure the similarity in appearance between different
materials, which correlates with human similarity judgments. We first create a
database of 9,000 rendered images depicting objects with varying materials,
shape and illumination. We then gather data on perceived similarity from
crowdsourced experiments; our analysis of over 114,840 answers suggests that
indeed a shared perception of appearance similarity exists. We feed this data
to a deep learning architecture with a novel loss function, which learns a
feature space for materials that correlates with such perceived appearance
similarity. Our evaluation shows that our model outperforms existing metrics.
Last, we demonstrate several applications enabled by our metric, including
appearance-based search for material suggestions, database visualization,
clustering and summarization, and gamut mapping.Comment: 12 pages, 17 figure
Hallucinating optimal high-dimensional subspaces
Linear subspace representations of appearance variation are pervasive in
computer vision. This paper addresses the problem of robustly matching such
subspaces (computing the similarity between them) when they are used to
describe the scope of variations within sets of images of different (possibly
greatly so) scales. A naive solution of projecting the low-scale subspace into
the high-scale image space is described first and subsequently shown to be
inadequate, especially at large scale discrepancies. A successful approach is
proposed instead. It consists of (i) an interpolated projection of the
low-scale subspace into the high-scale space, which is followed by (ii) a
rotation of this initial estimate within the bounds of the imposed
``downsampling constraint''. The optimal rotation is found in the closed-form
which best aligns the high-scale reconstruction of the low-scale subspace with
the reference it is compared to. The method is evaluated on the problem of
matching sets of (i) face appearances under varying illumination and (ii)
object appearances under varying viewpoint, using two large data sets. In
comparison to the naive matching, the proposed algorithm is shown to greatly
increase the separation of between-class and within-class similarities, as well
as produce far more meaningful modes of common appearance on which the match
score is based.Comment: Pattern Recognition, 201
Activity-driven content adaptation for effective video summarisation
In this paper, we present a novel method for content adaptation and video summarization fully implemented in compressed-domain. Firstly, summarization of generic videos is modeled as the process of extracted human objects under various activities/events. Accordingly, frames are classified into five categories via fuzzy decision including shot changes (cut and gradual transitions), motion activities (camera motion and object motion) and others by using two inter-frame measurements. Secondly, human objects are detected using Haar-like features. With the detected human objects and attained frame categories, activity levels for each frame are determined to adapt with video contents. Continuous frames belonging to same category are grouped to form one activity entry as content of interest (COI) which will convert the original video into a series of activities. An overall adjustable quota is used to control the size of generated summarization for efficient streaming purpose. Upon this quota, the frames selected for summarization are determined by evenly sampling the accumulated activity levels for content adaptation. Quantitative evaluations have proved the effectiveness and efficiency of our proposed approach, which provides a more flexible and general solution for this topic as domain-specific tasks such as accurate recognition of objects can be avoided
Pedestrian detection in uncontrolled environments using stereo and biometric information
A method for pedestrian detection from challenging real world outdoor scenes is presented in this paper. This technique is able to extract multiple pedestrians, of varying orientations and appearances, from a scene even when faced with large and multiple occlusions. The technique is also robust to changing background lighting conditions and effects, such as shadows. The technique applies an enhanced method from which reliable disparity information can be obtained even from untextured homogeneous areas within a scene. This is used in conjunction with ground plane estimation and biometric information,to obtain reliable pedestrian regions. These regions are robust to erroneous areas of disparity data and also to severe pedestrian occlusion, which often occurs in unconstrained scenarios
Multimedia information technology and the annotation of video
The state of the art in multimedia information technology has not progressed to the point where a single solution is available to meet all reasonable needs of documentalists and users of video archives. In general, we do not have an optimistic view of the usability of new technology in this domain, but digitization and digital power can be expected to cause a small revolution in the area of video archiving. The volume of data leads to two views of the future: on the pessimistic side, overload of data will cause lack of annotation capacity, and on the optimistic side, there will be enough data from which to learn selected concepts that can be deployed to support automatic annotation. At the threshold of this interesting era, we make an attempt to describe the state of the art in technology. We sample the progress in text, sound, and image processing, as well as in machine learning
Noisy Subspace Clustering via Thresholding
We consider the problem of clustering noisy high-dimensional data points into
a union of low-dimensional subspaces and a set of outliers. The number of
subspaces, their dimensions, and their orientations are unknown. A
probabilistic performance analysis of the thresholding-based subspace
clustering (TSC) algorithm introduced recently in [1] shows that TSC succeeds
in the noisy case, even when the subspaces intersect. Our results reveal an
explicit tradeoff between the allowed noise level and the affinity of the
subspaces. We furthermore find that the simple outlier detection scheme
introduced in [1] provably succeeds in the noisy case.Comment: Presented at the IEEE Int. Symp. Inf. Theory (ISIT) 2013, Istanbul,
Turkey. The version posted here corrects a minor error in the published
version. Specifically, the exponent -c n_l in the success probability of
Theorem 1 and in the corresponding proof outline has been corrected to
-c(n_l-1
Review of Person Re-identification Techniques
Person re-identification across different surveillance cameras with disjoint
fields of view has become one of the most interesting and challenging subjects
in the area of intelligent video surveillance. Although several methods have
been developed and proposed, certain limitations and unresolved issues remain.
In all of the existing re-identification approaches, feature vectors are
extracted from segmented still images or video frames. Different similarity or
dissimilarity measures have been applied to these vectors. Some methods have
used simple constant metrics, whereas others have utilised models to obtain
optimised metrics. Some have created models based on local colour or texture
information, and others have built models based on the gait of people. In
general, the main objective of all these approaches is to achieve a
higher-accuracy rate and lowercomputational costs. This study summarises
several developments in recent literature and discusses the various available
methods used in person re-identification. Specifically, their advantages and
disadvantages are mentioned and compared.Comment: Published 201
- …