13,375 research outputs found
Detecting human heads with their orientations
We propose a two-step method for detecting human heads with their orientations. In the first step, the method employs an ellipse as the contour model of human-head appearances to deal with wide variety of appearances. Our method then evaluates the ellipse to detect possible human heads. In the second step, on the other hand, our method focuses on features inside the ellipse, such as eyes, the mouth or cheeks, to model facial components. The method evaluates not only such components themselves but also their geometric configuration to eliminate false positives in the first step and, at the same time, to estimate face orientations. Our intensive experiments show that our method can correctly and stably detect human heads with their orientations
SALSA: A Novel Dataset for Multimodal Group Behavior Analysis
Studying free-standing conversational groups (FCGs) in unstructured social
settings (e.g., cocktail party ) is gratifying due to the wealth of information
available at the group (mining social networks) and individual (recognizing
native behavioral and personality traits) levels. However, analyzing social
scenes involving FCGs is also highly challenging due to the difficulty in
extracting behavioral cues such as target locations, their speaking activity
and head/body pose due to crowdedness and presence of extreme occlusions. To
this end, we propose SALSA, a novel dataset facilitating multimodal and
Synergetic sociAL Scene Analysis, and make two main contributions to research
on automated social interaction analysis: (1) SALSA records social interactions
among 18 participants in a natural, indoor environment for over 60 minutes,
under the poster presentation and cocktail party contexts presenting
difficulties in the form of low-resolution images, lighting variations,
numerous occlusions, reverberations and interfering sound sources; (2) To
alleviate these problems we facilitate multimodal analysis by recording the
social interplay using four static surveillance cameras and sociometric badges
worn by each participant, comprising the microphone, accelerometer, bluetooth
and infrared sensors. In addition to raw data, we also provide annotations
concerning individuals' personality as well as their position, head, body
orientation and F-formation information over the entire event duration. Through
extensive experiments with state-of-the-art approaches, we show (a) the
limitations of current methods and (b) how the recorded multiple cues
synergetically aid automatic analysis of social interactions. SALSA is
available at http://tev.fbk.eu/salsa.Comment: 14 pages, 11 figure
Adaptive multiscale detection of filamentary structures in a background of uniform random points
We are given a set of points that might be uniformly distributed in the
unit square . We wish to test whether the set, although mostly
consisting of uniformly scattered points, also contains a small fraction of
points sampled from some (a priori unknown) curve with -norm
bounded by . An asymptotic detection threshold exists in this problem;
for a constant , if the number of points sampled from the
curve is smaller than , reliable detection
is not possible for large . We describe a multiscale significant-runs
algorithm that can reliably detect concentration of data near a smooth curve,
without knowing the smoothness information or in advance,
provided that the number of points on the curve exceeds
. This algorithm therefore has an optimal
detection threshold, up to a factor . At the heart of our approach is
an analysis of the data by counting membership in multiscale multianisotropic
strips. The strips will have area and exhibit a variety of lengths,
orientations and anisotropies. The strips are partitioned into anisotropy
classes; each class is organized as a directed graph whose vertices all are
strips of the same anisotropy and whose edges link such strips to their ``good
continuations.'' The point-cloud data are reduced to counts that measure
membership in strips. Each anisotropy graph is reduced to a subgraph that
consist of strips with significant counts. The algorithm rejects
whenever some such subgraph contains a path that connects many consecutive
significant counts.Comment: Published at http://dx.doi.org/10.1214/009053605000000787 in the
Annals of Statistics (http://www.imstat.org/aos/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Automated Video Analysis of Animal Movements Using Gabor Orientation Filters
To quantify locomotory behavior, tools for determining the location and shape of an animal’s body are a first requirement. Video recording is a convenient technology to store raw movement data, but extracting body coordinates from video recordings is a nontrivial task. The algorithm described in this paper solves this task for videos of leeches or other quasi-linear animals in a manner inspired by the mammalian visual processing system: the video frames are fed through a bank of Gabor filters, which locally detect segments of the animal at a particular orientation. The algorithm assumes that the image location with maximal filter output lies on the animal’s body and traces its shape out in both directions from there. The algorithm successfully extracted location and shape information from video clips of swimming leeches, as well as from still photographs of swimming and crawling snakes. A Matlab implementation with a graphical user interface is available online, and should make this algorithm conveniently usable in many other contexts
A Diagram Is Worth A Dozen Images
Diagrams are common tools for representing complex concepts, relationships
and events, often when it would be difficult to portray the same information
with natural images. Understanding natural images has been extensively studied
in computer vision, while diagram understanding has received little attention.
In this paper, we study the problem of diagram interpretation and reasoning,
the challenging task of identifying the structure of a diagram and the
semantics of its constituents and their relationships. We introduce Diagram
Parse Graphs (DPG) as our representation to model the structure of diagrams. We
define syntactic parsing of diagrams as learning to infer DPGs for diagrams and
study semantic interpretation and reasoning of diagrams in the context of
diagram question answering. We devise an LSTM-based method for syntactic
parsing of diagrams and introduce a DPG-based attention model for diagram
question answering. We compile a new dataset of diagrams with exhaustive
annotations of constituents and relationships for over 5,000 diagrams and
15,000 questions and answers. Our results show the significance of our models
for syntactic parsing and question answering in diagrams using DPGs
A Multi-Phase Anglo-Saxon Site in Ewelme
New evidence is presented for a middle Anglo-Saxon ‘productive’ site on hilly ground north-west of Ewelme in south Oxfordshire. Coins and other finds from metal-detecting activity suggest the existence of an eighth- to ninth-century meeting or trading point located close to the Icknield Way. Th is place takes on an added significance because of its proximity to an early Anglo-Saxon cemetery and probably a late Anglo-Saxon meeting place. Th e authors provide an initial assessment of the site, its likely chronological development and its relationship with wider Anglo-Saxon activity in the upper Thames region and beyond. Some suggestions are made about the implications of the existence of such a long-lasting or recurring centre of activity for early medieval inhabitants’ perceptions of landscape
- …