Search CORE

13,375 research outputs found

Detecting human heads with their orientations

Author: Kimura Mitsuhiro
Matsuyama Takashi
Sugimoto Akihiro
Publication venue: 'Universitat Autonoma de Barcelona'
Publication date: 01/01/2005
Field of study

We propose a two-step method for detecting human heads with their orientations. In the first step, the method employs an ellipse as the contour model of human-head appearances to deal with wide variety of appearances. Our method then evaluates the ellipse to detect possible human heads. In the second step, on the other hand, our method focuses on features inside the ellipse, such as eyes, the mouth or cheeks, to model facial components. The method evaluates not only such components themselves but also their geometric configuration to eliminate false positives in the first step and, at the same time, to estimate face orientations. Our intensive experiments show that our method can correctly and stably detect human heads with their orientations

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Directory of Open Access Journals

Revistes Catalanes amb Accés Obert

Electronic Letters on Computer Vision and Image Analysis (ELCVIA - Universitat Autònoma de Barcelona)

Diposit Digital de Documents de la UAB

Secretaría de Estado de Cultura

SALSA: A Novel Dataset for Multimodal Group Behavior Analysis

Author: Alameda-Pineda Xavier
Batrinca Ligia
Lanz Oswald
Lepri Bruno
Ricci Elisa
Sebe Nicu
Staiano Jacopo
Subramanian Ramanathan
Publication venue
Publication date: 23/06/2015
Field of study

Studying free-standing conversational groups (FCGs) in unstructured social settings (e.g., cocktail party ) is gratifying due to the wealth of information available at the group (mining social networks) and individual (recognizing native behavioral and personality traits) levels. However, analyzing social scenes involving FCGs is also highly challenging due to the difficulty in extracting behavioral cues such as target locations, their speaking activity and head/body pose due to crowdedness and presence of extreme occlusions. To this end, we propose SALSA, a novel dataset facilitating multimodal and Synergetic sociAL Scene Analysis, and make two main contributions to research on automated social interaction analysis: (1) SALSA records social interactions among 18 participants in a natural, indoor environment for over 60 minutes, under the poster presentation and cocktail party contexts presenting difficulties in the form of low-resolution images, lighting variations, numerous occlusions, reverberations and interfering sound sources; (2) To alleviate these problems we facilitate multimodal analysis by recording the social interplay using four static surveillance cameras and sociometric badges worn by each participant, comprising the microphone, accelerometer, bluetooth and infrared sensors. In addition to raw data, we also provide annotations concerning individuals' personality as well as their position, head, body orientation and F-formation information over the entire event duration. Through extensive experiments with state-of-the-art approaches, we show (a) the limitations of current methods and (b) how the recorded multiple cues synergetically aid automatic analysis of social interactions. SALSA is available at http://tev.fbk.eu/salsa.Comment: 14 pages, 11 figure

arXiv.org e-Print Archive

Crossref

Archivio della ricerca - Fondazione Bruno Kessler

University of Canberra Research Repository

Adaptive multiscale detection of filamentary structures in a background of uniform random points

Author: Arias-Castro Ery
Donoho David L.
Huo Xiaoming
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2003
Field of study

We are given a set of

n

points that might be uniformly distributed in the unit square

[0,1]^2

. We wish to test whether the set, although mostly consisting of uniformly scattered points, also contains a small fraction of points sampled from some (a priori unknown) curve with

C^{\alpha}

-norm bounded by

\beta

. An asymptotic detection threshold exists in this problem; for a constant

T_-(\alpha,\beta)>0

, if the number of points sampled from the curve is smaller than

T_-(\alpha,\beta)n^{1/(1+\alpha)}

, reliable detection is not possible for large

n

. We describe a multiscale significant-runs algorithm that can reliably detect concentration of data near a smooth curve, without knowing the smoothness information

\alpha

\beta

in advance, provided that the number of points on the curve exceeds

T_*(\alpha,\beta)n^{1/(1+\alpha)}

. This algorithm therefore has an optimal detection threshold, up to a factor

T_*/T_-

. At the heart of our approach is an analysis of the data by counting membership in multiscale multianisotropic strips. The strips will have area

2/n

and exhibit a variety of lengths, orientations and anisotropies. The strips are partitioned into anisotropy classes; each class is organized as a directed graph whose vertices all are strips of the same anisotropy and whose edges link such strips to their ``good continuations.'' The point-cloud data are reduced to counts that measure membership in strips. Each anisotropy graph is reduced to a subgraph that consist of strips with significant counts. The algorithm rejects

\mathbf{H}_0

whenever some such subgraph contains a path that connects many consecutive significant counts.Comment: Published at http://dx.doi.org/10.1214/009053605000000787 in the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

CiteSeerX

Crossref

Automated Video Analysis of Animal Movements Using Gabor Orientation Filters

Author: Kristan Wiliam B., Jr.
Wagenaar Daniel A.
Publication venue: Humana Press Inc.
Publication date: 01/01/2010
Field of study

To quantify locomotory behavior, tools for determining the location and shape of an animal’s body are a first requirement. Video recording is a convenient technology to store raw movement data, but extracting body coordinates from video recordings is a nontrivial task. The algorithm described in this paper solves this task for videos of leeches or other quasi-linear animals in a manner inspired by the mammalian visual processing system: the video frames are fed through a bank of Gabor filters, which locally detect segments of the animal at a particular orientation. The algorithm assumes that the image location with maximal filter output lies on the animal’s body and traces its shape out in both directions from there. The algorithm successfully extracted location and shape information from video clips of swimming leeches, as well as from still photographs of swimming and crawling snakes. A Matlab implementation with a graphical user interface is available online, and should make this algorithm conveniently usable in many other contexts

CiteSeerX

Springer - Publisher Connector

PubMed Central

eScholarship - University of California

Caltech Authors

A Diagram Is Worth A Dozen Images

Author: B Alexe
CL Zitnick
F Pedregosa
J von Engelhardt
JRR Uijlings
M Twyman
R Horn
R Koncel-Kedziorski
RK Srihari
RW Ferguson
S Antol
S Hochreiter
SC Zhu
SK Card
Publication venue
Publication date: 23/03/2016
Field of study

Diagrams are common tools for representing complex concepts, relationships and events, often when it would be difficult to portray the same information with natural images. Understanding natural images has been extensively studied in computer vision, while diagram understanding has received little attention. In this paper, we study the problem of diagram interpretation and reasoning, the challenging task of identifying the structure of a diagram and the semantics of its constituents and their relationships. We introduce Diagram Parse Graphs (DPG) as our representation to model the structure of diagrams. We define syntactic parsing of diagrams as learning to infer DPGs for diagrams and study semantic interpretation and reasoning of diagrams in the context of diagram question answering. We devise an LSTM-based method for syntactic parsing of diagrams and introduce a DPG-based attention model for diagram question answering. We compile a new dataset of diagrams with exhaustive annotations of constituents and relationships for over 5,000 diagrams and 15,000 questions and answers. Our results show the significance of our models for syntactic parsing and question answering in diagrams using DPGs

arXiv.org e-Print Archive

Crossref

A Multi-Phase Anglo-Saxon Site in Ewelme

Author: Brookes SJ
Mileson S
Publication venue
Publication date: 01/01/2014
Field of study

New evidence is presented for a middle Anglo-Saxon ‘productive’ site on hilly ground north-west of Ewelme in south Oxfordshire. Coins and other finds from metal-detecting activity suggest the existence of an eighth- to ninth-century meeting or trading point located close to the Icknield Way. Th is place takes on an added significance because of its proximity to an early Anglo-Saxon cemetery and probably a late Anglo-Saxon meeting place. Th e authors provide an initial assessment of the site, its likely chronological development and its relationship with wider Anglo-Saxon activity in the upper Thames region and beyond. Some suggestions are made about the implications of the existence of such a long-lasting or recurring centre of activity for early medieval inhabitants’ perceptions of landscape

UCL Discovery