81,344 research outputs found
ChimpACT: A Longitudinal Dataset for Understanding Chimpanzee Behaviors
Understanding the behavior of non-human primates is crucial for improving
animal welfare, modeling social behavior, and gaining insights into
distinctively human and phylogenetically shared behaviors. However, the lack of
datasets on non-human primate behavior hinders in-depth exploration of primate
social interactions, posing challenges to research on our closest living
relatives. To address these limitations, we present ChimpACT, a comprehensive
dataset for quantifying the longitudinal behavior and social relations of
chimpanzees within a social group. Spanning from 2015 to 2018, ChimpACT
features videos of a group of over 20 chimpanzees residing at the Leipzig Zoo,
Germany, with a particular focus on documenting the developmental trajectory of
one young male, Azibo. ChimpACT is both comprehensive and challenging,
consisting of 163 videos with a cumulative 160,500 frames, each richly
annotated with detection, identification, pose estimation, and fine-grained
spatiotemporal behavior labels. We benchmark representative methods of three
tracks on ChimpACT: (i) tracking and identification, (ii) pose estimation, and
(iii) spatiotemporal action detection of the chimpanzees. Our experiments
reveal that ChimpACT offers ample opportunities for both devising new methods
and adapting existing ones to solve fundamental computer vision tasks applied
to chimpanzee groups, such as detection, pose estimation, and behavior
analysis, ultimately deepening our comprehension of communication and sociality
in non-human primates.Comment: NeurIPS 202
Beat-Event Detection in Action Movie Franchises
While important advances were recently made towards temporally localizing and
recognizing specific human actions or activities in videos, efficient detection
and classification of long video chunks belonging to semantically defined
categories such as "pursuit" or "romance" remains challenging.We introduce a
new dataset, Action Movie Franchises, consisting of a collection of Hollywood
action movie franchises. We define 11 non-exclusive semantic categories -
called beat-categories - that are broad enough to cover most of the movie
footage. The corresponding beat-events are annotated as groups of video shots,
possibly overlapping.We propose an approach for localizing beat-events based on
classifying shots into beat-categories and learning the temporal constraints
between shots. We show that temporal constraints significantly improve the
classification performance. We set up an evaluation protocol for beat-event
localization as well as for shot classification, depending on whether movies
from the same franchise are present or not in the training data
A Finite State Machine Fall Detection Using Quadrilateral Shape Features
A video-based fall detection system was presented; which consists of data acquisition, image processing, feature extraction, feature selection, classification and finite state machine. A two-dimensional human posture image was represented by 12 features extracted from the generalisation of a silhouette shape to a quadrilateral. The corresponding feature vectors for three groups of human pose were statistically analysed by using a non-parametric Kruskal Wallis test to assess the different significance level between them. From the statistical test, non-significant features were discarded. Four selected kernel-based Support Vector Machine: linear, quadratics, cubic and Radial Basis Function classifiers were trained to classify three human posture groups. Among four classifiers, the last one performed the best in terms of performance matric on testing set. The classifier outperformed others with high achievement ofaverage sensitivity, precision and F-score of 99.19%, 99.25% and 99.22%, respectively. Such pose classification model output was further used in a simple finite state machine to trigger the falling event alarms. The fall detection system was tested on different fall video sets and able to detect the presence offalling events in a frame sequence of videos with accuracy of 97.32% and low computional time
Dublin City University video track experiments for TREC 2002
Dublin City University participated in the Feature Extraction task and the Search task of the TREC-2002 Video
Track. In the Feature Extraction task, we submitted 3 features: Face, Speech, and Music. In the Search task, we
developed an interactive video retrieval system, which incorporated the 40 hours of the video search test collection and supported user searching using our own feature extraction data along with the donated feature data and ASR transcript from other Video Track groups. This video retrieval system allows a user to specify a query based on the 10 features and ASR transcript, and the query result is a ranked list of videos that can be further browsed at the shot level. To evaluate the usefulness of the feature-based query, we have developed a second system interface that
provides only ASR transcript-based querying, and we conducted an experiment with 12 test users to compare these 2 systems. Results were submitted to NIST and we are currently conducting further analysis of user performance with these 2 systems
Discovery and recognition of motion primitives in human activities
We present a novel framework for the automatic discovery and recognition of
motion primitives in videos of human activities. Given the 3D pose of a human
in a video, human motion primitives are discovered by optimizing the `motion
flux', a quantity which captures the motion variation of a group of skeletal
joints. A normalization of the primitives is proposed in order to make them
invariant with respect to a subject anatomical variations and data sampling
rate. The discovered primitives are unknown and unlabeled and are
unsupervisedly collected into classes via a hierarchical non-parametric Bayes
mixture model. Once classes are determined and labeled they are further
analyzed for establishing models for recognizing discovered primitives. Each
primitive model is defined by a set of learned parameters.
Given new video data and given the estimated pose of the subject appearing on
the video, the motion is segmented into primitives, which are recognized with a
probability given according to the parameters of the learned models.
Using our framework we build a publicly available dataset of human motion
primitives, using sequences taken from well-known motion capture datasets. We
expect that our framework, by providing an objective way for discovering and
categorizing human motion, will be a useful tool in numerous research fields
including video analysis, human inspired motion generation, learning by
demonstration, intuitive human-robot interaction, and human behavior analysis
Towards robots reasoning about group behavior of museum visitors: leader detection and group tracking
The final publication is available at IOS Press through http://dx.doi.org/10.3233/AIS-170467Peer ReviewedPostprint (author's final draft
The TREC-2002 video track report
TREC-2002 saw the second running of the Video Track, the goal of which was to promote progress in content-based retrieval from digital video via open, metrics-based evaluation. The track used 73.3 hours of publicly available digital video (in MPEG-1/VCD format) downloaded by the participants directly from the Internet Archive (Prelinger Archives) (internetarchive, 2002) and some from the Open
Video Project (Marchionini, 2001). The material comprised advertising, educational, industrial, and amateur films produced between the 1930's and the 1970's by corporations, nonprofit organizations, trade associations, community and interest groups, educational institutions, and individuals. 17 teams representing 5 companies and 12 universities - 4 from Asia, 9 from Europe, and 4 from the US - participated in one or more of three tasks in the 2001 video track: shot boundary determination, feature extraction, and search (manual or interactive). Results were scored by NIST using manually created truth data for shot boundary determination and manual assessment of feature extraction and search results. This paper is an introduction to, and an overview
of, the track framework - the tasks, data, and measures - the approaches taken by the participating groups, the results, and issues regrading the evaluation. For detailed information about the approaches and results, the reader should see the various site reports in the final workshop proceedings
- …