2,369 research outputs found
Sparsity in Dynamics of Spontaneous Subtle Emotions: Analysis \& Application
Spontaneous subtle emotions are expressed through micro-expressions, which
are tiny, sudden and short-lived dynamics of facial muscles; thus poses a great
challenge for visual recognition. The abrupt but significant dynamics for the
recognition task are temporally sparse while the rest, irrelevant dynamics, are
temporally redundant. In this work, we analyze and enforce sparsity constrains
to learn significant temporal and spectral structures while eliminate
irrelevant facial dynamics of micro-expressions, which would ease the challenge
in the visual recognition of spontaneous subtle emotions. The hypothesis is
confirmed through experimental results of automatic spontaneous subtle emotion
recognition with several sparsity levels on CASME II and SMIC, the only two
publicly available spontaneous subtle emotion databases. The overall
performances of the automatic subtle emotion recognition are boosted when only
significant dynamics are preserved from the original sequences.Comment: IEEE Transaction of Affective Computing (2016
Survey on video anomaly detection in dynamic scenes with moving cameras
The increasing popularity of compact and inexpensive cameras, e.g.~dash
cameras, body cameras, and cameras equipped on robots, has sparked a growing
interest in detecting anomalies within dynamic scenes recorded by moving
cameras. However, existing reviews primarily concentrate on Video Anomaly
Detection (VAD) methods assuming static cameras. The VAD literature with moving
cameras remains fragmented, lacking comprehensive reviews to date. To address
this gap, we endeavor to present the first comprehensive survey on Moving
Camera Video Anomaly Detection (MC-VAD). We delve into the research papers
related to MC-VAD, critically assessing their limitations and highlighting
associated challenges. Our exploration encompasses three application domains:
security, urban transportation, and marine environments, which in turn cover
six specific tasks. We compile an extensive list of 25 publicly-available
datasets spanning four distinct environments: underwater, water surface,
ground, and aerial. We summarize the types of anomalies these datasets
correspond to or contain, and present five main categories of approaches for
detecting such anomalies. Lastly, we identify future research directions and
discuss novel contributions that could advance the field of MC-VAD. With this
survey, we aim to offer a valuable reference for researchers and practitioners
striving to develop and advance state-of-the-art MC-VAD methods.Comment: Under revie
The Evolution of First Person Vision Methods: A Survey
The emergence of new wearable technologies such as action cameras and
smart-glasses has increased the interest of computer vision scientists in the
First Person perspective. Nowadays, this field is attracting attention and
investments of companies aiming to develop commercial devices with First Person
Vision recording capabilities. Due to this interest, an increasing demand of
methods to process these videos, possibly in real-time, is expected. Current
approaches present a particular combinations of different image features and
quantitative methods to accomplish specific objectives like object detection,
activity recognition, user machine interaction and so on. This paper summarizes
the evolution of the state of the art in First Person Vision video analysis
between 1997 and 2014, highlighting, among others, most commonly used features,
methods, challenges and opportunities within the field.Comment: First Person Vision, Egocentric Vision, Wearable Devices, Smart
Glasses, Computer Vision, Video Analytics, Human-machine Interactio
Spatio-temporal action localization with Deep Learning
Dissertação de mestrado em Engenharia InformáticaThe system that detects and identifies human activities are named human action recognition.
On the video approach, human activity is classified into four different categories, depending
on the complexity of the steps and the number of body parts involved in the action, namely
gestures, actions, interactions, and activities, which is challenging for video Human action
recognition to capture valuable and discriminative features because of the human body’s
variations. So, deep learning techniques have provided practical applications in multiple fields
of signal processing, usually surpassing traditional signal processing on a large scale.
Recently, several applications, namely surveillance, human-computer interaction, and video
recovery based on its content, have studied violence’s detection and recognition. In recent
years there has been a rapid growth in the production and consumption of a wide variety of
video data due to the popularization of high quality and relatively low-price video devices.
Smartphones and digital cameras contributed a lot to this factor. At the same time, there are
about 300 hours of video data updates every minute on YouTube. Along with the growing
production of video data, new technologies such as video captioning, answering video surveys,
and video-based activity/event detection are emerging every day. From the video input data,
the detection of human activity indicates which activity is contained in the video and locates
the regions in the video where the activity occurs.
This dissertation has conducted an experiment to identify and detect violence with spatial action localization, adapting a public dataset for effect. The idea was used an annotated
dataset of general action recognition and adapted only for violence detection.O sistema que deteta e identifica as atividades humanas é denominado reconhecimento da
ação humana. Na abordagem por vídeo, a atividade humana é classificada em quatro
categorias diferentes, dependendo da complexidade das etapas e do número de partes do
corpo envolvidas na ação, a saber, gestos, ações, interações e atividades, o que é desafiador
para o reconhecimento da ação humana do vídeo para capturar características valiosas e
discriminativas devido às variações do corpo humano. Portanto, as técnicas de deep learning
forneceram aplicações práticas em vários campos de processamento de sinal, geralmente
superando o processamento de sinal tradicional em grande escala.
Recentemente, várias aplicações, nomeadamente na vigilância, interação humano computador e recuperação de vídeo com base no seu conteúdo, estudaram a deteção e o
reconhecimento da violência. Nos últimos anos, tem havido um rápido crescimento na
produção e consumo de uma ampla variedade de dados de vídeo devido à popularização de
dispositivos de vídeo de alta qualidade e preços relativamente baixos. Smartphones e cameras
digitais contribuíram muito para esse fator. Ao mesmo tempo, há cerca de 300 horas de
atualizações de dados de vídeo a cada minuto no YouTube. Junto com a produção crescente
de dados de vídeo, novas tecnologias, como legendagem de vídeo, respostas a pesquisas de
vídeo e deteção de eventos / atividades baseadas em vídeo estão surgindo todos os dias. A
partir dos dados de entrada de vídeo, a deteção de atividade humana indica qual atividade
está contida no vídeo e localiza as regiões no vídeo onde a atividade ocorre.
Esta dissertação conduziu uma experiência para identificar e detetar violência com localização
espacial, adaptando um dataset público para efeito. A ideia foi usada um conjunto de dados
anotado de reconhecimento de ações gerais e adaptá-la apenas para deteção de violência
Background Subtraction in Video Surveillance
The aim of thesis is the real-time detection of moving and unconstrained surveillance environments monitored with static cameras. This is achieved based on the results provided by background subtraction. For this task, Gaussian Mixture Models (GMMs) and Kernel density estimation (KDE) are used. A thorough review of state-of-the-art formulations for the use of GMMs and KDE in the task of background subtraction reveals some further development opportunities, which are tackled in a novel GMM-based approach incorporating a variance controlling scheme. The proposed approach method is for parametric and non-parametric and gives us the better method for background subtraction, with more accuracy and easier parametrization of the models, for different environments. It also converges to more accurate models of the scenes. The detection of moving objects is achieved by using the results of background subtraction. For the detection of new static objects, two background models, learning at different rates, are used. This allows for a multi-class pixel classification, which follows the temporality of the changes detected by means of background subtraction. In a first approach, the subtraction of background models is done for parametric model and their results are shown. The second approach is for non-parametric models, where background subtraction is done using KDE non-parametric model. Furthermore, we have done some video engineering, where the background subtraction algorithm was employed so that, the background from one video and the foreground from another video are merged to form a new video. By doing this way, we can also do more complex video engineering with multiple videos. Finally, the results provided by region analysis can be used to improve the quality of the background models, therefore, considerably improving the detection results
Applications of Factorization Theorem and Ontologies for Activity ModelingRecognition and Anomaly Detection
In this thesis two approaches for activity modeling and suspicious activity detection are
examined. First is application of factorization theorem extension for deformable models in two
dierent contexts. First is human activity detection from joint position information, and second
is suspicious activity detection for tarmac security. It is shown that the first basis vector from
factorization theorem is good enough to dierentiate activities for human data and to distinguish
suspicious activities for tarmac security data.
Second approach dierentiates individual components of those activities using semantic methodol-
ogy. Although currently mainly used for improving search and information retrieval, we show that
ontologies are applicable to video surveillance. We evaluate the domain ontologies from Challenge
Project on Video Event Taxonomy sponsored by ARDA from the perspective of general ontology
design principles. We also focused on the eect of the domain on the granularity of the ontology
for suspicious activity detection
- …