Search CORE

340 research outputs found

Dynamic texture recognition using time-causal and time-recursive spatio-temporal receptive fields

Author: Jansson Ylva
Lindeberg Tony
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

This work presents a first evaluation of using spatio-temporal receptive fields from a recently proposed time-causal spatio-temporal scale-space framework as primitives for video analysis. We propose a new family of video descriptors based on regional statistics of spatio-temporal receptive field responses and evaluate this approach on the problem of dynamic texture recognition. Our approach generalises a previously used method, based on joint histograms of receptive field responses, from the spatial to the spatio-temporal domain and from object recognition to dynamic texture recognition. The time-recursive formulation enables computationally efficient time-causal recognition. The experimental evaluation demonstrates competitive performance compared to state-of-the-art. Especially, it is shown that binary versions of our dynamic texture descriptors achieve improved performance compared to a large range of similar methods using different primitives either handcrafted or learned from data. Further, our qualitative and quantitative investigation into parameter choices and the use of different sets of receptive fields highlights the robustness and flexibility of our approach. Together, these results support the descriptive power of this family of time-causal spatio-temporal receptive fields, validate our approach for dynamic texture recognition and point towards the possibility of designing a range of video analysis methods based on these new time-causal spatio-temporal primitives.Comment: 29 pages, 16 figure

arXiv.org e-Print Archive

Publikationer från KTH

Crossref

Digitala Vetenskapliga Arkivet - Academic Archive On-line

Convolutional Neural Network on Three Orthogonal Planes for Dynamic Texture Classification

Author: Andrearczyk Vincent
Whelan Paul F.
Publication venue
Publication date: 16/03/2017
Field of study

Dynamic Textures (DTs) are sequences of images of moving scenes that exhibit certain stationarity properties in time such as smoke, vegetation and fire. The analysis of DT is important for recognition, segmentation, synthesis or retrieval for a range of applications including surveillance, medical imaging and remote sensing. Deep learning methods have shown impressive results and are now the new state of the art for a wide range of computer vision tasks including image and video recognition and segmentation. In particular, Convolutional Neural Networks (CNNs) have recently proven to be well suited for texture analysis with a design similar to a filter bank approach. In this paper, we develop a new approach to DT analysis based on a CNN method applied on three orthogonal planes x y , xt and y t . We train CNNs on spatial frames and temporal slices extracted from the DT sequences and combine their outputs to obtain a competitive DT classifier. Our results on a wide range of commonly used DT classification benchmark datasets prove the robustness of our approach. Significant improvement of the state of the art is shown on the larger datasets.Comment: 19 pages, 10 figure

arXiv.org e-Print Archive

Irish Universities

DCU Online Research Access Service

АЛГОРИТМ АНАЛИЗА ДИНАМИЧЕСКИХ ТЕКСТУР

Author: Pyataeva Anna Vladimirovna
Raevich Ksenia Vladislavovna
Publication venue: 'Science and Innovation Center'
Publication date: 01/09/2018
Field of study

Recognizing dynamic patterns based on visual processing is significant for many applications such as remote monitoring for the prevention of natural disasters, e.g. forest fires, various types of surveillance, e.g. traffic monitoring, background subtraction in challenging environments, e.g. outdoor scenes with vegetation, homeland security applications and scientific studies of animal behavior. In the context of surveillance, recognizing dynamic patterns is of significance to isolate activities of interest (e.g. fire) from distracting background (e.g. windblown vegetation and changes in scene illumination).Methods: pattern recognition, computer vision.Results: This paper presents video based image processing algorithm with samples usually containing a cluttered background. According to the spatiotemporal features, four categorized groups were formulated. Dynamic texture recognition algorithm refers image objects to one of this group. Motion, color, facial, energy Laws and ELBP features are extracted for dynamic texture categorization. Classification based on boosted random forest.Practical relevance: Experimental results show that the proposed method is feasible and effective for video-based dynamic texture categorization. Averaged classification accuracy on the all video images is 95.2%.Постановка проблемы: Обнаружение динамических текстур на видеоизображениях в настоящее время находит все более широкое применение в системах компьютерного зрения. Например, обнаружение дыма и пламени в системах экологического мониторинга, анализ автомобильного трафика при мониторинге загруженности дорог, и в других системах. Поиск объекта интереса на динамическом фоне часто бывает затруднен за счет похожих текстурных признаков или признаков движения у фона и искомого объекта. В связи с этим возникает необходимость разработки алгоритма классификации динамических текстур для выделения объектов интереса на динамическом фоне.Методы: распознавание образов, компьютерное зрение.Результаты: В данной работе рассматривается обработка видеоизображений содержащих объекты с динамическим поведением на динамическом фоне, такие как вода, туман, пламя, текстиль на ветру и др. Разработан алгоритм отнесения объектов видеоизображения к одной из четырех предлагаемых категорий. Извлекаются признаки движения, цветовые особенности, фрактальности, энергетические признаки Ласа, строятся ELBP-гистограммы. В качестве классификатора использован бустинговый случайный лес.Практическая значимость: Разработан метод, позволяющий разделить динамические текстур на категории: по типу движения (периодическое и хаотичное) и типу объектов интереса (природные и искусственные). Экспериментальные исследования подтверждают эффективность предложенного алгоритма для отнесения объектов изображения к той или иной категории. Средняя точность классификации составила 95.2%

Crossref

Publishing House Science and Innovation Center: E-Journals / Научно-инновационный центр

Directory of Open Access Journals

Semantic Model Vectors for Complex Video Event Recognition

Author: Apostol Natsev
Bert Huang
Gang Hua
Lexing Xie
Michele Merler
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Crossref

Unleashing the Power of VGG16: Advancements in Facial Emotion Recognization

Author: P V V S Srinivas et al.
Publication venue: Auricle Global Society of Education and Research
Publication date: 02/11/2023
Field of study

In facial emotion detection, researchers are actively exploring effective methods to identify and understand facial expressions. This study introduces a novel mechanism for emotion identification using diverse facial photos captured under varying lighting conditions. A meticulously pre-processed dataset ensures data consistency and quality. Leveraging deep learning architectures, the study utilizes feature extraction techniques to capture subtle emotive cues and build an emotion classification model using convolutional neural networks (CNNs). The proposed methodology achieves an impressive 97% accuracy on the validation set, outperforming previous methods in terms of accuracy and robustness. Challenges such as lighting variations, head posture, and occlusions are acknowledged, and multimodal approaches incorporating additional modalities like auditory or physiological data are suggested for further improvement. The outcomes of this research have wide-ranging implications for affective computing, human-computer interaction, and mental health diagnosis, advancing the field of facial emotion identification and paving the way for sophisticated technology capable of understanding and responding to human emotions across diverse domains

International Journal on Recent and Innovation Trends in Computing and Communication

Video texture analysis based on HEVC encoding statistics

Author: Afonso Mariana
Agrafiotis Dimitris
Bull David
Katsenou Angeliki
Zhang Aaron
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 24/04/2017
Field of study

Explore Bristol Research

Discriminatively Trained Latent Ordinal Model for Video Classification

Author: Sharma Gaurav
Sikka Karan
Publication venue
Publication date: 14/08/2017
Field of study

We study the problem of video classification for facial analysis and human action recognition. We propose a novel weakly supervised learning method that models the video as a sequence of automatically mined, discriminative sub-events (eg. onset and offset phase for "smile", running and jumping for "highjump"). The proposed model is inspired by the recent works on Multiple Instance Learning and latent SVM/HCRF -- it extends such frameworks to model the ordinal aspect in the videos, approximately. We obtain consistent improvements over relevant competitive baselines on four challenging and publicly available video based facial analysis datasets for prediction of expression, clinical pain and intent in dyadic conversations and on three challenging human action datasets. We also validate the method with qualitative results and show that they largely support the intuitions behind the method.Comment: Paper accepted in IEEE TPAMI. arXiv admin note: substantial text overlap with arXiv:1604.0150

arXiv.org e-Print Archive

MPG.PuRe

Efficient Human Activity Recognition in Large Image and Video Databases

Author: Cheema Muhammad Shahzad
Publication venue: Universitäts- und Landesbibliothek Bonn
Publication date
Field of study

Vision-based human action recognition has attracted considerable interest in recent research for its applications to video surveillance, content-based search, healthcare, and interactive games. Most existing research deals with building informative feature descriptors, designing efficient and robust algorithms, proposing versatile and challenging datasets, and fusing multiple modalities. Often, these approaches build on certain conventions such as the use of motion cues to determine video descriptors, application of off-the-shelf classifiers, and single-factor classification of videos. In this thesis, we deal with important but overlooked issues such as efficiency, simplicity, and scalability of human activity recognition in different application scenarios: controlled video environment (e.g.~indoor surveillance), unconstrained videos (e.g.~YouTube), depth or skeletal data (e.g.~captured by Kinect), and person images (e.g.~Flicker). In particular, we are interested in answering questions like (a) is it possible to efficiently recognize human actions in controlled videos without temporal cues? (b) given that the large-scale unconstrained video data are often of high dimension low sample size (HDLSS) nature, how to efficiently recognize human actions in such data? (c) considering the rich 3D motion information available from depth or motion capture sensors, is it possible to recognize both the actions and the actors using only the motion dynamics of underlying activities? and (d) can motion information from monocular videos be used for automatically determining saliency regions for recognizing actions in still images

bonndoc – Der Publikationsserver der Universität Bonn

Object and action annotation in visual media beyond categories

Author: Becattini Federico
Publication venue
Publication date: 01/01/2018
Field of study

Florence Research