Search CORE

89 research outputs found

Action recognition from RGB-D data

Author: Wang Pichao
Publication venue: School of Computing and Information Technology
Publication date: 01/01/2017
Field of study

In recent years, action recognition based on RGB-D data has attracted increasing attention. Different from traditional 2D action recognition, RGB-D data contains extra depth and skeleton modalities. Different modalities have their own characteristics. This thesis presents seven novel methods to take advantages of the three modalities for action recognition. First, effective handcrafted features are designed and frequent pattern mining method is employed to mine the most discriminative, representative and nonredundant features for skeleton-based action recognition. Second, to take advantages of powerful Convolutional Neural Networks (ConvNets), it is proposed to represent spatio-temporal information carried in 3D skeleton sequences in three 2D images by encoding the joint trajectories and their dynamics into color distribution in the images, and ConvNets are adopted to learn the discriminative features for human action recognition. Third, for depth-based action recognition, three strategies of data augmentation are proposed to apply ConvNets to small training datasets. Forth, to take full advantage of the 3D structural information offered in the depth modality and its being insensitive to illumination variations, three simple, compact yet effective images-based representations are proposed and ConvNets are adopted for feature extraction and classification. However, both of previous two methods are sensitive to noise and could not differentiate well fine-grained actions. Fifth, it is proposed to represent a depth map sequence into three pairs of structured dynamic images at body, part and joint levels respectively through bidirectional rank pooling to deal with the issue. The structured dynamic image preserves the spatial-temporal information, enhances the structure information across both body parts/joints and different temporal scales, and takes advantages of ConvNets for action recognition. Sixth, it is proposed to extract and use scene flow for action recognition from RGB and depth data. Last, to exploit the joint information in multi-modal features arising from heterogeneous sources (RGB, depth), it is proposed to cooperatively train a single ConvNet (referred to as c-ConvNet) on both RGB features and depth features, and deeply aggregate the two modalities to achieve robust action recognition

Research Online

Detección de situaciones de violencia física interpersonal en videos usando técnicas de aprendizaje profundo

Author: Cuadros Ramos Kent Jhunior
Publication venue: 'Baishideng Publishing Group Inc.'
Publication date: 01/01/2023
Field of study

Diseña una arquitectura con el modelo de red neuronal convolucional Xception y LSTM para la detección de violencia física interpersonal en los videos de sistemas de vigilancia. Debido al aumento de inseguridad en el país y como medida preventiva, se buscó reforzar el sistema de videovigilancia, donde se enfocó en la necesidad de integrar nuevas tecnologías para supervisar la seguridad ciudadana como es el caso del uso de la visión artificial. Para el entrenamiento, validación y prueba de la arquitectura del modelo propuesto, se utilizó los conjuntos de datos Hockey Fight Dataset y Real Life Violence Situations Dataset. Los resultados obtenidos en la exactitud de nuestra propuesta en el conjunto de datos Hockey Fight Dataset supero a todos los demás métodos. En el caso del conjunto de datos Real Life Violence Situations Dataset que cuenta 2000 videos en contraste de otros conjuntos de datos utilizados para la detección de violencia, se obtuvieron buenos resultados en la exactitud mayores al 90%.Perú. Universidad Nacional Mayor de San Marcos. Vicerrectorado de Investigación y Posgrado. Proyectos de Investigación con Financiamiento para Grupos de Investigación. PCONFIGI. Código: C21201361. Resolución: 005753-2021-R/UNMS

Cybertesis UNMSM

Understanding egocentric human actions with temporal decision forests

Author: Garcia-Hernando Guillermo
Publication venue: Electrical and Electronic Engineering, Imperial College London
Publication date: 01/09/2018
Field of study

Understanding human actions is a fundamental task in computer vision with a wide range of applications including pervasive health-care, robotics and game control. This thesis focuses on the problem of egocentric action recognition from RGB-D data, wherein the world is viewed through the eyes of the actor whose hands describe the actions. The main contributions of this work are its findings regarding egocentric actions as described by hands in two application scenarios and a proposal of a new technique that is based on temporal decision forests. The thesis first introduces a novel framework to recognise fingertip writing in mid-air in the context of human-computer interaction. This framework detects whether the user is writing and tracks the fingertip over time to generate spatio-temporal trajectories that are recognised by using a Hough forest variant that encourages temporal consistency in prediction. A problem with using such forest approach for action recognition is that the learning of temporal dynamics is limited to hand-crafted temporal features and temporal regression, which may break the temporal continuity and lead to inconsistent predictions. To overcome this limitation, the thesis proposes transition forests. Besides any temporal information that is encoded in the feature space, the forest automatically learns the temporal dynamics during training, and it is exploited in inference in an online and efficient manner achieving state-of-the-art results. The last contribution of this thesis is its introduction of the first RGB-D benchmark to allow for the study of egocentric hand-object actions with both hand and object pose annotations. This study conducts an extensive evaluation of different baselines, state-of-the art approaches and temporal decision forest models using colour, depth and hand pose features. Furthermore, it extends the transition forest model to incorporate data from different modalities and demonstrates the benefit of using hand pose features to recognise egocentric human actions. The thesis concludes by discussing and analysing the contributions and proposing a few ideas for future work.Open Acces

Spiral - Imperial College Digital Repository

Contextual Understanding of Sequential Data Across Multiple Modalities

Author: Cho Sangwoo
Publication venue: 'Information Bulletin on Variable Stars (IBVS)'
Publication date: 01/01/2021
Field of study

In recent years, progress in computing and networking has made it possible to collect large volumes of data for various different applications in data mining and data analytics using machine learning methods. Data may come from different sources and in different shapes and forms depending on their inherent nature and the acquisition process. In this dissertation, we focus specifically on sequential data, which have been exponentially growing in recent years on platforms such as YouTube, social media, news agency sites, and other platforms. An important characteristic of sequential data is the inherent causal structure with latent patterns that can be discovered and learned from samples of the dataset. With this in mind, we target problems in two different domains of Computer Vision and Natural Language Processing that deal with sequential data and share the common characteristics of such data. The first one is action recognition based on video data, which is a fundamental problem in computer vision. This problem aims to find generalized patterns from videos to recognize or predict human actions. A video contains two important sets of information, i.e. appearance and motion. These information are complementary, and therefore an accurate recognition or prediction of activities or actions in video data depend significantly on our ability to extract them both. However, effective extraction of these information is a non-trivial task due to several challenges, such as viewpoint changes, camera motions, and scale variations, to name a few. It is thus crucial to design effective and generalized representations of video data that learn these variations and/or are invariant to such variations. We propose different models that learn and extract spatio-temporal correlations from video frames by using deep networks that overcome these challenges. The second problem that we study in this dissertation in the context of sequential data analysis is text summarization in multi-document processing. Sentences consist of sequence of words that imply context. The summarization task requires learning and understanding the contextual information from each sentence in order to determine which subset of sentences forms the best representative of a given article. With the progress made by deep learning, better representations of words have been achieved, leading in turn to better contextual representations of sentences. We propose summarization methods that combine mathematical optimization, Determinantal Point Processes (DPPs), and deep learning models that outperform the state of the art in multi-document text summarization

University of Central Florida (UCF): STARS (Showcase of Text, Archives, Research & Scholarship)

Drones Detection Using Smart Sensors

Author: Moafa Aishah
Publication venue: Scholarly Commons
Publication date: 01/04/2020
Field of study

Drones are modern and sophisticated technology that have been used in numerous fields. Nowadays, many countries use them in exploration, reconnaissance operations, and espionage in military operations. Drones also have many uses that are not limited to only daily life. For example, drones are used for home delivery, safety monitoring, and others. However, the use of drones is a double-edged sword. Drones can be used for positive purposes to improve the quality of human lives, but they can also be used for criminal purposes and other detrimental purposes. In fact, many countries have been attacked by terrorists using smart drones. Hence, drone detection is an active area of research and it receives the attention of many scholars. Advanced drones are, many times, difficult to detect, and hence they, sometimes, can be life threatening. Currently, most detection methods are based on video, sound, radar, temperature, radio frequency (RF), or Wi-Fi techniques. However, each detection method has several flaws that make them imperfect choices for drone detection in sensitive areas. Our aim is to overcome the challenges that most existing drone detection techniques face. In this thesis, we propose two modeling techniques and compare them to produce an efficient system for drone detection. Specifically, we compare the two proposed models by investigating the risk assessments and the probability of success for each model

Embry-Riddle Aeronautical University

Predicting Intentions from Motion: The Subject-Adversarial Adaptation Approach

Author: Becchio Cristina
Cavallo Andrea
Cavazza Jacopo
Morerio Pietro
Murino Vittorio
Volpi Riccardo
Zunino Andrea
Publication venue
Publication date: 18/09/2019
Field of study

Catalogo dei prodotti della ricerca

Open Access Repository

Institutional Research Information System University of Turin

Recent advances in video analytics for rail network surveillance for security, trespass and suicide prevention— a survey

Author: Aftab W.
Bosworth G.
Fletcher D.
Langran-Wheeler C.
Maddock S.
Mihaylova L.
Rigby S.
Zhang T.
Publication venue: 'MDPI AG'
Publication date: 01/06/2022
Field of study

Railway networks systems are by design open and accessible to people, but this presents challenges in the prevention of events such as terrorism, trespass, and suicide fatalities. With the rapid advancement of machine learning, numerous computer vision methods have been developed in closed-circuit television (CCTV) surveillance systems for the purposes of managing public spaces. These methods are built based on multiple types of sensors and are designed to automatically detect static objects and unexpected events, monitor people, and prevent potential dangers. This survey focuses on recently developed CCTV surveillance methods for rail networks, discusses the challenges they face, their advantages and disadvantages and a vision for future railway surveillance systems. State-of-the-art methods for object detection and behaviour recognition applied to rail network surveillance systems are introduced, and the ethics of handling personal data and the use of automated systems are also considered

Directory of Open Access Journals

PubMed Central

White Rose Research Online

Towards a Data-Driven Military:A multidisciplinary perspective

Author
Publication venue
Publication date: 01/01/2023
Field of study

International Migration, Integration and Social Cohesion online publications

Towards a Data-Driven Military:A multidisciplinary perspective

Author
Publication venue
Publication date: 01/01/2023
Field of study

International Migration, Integration and Social Cohesion online publications

Data-driven maintenance of military systems:Potential and challenges

Author: Homborg A.M.
Rijsdijk Chris
Tinga Tiedo
Publication venue: Leiden University Press
Publication date: 01/04/2023
Field of study

The success of military missions is largely dependent on the reliability and availability of the systems that are used. In modern warfare, data is considered as an important weapon, both in offence and defence. However, collection and analysis of the proper data can also play a crucial role in reducing the number of system failures, and thus increase the system availability and military performance considerably. In this chapter, the concept of data-driven maintenance will be introduced. First, the various maturity levels, ranging from detection of failures and automated diagnostics to advanced condition monitoring and predictive maintenance are introduced. Then, the different types of data and associated decisions are discussed. And finally, six practical cases from the Dutch MoD will be used to demonstrate the benefits of this concept and discuss the challenges that are encountered in applying this in military practice

University of Twente Research Information