223 research outputs found

    Local and Global Contextual Features Fusion for Pedestrian Intention Prediction

    Full text link
    Autonomous vehicles (AVs) are becoming an indispensable part of future transportation. However, safety challenges and lack of reliability limit their real-world deployment. Towards boosting the appearance of AVs on the roads, the interaction of AVs with pedestrians including "prediction of the pedestrian crossing intention" deserves extensive research. This is a highly challenging task as involves multiple non-linear parameters. In this direction, we extract and analyse spatio-temporal visual features of both pedestrian and traffic contexts. The pedestrian features include body pose and local context features that represent the pedestrian's behaviour. Additionally, to understand the global context, we utilise location, motion, and environmental information using scene parsing technology that represents the pedestrian's surroundings, and may affect the pedestrian's intention. Finally, these multi-modality features are intelligently fused for effective intention prediction learning. The experimental results of the proposed model on the JAAD dataset show a superior result on the combined AUC and F1-score compared to the state-of-the-art

    PENGENALAN MANUSIA BERBASIS PADA SINGLE-GAIT MENGGUNAKAN METODE MODIFIKASI LATENT CONDITIONAL RANDOM FIELD (L-CRF)

    Get PDF
    Pengenalan gait merupakan salah satu bagian dari computer vision yang berfungsi untuk mengenali subjek (manusia) dengan jarak tertentu tanpa memperhatikan aspek biometrik seperti iris, wajah, dan sidik jari. Latent Conditional Random Field (L-CRF) merupakan salah satu algoritma pengenalan single-gait dengan hasil yang lebih baik.Walaupun hasil performansi akurasi subjek dengan kondisi berjalan normal (#NM) yang lebih baik, tapi masih terdapat masalah performansi akurasi terhadap kondisi berjalan lain seperti membawa tas (#BG) dan memakai jas (#CL). Modifikasi Latent Conditional Random Field (mL-CRF) merupakan salah satu metode yang masih berkaitan dengan L-CRF, tapi memiliki perbedaan pada parameter pairwise. Keunggulannya adalah hasil yang lebih baik dalam melatih dan menguji data dari domain yang identik. Penelitian ini menggunakan silhouette frames pada data set CASIA gait database B yang berisi 124 subjek dengan 110 sequence tiap subjek. Proses pengolahan data mLCRF dilakukan berdasarkan sampel training (LT74 & MT62) dan 11 sudut pengamatan yang akan dibandingkan dengan L-CRF tanpa modifikasi, serta penelitian-penelitian sebelumnya. Pada penelitian ini, LT74 pada mL-CRF merupakan sampel training yang paling baik yang menghasilkan peningkatan akurasi sebesar 0,89% (#NM), 1,32% (#BG), 1,54% (#CL) terhadap LCRF tanpa modifikasi

    Behavioral Intention Prediction in Driving Scenes: A Survey

    Full text link
    In the driving scene, the road agents usually conduct frequent interactions and intention understanding of the surroundings. Ego-agent (each road agent itself) predicts what behavior will be engaged by other road users all the time and expects a shared and consistent understanding for safe movement. Behavioral Intention Prediction (BIP) simulates such a human consideration process and fulfills the early prediction of specific behaviors. Similar to other prediction tasks, such as trajectory prediction, data-driven deep learning methods have taken the primary pipeline in research. The rapid development of BIP inevitably leads to new issues and challenges. To catalyze future research, this work provides a comprehensive review of BIP from the available datasets, key factors and challenges, pedestrian-centric and vehicle-centric BIP approaches, and BIP-aware applications. Based on the investigation, data-driven deep learning approaches have become the primary pipelines. The behavioral intention types are still monotonous in most current datasets and methods (e.g., Crossing (C) and Not Crossing (NC) for pedestrians and Lane Changing (LC) for vehicles) in this field. In addition, for the safe-critical scenarios (e.g., near-crashing situations), current research is limited. Through this investigation, we identify open issues in behavioral intention prediction and suggest possible insights for future research.Comment: 254 reference

    Predicting pedestrian crossing intentions using contextual information

    Get PDF
    El entorno urbano es uno de los escenarios m as complejos para un veh culo aut onomo, ya que lo comparte con otros tipos de usuarios conocidos como usuarios vulnerables de la carretera, con los peatones como mayor representante. Estos usuarios se caracterizan por su gran dinamicidad. A pesar del gran n umero de interacciones entre veh culos y peatones, la seguridad de estos ultimos no ha aumentado al mismo ritmo que la de los ocupantes de los veh culos. Por esta raz on, es necesario abordar este problema. Una posible estrategia estar a basada en conseguir que los veh culos anticipen el comportamiento de los peatones para minimizar situaciones de riesgo, especialmente presentes en el momento de cruce. El objetivo de esta tesis doctoral es alcanzar dicha anticipaci on mediante el desarrollo de t ecnicas de predicci on de la acci on de cruce de peatones basadas en aprendizaje profundo. Previo al dise~no e implementaci on de los sistemas de predicci on, se ha desarrollado un sistema de clasi caci on con el objetivo de discernir a los peatones involucrados en la escena vial. El sistema, basado en redes neuronales convolucionales, ha sido entrenado y validado con un conjunto de datos personalizado. Dicho conjunto se ha construido a partir de varios conjuntos existentes y aumentado mediante la inclusi on de im agenes obtenidas de internet. Este paso previo a la anticipaci on permitir a reducir el procesamiento innecesario dentro del sistema de percepci on del veh culo. Tras este paso, se han desarrollado dos sistemas como propuesta para abordar el problema de predicci on. El primer sistema, basado en redes convolucionales y recurrentes, obtiene una predicci on a corto plazo de la acci on de cruce realizada un segundo en el futuro. La informaci on de entrada al modelo est a basada principalmente en imagen, que permite aportar contexto adicional del peat on. Adem as, el uso de otras variables relacionadas con el peat on junto con mejoras en la arquitectura, permiten mejorar considerablemente los resultados en el conjunto de datos JAAD. El segundo sistema se basa en una arquitectura end-to-end basado en la combinaci on de redes neuronales convolucionales tridimensionales y/o el codi cador de la arquitectura Transformer. En este modelo, a diferencia del anterior, la mayor a de las mejoras est an centradas en transformaciones de los datos de entrada. Tras analizar dichas mejoras, una serie de modelos se han evaluado y comparado con otros m etodos utilizando tanto el conjunto de datos JAAD como PIE. Los resultados obtenidos han conseguido liderar el estado del arte, validando la arquitectura propuesta.The urban environment is one of the most complex scenarios for an autonomous vehicle, as it is shared with other types of users known as vulnerable road users, with pedestrians as their principal representative. These users are characterized by their great dynamicity. Despite a large number of interactions between vehicles and pedestrians, the safety of pedestrians has not increased at the same rate as that of vehicle occupants. For this reason, it is necessary to address this problem. One possible strategy would be anticipating pedestrian behavior to minimize risky situations, especially during the crossing. The objective of this doctoral thesis is to achieve such anticipation through the development of crosswalk action prediction techniques based on deep learning. Before the design and implementation of the prediction systems, a classi cation system has been developed to discern the pedestrians involved in the road scene. The system, based on convolutional neural networks, has been trained and validated with a customized dataset. This set has been built from several existing sets and augmented by including images obtained from the Internet. This pre-anticipation step would reduce unnecessary processing within the vehicle perception system. After this step, two systems have been developed as a proposal to solve the prediction problem. The rst system is composed of convolutional and recurrent encoder networks. It obtains a short-term prediction of the crossing action performed one second in the future. The input information to the model is mainly image-based, which provides additional pedestrian context. In addition, the use of pedestrian-related variables and architectural improvements allows better results on the JAAD dataset. The second system is an end-to-end architecture based on the combination of threedimensional convolutional neural networks and/or the Transformer architecture encoder. In this model, most of the proposed and investigated improvements are focused on transformations of the input data. After an extensive set of individual tests, several models have been trained, evaluated, and compared with other methods using both JAAD and PIE datasets. Obtained results are among the best state-of-the-art models, validating the proposed architecture

    Recognising activities by jointly modelling actions and their effects

    Get PDF
    With the rapid increase in adoption of consumer technologies, including inexpensive but powerful hardware, robotics appears poised at the cusp of widespread deployment in human environments. A key barrier that still prevents this is the machine understanding and interpretation of human activity, through a perceptual medium such as computer vision, or RBG-D sensing such as with the Microsoft Kinect sensor. This thesis contributes novel video-based methods for activity recognition. Specifically, the focus is on activities that involve interactions between the human user and objects in the environment. Based on streams of poses and object tracking, machine learning models are provided to recognize various of these interactions. The thesis main contributions are (1) a new model for interactions that explicitly learns the human-object relationships through a latent distributed representation, (2) a practical framework for labeling chains of manipulation actions in temporally extended activities and (3) an unsupervised sequence segmentation technique that relies on slow feature analysis and spectral clustering. These techniques are validated by experiments with publicly available data sets, such as the Cornell CAD-120 activity corpus which is one of the most extensive publicly available such data sets that is also annotated with ground truth information. Our experiments demonstrate the advantages of the proposed methods, over and above state of the art alternatives from the recent literature on sequence classifiers

    Recent Advances in Social Data and Artificial Intelligence 2019

    Get PDF
    The importance and usefulness of subjects and topics involving social data and artificial intelligence are becoming widely recognized. This book contains invited review, expository, and original research articles dealing with, and presenting state-of-the-art accounts pf, the recent advances in the subjects of social data and artificial intelligence, and potentially their links to Cyberspace

    Robust localization with wearable sensors

    Get PDF
    Measuring physical movements of humans and understanding human behaviour is useful in a variety of areas and disciplines. Human inertial tracking is a method that can be leveraged for monitoring complex actions that emerge from interactions between human actors and their environment. An accurate estimation of motion trajectories can support new approaches to pedestrian navigation, emergency rescue, athlete management, and medicine. However, tracking with wearable inertial sensors has several problems that need to be overcome, such as the low accuracy of consumer-grade inertial measurement units (IMUs), the error accumulation problem in long-term tracking, and the artefacts generated by movements that are less common. This thesis focusses on measuring human movements with wearable head-mounted sensors to accurately estimate the physical location of a person over time. The research consisted of (i) providing an overview of the current state of research for inertial tracking with wearable sensors, (ii) investigating the performance of new tracking algorithms that combine sensor fusion and data-driven machine learning, (iii) eliminating the effect of random head motion during tracking, (iv) creating robust long-term tracking systems with a Bayesian neural network and sequential Monte Carlo method, and (v) verifying that the system can be applied with changing modes of behaviour, defined as natural transitions from walking to running and vice versa. This research introduces a new system for inertial tracking with head-mounted sensors (which can be placed in, e.g. helmets, caps, or glasses). This technology can be used for long-term positional tracking to explore complex behaviours
    corecore