104 research outputs found
Abordagem Computacional para Detecção Automatizada por Imagem do Uso de Cinto de Segurança em Condutores baseado em Redes Neurais Convolucionais
TCC(graduação) - Universidade Federal de Santa Catarina. Campus Araranguá. Engenharia da Computação.O desenvolvimento sustentável de qualquer cidade inteligente depende diretamente
vários fatores, dentre estes, um bom planejamento urbano de modo a utilizar de forma
otimizada das rotas de fluxo veiculares e com segurança. No entanto, uma maior
capacidade de vazão implica em um maior número de veículos circulantes nas ruas,
que por conseguinte, querer uma melhoria significativa nos serviços de manutenção e
sinalização, assim como a conscientização dos condutores quanto às normas e uso do
cinto de segurança. Não obstante, um maior corpo técnico é requerido para dar suporte
às situações anômalas, assim como fiscalizar a malha viária e punir os infratores. No
que tange o uso do cinto de segurança, mecanismo essencial para assegurar redução da
mortalidade em acidentes de trânsito, a fiscalização é trabalhosa, lenta e suscetível a
erros, realizada por meio da inspeção humana pelo fiscal de trânsito. Neste contexto,
o presente estudo busca apresentar uma abordagem computacional para a detecção
automatizada do uso do cinto de segurança, utilizando-se de imagens capturadas a
partir das vias de fluxo, como uma ferramenta auxiliar no monitoramento, prevenindo
a infração das leis e reduzindo custos. Para a abordagem proposta, Redes Neurais
Convolucionais foram utilizadas como reconhecedores, treinadas a partir de um dataset
especificamente construído a partir de imagens de veículos, no que se considera o cenário
ideal para inspeção. Os resultados experimentais apresentaram uma precisão média de
90,87% para um conjunto de 3000 imagens para o dataset utilizado, demonstrando
a viabilidade técnica em construir sistemas de apoio à decisão para a fiscalização no
trânsito.The sustainable development of any smart city relies directly on different factors,
among them, a good urban planning as a way to optimize the vehicle route flow with
safety. However, a higher flow rate capacity implies directly in a higher number of
vehicles moving around the roads, what next, requires a significant better mainitance
and signs, also as well as awarness of drivers in respect of laws and use of the seat
belt. Besides that, more people working in the technical team is required to support
anomalus situation, inspect the road network and punish offenders. Regarding the use
of the seat belt, essential mechanism to assure the reduction of mortalities in road
accidents, the inspection is laborious, slow and error prone, if done by human traffic
inspector. In this context, the current study aims to present a computational approach
for automatic detection in the seat belt, using captured imagens from road flow as a
tool to aid in the monitoring, preventing the law offend and reducing costs. For the
approach presented, Convolutional Neural Networks were used as detectors, trained on
top of a specific dataset built of vehicles images, in what is considered an ideal scenario
for inspection. The experimental results shows a mean average precision (mAP) of
90,87% for a 3000 images dataset, showing the technical viability in building systems
capable of aiding decision inspection in traffic
Vision-based Driver State Monitoring Using Deep Learning
Road accidents cause thousands of injuries and losses of lives every year, ranking among the top lifetime odds of death causes. More than 90% of the traffic accidents are caused by human errors [1], including sight obstruction, failure to spot danger through inattention, speeding, expectation errors, and other reasons. In recent years, driver monitoring systems (DMS) have been rapidly studied and developed to be used in commercial vehicles to prevent human error-caused car crashes. A DMS is a vehicle safety system that monitors driver’s attention and warns if necessary. Such a system may contain multiple modules that detect the most accident-related human factors, such as drowsiness and distractions. Typical DMS approaches seek driver distraction cues either from vehicle acceleration and steering (vehicle-based approach), driver physiological signals (physiological approach), or driver behaviours (behavioural-based approach). Behavioural-based driver state monitoring has numerous advantages over vehicle-based and physiological-based counterparts, including fast responsiveness and non-intrusiveness. In addition, the recent breakthrough in deep learning enables high-level action and face recognition, expanding driver monitoring coverage and improving model performance. This thesis presents CareDMS, a behavioural approach-based driver monitoring system using deep learning methods. CareDMS consists of driver anomaly detection and classification, gaze estimation, and emotion recognition. Each approach is developed with state-of-the-art deep learning solutions to address the shortcomings of the current DMS functionalities. Combined with a classic drowsiness detection method, CareDMS thoroughly covers three major types of distractions: physical (hands-off-steering wheel), visual (eyes-off-road ahead), and cognitive (minds-off-driving).
There are numerous challenges in behavioural-based driver state monitoring. Current driver distraction detection methods either lack detailed distraction classification or unknown driver anomalies generalization. This thesis introduces a novel two-phase proposal and classification network architecture. It can suspect all forms of distracted driving and recognize driver actions simultaneously, which provide downstream DMS important information for warning level customization. Next, gaze estimation for driver monitoring is difficult as drivers tend to have severe head movements while driving. This thesis proposes a video-based neural network that jointly learns head pose and gaze dynamics together. The design significantly reduces per-head-pose gaze estimation performance variance compared to benchmarks. Furthermore, emotional driving such as road rage and sadness could seriously impact driving performance. However, individuals have various emotional expressions, which makes vision-based emotion recognition a challenging task. This work proposes an efficient and versatile multimodal fusion module that effectively fuses facial expression and human voice for emotion recognition. Visible advantages are demonstrated compared to using a single modality. Finally, a driver state monitoring system, CareDMS, is presented to convert the output of each functionality into a specific driver’s status measurement and integrates various measurements into the driver’s level of alertness
A Context Aware Classification System for Monitoring Driver’s Distraction Levels
Understanding the safety measures regarding developing self-driving futuristic cars is a concern for decision-makers, civil society, consumer groups, and manufacturers. The researchers are trying to thoroughly test and simulate various driving contexts to make these cars fully secure for road users. Including the vehicle’ surroundings offer an ideal way to monitor context-aware situations and incorporate the various hazards. In this regard, different studies have analysed drivers’ behaviour under different case scenarios and scrutinised the external environment to obtain a holistic view of vehicles and the environment. Studies showed that the primary cause of road accidents is driver distraction, and there is a thin line that separates the transition from careless to dangerous. While there has been a significant improvement in advanced driver assistance systems, the current measures neither detect the severity of the distraction levels nor the context-aware, which can aid in preventing accidents. Also, no compact study provides a complete model for transitioning control from the driver to the vehicle when a high degree of distraction is detected.
The current study proposes a context-aware severity model to detect safety issues related to driver’s distractions, considering the physiological attributes, the activities, and context-aware situations such as environment and vehicle. Thereby, a novel three-phase Fast Recurrent Convolutional Neural Network (Fast-RCNN) architecture addresses the physiological attributes. Secondly, a novel two-tier FRCNN-LSTM framework is devised to classify the severity of driver distraction. Thirdly, a Dynamic Bayesian Network (DBN) for the prediction of driver distraction. The study further proposes the Multiclass Driver Distraction Risk Assessment (MDDRA) model, which can be adopted in a context-aware driving distraction scenario. Finally, a 3-way hybrid CNN-DBN-LSTM multiclass degree of driver distraction according to severity level is developed. In addition, a Hidden Markov Driver Distraction Severity Model (HMDDSM) for the transitioning of control from the driver to the vehicle when a high degree of distraction is detected.
This work tests and evaluates the proposed models using the multi-view TeleFOT naturalistic driving study data and the American University of Cairo dataset (AUCD). The evaluation of the developed models was performed using cross-correlation, hybrid cross-correlations, K-Folds validation. The results show that the technique effectively learns and adopts safety measures related to the severity of driver distraction. In addition, the results also show that while a driver is in a dangerous distraction state, the control can be shifted from driver to vehicle in a systematic manner
Intelligent Transportation Related Complex Systems and Sensors
Building around innovative services related to different modes of transport and traffic management, intelligent transport systems (ITS) are being widely adopted worldwide to improve the efficiency and safety of the transportation system. They enable users to be better informed and make safer, more coordinated, and smarter decisions on the use of transport networks. Current ITSs are complex systems, made up of several components/sub-systems characterized by time-dependent interactions among themselves. Some examples of these transportation-related complex systems include: road traffic sensors, autonomous/automated cars, smart cities, smart sensors, virtual sensors, traffic control systems, smart roads, logistics systems, smart mobility systems, and many others that are emerging from niche areas. The efficient operation of these complex systems requires: i) efficient solutions to the issues of sensors/actuators used to capture and control the physical parameters of these systems, as well as the quality of data collected from these systems; ii) tackling complexities using simulations and analytical modelling techniques; and iii) applying optimization techniques to improve the performance of these systems. It includes twenty-four papers, which cover scientific concepts, frameworks, architectures and various other ideas on analytics, trends and applications of transportation-related data
Driver lane change intention inference using machine learning methods.
Lane changing manoeuvre on highway is a highly interactive task for human drivers. The intelligent vehicles and the advanced driver assistance systems (ADAS) need to have proper awareness of the traffic context as well as the driver. The ADAS also need to understand the driver potential intent correctly since it shares the control authority with the human driver. This study provides a research on the driver intention inference, particular focus on the lane change manoeuvre on highways.
This report is organised in a paper basis, where each chapter corresponding to a publication, which is submitted or to be submitted. Part Ⅰ introduce the motivation and general methodology framework for this thesis. Part Ⅱ includes the literature survey and the state-of-art of driver intention inference. Part Ⅲ contains the techniques for traffic context perception that focus on the lane detection. A literature review on lane detection techniques and its integration with parallel driving framework is proposed. Next, a novel integrated lane detection system is designed. Part Ⅳ contains two parts, which provides the driver behaviour monitoring system for normal driving and secondary tasks detection. The first part is based on the conventional feature selection methods while the second part introduces an end-to-end deep learning framework. The design and analysis of driver lane change intention inference system for the lane change manoeuvre is proposed in Part Ⅴ.
Finally, discussions and conclusions are made in Part Ⅵ.
A major contribution of this project is to propose novel algorithms which accurately model the driver intention inference process. Lane change intention will be recognised based on machine learning (ML) methods due to its good reasoning and generalizing characteristics. Sensors in the vehicle are used to capture context traffic information, vehicle dynamics, and driver behaviours information. Machine learning and image processing are the techniques to recognise human driver behaviour.PhD in Transpor
Predicting pedestrian crossing intentions using contextual information
El entorno urbano es uno de los escenarios m as complejos para un veh culo aut onomo, ya
que lo comparte con otros tipos de usuarios conocidos como usuarios vulnerables de la
carretera, con los peatones como mayor representante. Estos usuarios se caracterizan por
su gran dinamicidad. A pesar del gran n umero de interacciones entre veh culos y peatones,
la seguridad de estos ultimos no ha aumentado al mismo ritmo que la de los ocupantes de
los veh culos. Por esta raz on, es necesario abordar este problema. Una posible estrategia
estar a basada en conseguir que los veh culos anticipen el comportamiento de los peatones
para minimizar situaciones de riesgo, especialmente presentes en el momento de cruce.
El objetivo de esta tesis doctoral es alcanzar dicha anticipaci on mediante el desarrollo
de t ecnicas de predicci on de la acci on de cruce de peatones basadas en aprendizaje
profundo.
Previo al dise~no e implementaci on de los sistemas de predicci on, se ha desarrollado
un sistema de clasi caci on con el objetivo de discernir a los peatones involucrados en la
escena vial. El sistema, basado en redes neuronales convolucionales, ha sido entrenado y
validado con un conjunto de datos personalizado. Dicho conjunto se ha construido a partir
de varios conjuntos existentes y aumentado mediante la inclusi on de im agenes obtenidas de
internet. Este paso previo a la anticipaci on permitir a reducir el procesamiento innecesario
dentro del sistema de percepci on del veh culo.
Tras este paso, se han desarrollado dos sistemas como propuesta para abordar el problema
de predicci on.
El primer sistema, basado en redes convolucionales y recurrentes, obtiene una predicci
on a corto plazo de la acci on de cruce realizada un segundo en el futuro. La informaci on
de entrada al modelo est a basada principalmente en imagen, que permite aportar contexto
adicional del peat on. Adem as, el uso de otras variables relacionadas con el peat on junto
con mejoras en la arquitectura, permiten mejorar considerablemente los resultados en el
conjunto de datos JAAD.
El segundo sistema se basa en una arquitectura end-to-end basado en la combinaci on
de redes neuronales convolucionales tridimensionales y/o el codi cador de la arquitectura
Transformer. En este modelo, a diferencia del anterior, la mayor a de las mejoras est an
centradas en transformaciones de los datos de entrada. Tras analizar dichas mejoras,
una serie de modelos se han evaluado y comparado con otros m etodos utilizando tanto el
conjunto de datos JAAD como PIE. Los resultados obtenidos han conseguido liderar el
estado del arte, validando la arquitectura propuesta.The urban environment is one of the most complex scenarios for an autonomous vehicle,
as it is shared with other types of users known as vulnerable road users, with pedestrians
as their principal representative. These users are characterized by their great dynamicity.
Despite a large number of interactions between vehicles and pedestrians, the safety of
pedestrians has not increased at the same rate as that of vehicle occupants. For this
reason, it is necessary to address this problem. One possible strategy would be anticipating
pedestrian behavior to minimize risky situations, especially during the crossing.
The objective of this doctoral thesis is to achieve such anticipation through the development
of crosswalk action prediction techniques based on deep learning.
Before the design and implementation of the prediction systems, a classi cation system
has been developed to discern the pedestrians involved in the road scene. The system,
based on convolutional neural networks, has been trained and validated with a customized
dataset. This set has been built from several existing sets and augmented by including
images obtained from the Internet. This pre-anticipation step would reduce unnecessary
processing within the vehicle perception system.
After this step, two systems have been developed as a proposal to solve the prediction
problem.
The rst system is composed of convolutional and recurrent encoder networks. It
obtains a short-term prediction of the crossing action performed one second in the future.
The input information to the model is mainly image-based, which provides additional
pedestrian context. In addition, the use of pedestrian-related variables and architectural
improvements allows better results on the JAAD dataset.
The second system is an end-to-end architecture based on the combination of threedimensional
convolutional neural networks and/or the Transformer architecture encoder.
In this model, most of the proposed and investigated improvements are focused on transformations
of the input data. After an extensive set of individual tests, several models
have been trained, evaluated, and compared with other methods using both JAAD and
PIE datasets. Obtained results are among the best state-of-the-art models, validating the
proposed architecture
Emotion and Stress Recognition Related Sensors and Machine Learning Technologies
This book includes impactful chapters which present scientific concepts, frameworks, architectures and ideas on sensing technologies and machine learning techniques. These are relevant in tackling the following challenges: (i) the field readiness and use of intrusive sensor systems and devices for capturing biosignals, including EEG sensor systems, ECG sensor systems and electrodermal activity sensor systems; (ii) the quality assessment and management of sensor data; (iii) data preprocessing, noise filtering and calibration concepts for biosignals; (iv) the field readiness and use of nonintrusive sensor technologies, including visual sensors, acoustic sensors, vibration sensors and piezoelectric sensors; (v) emotion recognition using mobile phones and smartwatches; (vi) body area sensor networks for emotion and stress studies; (vii) the use of experimental datasets in emotion recognition, including dataset generation principles and concepts, quality insurance and emotion elicitation material and concepts; (viii) machine learning techniques for robust emotion recognition, including graphical models, neural network methods, deep learning methods, statistical learning and multivariate empirical mode decomposition; (ix) subject-independent emotion and stress recognition concepts and systems, including facial expression-based systems, speech-based systems, EEG-based systems, ECG-based systems, electrodermal activity-based systems, multimodal recognition systems and sensor fusion concepts and (x) emotion and stress estimation and forecasting from a nonlinear dynamical system perspective
Study on Object Detection using Computer Vision by Artificial Neural Network(論文要旨/審査結果要旨)
芝浦工業大学2018年
State of the art of audio- and video based solutions for AAL
Working Group 3. Audio- and Video-based AAL ApplicationsIt is a matter of fact that Europe is facing more and more crucial challenges regarding health and social care due to the demographic change and the current economic context. The recent COVID-19 pandemic has stressed this situation even further, thus highlighting the need for taking action. Active and Assisted Living (AAL) technologies come as a viable approach to help facing these challenges, thanks to the high potential they have in enabling remote care and support. Broadly speaking, AAL can be referred to as the use of innovative and advanced Information and Communication Technologies to create supportive, inclusive and empowering applications and environments that enable older, impaired or frail people to live independently and stay active longer in society. AAL capitalizes on the growing pervasiveness and effectiveness of sensing and computing facilities to supply the persons in need with smart assistance, by responding to their necessities of autonomy, independence, comfort, security and safety. The application scenarios addressed by AAL are complex, due to the inherent heterogeneity of the end-user population, their living arrangements, and their physical conditions or impairment. Despite aiming at diverse goals, AAL systems should share some common characteristics. They are designed to provide support in daily life in an invisible, unobtrusive and user-friendly manner. Moreover, they are conceived to be intelligent, to be able to learn and adapt to the requirements and requests of the assisted people, and to synchronise with their specific needs. Nevertheless, to ensure the uptake of AAL in society, potential users must be willing to use AAL applications and to integrate them in their daily environments and lives. In this respect, video- and audio-based AAL applications have several advantages, in terms of unobtrusiveness and information richness. Indeed, cameras and microphones are far less obtrusive with respect to the hindrance other wearable sensors may cause to one’s activities. In addition, a single camera placed in a room can record most of the activities performed in the room, thus replacing many other non-visual sensors. Currently, video-based applications are effective in recognising and monitoring the activities, the movements, and the overall conditions of the assisted individuals as well as to assess their vital parameters (e.g., heart rate, respiratory rate). Similarly, audio sensors have the potential to become one of the most important modalities for interaction with AAL systems, as they can have a large range of sensing, do not require physical presence at a particular location and are physically intangible. Moreover, relevant information about individuals’ activities and health status can derive from processing audio signals (e.g., speech recordings). Nevertheless, as the other side of the coin, cameras and microphones are often perceived as the most intrusive technologies from the viewpoint of the privacy of the monitored individuals. This is due to the richness of the information these technologies convey and the intimate setting where they may be deployed. Solutions able to ensure privacy preservation by context and by design, as well as to ensure high legal and ethical standards are in high demand. After the review of the current state of play and the discussion in GoodBrother, we may claim that the first solutions in this direction are starting to appear in the literature. A multidisciplinary 4 debate among experts and stakeholders is paving the way towards AAL ensuring ergonomics, usability, acceptance and privacy preservation. The DIANA, PAAL, and VisuAAL projects are examples of this fresh approach.
This report provides the reader with a review of the most recent advances in audio- and video-based monitoring technologies for AAL. It has been drafted as a collective effort of WG3 to supply an introduction to AAL, its evolution over time and its main functional and technological underpinnings. In this respect, the report contributes to the field with the outline of a new generation of ethical-aware AAL technologies and a proposal for a novel comprehensive taxonomy of AAL systems and applications. Moreover, the report allows non-technical readers to gather an overview of the main components of an AAL system and how these function and interact with the end-users.
The report illustrates the state of the art of the most successful AAL applications and functions based on audio and video data, namely (i) lifelogging and self-monitoring, (ii) remote monitoring of vital signs, (iii) emotional state recognition, (iv) food intake monitoring, activity and behaviour recognition, (v) activity and personal assistance, (vi) gesture recognition, (vii) fall detection and prevention, (viii) mobility assessment and frailty recognition, and (ix) cognitive and motor rehabilitation. For these application scenarios, the report illustrates the state of play in terms of scientific advances, available products and research project. The open challenges are also highlighted.
The report ends with an overview of the challenges, the hindrances and the opportunities posed by the uptake in real world settings of AAL technologies. In this respect, the report illustrates the current procedural and technological approaches to cope with acceptability, usability and trust in the AAL technology, by surveying strategies and approaches to co-design, to privacy preservation in video and audio data, to transparency and explainability in data processing, and to data transmission and communication. User acceptance and ethical considerations are also debated. Finally, the potentials coming from the silver economy are overviewed.publishedVersio
- …