Search CORE

89 research outputs found

Pedestrian and Vehicle Detection in Autonomous Vehicle Perception Systems—A Review

Author: Abbod Maysam
Galvao Luiz G.
Huda Md Nazmul
Kalganova Tatiana
Palade Vasile
Publication venue: 'MDPI AG'
Publication date: 31/10/2021
Field of study

Autonomous Vehicles (AVs) have the potential to solve many traffic problems, such as accidents, congestion and pollution. However, there are still challenges to overcome, for instance, AVs need to accurately perceive their environment to safely navigate in busy urban scenarios. The aim of this paper is to review recent articles on computer vision techniques that can be used to build an AV perception system. AV perception systems need to accurately detect non-static objects and predict their behaviour, as well as to detect static objects and recognise the information they are providing. This paper, in particular, focuses on the computer vision techniques used to detect pedestrians and vehicles. There have been many papers and reviews on pedestrians and vehicles detection so far. However, most of the past papers only reviewed pedestrian or vehicle detection separately. This review aims to present an overview of the AV systems in general, and then review and investigate several detection computer vision techniques for pedestrians and vehicles. The review concludes that both traditional and Deep Learning (DL) techniques have been used for pedestrian and vehicle detection; however, DL techniques have shown the best results. Although good detection results have been achieved for pedestrians and vehicles, the current algorithms still struggle to detect small, occluded, and truncated objects. In addition, there is limited research on how to improve detection performance in difficult light and weather conditions. Most of the algorithms have been tested on well-recognised datasets such as Caltech and KITTI; however, these datasets have their own limitations. Therefore, this paper recommends that future works should be implemented on more new challenging datasets, such as PIE and BDD100K.EPSRC DTP PhD studentshi

Coventry University Pure Portal

Brunel University Research Archive

Robust object representation by boosting-like deep learning architecture

Author: Han Jungong
Qian Cheng-shan
Shen Linlin
Wang Lei
Zhang Baochang
Publication venue: 'Elsevier BV'
Publication date: 03/06/2016
Field of study

This paper presents a new deep learning architecture for robust object representation, aiming at efficiently combining the proposed synchronized multi-stage feature (SMF) and a boosting-like algorithm. The SMF structure can capture a variety of characteristics from the inputting object based on the fusion of the handcraft features and deep learned features. With the proposed boosting-like algorithm, we can obtain more convergence stability on training multi-layer network by using the boosted samples. We show the generalization of our object representation architecture by applying it to undertake various tasks, i.e. pedestrian detection and action recognition. Our approach achieves 15.89% and 3.85% reduction in the average miss rate compared with ACF and JointDeep on the largest Caltech dataset, and acquires competitive results on the MSRAction3D dataset

Northumbria University Research Portal

Crossref

Optimisation de Base de Donnée pour la Détection de Piétons temps-réel

Author: Bremond Francois
Trichet Remi
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 15/12/2017
Field of study

International audienceThis paper tackles data selection for training set generation in the context of nearreal-time pedestrian detection through the introduction of a training methodology: FairTrain.After highlighting the impact of poorly chosen data on detector performance, we will introduce anew data selection technique utilizing the expectation-maximization algorithm for data weighting.FairTrain also features a version of the cascade-of-rejectors enhanced with data selection principles.Experiments on the INRIA and PETS2009 datasets prove that, when ne trained, a simple HoG-based detector can perform on par with most of its near real-time competitors.Ce document traite de la sélection de données pour la génération de l’ensembled’entraînement dans le contexte de la détection des piétons en temps-réel grâce a l’introductiond’une méthodologie: FairTrain. Après avoir souligné l’impact des données mal choisies sur lesperformances des détecteurs, nous allons présenter une nouvelle technique de sélection de donnéespondéré par l’algorithme d’expectation-maximization. FairTrain propose également une versionde cascade-de-rejecteurs améliorée avec des principes de sélection de données. Les expériencessur les bases de données INRIA et Caltech prouvent que, lorsqu’ils sont bien formés, un simpledétecteur basé sur des HoGs fonctionne aussi bien que ses concurrents temps-réel

Crossref

INRIA a CCSD electronic archive server

HAL Descartes

Hal-Diderot

Coopération de réseaux de caméras ambiantes et de vision embarquée sur robot mobile pour la surveillance de lieux publics

Author: Mekonnen Alhayat Ali
Publication venue
Publication date: 11/03/2014
Field of study

Actuellement, il y a une demande croissante pour le déploiement de robots mobile dans des lieux publics. Pour alimenter cette demande, plusieurs chercheurs ont déployé des systèmes robotiques de prototypes dans des lieux publics comme les hôpitaux, les supermarchés, les musées, et les environnements de bureau. Une principale préoccupation qui ne doit pas être négligé, comme des robots sortent de leur milieu industriel isolé et commencent à interagir avec les humains dans un espace de travail partagé, est une interaction sécuritaire. Pour un robot mobile à avoir un comportement interactif sécuritaire et acceptable - il a besoin de connaître la présence, la localisation et les mouvements de population à mieux comprendre et anticiper leurs intentions et leurs actions. Cette thèse vise à apporter une contribution dans ce sens en mettant l'accent sur les modalités de perception pour détecter et suivre les personnes à proximité d'un robot mobile. Comme une première contribution, cette thèse présente un système automatisé de détection des personnes visuel optimisé qui prend explicitement la demande de calcul prévue sur le robot en considération. Différentes expériences comparatives sont menées pour mettre clairement en évidence les améliorations de ce détecteur apporte à la table, y compris ses effets sur la réactivité du robot lors de missions en ligne. Dans un deuxiè contribution, la thèse propose et valide un cadre de coopération pour fusionner des informations depuis des caméras ambiant affixé au mur et de capteurs montés sur le robot mobile afin de mieux suivre les personnes dans le voisinage. La même structure est également validée par des données de fusion à partir des différents capteurs sur le robot mobile au cours de l'absence de perception externe. Enfin, nous démontrons les améliorations apportées par les modalités perceptives développés en les déployant sur notre plate-forme robotique et illustrant la capacité du robot à percevoir les gens dans les lieux publics supposés et respecter leur espace personnel pendant la navigation.This thesis deals with detection and tracking of people in a surveilled public place. It proposes to include a mobile robot in classical surveillance systems that are based on environment fixed sensors. The mobile robot brings about two important benefits: (1) it acts as a mobile sensor with perception capabilities, and (2) it can be used as means of action for service provision. In this context, as a first contribution, it presents an optimized visual people detector based on Binary Integer Programming that explicitly takes the computational demand stipulated into consideration. A set of homogeneous and heterogeneous pool of features are investigated under this framework, thoroughly tested and compared with the state-of-the-art detectors. The experimental results clearly highlight the improvements the different detectors learned with this framework bring to the table including its effect on the robot's reactivity during on-line missions. As a second contribution, the thesis proposes and validates a cooperative framework to fuse information from wall mounted cameras and sensors on the mobile robot to better track people in the vicinity. Finally, we demonstrate the improvements brought by the developed perceptual modalities by deploying them on our robotic platform and illustrating the robot's ability to perceive people in supposed public areas and respect their personal space during navigation

Thèses en ligne de l'Université Toulouse III - Paul Sabatier

From pixels to people : recovering location, shape and pose of humans in images

Author: Omran Mohamed
Publication venue: Saarländische Universitäts- und Landesbibliothek
Publication date: 01/01/2021
Field of study

Humans are at the centre of a significant amount of research in computer vision. Endowing machines with the ability to perceive people from visual data is an immense scientific challenge with a high degree of direct practical relevance. Success in automatic perception can be measured at different levels of abstraction, and this will depend on which intelligent behaviour we are trying to replicate: the ability to localise persons in an image or in the environment, understanding how persons are moving at the skeleton and at the surface level, interpreting their interactions with the environment including with other people, and perhaps even anticipating future actions. In this thesis we tackle different sub-problems of the broad research area referred to as "looking at people", aiming to perceive humans in images at different levels of granularity. We start with bounding box-level pedestrian detection: We present a retrospective analysis of methods published in the decade preceding our work, identifying various strands of research that have advanced the state of the art. With quantitative exper- iments, we demonstrate the critical role of developing better feature representations and having the right training distribution. We then contribute two methods based on the insights derived from our analysis: one that combines the strongest aspects of past detectors and another that focuses purely on learning representations. The latter method outperforms more complicated approaches, especially those based on hand- crafted features. We conclude our work on pedestrian detection with a forward-looking analysis that maps out potential avenues for future research. We then turn to pixel-level methods: Perceiving humans requires us to both separate them precisely from the background and identify their surroundings. To this end, we introduce Cityscapes, a large-scale dataset for street scene understanding. This has since established itself as a go-to benchmark for segmentation and detection. We additionally develop methods that relax the requirement for expensive pixel-level annotations, focusing on the task of boundary detection, i.e. identifying the outlines of relevant objects and surfaces. Next, we make the jump from pixels to 3D surfaces, from localising and labelling to fine-grained spatial understanding. We contribute a method for recovering 3D human shape and pose, which marries the advantages of learning-based and model- based approaches. We conclude the thesis with a detailed discussion of benchmarking practices in computer vision. Among other things, we argue that the design of future datasets should be driven by the general goal of combinatorial robustness besides task-specific considerations.Der Mensch steht im Zentrum vieler Forschungsanstrengungen im Bereich des maschinellen Sehens. Es ist eine immense wissenschaftliche Herausforderung mit hohem unmittelbarem Praxisbezug, Maschinen mit der Fähigkeit auszustatten, Menschen auf der Grundlage von visuellen Daten wahrzunehmen. Die automatische Wahrnehmung kann auf verschiedenen Abstraktionsebenen erfolgen. Dies hängt davon ab, welches intelligente Verhalten wir nachbilden wollen: die Fähigkeit, Personen auf der Bildfläche oder im 3D-Raum zu lokalisieren, die Bewegungen von Körperteilen und Körperoberflächen zu erfassen, Interaktionen einer Person mit ihrer Umgebung einschließlich mit anderen Menschen zu deuten, und vielleicht sogar zukünftige Handlungen zu antizipieren. In dieser Arbeit beschäftigen wir uns mit verschiedenen Teilproblemen die dem breiten Forschungsgebiet "Betrachten von Menschen" gehören. Beginnend mit der Fußgängererkennung präsentieren wir eine Analyse von Methoden, die im Jahrzehnt vor unserem Ausgangspunkt veröffentlicht wurden, und identifizieren dabei verschiedene Forschungsstränge, die den Stand der Technik vorangetrieben haben. Unsere quantitativen Experimente zeigen die entscheidende Rolle sowohl der Entwicklung besserer Bildmerkmale als auch der Trainingsdatenverteilung. Anschließend tragen wir zwei Methoden bei, die auf den Erkenntnissen unserer Analyse basieren: eine Methode, die die stärksten Aspekte vergangener Detektoren kombiniert, eine andere, die sich im Wesentlichen auf das Lernen von Bildmerkmalen konzentriert. Letztere übertrifft kompliziertere Methoden, insbesondere solche, die auf handgefertigten Bildmerkmalen basieren. Wir schließen unsere Arbeit zur Fußgängererkennung mit einer vorausschauenden Analyse ab, die mögliche Wege für die zukünftige Forschung aufzeigt. Anschließend wenden wir uns Methoden zu, die Entscheidungen auf Pixelebene betreffen. Um Menschen wahrzunehmen, müssen wir diese sowohl praezise vom Hintergrund trennen als auch ihre Umgebung verstehen. Zu diesem Zweck führen wir Cityscapes ein, einen umfangreichen Datensatz zum Verständnis von Straßenszenen. Dieser hat sich seitdem als Standardbenchmark für Segmentierung und Erkennung etabliert. Darüber hinaus entwickeln wir Methoden, die die Notwendigkeit teurer Annotationen auf Pixelebene reduzieren. Wir konzentrieren uns hierbei auf die Aufgabe der Umgrenzungserkennung, d. h. das Erkennen der Umrisse relevanter Objekte und Oberflächen. Als nächstes machen wir den Sprung von Pixeln zu 3D-Oberflächen, vom Lokalisieren und Beschriften zum präzisen räumlichen Verständnis. Wir tragen eine Methode zur Schätzung der 3D-Körperoberfläche sowie der 3D-Körperpose bei, die die Vorteile von lernbasierten und modellbasierten Ansätzen vereint. Wir schließen die Arbeit mit einer ausführlichen Diskussion von Evaluationspraktiken im maschinellen Sehen ab. Unter anderem argumentieren wir, dass der Entwurf zukünftiger Datensätze neben aufgabenspezifischen Überlegungen vom allgemeinen Ziel der kombinatorischen Robustheit bestimmt werden sollte

Universaar

Acronym

MPG.PuRe

Pedestrian Movement Direction Recognition Using Convolutional Neural Networks

Author: Cazorla Miguel
Dominguez-Sanchez Alex
Orts-Escolano Sergio
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2017
Field of study

Pedestrian movement direction recognition is an important factor in autonomous driver assistance and security surveillance systems. Pedestrians are the most crucial and fragile moving objects in streets, roads, and events, where thousands of people may gather on a regular basis. People flow analysis on zebra crossings and in shopping centers or events such as demonstrations are a key element to improve safety and to enable autonomous cars to drive in real life environments. This paper focuses on deep learning techniques such as convolutional neural networks (CNN) to achieve a reliable detection of pedestrians moving in a particular direction. We propose a CNN-based technique that leverages current pedestrian detection techniques (histograms of oriented gradients-linSVM) to generate a sum of subtracted frames (flow estimation around the detected pedestrian), which are used as an input for the proposed modified versions of various state-of-the-art CNN networks, such as AlexNet, GoogleNet, and ResNet. Moreover, we have also created a new data set for this purpose, and analyzed the importance of training in a known data set for the neural networks to achieve reliable results.This work was supported by the Feder funds, Spanish Government through the COMBAHO Project, under Grant TIN2016-76515-R, and in part by the University of Alicante Project under Grant GRE16-19

Repositorio Institucional de la Universidad de Alicante

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Crossref

Deeply Smile Detection Based on Discriminative Features with Modified LeNet-5 Network

Author: Maher Obaya Hend
Publication venue: Arab Journals Platform
Publication date: 05/10/2023
Field of study

Facial expressions are caused by specific movements of the face muscles; they are regarded as a visible manifestation of a person\u27s inner thought process, internal emotional states, and intentions. A smile is a facial expression that often indicates happiness, satisfaction, or agreement. Many applications use smile detection such as automatic image capture, distance learning systems, interactive systems, video conferencing, patient monitoring, and product rating. The smile detection system is divided into two stages: feature extraction and classification. As a result, the accuracy of smile detection is dependent on both phases. In recent years, numerous researchers and scholars have identified various approaches to smile detection, however, their accuracy is still under the desired level. To this end, we propose an effective Convolutional Neural Network (CNN) architecture based on modified LeNet-5 Network (MLeNet-5) for detecting smiles in images. The proposed system generates low-level face identifiers and detect smiles using a strong binary classifier. In our experiments, the proposed MLenet-5 system used the SMILEsmilesD and (GENKI-4 K) databases in which the smile detection rate of the proposed method improves the accuracy by 2% on SMILEsmilesD database and 5% on GENKI-4 K database relative to LeNet-5-based CNN network. In addition, the proposed system decreases the number of parameters compared to LeNet-5-based CNN network and most of the existing models while maintaining the robustness and effectiveness of the results

Arab Journals Platform

Deeply Smile Detection Based on Discriminative Features with Modified LeNet-5 Network

Author: Maher Obaya Hend
Publication venue: Arab Journals Platform
Publication date: 05/10/2023
Field of study

Arab Journals Platform

Prototype to Increase Crosswalk Safety by Integrating Computer Vision with ITS-G5 Technologies

Author: Costa Paulo
Gaspar Francisco
Guerreiro Vitor
Loureiro Paulo
Mendes Sílvio
Rabadão Carlos
Publication venue: 'MDPI AG'
Publication date: 01/01/2020
Field of study

Human errors are probably the main cause of car accidents, and this type of vehicle is one of the most dangerous forms of transport for people. The danger comes from the fact that on public roads there are simultaneously different types of actors (drivers, pedestrians or cyclists) and many objects that change their position over time, making difficult to predict their immediate movements. The intelligent transport system (ITS-G5) standard specifies the European communication technologies and protocols to assist public road users, providing them with relevant information. The scientific community is developing ITS-G5 applications for various purposes, among which is the increasing of pedestrian safety. This paper describes the developed work to implement an ITS-G5 prototype that aims at the increasing of pedestrian and driver safety in the vicinity of a pedestrian crosswalk by sending ITS-G5 decentralized environmental notification messages (DENM) to the vehicles. These messages are analyzed, and if they are relevant, they are presented to the driver through a car’s onboard infotainment system. This alert allows the driver to take safety precautions to prevent accidents. The implemented prototype was tested in a controlled environment pedestrian crosswalk. The results showed the capacity of the prototype for detecting pedestrians, suitable message sending, the reception and processing on a vehicle onboard unit (OBU) module and its presentation on the car onboard infotainment system.info:eu-repo/semantics/publishedVersio

IC-online

Robust object representation by boosting-like deep learning architecture

Author: Appel
Baochang Zhang
Chen
Chen
Chen
Chen
Chen
Cheng-shan Qian
Dollar
Doll´ar
Doll´ar
Gu
Gu
Han
Hoiem
Jian
Jungong Han
Lai
LeCun
Lei Wang
Li
Li
Linlin Shen
Liu
Liu
Ma
Ojala
Pan
Sermanet
Shao
Shen
Tuzel
Vieira
Viola
Watanabe
Wen
Zhu
Publication venue: 'Elsevier BV'
Publication date
Field of study

Crossref