6,163 research outputs found

    An Overview about Emerging Technologies of Autonomous Driving

    Full text link
    Since DARPA started Grand Challenges in 2004 and Urban Challenges in 2007, autonomous driving has been the most active field of AI applications. This paper gives an overview about technical aspects of autonomous driving technologies and open problems. We investigate the major fields of self-driving systems, such as perception, mapping and localization, prediction, planning and control, simulation, V2X and safety etc. Especially we elaborate on all these issues in a framework of data closed loop, a popular platform to solve the long tailed autonomous driving problems

    VIENA2: A Driving Anticipation Dataset

    Full text link
    Action anticipation is critical in scenarios where one needs to react before the action is finalized. This is, for instance, the case in automated driving, where a car needs to, e.g., avoid hitting pedestrians and respect traffic lights. While solutions have been proposed to tackle subsets of the driving anticipation tasks, by making use of diverse, task-specific sensors, there is no single dataset or framework that addresses them all in a consistent manner. In this paper, we therefore introduce a new, large-scale dataset, called VIENA2, covering 5 generic driving scenarios, with a total of 25 distinct action classes. It contains more than 15K full HD, 5s long videos acquired in various driving conditions, weathers, daytimes and environments, complemented with a common and realistic set of sensor measurements. This amounts to more than 2.25M frames, each annotated with an action label, corresponding to 600 samples per action class. We discuss our data acquisition strategy and the statistics of our dataset, and benchmark state-of-the-art action anticipation techniques, including a new multi-modal LSTM architecture with an effective loss function for action anticipation in driving scenarios.Comment: Accepted in ACCV 201

    A Routine and Post-disaster Road Corridor Monitoring Framework for the Increased Resilience of Road Infrastructures

    Get PDF

    Automatic object classification for surveillance videos.

    Get PDF
    PhDThe recent popularity of surveillance video systems, specially located in urban scenarios, demands the development of visual techniques for monitoring purposes. A primary step towards intelligent surveillance video systems consists on automatic object classification, which still remains an open research problem and the keystone for the development of more specific applications. Typically, object representation is based on the inherent visual features. However, psychological studies have demonstrated that human beings can routinely categorise objects according to their behaviour. The existing gap in the understanding between the features automatically extracted by a computer, such as appearance-based features, and the concepts unconsciously perceived by human beings but unattainable for machines, or the behaviour features, is most commonly known as semantic gap. Consequently, this thesis proposes to narrow the semantic gap and bring together machine and human understanding towards object classification. Thus, a Surveillance Media Management is proposed to automatically detect and classify objects by analysing the physical properties inherent in their appearance (machine understanding) and the behaviour patterns which require a higher level of understanding (human understanding). Finally, a probabilistic multimodal fusion algorithm bridges the gap performing an automatic classification considering both machine and human understanding. The performance of the proposed Surveillance Media Management framework has been thoroughly evaluated on outdoor surveillance datasets. The experiments conducted demonstrated that the combination of machine and human understanding substantially enhanced the object classification performance. Finally, the inclusion of human reasoning and understanding provides the essential information to bridge the semantic gap towards smart surveillance video systems

    A real time vehicles detection algorithm for vision based sensors

    Full text link
    A vehicle detection plays an important role in the traffic control at signalised intersections. This paper introduces a vision-based algorithm for vehicles presence recognition in detection zones. The algorithm uses linguistic variables to evaluate local attributes of an input image. The image attributes are categorised as vehicle, background or unknown features. Experimental results on complex traffic scenes show that the proposed algorithm is effective for a real-time vehicles detection.Comment: The final publication is available at http://www.springerlink.co

    Human robot interaction in a crowded environment

    No full text
    Human Robot Interaction (HRI) is the primary means of establishing natural and affective communication between humans and robots. HRI enables robots to act in a way similar to humans in order to assist in activities that are considered to be laborious, unsafe, or repetitive. Vision based human robot interaction is a major component of HRI, with which visual information is used to interpret how human interaction takes place. Common tasks of HRI include finding pre-trained static or dynamic gestures in an image, which involves localising different key parts of the human body such as the face and hands. This information is subsequently used to extract different gestures. After the initial detection process, the robot is required to comprehend the underlying meaning of these gestures [3]. Thus far, most gesture recognition systems can only detect gestures and identify a person in relatively static environments. This is not realistic for practical applications as difficulties may arise from people‟s movements and changing illumination conditions. Another issue to consider is that of identifying the commanding person in a crowded scene, which is important for interpreting the navigation commands. To this end, it is necessary to associate the gesture to the correct person and automatic reasoning is required to extract the most probable location of the person who has initiated the gesture. In this thesis, we have proposed a practical framework for addressing the above issues. It attempts to achieve a coarse level understanding about a given environment before engaging in active communication. This includes recognizing human robot interaction, where a person has the intention to communicate with the robot. In this regard, it is necessary to differentiate if people present are engaged with each other or their surrounding environment. The basic task is to detect and reason about the environmental context and different interactions so as to respond accordingly. For example, if individuals are engaged in conversation, the robot should realize it is best not to disturb or, if an individual is receptive to the robot‟s interaction, it may approach the person. Finally, if the user is moving in the environment, it can analyse further to understand if any help can be offered in assisting this user. The method proposed in this thesis combines multiple visual cues in a Bayesian framework to identify people in a scene and determine potential intentions. For improving system performance, contextual feedback is used, which allows the Bayesian network to evolve and adjust itself according to the surrounding environment. The results achieved demonstrate the effectiveness of the technique in dealing with human-robot interaction in a relatively crowded environment [7]

    Deep learning based 3D object detection for automotive radar and camera fusion

    Get PDF
    La percepción en el dominio de los vehículos autónomos es una disciplina clave para lograr la automatización de los Sistemas Inteligentes de Transporte. Por ello, este Trabajo Fin de Máster tiene como objetivo el desarrollo de una técnica de fusión sensorial para RADAR y cámara que permita crear una representación del entorno enriquecida para la Detección de Objetos 3D mediante algoritmos Deep Learning. Para ello, se parte de la idea de PointPainting [1] y se adapta a un sensor en auge, el RADAR 3+1D, donde nube de puntos RADAR e información semántica de la cámara son agregadas para generar una representación enriquecida del entorno.Perception in the domain of autonomous vehicles is a key discipline to achieve the au tomation of Intelligent Transport Systems. Therefore, this Master Thesis aims to develop a sensor fusion technique for RADAR and camera to create an enriched representation of the environment for 3D Object Detection using Deep Learning algorithms. To this end, the idea of PointPainting [1] is used as a starting point and is adapted to a growing sensor, the 3+1D RADAR, in which the radar point cloud is aggregated with the semantic information from the camera.Máster Universitario en Ingeniería Industrial (M141
    corecore