343 research outputs found

    Proceedings of the 2009 Joint Workshop of Fraunhofer IOSB and Institute for Anthropomatics, Vision and Fusion Laboratory

    Get PDF
    The joint workshop of the Fraunhofer Institute of Optronics, System Technologies and Image Exploitation IOSB, Karlsruhe, and the Vision and Fusion Laboratory (Institute for Anthropomatics, Karlsruhe Institute of Technology (KIT)), is organized annually since 2005 with the aim to report on the latest research and development findings of the doctoral students of both institutions. This book provides a collection of 16 technical reports on the research results presented on the 2009 workshop

    An investigation into common challenges of 3D scene understanding in visual surveillance

    Get PDF
    Nowadays, video surveillance systems are ubiquitous. Most installations simply consist of CCTV cameras connected to a central control room and rely on human operators to interpret what they see on the screen in order to, for example, detect a crime (either during or after an event). Some modern computer vision systems aim to automate the process, at least to some degree, and various algorithms have been somewhat successful in certain limited areas. However, such systems remain inefficient in general circumstances and present real challenges yet to be solved. These challenges include the ability to recognise and ultimately predict and prevent abnormal behaviour or even reliably recognise objects, for example in order to detect left luggage or suspicious objects. This thesis first aims to study the state-of-the-art and identify the major challenges and possible requirements of future automated and semi-automated CCTV technology in the field. This thesis presents the application of a suite of 2D and highly novel 3D methodologies that go some way to overcome current limitations.The methods presented here are based on the analysis of object features directly extracted from the geometry of the scene and start with a consideration of mainly existing techniques, such as the use of lines, vanishing points (VPs) and planes, applied to real scenes. Then, an investigation is presented into the use of richer 2.5D/3D surface normal data. In all cases the aim is to combine both 2D and 3D data to obtain a better understanding of the scene, aimed ultimately at capturing what is happening within the scene in order to be able to move towards automated scene analysis. Although this thesis focuses on the widespread application of video surveillance, an example case of the railway station environment is used to represent typical real-world challenges, where the principles can be readily extended elsewhere, such as to airports, motorways, the households, shopping malls etc. The context of this research work, together with an overall presentation of existing methods used in video surveillance and their challenges are described in chapter 1.Common computer vision techniques such as VP detection, camera calibration, 3D reconstruction, segmentation etc., can be applied in an effort to extract meaning to video surveillance applications. According to the literature, these methods have been well researched and their use will be assessed in the context of current surveillance requirements in chapter 2. While existing techniques can perform well in some contexts, such as an architectural environment composed of simple geometrical elements, their robustness and performance in feature extraction and object recognition tasks is not sufficient to solve the key challenges encountered in general video surveillance context. This is largely due to issues such as variable lighting, weather conditions, and shadows and in general complexity of the real-world environment. Chapter 3 presents the research and contribution on those topics – methods to extract optimal features for a specific CCTV application – as well as their strengths and weaknesses to highlight that the proposed algorithm obtains better results than most due to its specific design.The comparison of current surveillance systems and methods from the literature has shown that 2D data are however almost constantly used for many applications. Indeed, industrial systems as well as the research community have been improving intensively 2D feature extraction methods since image analysis and Scene understanding has been of interest. The constant progress on 2D feature extraction methods throughout the years makes it almost effortless nowadays due to a large variety of techniques. Moreover, even if 2D data do not allow solving all challenges in video surveillance or other applications, they are still used as starting stages towards scene understanding and image analysis. Chapter 4 will then explore 2D feature extraction via vanishing point detection and segmentation methods. A combination of most common techniques and a novel approach will be then proposed to extract vanishing points from video surveillance environments. Moreover, segmentation techniques will be explored in the aim to determine how they can be used to complement vanishing point detection and lead towards 3D data extraction and analysis. In spite of the contribution above, 2D data is insufficient for all but the simplest applications aimed at obtaining an understanding of a scene, where the aim is for a robust detection of, say, left luggage or abnormal behaviour; without significant a priori information about the scene geometry. Therefore, more information is required in order to be able to design a more automated and intelligent algorithm to obtain richer information from the scene geometry and so a better understanding of what is happening within. This can be overcome by the use of 3D data (in addition to 2D data) allowing opportunity for object “classification” and from this to infer a map of functionality, describing feasible and unfeasible object functionality in a given environment. Chapter 5 presents how 3D data can be beneficial for this task and the various solutions investigated to recover 3D data, as well as some preliminary work towards plane extraction.It is apparent that VPs and planes give useful information about a scene’s perspective and can assist in 3D data recovery within a scene. However, neither VPs nor plane detection techniques alone allow the recovery of more complex generic object shapes - for example composed of spheres, cylinders etc - and any simple model will suffer in the presence of non-Manhattan features, e.g. introduced by the presence of an escalator. For this reason, a novel photometric stereo-based surface normal retrieval methodology is introduced to capture the 3D geometry of the whole scene or part of it. Chapter 6 describes how photometric stereo allows recovery of 3D information in order to obtain a better understanding of a scene, as well as also partially overcoming some current surveillance challenges, such as difficulty in resolving fine detail, particularly at large standoff distances, and in isolating and recognising more complex objects in real scenes. Here items of interest may be obscured by complex environmental factors that are subject to rapid change, making, for example, the detection of suspicious objects and behaviour highly problematic. Here innovative use is made of an untapped latent capability offered within modern surveillance environments to introduce a form of environmental structuring to good advantage in order to achieve a richer form of data acquisition. This chapter also goes on to explore the novel application of photometric stereo in such diverse applications, how our algorithm can be incorporated into an existing surveillance system and considers a typical real commercial application.One of the most important aspects of this research work is its application. Indeed, while most of the research literature has been based on relatively simple structured environments, the approach here has been designed to be applied to real surveillance environments, such as railway stations, airports, waiting rooms, etc, and where surveillance cameras may be fixed or in the future form part of a mobile robotic free roaming surveillance device, that must continually reinterpret its changing environment. So, as mentioned previously, while the main focus has been to apply this algorithm to railway station environments, the work has been approached in a way that allows adaptation to many other applications, such as autonomous robotics, and in motorway, shopping centre, street and home environments. All of these applications require a better understanding of the scene for security or safety purposes. Finally, chapter 7 presents a global conclusion and what will be achieved in the future

    Resilient Infrastructure and Building Security

    Get PDF

    Coopération de réseaux de caméras ambiantes et de vision embarquée sur robot mobile pour la surveillance de lieux publics

    Get PDF
    Actuellement, il y a une demande croissante pour le déploiement de robots mobile dans des lieux publics. Pour alimenter cette demande, plusieurs chercheurs ont déployé des systèmes robotiques de prototypes dans des lieux publics comme les hôpitaux, les supermarchés, les musées, et les environnements de bureau. Une principale préoccupation qui ne doit pas être négligé, comme des robots sortent de leur milieu industriel isolé et commencent à interagir avec les humains dans un espace de travail partagé, est une interaction sécuritaire. Pour un robot mobile à avoir un comportement interactif sécuritaire et acceptable - il a besoin de connaître la présence, la localisation et les mouvements de population à mieux comprendre et anticiper leurs intentions et leurs actions. Cette thèse vise à apporter une contribution dans ce sens en mettant l'accent sur les modalités de perception pour détecter et suivre les personnes à proximité d'un robot mobile. Comme une première contribution, cette thèse présente un système automatisé de détection des personnes visuel optimisé qui prend explicitement la demande de calcul prévue sur le robot en considération. Différentes expériences comparatives sont menées pour mettre clairement en évidence les améliorations de ce détecteur apporte à la table, y compris ses effets sur la réactivité du robot lors de missions en ligne. Dans un deuxiè contribution, la thèse propose et valide un cadre de coopération pour fusionner des informations depuis des caméras ambiant affixé au mur et de capteurs montés sur le robot mobile afin de mieux suivre les personnes dans le voisinage. La même structure est également validée par des données de fusion à partir des différents capteurs sur le robot mobile au cours de l'absence de perception externe. Enfin, nous démontrons les améliorations apportées par les modalités perceptives développés en les déployant sur notre plate-forme robotique et illustrant la capacité du robot à percevoir les gens dans les lieux publics supposés et respecter leur espace personnel pendant la navigation.This thesis deals with detection and tracking of people in a surveilled public place. It proposes to include a mobile robot in classical surveillance systems that are based on environment fixed sensors. The mobile robot brings about two important benefits: (1) it acts as a mobile sensor with perception capabilities, and (2) it can be used as means of action for service provision. In this context, as a first contribution, it presents an optimized visual people detector based on Binary Integer Programming that explicitly takes the computational demand stipulated into consideration. A set of homogeneous and heterogeneous pool of features are investigated under this framework, thoroughly tested and compared with the state-of-the-art detectors. The experimental results clearly highlight the improvements the different detectors learned with this framework bring to the table including its effect on the robot's reactivity during on-line missions. As a second contribution, the thesis proposes and validates a cooperative framework to fuse information from wall mounted cameras and sensors on the mobile robot to better track people in the vicinity. Finally, we demonstrate the improvements brought by the developed perceptual modalities by deploying them on our robotic platform and illustrating the robot's ability to perceive people in supposed public areas and respect their personal space during navigation

    Super-resolução em vídeos de baixa qualidade para aplicações forenses, de vigilância e móveis

    Get PDF
    Orientadores: Siome Klein Goldenstein, Anderson de Rezende RochaTese (doutorado) - Universidade Estadual de Campinas, Instituto de ComputaçãoResumo: Algoritmos de super-resolução (SR) são métodos para obter um aumento da resolução de imagens compostas por pixels. Na super-resolução por múltiplas imagens, um conjunto de imagens de baixa resolução de uma cena é combinado para construir uma imagem de resolução superior. Super-resolução é uma solução barata para superar as limitações dos sistemas de aquisição de imagens, e pode ser útil em diversos casos em que o dispositivo não pode ser melhorado ou substituído - mas em que é possível obter diversas capturas da mesma cena. Neste trabalho, é explorada a super-resolução por múltiplas imagens para imagens naturais, em cenários nos quais é possível obter diversas imagens de uma cena. São propostas cinco variações de um método que explora propriedades geométricas de múltiplas imagens de baixa resolução para combiná-las em uma imagem de resolução superior; duas variações de um método que combina técnicas de inpainting e super-resolução; e mais três variações de um método que utiliza filtros adaptativos e regularização para resolver um problema de mínimos quadrados. Super-resolução por múltiplas imagens é possível quando existe movimento e informações não redundantes entre as imagens de baixa resolução. Entretanto, combiná-las em uma imagem de resolução superior pode não ser computacionalmente viável por técnicas complexas de super-resolução. A primeira aplicação dos métodos propostos é para um conjunto de imagens capturadas pelos dispositivos móveis mais recentes. Este tipo de ambiente requer algoritmos eficazes que sejam executados rapidamente e utilizando baixo consumo de memória. A segunda aplicação é na Ciência Forense. Câmeras de vigilância espalhadas pelas cidades poderiam fornecer dicas importantes para identificar um suspeito, por exemplo, em uma cena de crime. Entretanto, o reconhecimento dos caracteres de placas veiculares é especialmente difícil quando a resolução das imagens é baixa. Por isso, este trabalho também propõe um arcabouço que realiza a super-resolução de placas veiculares em vídeos reais de vigilância, capturados por câmeras de baixa qualidade e não projetadas especificamente para esta tarefa, ajudando o especialista forense a compreender um evento de interesse. O arcabouço realiza todas as etapas necessárias para rastrear, alinhar, reconstruir e reconhecer automaticamente os caracteres de uma placa suspeita. O usuário recebe, como saída, a imagem de alta resolução reconstruída, mais rica em detalhes, e também a sequência de caracteres reconhecida automaticamente nesta imagem. São apresentadas validações quantitativas e qualitativas dos algoritmos propostos e de suas aplicações. Os experimentos mostram, por exemplo, que é possível aumentar o número de caracteres reconhecidos corretamente, colocando o arcabouço proposto como uma ferramenta importante para fornecer aos peritos uma solução para o reconhecimento de placas veiculares sob condições adversas de aquisição. Por fim, também é sugerido o número mínimo de imagens a ser utilizada como entrada em cada aplicaçãoAbstract: Super-resolution (SR) algorithms are methods for achieving high-resolution (HR) enlargements of pixel-based images. In multi-frame super resolution, a set of low-resolution (LR) images of a scene are combined to construct an image with higher resolution. Super resolution is an inexpensive solution to overcome the limitations of image acquisition hardware systems, and can be useful in several cases in which the device cannot be upgraded or replaced, but multiple frames of the same scene can be obtained. In this work, we explore SR possibilities for natural images, in scenarios wherein we have multiple frames of a same scene. We design and develop five variations of an algorithm which rely on exploring geometric properties in order to combine pixels from LR observations into an HR grid; two variations of a method that combines inpainting techniques to multi-frame super resolution; and three variations of an algorithm that uses adaptive filtering and Tikhonov regularization to solve a least-square problem. Multi-frame super resolution is possible when there is motion and non-redundant information from LR observations. However, combining a large number of frames into a higher resolution image may not be computationally feasible by complex super-resolution techniques. The first application of the proposed methods is in consumer-grade photography with a setup in which several low-resolution images gathered by recent mobile devices can be combined to create a much higher resolution image. Such always-on low-power environment requires effective high-performance algorithms, that run fastly and with a low-memory footprint. The second application is in Digital Forensic, with a setup in which low-quality surveillance cameras throughout the cities could provide important cues to identify a suspect vehicle, for example, in a crime scene. However, license-plate recognition is especially difficult under poor image resolutions. Hence, we design and develop a novel, free and open-source framework underpinned by SR and Automatic License-Plate Recognition (ALPR) techniques to identify license-plate characters in low-quality real-world traffic videos, captured by cameras not designed for the ALPR task, aiding forensic analysts in understanding an event of interest. The framework handles the necessary conditions to identify a target license plate, using a novel methodology to locate, track, align, super resolve, and recognize its alphanumerics. The user receives as outputs the rectified and super-resolved license-plate, richer in details, and also the sequence of license-plates characters that have been automatically recognized in the super-resolved image. We present quantitative and qualitative validations of the proposed algorithms and its applications. Our experiments show, for example, that SR can increase the number of correctly recognized characters posing the framework as an important step toward providing forensic experts and practitioners with a solution for the license-plate recognition problem under difficult acquisition conditions. Finally, we also suggest a minimum number of images to use as input in each applicationDoutoradoCiência da ComputaçãoDoutor em Ciência da Computação1197478,146886153996/3-2015CAPESCNP

    Development of situation recognition, environment monitoring and patient condition monitoring service modules for hospital robots

    Get PDF
    An aging society and economic pressure have caused an increase in the patient-to-staff ratio leading to a reduction in healthcare quality. In order to combat the deficiencies in the delivery of patient healthcare, the European Commission in the FP6 scheme approved the financing of a research project for the development of an Intelligent Robot Swarm for Attendance, Recognition, Cleaning and Delivery (iWARD). Each iWARD robot contained a mobile, self-navigating platform and several modules attached to it to perform their specific tasks. As part of the iWARD project, the research described in this thesis is interested to develop hospital robot modules which are able to perform the tasks of surveillance and patient monitoring in a hospital environment for four scenarios: Intruder detection, Patient behavioural analysis, Patient physical condition monitoring, and Environment monitoring. Since the Intruder detection and Patient behavioural analysis scenarios require the same equipment, they can be combined into one common physical module called Situation recognition module. The other two scenarios are to be served by their separate modules: Environment monitoring module and Patient condition monitoring module. The situation recognition module uses non-intrusive machine vision-based concepts. The system includes an RGB video camera and a 3D laser sensor, which monitor the environment in order to detect an intruder, or a patient lying on the floor. The system deals with various image-processing and sensor fusion techniques. The environment monitoring module monitors several parameters of the hospital environment: temperature, humidity and smoke. The patient condition monitoring system remotely measures the following body conditions: body temperature, heart rate, respiratory rate, and others, using sensors attached to the patient’s body. The system algorithm and module software is implemented in C/C++ and uses the OpenCV image analysis and processing library and is successfully tested on Linux (Ubuntu) Platform. The outcome of this research has significant contribution to the robotics application area in the hospital environment

    Taking the Temperature of Sports Arenas:Automatic Analysis of People

    Get PDF
    corecore