67 research outputs found

    3D Motion Analysis via Energy Minimization

    Get PDF
    This work deals with 3D motion analysis from stereo image sequences for driver assistance systems. It consists of two parts: the estimation of motion from the image data and the segmentation of moving objects in the input images. The content can be summarized with the technical term machine visual kinesthesia, the sensation or perception and cognition of motion. In the first three chapters, the importance of motion information is discussed for driver assistance systems, for machine vision in general, and for the estimation of ego motion. The next two chapters delineate on motion perception, analyzing the apparent movement of pixels in image sequences for both a monocular and binocular camera setup. Then, the obtained motion information is used to segment moving objects in the input video. Thus, one can clearly identify the thread from analyzing the input images to describing the input images by means of stationary and moving objects. Finally, I present possibilities for future applications based on the contents of this thesis. Previous work in each case is presented in the respective chapters. Although the overarching issue of motion estimation from image sequences is related to practice, there is nothing as practical as a good theory (Kurt Lewin). Several problems in computer vision are formulated as intricate energy minimization problems. In this thesis, motion analysis in image sequences is thoroughly investigated, showing that splitting an original complex problem into simplified sub-problems yields improved accuracy, increased robustness, and a clear and accessible approach to state-of-the-art motion estimation techniques. In Chapter 4, optical flow is considered. Optical flow is commonly estimated by minimizing the combined energy, consisting of a data term and a smoothness term. These two parts are decoupled, yielding a novel and iterative approach to optical flow. The derived Refinement Optical Flow framework is a clear and straight-forward approach to computing the apparent image motion vector field. Furthermore this results currently in the most accurate motion estimation techniques in literature. Much as this is an engineering approach of fine-tuning precision to the last detail, it helps to get a better insight into the problem of motion estimation. This profoundly contributes to state-of-the-art research in motion analysis, in particular facilitating the use of motion estimation in a wide range of applications. In Chapter 5, scene flow is rethought. Scene flow stands for the three-dimensional motion vector field for every image pixel, computed from a stereo image sequence. Again, decoupling of the commonly coupled approach of estimating three-dimensional position and three dimensional motion yields an approach to scene ow estimation with more accurate results and a considerably lower computational load. It results in a dense scene flow field and enables additional applications based on the dense three-dimensional motion vector field, which are to be investigated in the future. One such application is the segmentation of moving objects in an image sequence. Detecting moving objects within the scene is one of the most important features to extract in image sequences from a dynamic environment. This is presented in Chapter 6. Scene flow and the segmentation of independently moving objects are only first steps towards machine visual kinesthesia. Throughout this work, I present possible future work to improve the estimation of optical flow and scene flow. Chapter 7 additionally presents an outlook on future research for driver assistance applications. But there is much more to the full understanding of the three-dimensional dynamic scene. This work is meant to inspire the reader to think outside the box and contribute to the vision of building perceiving machines.</em

    Combined Learned and Classical Methods for Real-Time Visual Perception in Autonomous Driving

    Full text link
    Autonomy, robotics, and Artificial Intelligence (AI) are among the main defining themes of next-generation societies. Of the most important applications of said technologies is driving automation which spans from different Advanced Driver Assistance Systems (ADAS) to full self-driving vehicles. Driving automation is promising to reduce accidents, increase safety, and increase access to mobility for more people such as the elderly and the handicapped. However, one of the main challenges facing autonomous vehicles is robust perception which can enable safe interaction and decision making. With so many sensors to perceive the environment, each with its own capabilities and limitations, vision is by far one of the main sensing modalities. Cameras are cheap and can provide rich information of the observed scene. Therefore, this dissertation develops a set of visual perception algorithms with a focus on autonomous driving as the target application area. This dissertation starts by addressing the problem of real-time motion estimation of an agent using only the visual input from a camera attached to it, a problem known as visual odometry. The visual odometry algorithm can achieve low drift rates over long-traveled distances. This is made possible through the innovative local mapping approach used. This visual odometry algorithm was then combined with my multi-object detection and tracking system. The tracking system operates in a tracking-by-detection paradigm where an object detector based on convolution neural networks (CNNs) is used. Therefore, the combined system can detect and track other traffic participants both in image domain and in 3D world frame while simultaneously estimating vehicle motion. This is a necessary requirement for obstacle avoidance and safe navigation. Finally, the operational range of traditional monocular cameras was expanded with the capability to infer depth and thus replace stereo and RGB-D cameras. This is accomplished through a single-stream convolution neural network which can output both depth prediction and semantic segmentation. Semantic segmentation is the process of classifying each pixel in an image and is an important step toward scene understanding. Literature survey, algorithms descriptions, and comprehensive evaluations on real-world datasets are presented.Ph.D.College of Engineering & Computer ScienceUniversity of Michiganhttps://deepblue.lib.umich.edu/bitstream/2027.42/153989/1/Mohamed Aladem Final Dissertation.pdfDescription of Mohamed Aladem Final Dissertation.pdf : Dissertatio

    Real-time stereo semi-global matching for video processing using previous incremental information

    Get PDF
    This paper presents an incremental stereo algorithm designed to calculate a real-time disparity image. The algorithm is designed for stereo video sequences and uses previous information to reduce computation time and improve disparity image quality. It is based on the semi-global matching stereo algorithm but modified to reuse previous calculation information. Storing and reusing this information not only reduces computation time but improves accuracy in a cost filtering scheme. Some tests are presented to compare the computation time and results of the algorithm, which show that it can achieve better results in terms of quality and time than standard algorithms for some scenarios

    Event-based neuromorphic stereo vision

    Full text link

    Event-based Vision: A Survey

    Get PDF
    Event cameras are bio-inspired sensors that differ from conventional frame cameras: Instead of capturing images at a fixed rate, they asynchronously measure per-pixel brightness changes, and output a stream of events that encode the time, location and sign of the brightness changes. Event cameras offer attractive properties compared to traditional cameras: high temporal resolution (in the order of microseconds), very high dynamic range (140 dB vs. 60 dB), low power consumption, and high pixel bandwidth (on the order of kHz) resulting in reduced motion blur. Hence, event cameras have a large potential for robotics and computer vision in challenging scenarios for traditional cameras, such as low-latency, high speed, and high dynamic range. However, novel methods are required to process the unconventional output of these sensors in order to unlock their potential. This paper provides a comprehensive overview of the emerging field of event-based vision, with a focus on the applications and the algorithms developed to unlock the outstanding properties of event cameras. We present event cameras from their working principle, the actual sensors that are available and the tasks that they have been used for, from low-level vision (feature detection and tracking, optic flow, etc.) to high-level vision (reconstruction, segmentation, recognition). We also discuss the techniques developed to process events, including learning-based techniques, as well as specialized processors for these novel sensors, such as spiking neural networks. Additionally, we highlight the challenges that remain to be tackled and the opportunities that lie ahead in the search for a more efficient, bio-inspired way for machines to perceive and interact with the world

    Stereoscopic depth estimation for online vision systems

    Get PDF
    A lot of work has been done in the area of machine stereo vision, but a severe drawback of today's algorithms is that they either achieve high accuracy and robustness by sacrificing real-time speed or they are real-time capable but with major deficiencies in quality. In order to tackle this problem this thesis presents two new methods which exhibit a very good balance between computational effort and depth accuracy. First, the summed normalized cross-correlation is proposed which constitutes a new cost function for block-matching stereo processing. In contrast to most standard cost functions it hardly suffers from the fattening effect while being computationally very efficient. Second, the direct surface fitting, a new algorithm for fitting parametric surface models to stereo images, is introduced. This algorithm is inspired by the homography-constrained gradient descent methods but in contrast to these allows also for the estimation of non-planar surfaces. Experimental evaluations demonstrate that both newly introduced algorithms are competitive to state-of-the-art in terms of accuracy while having a much lower computational time.Die visuelle Wahrnehmung des Menschen wird in hohem Maße vom stereoskopischenSehen beeinflusst. Die dreidimensionale Wahrnehmung entsteht dabei durch dieleicht unterschiedlichen Blickwinkel der beiden Augen. Es ist eine nahe liegendeAnnahmen, dass maschinelle Sehsysteme ebenfalls von einem vergleichbaren Sinnprofitieren können. Obwohl es bereits zahlreiche Arbeiten auf dem Gebiet desmaschinellen stereoskopischen Sehen gibt, erfüllen die heutigen Algorithmenentweder nicht die Anforderungen für eine effiziente Berechnung oder aber siehaben nur eine geringe Genauigkeit und Robustheit. Das Ziel dieser Doktorarbeit ist die Entwicklung von echtzeit- undrealweltfähigen stereoskopischen Algorithmen. Insbesondere soll die Berechnungder Algorithmen leichtgewichtig genug sein, um auf mobilen Plattformeneingesetzt werden zu können. Dazu werden im Rahmen dieser Arbeit zwei neueMethoden vorgestellt, welche sich durch eine gute Balance zwischenGeschwindigkeit und Genauigkeit auszeichnen. Als erstes wird die "Summed Normalized Cross-Correlation" (SNCC) vorgestellt,eine neue Kostenfunktion für blockvergleichende, stereoskopischeTiefenschätzung. Im Unterschied zu den meisten anderen Kostenfunktionen ist SNCCnicht anfällig für den qualitätsmindernden "Fattening"-Effekt, kann abertrotzdem sehr effizient berechnet werden. Die Auswertung der Genauigkeit aufStandard Benchmark-Tests zeigt, dass mit SNCC die Genauigkeit von lokaler,blockvergleichsbasierter, stereoskopischer Berechnung nahe an die Genauigkeitvon global optimierenden Methoden basierend auf "Graph Cut" oder "BeliefPropagation" heran kommt. Die zweite vorgestellte Methode ist das "Direct Surface Fitting", ein neuerAlgorithmus zum Schätzen parametrischer Oberflächenmodelle an Hand vonStereobildern. Dieser Algorithmus ist inspiriert vom Homographie-beschränktenGradientenabstieg, welcher häufig dazu benutzt wird um die Lage von planarenOberflächen im Raum zu Schätzen. Durch die Ersetzung des Gradientenabstiegs mitder direkten Suchmethodik von Hooke und Jeeves wird die planare Schätzung aufbeliebige parametrische Oberflächenmodelle und beliebige Kostenfunktionenerweitert. Ein Vergleich auf Standard Benchmark-Tests zeigt, dass "DirectSurface Fitting" eine vergleichbare Genauigkeit wie Methoden aus dem Stand derTechnik hat, im Gegensatz zu diesen aber höhere Robustheit in anspruchsvollenSituationen besitzt. Um die Realwelttauglichkeit und Effizienz der vorgestellten Methoden zuuntermauern wurden diese in ein Automobil- und in ein Robotersystemintegriert. Die mit diesen mobilen Systemen durchgeführten Experimentedemonstrieren die hohe Robustheit und Stabilität der eingeführten Methoden

    Towards a Common Software/Hardware Methodology for Future Advanced Driver Assistance Systems

    Get PDF
    The European research project DESERVE (DEvelopment platform for Safe and Efficient dRiVE, 2012-2015) had the aim of designing and developing a platform tool to cope with the continuously increasing complexity and the simultaneous need to reduce cost for future embedded Advanced Driver Assistance Systems (ADAS). For this purpose, the DESERVE platform profits from cross-domain software reuse, standardization of automotive software component interfaces, and easy but safety-compliant integration of heterogeneous modules. This enables the development of a new generation of ADAS applications, which challengingly combine different functions, sensors, actuators, hardware platforms, and Human Machine Interfaces (HMI). This book presents the different results of the DESERVE project concerning the ADAS development platform, test case functions, and validation and evaluation of different approaches. The reader is invited to substantiate the content of this book with the deliverables published during the DESERVE project. Technical topics discussed in this book include:Modern ADAS development platforms;Design space exploration;Driving modelling;Video-based and Radar-based ADAS functions;HMI for ADAS;Vehicle-hardware-in-the-loop validation system

    Irish Machine Vision and Image Processing Conference Proceedings 2017

    Get PDF
    • …
    corecore