162 research outputs found

    Stereo vision-based road estimation assisted by efficient planar patch calculation

    Get PDF

    Design and modeling of a stair climber smart mobile robot (MSRox)

    Full text link

    Computational Modeling of Human Dorsal Pathway for Motion Processing

    Get PDF
    Reliable motion estimation in videos is of crucial importance for background iden- tification, object tracking, action recognition, event analysis, self-navigation, etc. Re- constructing the motion field in the 2D image plane is very challenging, due to variations in image quality, scene geometry, lighting condition, and most importantly, camera jit- tering. Traditional optical flow models assume consistent image brightness and smooth motion field, which are violated by unstable illumination and motion discontinuities that are common in real world videos. To recognize observer (or camera) motion robustly in complex, realistic scenarios, we propose a biologically-inspired motion estimation system to overcome issues posed by real world videos. The bottom-up model is inspired from the infrastructure as well as functionalities of human dorsal pathway, and the hierarchical processing stream can be divided into three stages: 1) spatio-temporal processing for local motion, 2) recogni- tion for global motion patterns (camera motion), and 3) preemptive estimation of object motion. To extract effective and meaningful motion features, we apply a series of steer- able, spatio-temporal filters to detect local motion at different speeds and directions, in a way that\u27s selective of motion velocity. The intermediate response maps are cal- ibrated and combined to estimate dense motion fields in local regions, and then, local motions along two orthogonal axes are aggregated for recognizing planar, radial and circular patterns of global motion. We evaluate the model with an extensive, realistic video database that collected by hand with a mobile device (iPad) and the video content varies in scene geometry, lighting condition, view perspective and depth. We achieved high quality result and demonstrated that this bottom-up model is capable of extracting high-level semantic knowledge regarding self motion in realistic scenes. Once the global motion is known, we segment objects from moving backgrounds by compensating for camera motion. For videos captured with non-stationary cam- eras, we consider global motion as a combination of camera motion (background) and object motion (foreground). To estimate foreground motion, we exploit corollary dis- charge mechanism of biological systems and estimate motion preemptively. Since back- ground motions for each pixel are collectively introduced by camera movements, we apply spatial-temporal averaging to estimate the background motion at pixel level, and the initial estimation of foreground motion is derived by comparing global motion and background motion at multiple spatial levels. The real frame signals are compared with those derived by forward predictions, refining estimations for object motion. This mo- tion detection system is applied to detect objects with cluttered, moving backgrounds and is proved to be efficient in locating independently moving, non-rigid regions. The core contribution of this thesis is the invention of a robust motion estimation system for complicated real world videos, with challenges by real sensor noise, complex natural scenes, variations in illumination and depth, and motion discontinuities. The overall system demonstrates biological plausibility and holds great potential for other applications, such as camera motion removal, heading estimation, obstacle avoidance, route planning, and vision-based navigational assistance, etc

    Multiple Lane Detection Algorithm Based on Novel Dense Vanishing Point Estimation

    Get PDF

    ëŹŽìž ìžìœšìŁŒí–‰ 찚량을 위한 닚안 ìčŽë©”띌 êž°ë°˜ 싀시간 ìŁŒí–‰ 환êČœ 읞식 êž°ëČ•ì— ꎀ한 ì—°ê”Ź

    Get PDF
    í•™ìœ„ë…ŒëŹž (ë°•ì‚Ź)-- 서욞대학ꔐ 대학원 : ì „êž°êł”í•™ë¶€, 2014. 2. 서ìŠč우.Homo Faber, refers to humans as controlling the environments through tools. From the beginning of the world, humans create tools for chasing the convenient life. The desire for the rapid movement let the human ride on horseback, make the wagon and finally make the vehicle. The vehicle made humans possible to travel the long distance very quickly as well as conveniently. However, since human being itself is imperfect, plenty of people have died due to the car accident, and people are dying at this moment. The research for autonomous vehicle has been conducted to satisfy the humans desire of the safety as the best alternative. And, the dream of autonomous vehicle will be come true in the near future. For the implementation of autonomous vehicle, many kinds of techniques are required, among which, the recognition of the environment around the vehicle is one of the most fundamental and important problems. For the recognition of surrounding objects many kinds of sensors can be utilized, however, the monocular camera can collect the largest information among sensors as well as can be utilized for the variety of purposes, and can be adopted for the various vehicle types due to the good price competitiveness. I expect that the research using the monocular camera for autonomous vehicle is very practical and useful. In this dissertation, I cover four important recognition problems for autonomous driving by using monocular camera in vehicular environment. Firstly, to drive the way autonomously the vehicle has to recognize lanes and keep its lane. However, the detection of lane markings under the various illuminant variation is very difficult in the image processing area. Nevertheless, it must be solved for the autonomous driving. The first research topic is the robust lane marking extraction under the illumination variations for multilane detection. I proposed the new lane marking extraction filter that can detect the imperfect lane markings as well as the new false positive cancelling algorithm that can eliminate noise markings. This approach can extract lane markings successfully even under the bad illumination conditions. Secondly, the problem to tackle, is if there is no lane marking on the road, then how the autonomous vehicle can recognize the road to run? In addition, what is the current lane position of the road? The latter is the important question since we can make a decision for lane change or keeping depending on the current position of lane. The second research is for handling those two problems, and I proposed the approach for the fusing the road detection and the lane position estimation. Thirdly, to drive more safely, keeping the safety distance is very important. Additionally, many equipments for the driving safety require the distance information. Measuring accurate inter-vehicle distance by using monocular camera and line laser is the third research topic. To measure the inter-vehicle distance, I illuminate the line laser on the front side of vehicle, and measure the length of the laser line and lane width in the image. Based on the imaging geometry, the distance calculation problem can be solved with accuracy. There are still many of important problems are remaining to be solved, and I proposed some approaches by using the monocular camera to handle the important problems. I expect very active researches will be continuously conducted and, based on the researches, the era of autonomous vehicle will come in the near future.1 Introduction 1.1 Background and Motivations 1.2 Contributions and Outline of the Dissertation 1.2.1 Illumination-Tolerant Lane Marking Extraction for Multilane Detection 1.2.2 Fusing Road Detection and Lane Position Estimation for the Robust Road Boundary Estimation 1.2.3 Accurate Inter-Vehicle Distance Measurement based on Monocular Camera and Line Laser 2 Illumination-Tolerant Lane Marking Extraction for Multilane Detection 2.1 Introduction 2.2 Lane Marking Candidate Extraction Filter 2.2.1 Requirements of the Filter 2.2.2 A Comparison of Filter Characteristics 2.2.3 Cone Hat Filter 2.3 Overview of the Proposed Algorithm 2.3.1 Filter Width Estimation 2.3.2 Top Hat (Cone Hat) Filtering 2.3.3 Reiterated Extraction 2.3.4 False Positive Cancelling 2.3.4.1 Lane Marking Center Point Extraction 2.3.4.2 Fast Center Point Segmentation 2.3.4.3 Vanishing Point Detection 2.3.4.4 Segment Extraction 2.3.4.5 False Positive Filtering 2.4 Experiments and Evaluation 2.4.1 Experimental Set-up 2.4.2 Conventional Algorithm for Evaluation 2.4.2.1 Global threshold 2.4.2.2 Positive Negative Gradient 2.4.2.3 Local Threshold 2.4.2.4 Symmetry Local Threshold 2.4.2.5 Double Extraction using Symmetry Local Threshold 2.4.2.6 Gaussian Filter 2.4.3 Experimental Results 2.4.4 Summary 3 Fusing Road Detection and Lane Position Estimation for the Robust Road Boundary Estimation 3.1 Introduction 3.2 Chromaticity-based Flood-fill Method 3.2.1 Illuminant-Invariant Space 3.2.2 Road Pixel Selection 3.2.3 Flood-fill Algorithm 3.3 Lane Position Estimation 3.3.1 Lane Marking Extraction 3.3.2 Proposed Lane Position Detection Algorithm 3.3.3 Birds-eye View Transformation by using the Proposed Dynamic Homography Matrix Generation 3.3.4 Next Lane Position Estimation based on the Cross-ratio 3.3.5 Forward-looking View Transformation 3.4 Information Fusion Between Road Detection and Lane Position Estimation 3.4.1 The Case of Detection Failures 3.4.2 The Benefit of Information Fusion 3.5 Experiments and Evaluation 3.6 Summary 4 Accurate Inter-Vehicle Distance Measurement based on Monocular Camera and Line Laser 4.1 Introduction 4.2 Proposed Distance Measurement Algorithm 4.3 Experiments and Evaluation 4.3.1 Experimental System Set-up 4.3.2 Experimental Results 4.4 Summary 5 ConclusionDocto

    Visual Attention in Dynamic Environments and its Application to Playing Online Games

    Get PDF
    Abstract In this thesis we present a prototype of Cognitive Programs (CPs) - an executive controller built on top of Selective Tuning (ST) model of attention. CPs enable top-down control of visual system and interaction between the low-level vision and higher-level task demands. Abstract We implement a subset of CPs for playing online video games in real time using only visual input. Two commercial closed-source games - Canabalt and Robot Unicorn Attack - are used for evaluation. Their simple gameplay and minimal controls put the emphasis on reaction speed and attention over planning. Abstract Our implementation of Cognitive Programs plays both games at human expert level, which experimentally proves the validity of the concept. Additionally we resolved multiple theoretical and engineering issues, e.g. extending the CPs to dynamic environments, finding suitable data structures for describing the task and information flow within the network and determining the correct timing for each process

    Obstacle detection of 3D imaging depth images by supervised Laplacian eigenmap dimension reduction

    Get PDF
    In this paper, we propose an obstacle detection method for 3D imaging sensors by supervised Laplacian eigenmap manifold learning. The paper analyses the depth ambiguity problem of 3D depth images firstly, then the ambiguity boundary line and intensity images are used to eliminate ambiguity and extract the non-ambiguity regions of depth image. 3D information without ambiguity is applied to the manifold learning stage directly, and we use a biased distance in supervised Laplacian eigenmap to realize a non-linear dimensionality reduction of the depth data. In the experiment, 3D coordinate information of obstacles and non-obstacles is used as training data of manifold learning respectively. Experiment results show that our model can effectively eliminate the depth ambiguity of 3D imaging images and realize obstacle detection and identification, the method also shows good stability to 3D imaging noise

    Binokulare EigenbewegungsschĂ€tzung fĂŒr Fahrerassistenzanwendungen

    Get PDF
    Driving can be dangerous. Humans become inattentive when performing a monotonous task like driving. Also the risk implied while multi-tasking, like using the cellular phone while driving, can break the concentration of the driver and increase the risk of accidents. Others factors like exhaustion, nervousness and excitement affect the performance of the driver and the response time. Consequently, car manufacturers have developed systems in the last decades which assist the driver under various circumstances. These systems are called driver assistance systems. Driver assistance systems are meant to support the task of driving, and the field of action varies from alerting the driver, with acoustical or optical warnings, to taking control of the car, such as keeping the vehicle in the traffic lane until the driver resumes control. For such a purpose, the vehicle is equipped with on-board sensors which allow the perception of the environment and/or the state of the vehicle. Cameras are sensors which extract useful information about the visual appearance of the environment. Additionally, a binocular system allows the extraction of 3D information. One of the main requirements for most camera-based driver assistance systems is the accurate knowledge of the motion of the vehicle. Some sources of information, like velocimeters and GPS, are of common use in vehicles today. Nevertheless, the resolution and accuracy usually achieved with these systems are not enough for many real-time applications. The computation of ego-motion from sequences of stereo images for the implementation of driving intelligent systems, like autonomous navigation or collision avoidance, constitutes the core of this thesis. This dissertation proposes a framework for the simultaneous computation of the 6 degrees of freedom of ego-motion (rotation and translation in 3D Euclidean space), the estimation of the scene structure and the detection and estimation of independently moving objects. The input is exclusively provided by a binocular system and the framework does not call for any data acquisition strategy, i.e. the stereo images are just processed as they are provided. Stereo allows one to establish correspondences between left and right images, estimating 3D points of the environment via triangulation. Likewise, feature tracking establishes correspondences between the images acquired at different time instances. When both are used together for a large number of points, the result is a set of clouds of 3D points with point-to-point correspondences between clouds. The apparent motion of the 3D points between consecutive frames is caused by a variety of reasons. The most dominant motion for most of the points in the clouds is caused by the ego-motion of the vehicle; as the vehicle moves and images are acquired, the relative position of the world points with respect to the vehicle changes. Motion is also caused by objects moving in the environment. They move independently of the vehicle motion, so the observed motion for these points is the sum of the ego-vehicle motion and the independent motion of the object. A third reason, and of paramount importance in vision applications, is caused by correspondence problems, i.e. the incorrect spatial or temporal assignment of the point-to-point correspondence. Furthermore, all the points in the clouds are actually noisy measurements of the real unknown 3D points of the environment. Solving ego-motion and scene structure from the clouds of points requires some previous analysis of the noise involved in the imaging process, and how it propagates as the data is processed. Therefore, this dissertation analyzes the noise properties of the 3D points obtained through stereo triangulation. This leads to the detection of a bias in the estimation of 3D position, which is corrected with a reformulation of the projection equation. Ego-motion is obtained by finding the rotation and translation between the two clouds of points. This problem is known as absolute orientation, and many solutions based on least squares have been proposed in the literature. This thesis reviews the available closed form solutions to the problem. The proposed framework is divided in three main blocks: 1) stereo and feature tracking computation, 2) ego-motion estimation and 3) estimation of 3D point position and 3D velocity. The first block solves the correspondence problem providing the clouds of points as output. No special implementation of this block is required in this thesis. The ego-motion block computes the motion of the cameras by finding the absolute orientation between the clouds of static points in the environment. Since the cloud of points might contain independently moving objects and outliers generated by false correspondences, the direct computation of the least squares might lead to an erroneous solution. The first contribution of this thesis is an effective rejection rule that detects outliers based on the distance between predicted and measured quantities, and reduces the effects of noisy measurement by assigning appropriate weights to the data. This method is called Smoothness Motion Constraint (SMC). The ego-motion of the camera between two frames is obtained finding the absolute orientation between consecutive clouds of weighted 3D points. The complete ego-motion since initialization is achieved concatenating the individual motion estimates. This leads to a super-linear propagation of the error, since noise is integrated. A second contribution of this dissertation is a predictor/corrector iterative method, which integrates the clouds of 3D points of multiple time instances for the computation of ego-motion. The presented method considerably reduces the accumulation of errors in the estimated ego-position of the camera. Another contribution of this dissertation is a method which recursively estimates the 3D world position of a point and its velocity; by fusing stereo, feature tracking and the estimated ego-motion in a Kalman Filter system. An improved estimation of point position is obtained this way, which is used in the subsequent system cycle resulting in an improved computation of ego-motion. The general contribution of this dissertation is a single framework for the real time computation of scene structure, independently moving objects and ego-motion for automotive applications.Autofahren kann gefĂ€hrlich sein. Die Fahrleistung wird durch die physischen und psychischen Grenzen des Fahrers und durch externe Faktoren wie das Wetter beeinflusst. Fahrerassistenzsysteme erhöhen den Fahrkomfort und unterstĂŒtzen den Fahrer, um die Anzahl an UnfĂ€llen zu verringern. Fahrerassistenzsysteme unterstĂŒtzen den Fahrer durch Warnungen mit optischen oder akustischen Signalen bis hin zur Übernahme der Kontrolle ĂŒber das Auto durch das System. Eine der Hauptvoraussetzungen fĂŒr die meisten Fahrerassistenzsysteme ist die akkurate Kenntnis der Bewegung des eigenen Fahrzeugs. Heutzutage verfĂŒgt man ĂŒber verschiedene Sensoren, um die Bewegung des Fahrzeugs zu messen, wie zum Beispiel GPS und Tachometer. Doch Auflösung und Genauigkeit dieser Systeme sind nicht ausreichend fĂŒr viele Echtzeitanwendungen. Die Berechnung der Eigenbewegung aus Stereobildsequenzen fĂŒr Fahrerassistenzsysteme, z.B. zur autonomen Navigation oder Kollisionsvermeidung, bildet den Kern dieser Arbeit. Diese Dissertation prĂ€sentiert ein System zur Echtzeitbewertung einer Szene, inklusive Detektion und Bewertung von unabhĂ€ngig bewegten Objekten sowie der akkuraten SchĂ€tzung der sechs Freiheitsgrade der Eigenbewegung. Diese grundlegenden Bestandteile sind erforderlich, um viele intelligente Automobilanwendungen zu entwickeln, die den Fahrer in unterschiedlichen Verkehrssituationen unterstĂŒtzen. Das System arbeitet ausschließlich mit einer Stereokameraplattform als Sensor. Um die Eigenbewegung und die Szenenstruktur zu berechnen wird eine Analyse des Rauschens und der Fehlerfortpflanzung im Bildaufbereitungsprozess benötigt. Deshalb werden in dieser Dissertation die Rauscheigenschaften der durch Stereotriangulation erhaltenen 3D-Punkte analysiert. Dies fĂŒhrt zu der Entdeckung eines systematischen Fehlers in der SchĂ€tzung der 3D-Position, der sich mit einer Neuformulierung der Projektionsgleichung korrigieren lĂ€sst. Die Simulationsergebnisse zeigen, dass eine bedeutende Verringerung des Fehlers in der geschĂ€tzten 3D-Punktposition möglich ist. Die EigenbewegungsschĂ€tzung wird gewonnen, indem die Rotation und Translation zwischen Punktwolken geschĂ€tzt wird. Dieses Problem ist als „absolute Orientierung” bekannt und viele Lösungen auf Basis der Methode der kleinsten Quadrate sind in der Literatur vorgeschlagen worden. Diese Arbeit rezensiert die verfĂŒgbaren geschlossenen Lösungen zu dem Problem. Das vorgestellte System gliedert sich in drei wesentliche Bausteine: 1. Registrierung von Bildmerkmalen, 2. EigenbewegungsschĂ€tzung und 3. iterative SchĂ€tzung von 3D-Position und 3D-Geschwindigkeit von Weltpunkten. Der erster Block erhĂ€lt eine Folge rektifizierter Bilder als Eingabe und liefert daraus eine Liste von verfolgten Bildmerkmalen mit ihrer entsprechenden 3D-Position. Der Block „EigenbewegungsschĂ€tzung” besteht aus vier Hauptschritten in einer Schleife: 1. Bewegungsvorhersage, 2. Anwendung der Glattheitsbedingung fĂŒr die Bewegung (GBB), 3. absolute Orientierungsberechnung und 4. Bewegungsintegration. Die in dieser Dissertation vorgeschlagene GBB ist eine mĂ€chtige Bedingung fĂŒr die Ablehnung von Ausreißern und fĂŒr die Zuordnung von Gewichten zu den gemessenen 3D-Punkten. Simulationen werden mit gaußschem und slashschem Rauschen ausgefĂŒhrt. Die Ergebnisse zeigen die Überlegenheit der GBB-Version ĂŒber die Standardgewichtungsmethoden. Die StabilitĂ€t der Ergebnisse hinsichtlich Ausreißern wurde analysiert mit dem Resultat, dass der „break down point” grĂ¶ĂŸer als 50% ist. Wenn die vier Schritte iterativ ausgefĂŒhrt, werden wird ein PrĂ€diktor-Korrektor-Verfahren gewonnen.Wir nennen diese SchĂ€tzung Multi-frameschĂ€tzung im Gegensatz zur ZweiframeschĂ€tzung, die nur die aktuellen und vorherigen Bildpaare fĂŒr die Berechnung der Eigenbewegung betrachtet. Die erste Iteration wird zwischen der aktuellen und vorherigen Wolke von Punkten durchgefĂŒhrt. Jede weitere Iteration integriert eine zusĂ€tzliche Punktwolke eines vorherigen Zeitpunkts. Diese Methode reduziert die Fehlerakkumulation bei der Integration von mehreren SchĂ€tzungen in einer einzigen globalen SchĂ€tzung. Simulationsergebnisse zeigen, dass obwohl der Fehler noch superlinear im Laufe der Zeit zunimmt, die GrĂ¶ĂŸe des Fehlers um mehrere GrĂ¶ĂŸenordnungen reduziert wird. Der dritte Block besteht aus der iterativen SchĂ€tzung von 3D-Position und 3D-Geschwindigkeit von Weltpunkten. Hier wird eine Methode basierend auf einem Kalman Filter verwendet, das Stereo, Featuretracking und Eigenbewegungsdaten fusioniert. Messungen der Position eines Weltpunkts werden durch das Stereokamerasystem gewonnen. Die Differenzierung der Position des geschĂ€tzten Punkts erlaubt die zusĂ€tzliche SchĂ€tzung seiner Geschwindigkeit. Die Messungen werden durch das Messmodell gewonnen, das Stereo- und Bewegungsdaten fusioniert. Simulationsergebnisse validieren das Modell. Die Verringerung der Positionsunsicherheit im Laufe der Zeit wird mit einer Monte-Carlo Simulation erzielt. Experimentelle Ergebnisse werden mit langen Sequenzen von Bildern erzielt. ZusĂ€tzliche Tests, einschließlich einer 3D-Rekonstruktion einer Waldszene und der Berechnung der freien Kamerabewegung in einem Indoor-Szenario, wurden durchgefĂŒhrt. Die Methode zeigt gute Ergebnisse in allen FĂ€llen. Der Algorithmus liefert zudem akzeptable Ergebnisse bei der SchĂ€tzung der Lage kleiner Objekte, wie Köpfe und Beine von realen Crash-Test-Dummies
    • 

    corecore