Search CORE

11 research outputs found

Motion Estimation from Disparity Images

Author: Darrell T.
Demirdjian D.
Publication venue
Publication date: 01/01/2001
Field of study

A new method for 3D rigid motion estimation from stereo is proposed in this paper. The appealing feature of this method is that it directly uses the disparity images obtained from stereo matching. We assume that the stereo rig has parallel cameras and show, in that case, the geometric and topological properties of the disparity images. Then we introduce a rigid transformation (called d-motion) that maps two disparity images of a rigidly moving object. We show how it is related to the Euclidean rigid motion and a motion estimation algorithm is derived. We show with experiments that our approach is simple and more accurate than standard approaches

CiteSeerX

DSpace@MIT

Range Segmentation Using Visibility Constraints

Author: Darrell Trevor
Taycher Leonid
Publication venue
Publication date: 01/09/2001
Field of study

Visibility constraints can aid the segmentation of foreground objects observed with multiple range images. In our approach, points are defined as foreground if they can be determined to occlude some {em empty space} in the scene. We present an efficient algorithm to estimate foreground points in each range view using explicit epipolar search. In cases where the background pattern is stationary, we show how visibility constraints from other views can generate virtual background values at points with no valid depth in the primary view. We demonstrate the performance of both algorithms for detecting people in indoor office environments

DSpace@MIT

Estimation and prediction of the vehicle's motion basedon visual odometry and Kalman filter

Author: B. Musleh
D. Lowe
D. Scaramuzza
D. Scharstein
I. Parra
J. Borenstein
R. Kalman
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2012
Field of study

Proceeding of: 14th International Conference, ACIVS 2012, Brno, Czech Republic, September 4-7, 2012The movement of the vehicle is an useful information for different applications, such as driver assistant systems or autonomous vehicles. This information can be known by different methods, for instance, by using a GPS or by means of the visual odometry. However, there are some situations where both methods do not work correctly. For example, there are areas in urban environments where the signal of the GPS is not available, as tunnels or streets with high buildings. On the other hand, the algorithms of computer vision are affected by outdoor environments, and the main source of difficulties is the variation in the ligthing conditions. A method to estimate and predict the movement of the vehicle based on visual odometry and Kalman filter is explained in this paper. The Kalman filter allows both filtering and prediction of vehicle motion, using the results from the visual odometry estimation.This work was also supported by Spanish Government through the CICYT projects FEDORA (Grant TRA2010-20255-C03-01), Driver Distraction Detector System (Grant TRA2011-29454-C03-02) and by CAM through the projects SEGVAUTO-II.Publicad

Crossref

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Universidad Carlos III de Madrid e-Archivo

Egomotion Estimation Using Binocular Spatiotemporal Oriented Energy

Author
Publication venue: 'British Machine Vision Association and Society for Pattern Recognition'
Publication date: 01/01/2013
Field of study

Crossref

Detection of dominant planar surfaces in disparity images based on random sampling

Author: E. Karlo Nyarko
Ratko Grbić
Robert Cupec
Publication venue: Faculty of Mechanical Engineering in Slavonski Brod; Faculty of Electrical Engineering, Computer Science and Information Technology Osijek; Faculty of Civil Engineering in Osijek
Publication date: 01/01/2011
Field of study

U ovom članku ispituje se praktična primjenjivost RANSAC-pristupa za detekciju ravnih površina na slikama dispariteta dobivenim pomoću stereo vizije. Težište istraživanja je primjena u interijerima, gdje je velik dio dominantnih površina jednolično obojen, što predstavlja poseban problem za stereo viziju. Ispitano je nekoliko jednostavnih modifikacija osnovnog RANSAC-algoritma s ciljem utvrđivanja koliko oni mogu poboljšati njegovu učinkovitost. Predložene su dvije jednostavne mjere točnosti rekonstrukcija ravnih površina. Provedeno je eksperimentalno istraživanje na slikama snimljenim sustavom stereo vizije montiranom na mobilnog robota koji se kretao hodnicima fakulteta.In this paper, the applicability of RANSAC-approach to planar surface detection in disparity images obtained by stereo vision is investigated. This study is specially focused on application in indoor environments, where many of the dominant surfaces are uniformly colored, which poses additional difficulties to stereo vision. Several simple modifications to the basic RANSAC-algorithm are examined and improvements achieved by these modifications are evaluated. Two simple performance measures for evaluating the accuracy of planar surface detection are proposed. An experimental study is performed using images acquired by a stereo vision system mounted on a mobile robot moving in an indoor environment

HRČAK - Portal of Croatian Scientific and Professional Journals

Hrčak - Portal of scientific journals of Croatia

Egomotion estimation using binocular spatiotemporal oriented energy

Author: Zhong Hao
Publication venue
Publication date
Field of study

Camera egomotion estimation is concerned with the recovery of a camera's motion (e.g., instantaneous translation and rotation) as it moves through its environment. It has been demonstrated to be of both theoretical and practical interest. This thesis documents a novel algorithm for egomotion estimation based on binocularly matched spatiotemporal oriented energy distributions. Basing the estimation on oriented energy measurements makes it possible to recover egomotion without the need to establish temporal correspondences or convert disparity into 3D world coordinates. There sulting algorithm has been realized in software and evaluated quantitatively on a novel laboratory dataset with ground truth as well as qualitatively on both indoor and outdoor real-world datasets. Performance is evaluated relative to comparable alternative algorithms and shown to exhibit best overall performance

YorkSpace

Recognition of Holoscopic 3D Video Hand Gesture Using Convolutional Neural Networks

Author: Abbod M
Alnaim N
Swash R
Publication venue: 'MDPI AG'
Publication date: 15/04/2020
Field of study

Copyright © 2020 by the authors. The convolutional neural network (CNN) algorithm is one of the efficient techniques to recognize hand gestures. In human–computer interaction, a human gesture is a non-verbal communication mode, as users communicate with a computer via input devices. In this article, 3D micro hand gesture recognition disparity experiments are proposed using CNN. This study includes twelve 3D micro hand motions recorded for three different subjects. The system is validated by an experiment that is implemented on twenty different subjects of different ages. The results are analysed and evaluated based on execution time, training, testing, sensitivity, specificity, positive and negative predictive value, and likelihood. The CNN training results show an accuracy as high as 100%, which present superior performance in all factors. On the other hand, the validation results average about 99% accuracy. The CNN algorithm has proven to be the most accurate classification tool for micro gesture recognition.Imam Abdulrahman bin Faisal Universit

Multidisciplinary Digital Publishing Institute

Brunel University Research Archive

Binokulare Eigenbewegungsschätzung für Fahrerassistenzanwendungen

Author: Badino Hernán
Publication venue
Publication date: 28/01/2009
Field of study

Driving can be dangerous. Humans become inattentive when performing a monotonous task like driving. Also the risk implied while multi-tasking, like using the cellular phone while driving, can break the concentration of the driver and increase the risk of accidents. Others factors like exhaustion, nervousness and excitement affect the performance of the driver and the response time. Consequently, car manufacturers have developed systems in the last decades which assist the driver under various circumstances. These systems are called driver assistance systems. Driver assistance systems are meant to support the task of driving, and the field of action varies from alerting the driver, with acoustical or optical warnings, to taking control of the car, such as keeping the vehicle in the traffic lane until the driver resumes control. For such a purpose, the vehicle is equipped with on-board sensors which allow the perception of the environment and/or the state of the vehicle. Cameras are sensors which extract useful information about the visual appearance of the environment. Additionally, a binocular system allows the extraction of 3D information. One of the main requirements for most camera-based driver assistance systems is the accurate knowledge of the motion of the vehicle. Some sources of information, like velocimeters and GPS, are of common use in vehicles today. Nevertheless, the resolution and accuracy usually achieved with these systems are not enough for many real-time applications. The computation of ego-motion from sequences of stereo images for the implementation of driving intelligent systems, like autonomous navigation or collision avoidance, constitutes the core of this thesis. This dissertation proposes a framework for the simultaneous computation of the 6 degrees of freedom of ego-motion (rotation and translation in 3D Euclidean space), the estimation of the scene structure and the detection and estimation of independently moving objects. The input is exclusively provided by a binocular system and the framework does not call for any data acquisition strategy, i.e. the stereo images are just processed as they are provided. Stereo allows one to establish correspondences between left and right images, estimating 3D points of the environment via triangulation. Likewise, feature tracking establishes correspondences between the images acquired at different time instances. When both are used together for a large number of points, the result is a set of clouds of 3D points with point-to-point correspondences between clouds. The apparent motion of the 3D points between consecutive frames is caused by a variety of reasons. The most dominant motion for most of the points in the clouds is caused by the ego-motion of the vehicle; as the vehicle moves and images are acquired, the relative position of the world points with respect to the vehicle changes. Motion is also caused by objects moving in the environment. They move independently of the vehicle motion, so the observed motion for these points is the sum of the ego-vehicle motion and the independent motion of the object. A third reason, and of paramount importance in vision applications, is caused by correspondence problems, i.e. the incorrect spatial or temporal assignment of the point-to-point correspondence. Furthermore, all the points in the clouds are actually noisy measurements of the real unknown 3D points of the environment. Solving ego-motion and scene structure from the clouds of points requires some previous analysis of the noise involved in the imaging process, and how it propagates as the data is processed. Therefore, this dissertation analyzes the noise properties of the 3D points obtained through stereo triangulation. This leads to the detection of a bias in the estimation of 3D position, which is corrected with a reformulation of the projection equation. Ego-motion is obtained by finding the rotation and translation between the two clouds of points. This problem is known as absolute orientation, and many solutions based on least squares have been proposed in the literature. This thesis reviews the available closed form solutions to the problem. The proposed framework is divided in three main blocks: 1) stereo and feature tracking computation, 2) ego-motion estimation and 3) estimation of 3D point position and 3D velocity. The first block solves the correspondence problem providing the clouds of points as output. No special implementation of this block is required in this thesis. The ego-motion block computes the motion of the cameras by finding the absolute orientation between the clouds of static points in the environment. Since the cloud of points might contain independently moving objects and outliers generated by false correspondences, the direct computation of the least squares might lead to an erroneous solution. The first contribution of this thesis is an effective rejection rule that detects outliers based on the distance between predicted and measured quantities, and reduces the effects of noisy measurement by assigning appropriate weights to the data. This method is called Smoothness Motion Constraint (SMC). The ego-motion of the camera between two frames is obtained finding the absolute orientation between consecutive clouds of weighted 3D points. The complete ego-motion since initialization is achieved concatenating the individual motion estimates. This leads to a super-linear propagation of the error, since noise is integrated. A second contribution of this dissertation is a predictor/corrector iterative method, which integrates the clouds of 3D points of multiple time instances for the computation of ego-motion. The presented method considerably reduces the accumulation of errors in the estimated ego-position of the camera. Another contribution of this dissertation is a method which recursively estimates the 3D world position of a point and its velocity; by fusing stereo, feature tracking and the estimated ego-motion in a Kalman Filter system. An improved estimation of point position is obtained this way, which is used in the subsequent system cycle resulting in an improved computation of ego-motion. The general contribution of this dissertation is a single framework for the real time computation of scene structure, independently moving objects and ego-motion for automotive applications.Autofahren kann gefährlich sein. Die Fahrleistung wird durch die physischen und psychischen Grenzen des Fahrers und durch externe Faktoren wie das Wetter beeinflusst. Fahrerassistenzsysteme erhöhen den Fahrkomfort und unterstützen den Fahrer, um die Anzahl an Unfällen zu verringern. Fahrerassistenzsysteme unterstützen den Fahrer durch Warnungen mit optischen oder akustischen Signalen bis hin zur Übernahme der Kontrolle über das Auto durch das System. Eine der Hauptvoraussetzungen für die meisten Fahrerassistenzsysteme ist die akkurate Kenntnis der Bewegung des eigenen Fahrzeugs. Heutzutage verfügt man über verschiedene Sensoren, um die Bewegung des Fahrzeugs zu messen, wie zum Beispiel GPS und Tachometer. Doch Auflösung und Genauigkeit dieser Systeme sind nicht ausreichend für viele Echtzeitanwendungen. Die Berechnung der Eigenbewegung aus Stereobildsequenzen für Fahrerassistenzsysteme, z.B. zur autonomen Navigation oder Kollisionsvermeidung, bildet den Kern dieser Arbeit. Diese Dissertation präsentiert ein System zur Echtzeitbewertung einer Szene, inklusive Detektion und Bewertung von unabhängig bewegten Objekten sowie der akkuraten Schätzung der sechs Freiheitsgrade der Eigenbewegung. Diese grundlegenden Bestandteile sind erforderlich, um viele intelligente Automobilanwendungen zu entwickeln, die den Fahrer in unterschiedlichen Verkehrssituationen unterstützen. Das System arbeitet ausschließlich mit einer Stereokameraplattform als Sensor. Um die Eigenbewegung und die Szenenstruktur zu berechnen wird eine Analyse des Rauschens und der Fehlerfortpflanzung im Bildaufbereitungsprozess benötigt. Deshalb werden in dieser Dissertation die Rauscheigenschaften der durch Stereotriangulation erhaltenen 3D-Punkte analysiert. Dies führt zu der Entdeckung eines systematischen Fehlers in der Schätzung der 3D-Position, der sich mit einer Neuformulierung der Projektionsgleichung korrigieren lässt. Die Simulationsergebnisse zeigen, dass eine bedeutende Verringerung des Fehlers in der geschätzten 3D-Punktposition möglich ist. Die Eigenbewegungsschätzung wird gewonnen, indem die Rotation und Translation zwischen Punktwolken geschätzt wird. Dieses Problem ist als „absolute Orientierung” bekannt und viele Lösungen auf Basis der Methode der kleinsten Quadrate sind in der Literatur vorgeschlagen worden. Diese Arbeit rezensiert die verfügbaren geschlossenen Lösungen zu dem Problem. Das vorgestellte System gliedert sich in drei wesentliche Bausteine: 1. Registrierung von Bildmerkmalen, 2. Eigenbewegungsschätzung und 3. iterative Schätzung von 3D-Position und 3D-Geschwindigkeit von Weltpunkten. Der erster Block erhält eine Folge rektifizierter Bilder als Eingabe und liefert daraus eine Liste von verfolgten Bildmerkmalen mit ihrer entsprechenden 3D-Position. Der Block „Eigenbewegungsschätzung” besteht aus vier Hauptschritten in einer Schleife: 1. Bewegungsvorhersage, 2. Anwendung der Glattheitsbedingung für die Bewegung (GBB), 3. absolute Orientierungsberechnung und 4. Bewegungsintegration. Die in dieser Dissertation vorgeschlagene GBB ist eine mächtige Bedingung für die Ablehnung von Ausreißern und für die Zuordnung von Gewichten zu den gemessenen 3D-Punkten. Simulationen werden mit gaußschem und slashschem Rauschen ausgeführt. Die Ergebnisse zeigen die Überlegenheit der GBB-Version über die Standardgewichtungsmethoden. Die Stabilität der Ergebnisse hinsichtlich Ausreißern wurde analysiert mit dem Resultat, dass der „break down point” größer als 50% ist. Wenn die vier Schritte iterativ ausgeführt, werden wird ein Prädiktor-Korrektor-Verfahren gewonnen.Wir nennen diese Schätzung Multi-frameschätzung im Gegensatz zur Zweiframeschätzung, die nur die aktuellen und vorherigen Bildpaare für die Berechnung der Eigenbewegung betrachtet. Die erste Iteration wird zwischen der aktuellen und vorherigen Wolke von Punkten durchgeführt. Jede weitere Iteration integriert eine zusätzliche Punktwolke eines vorherigen Zeitpunkts. Diese Methode reduziert die Fehlerakkumulation bei der Integration von mehreren Schätzungen in einer einzigen globalen Schätzung. Simulationsergebnisse zeigen, dass obwohl der Fehler noch superlinear im Laufe der Zeit zunimmt, die Größe des Fehlers um mehrere Größenordnungen reduziert wird. Der dritte Block besteht aus der iterativen Schätzung von 3D-Position und 3D-Geschwindigkeit von Weltpunkten. Hier wird eine Methode basierend auf einem Kalman Filter verwendet, das Stereo, Featuretracking und Eigenbewegungsdaten fusioniert. Messungen der Position eines Weltpunkts werden durch das Stereokamerasystem gewonnen. Die Differenzierung der Position des geschätzten Punkts erlaubt die zusätzliche Schätzung seiner Geschwindigkeit. Die Messungen werden durch das Messmodell gewonnen, das Stereo- und Bewegungsdaten fusioniert. Simulationsergebnisse validieren das Modell. Die Verringerung der Positionsunsicherheit im Laufe der Zeit wird mit einer Monte-Carlo Simulation erzielt. Experimentelle Ergebnisse werden mit langen Sequenzen von Bildern erzielt. Zusätzliche Tests, einschließlich einer 3D-Rekonstruktion einer Waldszene und der Berechnung der freien Kamerabewegung in einem Indoor-Szenario, wurden durchgeführt. Die Methode zeigt gute Ergebnisse in allen Fällen. Der Algorithmus liefert zudem akzeptable Ergebnisse bei der Schätzung der Lage kleiner Objekte, wie Köpfe und Beine von realen Crash-Test-Dummies

Hochschulschriftenserver - Universität Frankfurt am Main

Recommended from our members

Hand gesture recognition using deep learning neural networks

Author: Alnaim Norah
Publication venue: Brunel University London
Publication date: 01/01/2020
Field of study

This thesis was submitted for the award of Doctor of Philosophy and was awarded by Brunel University LondonHuman Computer Interaction (HCI) is a broad field involving different types of interactions including gestures. Gesture recognition concerns non-verbal motions used as a means of communication in HCI. A system may be utilised to identify human gestures to convey information for device control. This represents a significant field within HCI involving device interfaces and users. The aim of gesture recognition is to record gestures that are formed in a certain way and then detected by a device such as a camera. Hand gestures can be used as a form of communication for many different applications. It may be used by people who possess different disabilities, including those with hearing-impairments, speech impairments and stroke patients, to communicate and fulfil their basic needs. Various studies have previously been conducted relating to hand gestures. Some studies proposed different techniques to implement the hand gesture experiments. For image processing there are multiple tools to extract features of images, as well as Artificial Intelligence which has varied classifiers to classify different types of data. 2D and 3D hand gestures request an effective algorithm to extract images and classify various mini gestures and movements. This research discusses this issue using different algorithms. To detect 2D or 3D hand gestures, this research proposed image processing tools such as Wavelet Transforms and Empirical Mode Decomposition to extract image features. The Artificial Neural Network (ANN) classifier which used to train and classify data besides Convolutional Neural Networks (CNN). These methods were examined in terms of multiple parameters such as execution time, accuracy, sensitivity, specificity, positive predictive value, negative predictive value, positive likelihood, negative likelihood, receiver operating characteristic, area under ROC curve and root mean square. This research discusses four original contributions in the field of hand gestures. The first contribution is an implementation of two experiments using 2D hand gesture video where ten different gestures are detected in short and long distances using an iPhone 6 Plus with 4K resolution. The experiments are performed using WT and EMD for feature extraction while ANN and CNN for classification. The second contribution comprises 3D hand gesture video experiments where twelve gestures are recorded using holoscopic imaging system camera. The third contribution pertains experimental work carried out to detect seven common hand gestures. Finally, disparity experiments were performed using the left and the right 3D hand gesture videos to discover disparities. The results of comparison show the accuracy results of CNN being 100% compared to other techniques. CNN is clearly the most appropriate method to be used in a hand gesture system.Imam Abdulrahman bin Faisal Universit

Brunel University Research Archive

Visual Tracking and Motion Estimation for an On-orbit Servicing of a Satellite

Author: Oumer Nassir W.
Publication venue
Publication date: 01/09/2016
Field of study

This thesis addresses visual tracking of a non-cooperative as well as a partially cooperative satellite, to enable close-range rendezvous between a servicer and a target satellite. Visual tracking and estimation of relative motion between a servicer and a target satellite are critical abilities for rendezvous and proximity operation such as repairing and deorbiting. For this purpose, Lidar has been widely employed in cooperative rendezvous and docking missions. Despite its robustness to harsh space illumination, Lidar has high weight and rotating parts and consumes more power, thus undermines the stringent requirements of a satellite design. On the other hand, inexpensive on-board cameras can provide an effective solution, working at a wide range of distances. However, conditions of space lighting are particularly challenging for image based tracking algorithms, because of the direct sunlight exposure, and due to the glossy surface of the satellite that creates strong reflection and image saturation, which leads to difficulties in tracking procedures. In order to address these difficulties, the relevant literature is examined in the fields of computer vision, and satellite rendezvous and docking. Two classes of problems are identified and relevant solutions, implemented on a standard computer are provided. Firstly, in the absence of a geometric model of the satellite, the thesis presents a robust feature-based method with prediction capability in case of insufficient features, relying on a point-wise motion model. Secondly, we employ a robust model-based hierarchical position localization method to handle change of image features along a range of distances, and localize an attitude-controlled (partially cooperative) satellite. Moreover, the thesis presents a pose tracking method addressing ambiguities in edge-matching, and a pose detection algorithm based on appearance model learning. For the validation of the methods, real camera images and ground truth data, generated with a laboratory tet bed similar to space conditions are used. The experimental results indicate that camera based methods provide robust and accurate tracking for the approach of malfunctioning satellites in spite of the difficulties associated with specularities and direct sunlight. Also exceptional lighting conditions associated to the sun angle are discussed, aimed at achieving fully reliable localization system in a certain mission

Institute of Transport Research:Publications