Search CORE

2,001 research outputs found

Recommended from our members

Foreground detection of video through the integration of novel multiple detection algorithims

Author: Nawaz Muhammad
Publication venue: Brunel University School of Engineering and Design PhD Theses
Publication date: 01/01/2013
Field of study

This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel UniversityThe main outcomes of this research are the design of a foreground detection algorithm, which is more accurate and less time consuming than existing algorithms. By the term accuracy we mean an exact mask (which satisfies the respective ground truth value) of the foreground object(s). Motion detection being the prior component of foreground detection process can be achieved via pixel based and block based methods, both of which have their own merits and disadvantages. Pixel based methods are efficient in terms of accuracy but a time consuming process, so cannot be recommended for real time applications. On the other hand block based motion estimation has relatively less accuracy but consumes less time and is thus ideal for real-time applications. In the first proposed algorithm, block based motion estimation technique is opted for timely execution. To overcome the issue of accuracy another morphological based technique was adopted called opening-and-closing by reconstruction, which is a pixel based operation so produces higher accuracy and requires lesser time in execution. Morphological operation opening-and-closing by reconstruction finds the maxima and minima inside the foreground object(s). Thus this novel simultaneous process compensates for the lower accuracy of block based motion estimation. To verify the efficiency of this algorithm a complex video consisting of multiple colours, and fast and slow motions at various places was selected. Based on 11 different performance measures the proposed algorithm achieved an average accuracy of more than 24.73% than four of the well-established algorithms. Background subtraction, being the most cited algorithm for foreground detection, encounters the major problem of proper threshold value at run time. For effective value of the threshold at run time in background subtraction algorithm, the primary component of the foreground detection process, motion is used, in this next proposed algorithm. For the said purpose the smooth histogram peaks and valley of the motion were analyzed, which reflects the high and slow motion areas of the moving object(s) in the given frame and generates the threshold value at run time by exploiting the values of peaks and valley. This proposed algorithm was tested using four recommended video sequences including indoor and outdoor shoots, and were compared with five high ranked algorithms. Based on the values of standard performance measures, the proposed algorithm achieved an average of more than 12.30% higher accuracy results

Brunel University Research Archive

Robust Modular Feature-Based Terrain-Aided Visual Navigation and Mapping

Author: Volkova Anastasiia
Publication venue: Faculty of Engineering and Information Technologies, School of Aerospace, Mechanical and Mechatronic Engineering
Publication date: 31/08/2018
Field of study

The visual feature-based Terrain-Aided Navigation (TAN) system presented in this thesis addresses the problem of constraining inertial drift introduced into the location estimate of Unmanned Aerial Vehicles (UAVs) in GPS-denied environment. The presented TAN system utilises salient visual features representing semantic or human-interpretable objects (roads, forest and water boundaries) from onboard aerial imagery and associates them to a database of reference features created a-priori, through application of the same feature detection algorithms to satellite imagery. Correlation of the detected features with the reference features via a series of the robust data association steps allows a localisation solution to be achieved with a finite absolute bound precision defined by the certainty of the reference dataset. The feature-based Visual Navigation System (VNS) presented in this thesis was originally developed for a navigation application using simulated multi-year satellite image datasets. The extension of the system application into the mapping domain, in turn, has been based on the real (not simulated) flight data and imagery. In the mapping study the full potential of the system, being a versatile tool for enhancing the accuracy of the information derived from the aerial imagery has been demonstrated. Not only have the visual features, such as road networks, shorelines and water bodies, been used to obtain a position ’fix’, they have also been used in reverse for accurate mapping of vehicles detected on the roads into an inertial space with improved precision. Combined correction of the geo-coding errors and improved aircraft localisation formed a robust solution to the defense mapping application. A system of the proposed design will provide a complete independent navigation solution to an autonomous UAV and additionally give it object tracking capability

Sydney eScholarship

Extraction of Unfoliaged Trees from Terrestrial Image Sequences

Author: Huang Hai
Publication venue: Universität der Bundeswehr München, Fakultät für Bauingenieurwesen und Umweltwissenschaften
Publication date: 01/01/2010
Field of study

This thesis presents a generative statistical approach for the fully automatic three-dimensional (3D) extraction and reconstruction of unfoliaged deciduous trees from wide-baseline image sequences. Tree models improve the realism of 3D Geoinformation systems (GIS) by adding a natural touch. Unfoliaged trees are, however, difficult to reconstruct from images due to partially weak contrast, background clutter, occlusions, and particularly the possibly varying order of branches in images from different viewpoints. The proposed approach combines generative modeling by L-systems and statistical maximum a posteriori (MAP) estimation for the extraction of the 3D branching structure of trees. Background estimation is conducted by means of mathematical (gray scale) morphology as basis for generative modeling. A Gaussian likelihood function based on intensity differences is employed to evaluate the hypotheses. A mechanism has been devised to control the sampling sequence of multiple parameters in the Markov Chain considering their characteristics and the performance in the previous step. A tree is classified into three typical branching types after the extraction of the first level of branches and more specific Production Rules of L-systems are used accordingly. Generic prior distributions for parameters are refined based on already extracted branches in a Bayesian framework and integrated into the MAP estimation. By these means most of the branching structure besides tiny twigs can be reconstructed. Results are presented in the form of VRML (Virtual Reality Modeling Language) models demonstrating the potential of the approach as well as its current shortcomings.Diese Dissertationsschrift stellt einen generativen statistischen Ansatz für die vollautomatische drei-dimensionale (3D) Extraktion und Rekonstruktion unbelaubter Laubbäume aus Bildsequenzen mit großer Basis vor. Modelle für Bäume verbessern den Realismus von 3D Geoinformationssystemen (GIS), indem sie Letzteren eine natürliche Note geben. Wegen z.T. schwachem Kontrast, Störobjekten im Hintergrund, Verdeckungen und insbesondere der möglicherweise unterschiedlichen Ordnung der Äste in Bildern von verschiedenen Blickpunkten sind unbelaubte Bäume aber schwierig zu rekonstruieren. Der vorliegende Ansatz kombiniert generative Modellierung mittels L-Systemen und statistische Maximum A Posteriori (MAP) Schätzung für die Extraktion der 3D Verzweigungsstruktur von Bäumen. Hintergrund-Schätzung wird auf Grundlage von mathematischer (Grauwert) Morphologie als Basis für die generative Modellierung durchgeführt. Für die Bewertung der Hypothesen wird eine Gaußsche Likelihood-Funktion basierend auf Intensitätsunterschieden benutzt. Es wurde ein Mechanismus entworfen, der die Reihenfolge der Verwendung mehrerer Parameter für die Markoff-Kette basierend auf deren Charakteristik und Performance im letzten Schritt kontrolliert. Ein Baum wird nach der Extraktion der ersten Stufe von Ästen in drei typische Verzweigungstypen klassifiziert und es werden entsprechend Produktionsregeln von spezifischen L-Systemen verwendet. Basierend auf bereits extrahierten Ästen werden generische Prior-Verteilungen für die Parameter in einem Bayes’schen Rahmen verfeinert und in die MAP Schätzung integriert. Damit kann ein großer Teil der Verzweigungsstruktur außer kleinen Ästen extrahiert werden. Die Ergebnisse werden als VRML (Virtual Reality Modeling Language) Modelle dargestellt. Sie zeigen das Potenzial aber auch die noch vorhandenen Defizite des Ansatzes

Universität der Bundeswehr München: AtheneForschung

Extracting semantic video objects

Author: Feng DD
Long F
Peng H
Siu WC
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 11/12/2014
Field of study

Dagan Feng2000-2001 > Academic research: refereed > Publication in refereed journalVersion of RecordPublishe

PolyU Institutional Repository

Video object segmentation and tracking.

Author: Murugas Themesha.
Publication venue
Publication date: 01/01/2005
Field of study

Thesis (M.Sc.Eng.)-University of KwaZulu-Natal, 2005One of the more complex video processing problems currently vexing researchers is that of object segmentation. This involves identifying semantically meaningful objects in a scene and separating them from the background. While the human visual system is capable of performing this task with minimal effort, development and research in machine vision is yet to yield techniques that perform the task as effectively and efficiently. The problem is not only difficult due to the complexity of the mechanisms involved but also because it is an ill-posed problem. No unique segmentation of a scene exists as what is of interest as a segmented object depends very much on the application and the scene content. In most situations a priori knowledge of the nature of the problem is required, often depending on the specific application in which the segmentation tool is to be used. This research presents an automatic method of segmenting objects from a video sequence. The intent is to extract and maintain both the shape and contour information as the object changes dynamically over time in the sequence. A priori information is incorporated by requesting the user to tune a set of input parameters prior to execution of the algorithm. Motion is used as a semantic for video object extraction subject to the assumption that there is only one moving object in the scene and the only motion in the video sequence is that of the object of interest. It is further assumed that there is constant illumination and no occlusion of the object. A change detection mask is used to detect the moving object followed by morphological operators to refine the result. The change detection mask yields a model of the moving components; this is then compared to a contour map of the frame to extract a more accurate contour of the moving object and this is then used to extract the object of interest itself. Since the video object is moving as the sequence progresses, it is necessary to update the object over time. To accomplish this, an object tracker has been implemented based on the Hausdorff objectmatching algorithm. The dissertation begins with an overview of segmentation techniques and a discussion of the approach used in this research. This is followed by a detailed description of the algorithm covering initial segmentation, object tracking across frames and video object extraction. Finally, the semantic object extraction results for a variety of video sequences are presented and evaluated

ResearchSpace@UKZN

Self-Supervised Object-in-Gripper Segmentation from Robotic Motions

Author: Boerdijk Wout
Durner Maximilian
Sundermeyer Martin
Triebel Rudolph
Publication venue
Publication date: 01/01/2020
Field of study

Accurate object segmentation is a crucial task in the context of robotic manipulation. However, creating sufficient annotated training data for neural networks is particularly time consuming and often requires manual labeling. To this end, we propose a simple, yet robust solution for learning to segment unknown objects grasped by a robot. Specifically, we exploit motion and temporal cues in RGB video sequences. Using optical flow estimation we first learn to predict segmentation masks of our given manipulator. Then, these annotations are used in combination with motion cues to automatically distinguish between background, manipulator and unknown, grasped object. In contrast to existing systems our approach is fully self-supervised and independent of precise camera calibration, 3D models or potentially imperfect depth data. We perform a thorough comparison with alternative baselines and approaches from literature. The object masks and views are shown to be suitable training data for segmentation networks that generalize to novel environments and also allow for watertight 3D reconstruction.Comment: 15 pages, 11 figures. Video: https://www.youtube.com/watch?v=srEwuuIIgz

arXiv.org e-Print Archive

Institute of Transport Research:Publications