Search CORE

10 research outputs found

Recommended from our members

Foreground detection of video through the integration of novel multiple detection algorithims

Author: Nawaz Muhammad
Publication venue: Brunel University School of Engineering and Design PhD Theses
Publication date: 01/01/2013
Field of study

This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel UniversityThe main outcomes of this research are the design of a foreground detection algorithm, which is more accurate and less time consuming than existing algorithms. By the term accuracy we mean an exact mask (which satisfies the respective ground truth value) of the foreground object(s). Motion detection being the prior component of foreground detection process can be achieved via pixel based and block based methods, both of which have their own merits and disadvantages. Pixel based methods are efficient in terms of accuracy but a time consuming process, so cannot be recommended for real time applications. On the other hand block based motion estimation has relatively less accuracy but consumes less time and is thus ideal for real-time applications. In the first proposed algorithm, block based motion estimation technique is opted for timely execution. To overcome the issue of accuracy another morphological based technique was adopted called opening-and-closing by reconstruction, which is a pixel based operation so produces higher accuracy and requires lesser time in execution. Morphological operation opening-and-closing by reconstruction finds the maxima and minima inside the foreground object(s). Thus this novel simultaneous process compensates for the lower accuracy of block based motion estimation. To verify the efficiency of this algorithm a complex video consisting of multiple colours, and fast and slow motions at various places was selected. Based on 11 different performance measures the proposed algorithm achieved an average accuracy of more than 24.73% than four of the well-established algorithms. Background subtraction, being the most cited algorithm for foreground detection, encounters the major problem of proper threshold value at run time. For effective value of the threshold at run time in background subtraction algorithm, the primary component of the foreground detection process, motion is used, in this next proposed algorithm. For the said purpose the smooth histogram peaks and valley of the motion were analyzed, which reflects the high and slow motion areas of the moving object(s) in the given frame and generates the threshold value at run time by exploiting the values of peaks and valley. This proposed algorithm was tested using four recommended video sequences including indoor and outdoor shoots, and were compared with five high ranked algorithms. Based on the values of standard performance measures, the proposed algorithm achieved an average of more than 12.30% higher accuracy results

Brunel University Research Archive

Strategy for Foreground Movement Identification Adaptive to Background Variations

Author: Anuradha K.
Raajan N.R.
Publication venue: Institute of Advanced Engineering and Science
Publication date: 01/12/2018
Field of study

Video processing has gained a lot of significance because of its applications in various areas of research. This includes monitoring movements in public places for surveillance. Video sequences from various standard datasets such as I2R, CAVIAR and UCSD are often referred for video processing applications and research. Identification of actors as well as the movements in video sequences should be accomplished with the static and dynamic background. The significance of research in video processing lies in identifying the foreground movement of actors and objects in video sequences. Foreground identification can be done with a static or dynamic background. This type of identification becomes complex while detecting the movements in video sequences with a dynamic background. For identification of foreground movement in video sequences with dynamic background, two algorithms are proposed in this article. The algorithms are termed as Frame Difference between Neighboring Frames using Hue, Saturation and Value (FDNF-HSV) and Frame Difference between Neighboring Frames using Greyscale (FDNF-G). With regard to F-measure, recall and precision, the proposed algorithms are evaluated with state-of-art techniques. Results of evaluation show that, the proposed algorithms have shown enhanced performance

IAES journal

ZENODO

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

Institute of Advanced Engineering and Science

Precise foreground detection algorithm using motion estimation, minima and maxima inside the foreground object

Author: Cosmas J
Lazaridis PI
Mohib H
Nawaz M
Zaharis ZD
Zhang Y
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2013
Field of study

In this paper the precise foreground mask is obtained in a complex environment by applying simple and effective methods on a video sequence consisting of multi-colour and multiple foreground object environment. To detect moving objects we use a simple algorithm based on block-based motion estimation, which requires less computational time. To obtain a full and improved mask of the moving object, we use an opening-and-closing-by- reconstruction mechanism to identify the minima and maxima inside the foreground object by applying a set of morphological operations. This further enhances the outlines of foreground objects at various stages of image processing. Therefore, the algorithm does not require the knowledge of the background image. That is why it can be used in real world video sequences to detect the foreground in cases where we do not have a background model in advance. The comparative performance results demonstrate the effectiveness of the proposed algorithm.The Institute of Management Sciences Peshawar (http://imsciences.edu.pk/) through Higher Education Commission Islamabad, Pakistan (http://hec.gov.pk/)

Crossref

Brunel University Research Archive

University of Bedfordshire Repository

University of Huddersfield Repository

Motion Segmentation by New Three-View Constraint from a Moving Camera

Author: Fuyuan Xu
Guohua Gu
Kan Ren
Weixian Qian
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2015
Field of study

We propose a new method for the motion segmentation using a moving camera. The proposed method classifies each image pixel in the image sequence as the background or the motion regions by applying a novel three-view constraint called the “parallax-based multiplanar constraint.” This new three-view constraint, being the main contribution of this paper, is derived from the relative projective structure of two points in three different views and implemented within the “Plane + Parallax” framework. The parallax-based multiplanar constraint overcomes the problem of the previous geometry constraint and does not require the reference plane to be constant across multiple views. Unlike the epipolar constraint, the parallax-based multiplanar constraint modifies the surface degradation to the line degradation to detect the motion objects followed by a moving camera in the same direction. We evaluate the proposed method with several video sequences to demonstrate the effectiveness and robustness of the parallax-based multiplanar constraint

Crossref

Directory of Open Access Journals

Biometric fusion methods for adaptive face recognition in computer vision

Author: Fakhir Mahammad Majed
Publication venue: Newcastle University
Publication date: 01/01/2017
Field of study

PhD ThesisFace recognition is a biometric method that uses different techniques to identify the individuals based on the facial information received from digital image data. The system of face recognition is widely used for security purposes, which has challenging problems. The solutions to some of the most important challenges are proposed in this study. The aim of this thesis is to investigate face recognition across pose problem based on the image parameters of camera calibration. In this thesis, three novel methods have been derived to address the challenges of face recognition and offer solutions to infer the camera parameters from images using a geomtric approach based on perspective projection. The following techniques were used: camera calibration CMT and Face Quadtree Decomposition (FQD), in order to develop the face camera measurement technique (FCMT) for human facial recognition. Facial information from a feature extraction and identity-matching algorithm has been created. The success and efficacy of the proposed algorithm are analysed in terms of robustness to noise, the accuracy of distance measurement, and face recognition. To overcome the intrinsic and extrinsic parameters of camera calibration parameters, a novel technique has been developed based on perspective projection, which uses different geometrical shapes to calibrate the camera. The parameters used in novel measurement technique CMT that enables the system to infer the real distance for regular and irregular objects from the 2-D images. The proposed system of CMT feeds into FQD to measure the distance between the facial points. Quadtree decomposition enhances the representation of edges and other singularities along curves of the face, and thus improves directional features from face detection across face pose. The proposed FCMT system is the new combination of CMT and FQD to recognise the faces in the various pose. The theoretical foundation of the proposed solutions has been thoroughly developed and discussed in detail. The results show that the proposed algorithms outperform existing algorithms in face recognition, with a 2.5% improvement in main error recognition rate compared with recent studies

Newcastle University eTheses

Vision-based detection of aircrafts and UAVs

Author: Rozantsev Artem
Publication venue: Lausanne, EPFL
Publication date: 02/05/2017
Field of study

Unmanned Aerial Vehicles are becoming increasingly popular for a broad variety of tasks ranging from aerial imagery to objects delivery. With the expansion of the areas, where drones can be efficiently used, the collision risk with other flying objects increases. Avoiding such collisions would be a relatively easy task, if all the aircrafts in the neighboring airspace could communicate with each other and share their location information. However, it is often the case that either location information is unavailable (e.g. flying in GPS-denied environments) or communication is not possible (e.g. different communication channels or non-cooperative flight scenario). To ensure flight safety in this kind of situations drones need a way to autonomously detect other objects that are intruding the neighboring airspace. Visual-based collision avoidance is of particular interest as cameras generally consume less power and are more lightweight than active sensor alternatives such as radars and lasers. We have therefore developed a set of increasingly sophisticated algorithms to provide drones with a visual collision avoidance capability. First, we present a novel method for detecting flying objects such as drones and planes that occupy a small part of the camera field of view, possibly move in front of complex backgrounds, and are filmed by a moving camera. In order to be solved this problem requires combining motion and appearance information, as neither of the two alone is capable of providing reliable enough detections. We therefore propose a machine learning technique that operates on spatio- temporal cubes of image intensities where individual patches are aligned using an object-centric regression-based motion stabilization algorithm. Second, in order to reduce the need to collect a large training dataset and to manual annotate it, we introduce a way to generate realistic synthetic images. Given only a small set of real examples and a coarse 3D model of the object, synthetic data can be generated in arbitrary quantities and further used to supplement real examples for training a detector. The key ingredient of our method is that the synthetically generated images need to be as close as possible to the real ones not in terms of image quality, but according to the features, used by a machine learning algorithm. Third, though the aforementioned approach yields a substantial increase in performance when using Adaboost and DPM detectors, it does not generalize well to Convolutional Neural Networks, which have become the state-of-the-art. This happens because, as we add more and more synthetic data, the CNNs begin to overfit to the synthetic images at the expense of the real ones. We therefore propose a novel deep domain adaptation technique that allows efficiently combining real and synthetic images without overfitting to either of the two. While most of the adaptation techniques aim at learning features that are invariant to the possible difference of the images, coming from different sources (real and synthetic). Unlike those methods, we suggest modeling this difference with a special two-stream architecture. We evaluate our approach on three different datasets and show its effectiveness for various classification and regression tasks

Infoscience - École polytechnique fédérale de Lausanne

움직이는 물체 검출 및 추적을 위한 생체 모방 모델

Author: 이광무
Publication venue: 서울대학교 대학원
Publication date: 01/02/2014
Field of study

학위논문 (박사)-- 서울대학교 대학원 : 전기·컴퓨터공학부, 2014. 2. 최진영.In this thesis, we propose bio-mimetic models for motion detection and visual tracking to overcome the limitations of existing methods in actual environments. The models are inspired from the theory that there are four different forms of visual memory for human visual perception when representing a scenevisible persistence, informational persistence, visual short-term memory (VSTM), and visual long-term memory (VLTM). We view our problem as a problem of modeling and representing an observed scene with temporary short-term models (TSTM) and conservative long-term models (CLTM). We study on building efficient and effective models for TSTM and CLTM, and utilizing them together to obtain robust detection and tracking results under occlusions, clumsy initializations, background clutters, drifting, and non-rigid deformations encountered in actual environments. First, we propose an efficient representation of TSTM to be used for moving object detection on non-stationary cameras, which runs within 5.8 milliseconds (ms) on a PC, and real-time on mobile devices. To achieve real-time capability with robust performance, our method models the background through the proposed dual-mode kernel model (DMKM) and compensates the motion of the camera by mixing neighboring models. Modeling through DMKM prevents the background model from being contaminated by foreground pixels, while still allowing the model to be able to adapt to changes of the background. Mixing neighboring models reduces the errors arising from motion compensation and their influences are further reduced by keeping the age of the model. Also, to decrease computation load, the proposed method applies one DMKM to multiple pixels without performance degradation. Experimental results show the computational lightness and the real-time capability of our method on a smart phone with robust detection performances. Second, by using the concept from both TSTM and CLTM, a new visual tracking method using the novel tri-model is proposed. The proposed method aims to solve the problems of occlusions, background clutters, and drifting simultaneously with the new tri-model. The proposed tri-model is composed of three models, where each model learns the target object, the background, and other non-target moving objects online. The proposed scheme performs tracking by finding the best explanation of the scene with the three learned models. By utilizing the information in the background and the foreground models as well as the target object model, our method obtains robust results under occlusions and background clutters. Also, the target object model is updated in a conservative way to prevent drifting. Furthermore, our method is not restricted to bounding-boxes when representing the target object, and is able to give pixel-wise tracking results. Third, we go beyond pixel-wise modeling and propose a local feature based tracking model using both TSTM and CLTM to track objects in case of uncertain initializations and severe occlusions. To track objects accurately in such situations, the proposed scheme uses ``motion saliency'' and ``descriptor saliency'' of local features and performs tracking based on generalized Hough transform (GHT). The proposed motion saliency of a local feature utilizes instantaneous velocity of features to form TSTM and emphasizes features having distinctive motions, compared to the motions coming from local features which are not from the object. The descriptor saliency models local features as CLTM and emphasizes features which are likely to be of the object in terms of its feature descriptors. Through these saliencies, the proposed method tries to ``learn and find'' the target object rather than looking for what was given at initialization, becoming robust to initialization problems. Also, our tracking result is obtained by combining the results of each local features of the target and the surroundings, thus being robust against severe occlusions as well. The proposed method is compared against eight other methods, with nine image sequences, and hundred random initializations. The experimental results show that our method outperforms all other compared methods. Fourth and last, we focus on building robust CLTM with local patches and their neighboring structures. The proposed method is based on sequential Bayesian inference and focuses on solving both the problem of tracking under partial occlusions and the problem of non-rigid object tracking in real-time on desktop personal computers (PC). The proposed scheme is mainly composed of two parts: (1) modeling the target object using elastic structure of local patches for robust performanceand (2) efficient hierarchical diffusion method to perform the tracking process in real-time. The elastic structure of local patches allows the proposed scheme to handle partial occlusions and non-rigid deformations through the relationship among neighboring patches. The proposed hierarchical diffusion generates samples from the region where the posterior is concentrated to reduce computation time. The method is extensively tested on a number of challenging image sequences with occlusion and non-rigid deformation. The experimental results show the real-time capability and the robustness of the proposed scheme under various situations.1 Introduction 1.1 Background and Research Issues 1.1.1 Issues in Motion Detection 1.1.2 Issues in Object Tracking 1.2 The Human Visual Memory 1.2.1 Sensory Memory 1.2.2 Visual Short-Term Memory 1.2.3 Visual Long-Term Memory 1.3 Bio-mimetic Framework for Detection and Tracking 1.4 Contents of the Research 2 Detection by Pixel-wise Dual-Mode Kernel Model 2.1 Proposed Method 2.1.1 Approximated Gaussian Kernel Model 2.1.2 Dual-Mode Kernel Model (DMKM) 2.1.3 Motion Compensation by Mixing Models 2.1.4 Detection of Foreground Pixels 2.2 Experimental Results 2.2.1 Runtime Comparison 2.2.2 Qualitative Comparison 2.2.3 Quantitative Comparison 2.2.4 Effects of Dual-Mode Kernel Model 2.2.5 Effects of Motion Compensation 2.2.6 Mobile Results 2.3 Remarks and Discussion 3 Tracking by Pixel-wise Tri-Model Representation 3.1 Tri-Model Framework 3.1.1 Overall Scheme 3.1.2 Advantages 3.1.3 Practical Approximation 3.2 Tracking with the Tri-Model 3.2.1 Likelihood of the Tri-Model 3.2.2 Likelihood Maximization 3.2.3 Estimating Pixel-Wise Labels 3.3 Learning the Tri-Model 3.3.1 Target Model 3.3.2 Background Model 3.3.3 Foreground Model 3.4 Experimental Results 3.4.1 Experimental Settings 3.4.2 Tracking Accuracy: Bounding Box 3.4.3 Tracking Accuracy: Pixel-Wise 3.5 Remarks and Discussion 4 Tracking by Feature-point-wise Saliency Model 4.1 Proposed Method 4.1.1 Tracking based on GHT 4.1.2 Descriptor Saliency and Feature DB Update 4.1.3 Motion Saliency 4.2 Experimental Results 4.2.1 Tracking with Inaccurate Initializations 4.2.2 Tracking Under Occlusions 4.3 Remarks and Discussion 5 Tracking by Patch-wise Elastic Structure Model 5.1 Tracking with Elastic Structure of Local Patches 5.1.1 Sequential Bayesian Inference Framework 5.1.2 Elastic Structure of Local Patches 5.1.3 Modeling a Single Patch 5.1.4 Modeling the Relationship between Patches 5.1.5 Model Update 5.1.6 Hierarchical Diffusion 5.1.7 Summary of the Proposed Method 5.2 Experiments 5.2.1 Parameter Effects 5.2.2 Performance Evaluation 5.2.3 Discussion on Translation, Rotation, Illumination Changes 5.2.4 Discussion on Partial Occlusions 5.2.5 Discussion on Non-Rigid Deformations 5.2.6 Discussion on Additional Cases 5.2.7 Summary of Tracking Results 5.2.8 Effectiveness of Hierarchical Diffusion 5.2.9 Limitations 5.3 Remarks and Discussion 6 Concluding Remarks and Future Works Bibliography Abstract in KoreanDocto

SNU Open Repository and Archive