457 research outputs found

    BEMDEC: An Adaptive and Robust Methodology for Digital Image Feature Extraction

    Get PDF
    The intriguing study of feature extraction, and edge detection in particular, has, as a result of the increased use of imagery, drawn even more attention not just from the field of computer science but also from a variety of scientific fields. However, various challenges surrounding the formulation of feature extraction operator, particularly of edges, which is capable of satisfying the necessary properties of low probability of error (i.e., failure of marking true edges), accuracy, and consistent response to a single edge, continue to persist. Moreover, it should be pointed out that most of the work in the area of feature extraction has been focused on improving many of the existing approaches rather than devising or adopting new ones. In the image processing subfield, where the needs constantly change, we must equally change the way we think. In this digital world where the use of images, for variety of purposes, continues to increase, researchers, if they are serious about addressing the aforementioned limitations, must be able to think outside the box and step away from the usual in order to overcome these challenges. In this dissertation, we propose an adaptive and robust, yet simple, digital image features detection methodology using bidimensional empirical mode decomposition (BEMD), a sifting process that decomposes a signal into its two-dimensional (2D) bidimensional intrinsic mode functions (BIMFs). The method is further extended to detect corners and curves, and as such, dubbed as BEMDEC, indicating its ability to detect edges, corners and curves. In addition to the application of BEMD, a unique combination of a flexible envelope estimation algorithm, stopping criteria and boundary adjustment made the realization of this multi-feature detector possible. Further application of two morphological operators of binarization and thinning adds to the quality of the operator

    Video Deinterlacing using Control Grid Interpolation Frameworks

    Get PDF
    abstract: Video deinterlacing is a key technique in digital video processing, particularly with the widespread usage of LCD and plasma TVs. This thesis proposes a novel spatio-temporal, non-linear video deinterlacing technique that adaptively chooses between the results from one dimensional control grid interpolation (1DCGI), vertical temporal filter (VTF) and temporal line averaging (LA). The proposed method performs better than several popular benchmarking methods in terms of both visual quality and peak signal to noise ratio (PSNR). The algorithm performs better than existing approaches like edge-based line averaging (ELA) and spatio-temporal edge-based median filtering (STELA) on fine moving edges and semi-static regions of videos, which are recognized as particularly challenging deinterlacing cases. The proposed approach also performs better than the state-of-the-art content adaptive vertical temporal filtering (CAVTF) approach. Along with the main approach several spin-off approaches are also proposed each with its own characteristics.Dissertation/ThesisM.S. Electrical Engineering 201

    Digital Image Processing

    Get PDF
    This book presents several recent advances that are related or fall under the umbrella of 'digital image processing', with the purpose of providing an insight into the possibilities offered by digital image processing algorithms in various fields. The presented mathematical algorithms are accompanied by graphical representations and illustrative examples for an enhanced readability. The chapters are written in a manner that allows even a reader with basic experience and knowledge in the digital image processing field to properly understand the presented algorithms. Concurrently, the structure of the information in this book is such that fellow scientists will be able to use it to push the development of the presented subjects even further

    A Comparative Study of Feature Detection Methods for AUV Localization

    Get PDF
    Underwater localization is a difficult task when it comes to making the system autonomous due to the unpredictable environment. The fact that radio signals such as GPS cannot be transmitted through water makes autonomous movement and localization underwater even more challenging. One specific method that is widely used for autonomous underwater navigation applications is Simultaneous Localization and Mapping (SLAM), a technique in which a map is created and updated while localizing the vehicle within the map. In SLAM, feature detection is used in landmark extraction and data association by examining each pixel and differentiating landmarks pixels from those of the background. Previous research on the performance of different feature detection methods have been done in environments such as cisterns and caverns where the effects of the ocean are reduced. Our objective, however, is to achieves robust localization in the open ocean environment of the Cal Poly pier. This thesis performs a comparative study between different feature detection methods including Scale Invariant Feature Transform (SIFT), Speeded-Up Robust Features (SURF), and Oriented FAST and Rotated BRIEF (ORB) on different sensors. We evaluate the feature detection and matching performance of these algorithms in a simulated open-ocean environment

    Adaptive Methods for Robust Document Image Understanding

    Get PDF
    A vast amount of digital document material is continuously being produced as part of major digitization efforts around the world. In this context, generic and efficient automatic solutions for document image understanding represent a stringent necessity. We propose a generic framework for document image understanding systems, usable for practically any document types available in digital form. Following the introduced workflow, we shift our attention to each of the following processing stages in turn: quality assurance, image enhancement, color reduction and binarization, skew and orientation detection, page segmentation and logical layout analysis. We review the state of the art in each area, identify current defficiencies, point out promising directions and give specific guidelines for future investigation. We address some of the identified issues by means of novel algorithmic solutions putting special focus on generality, computational efficiency and the exploitation of all available sources of information. More specifically, we introduce the following original methods: a fully automatic detection of color reference targets in digitized material, accurate foreground extraction from color historical documents, font enhancement for hot metal typesetted prints, a theoretically optimal solution for the document binarization problem from both computational complexity- and threshold selection point of view, a layout-independent skew and orientation detection, a robust and versatile page segmentation method, a semi-automatic front page detection algorithm and a complete framework for article segmentation in periodical publications. The proposed methods are experimentally evaluated on large datasets consisting of real-life heterogeneous document scans. The obtained results show that a document understanding system combining these modules is able to robustly process a wide variety of documents with good overall accuracy

    Valvekaameratel põhineva inimseire täiustamine pildi resolutsiooni parandamise ning näotuvastuse abil

    Get PDF
    Due to importance of security in the society, monitoring activities and recognizing specific people through surveillance video camera is playing an important role. One of the main issues in such activity rises from the fact that cameras do not meet the resolution requirement for many face recognition algorithms. In order to solve this issue, in this work we are proposing a new system which super resolve the image. First, we are using sparse representation with the specific dictionary involving many natural and facial images to super resolve images. As a second method, we are using deep learning convulutional network. Image super resolution is followed by Hidden Markov Model and Singular Value Decomposition based face recognition. The proposed system has been tested on many well-known face databases such as FERET, HeadPose, and Essex University databases as well as our recently introduced iCV Face Recognition database (iCV-F). The experimental results shows that the recognition rate is increasing considerably after applying the super resolution by using facial and natural image dictionary. In addition, we are also proposing a system for analysing people movement on surveillance video. People including faces are detected by using Histogram of Oriented Gradient features and Viola-jones algorithm. Multi-target tracking system with discrete-continuouos energy minimization tracking system is then used to track people. The tracking data is then in turn used to get information about visited and passed locations and face recognition results for tracked people

    Video content analysis for intelligent forensics

    Get PDF
    The networks of surveillance cameras installed in public places and private territories continuously record video data with the aim of detecting and preventing unlawful activities. This enhances the importance of video content analysis applications, either for real time (i.e. analytic) or post-event (i.e. forensic) analysis. In this thesis, the primary focus is on four key aspects of video content analysis, namely; 1. Moving object detection and recognition, 2. Correction of colours in the video frames and recognition of colours of moving objects, 3. Make and model recognition of vehicles and identification of their type, 4. Detection and recognition of text information in outdoor scenes. To address the first issue, a framework is presented in the first part of the thesis that efficiently detects and recognizes moving objects in videos. The framework targets the problem of object detection in the presence of complex background. The object detection part of the framework relies on background modelling technique and a novel post processing step where the contours of the foreground regions (i.e. moving object) are refined by the classification of edge segments as belonging either to the background or to the foreground region. Further, a novel feature descriptor is devised for the classification of moving objects into humans, vehicles and background. The proposed feature descriptor captures the texture information present in the silhouette of foreground objects. To address the second issue, a framework for the correction and recognition of true colours of objects in videos is presented with novel noise reduction, colour enhancement and colour recognition stages. The colour recognition stage makes use of temporal information to reliably recognize the true colours of moving objects in multiple frames. The proposed framework is specifically designed to perform robustly on videos that have poor quality because of surrounding illumination, camera sensor imperfection and artefacts due to high compression. In the third part of the thesis, a framework for vehicle make and model recognition and type identification is presented. As a part of this work, a novel feature representation technique for distinctive representation of vehicle images has emerged. The feature representation technique uses dense feature description and mid-level feature encoding scheme to capture the texture in the frontal view of the vehicles. The proposed method is insensitive to minor in-plane rotation and skew within the image. The capability of the proposed framework can be enhanced to any number of vehicle classes without re-training. Another important contribution of this work is the publication of a comprehensive up to date dataset of vehicle images to support future research in this domain. The problem of text detection and recognition in images is addressed in the last part of the thesis. A novel technique is proposed that exploits the colour information in the image for the identification of text regions. Apart from detection, the colour information is also used to segment characters from the words. The recognition of identified characters is performed using shape features and supervised learning. Finally, a lexicon based alignment procedure is adopted to finalize the recognition of strings present in word images. Extensive experiments have been conducted on benchmark datasets to analyse the performance of proposed algorithms. The results show that the proposed moving object detection and recognition technique superseded well-know baseline techniques. The proposed framework for the correction and recognition of object colours in video frames achieved all the aforementioned goals. The performance analysis of the vehicle make and model recognition framework on multiple datasets has shown the strength and reliability of the technique when used within various scenarios. Finally, the experimental results for the text detection and recognition framework on benchmark datasets have revealed the potential of the proposed scheme for accurate detection and recognition of text in the wild

    Spatial and temporal background modelling of non-stationary visual scenes

    Get PDF
    PhDThe prevalence of electronic imaging systems in everyday life has become increasingly apparent in recent years. Applications are to be found in medical scanning, automated manufacture, and perhaps most significantly, surveillance. Metropolitan areas, shopping malls, and road traffic management all employ and benefit from an unprecedented quantity of video cameras for monitoring purposes. But the high cost and limited effectiveness of employing humans as the final link in the monitoring chain has driven scientists to seek solutions based on machine vision techniques. Whilst the field of machine vision has enjoyed consistent rapid development in the last 20 years, some of the most fundamental issues still remain to be solved in a satisfactory manner. Central to a great many vision applications is the concept of segmentation, and in particular, most practical systems perform background subtraction as one of the first stages of video processing. This involves separation of ‘interesting foreground’ from the less informative but persistent background. But the definition of what is ‘interesting’ is somewhat subjective, and liable to be application specific. Furthermore, the background may be interpreted as including the visual appearance of normal activity of any agents present in the scene, human or otherwise. Thus a background model might be called upon to absorb lighting changes, moving trees and foliage, or normal traffic flow and pedestrian activity, in order to effect what might be termed in ‘biologically-inspired’ vision as pre-attentive selection. This challenge is one of the Holy Grails of the computer vision field, and consequently the subject has received considerable attention. This thesis sets out to address some of the limitations of contemporary methods of background segmentation by investigating methods of inducing local mutual support amongst pixels in three starkly contrasting paradigms: (1) locality in the spatial domain, (2) locality in the shortterm time domain, and (3) locality in the domain of cyclic repetition frequency. Conventional per pixel models, such as those based on Gaussian Mixture Models, offer no spatial support between adjacent pixels at all. At the other extreme, eigenspace models impose a structure in which every image pixel bears the same relation to every other pixel. But Markov Random Fields permit definition of arbitrary local cliques by construction of a suitable graph, and 3 are used here to facilitate a novel structure capable of exploiting probabilistic local cooccurrence of adjacent Local Binary Patterns. The result is a method exhibiting strong sensitivity to multiple learned local pattern hypotheses, whilst relying solely on monochrome image data. Many background models enforce temporal consistency constraints on a pixel in attempt to confirm background membership before being accepted as part of the model, and typically some control over this process is exercised by a learning rate parameter. But in busy scenes, a true background pixel may be visible for a relatively small fraction of the time and in a temporally fragmented fashion, thus hindering such background acquisition. However, support in terms of temporal locality may still be achieved by using Combinatorial Optimization to derive shortterm background estimates which induce a similar consistency, but are considerably more robust to disturbance. A novel technique is presented here in which the short-term estimates act as ‘pre-filtered’ data from which a far more compact eigen-background may be constructed. Many scenes entail elements exhibiting repetitive periodic behaviour. Some road junctions employing traffic signals are among these, yet little is to be found amongst the literature regarding the explicit modelling of such periodic processes in a scene. Previous work focussing on gait recognition has demonstrated approaches based on recurrence of self-similarity by which local periodicity may be identified. The present work harnesses and extends this method in order to characterize scenes displaying multiple distinct periodicities by building a spatio-temporal model. The model may then be used to highlight abnormality in scene activity. Furthermore, a Phase Locked Loop technique with a novel phase detector is detailed, enabling such a model to maintain correct synchronization with scene activity in spite of noise and drift of periodicity. This thesis contends that these three approaches are all manifestations of the same broad underlying concept: local support in each of the space, time and frequency domains, and furthermore, that the support can be harnessed practically, as will be demonstrated experimentally
    corecore