1,578 research outputs found

    Fusion of monocular cues to detect man-made structures in aerial imagery

    Get PDF
    The extraction of buildings from aerial imagery is a complex problem for automated computer vision. It requires locating regions in a scene that possess properties distinguishing them as man-made objects as opposed to naturally occurring terrain features. It is reasonable to assume that no single detection method can correctly delineate or verify buildings in every scene. A cooperative-methods paradigm is useful in approaching the building extraction problem. Using this paradigm, each extraction technique provides information which can be added or assimilated into an overall interpretation of the scene. Thus, the main objective is to explore the development of computer vision system that integrates the results of various scene analysis techniques into an accurate and robust interpretation of the underlying three dimensional scene. The problem of building hypothesis fusion in aerial imagery is discussed. Building extraction techniques are briefly surveyed, including four building extraction, verification, and clustering systems. A method for fusing the symbolic data generated by these systems is described, and applied to monocular image and stereo image data sets. Evaluation methods for the fusion results are described, and the fusion results are analyzed using these methods

    Wrapper Maintenance: A Machine Learning Approach

    Full text link
    The proliferation of online information sources has led to an increased use of wrappers for extracting data from Web sources. While most of the previous research has focused on quick and efficient generation of wrappers, the development of tools for wrapper maintenance has received less attention. This is an important research problem because Web sources often change in ways that prevent the wrappers from extracting data correctly. We present an efficient algorithm that learns structural information about data from positive examples alone. We describe how this information can be used for two wrapper maintenance applications: wrapper verification and reinduction. The wrapper verification system detects when a wrapper is not extracting correct data, usually because the Web source has changed its format. The reinduction algorithm automatically recovers from changes in the Web source by identifying data on Web pages so that a new wrapper may be generated for this source. To validate our approach, we monitored 27 wrappers over a period of a year. The verification algorithm correctly discovered 35 of the 37 wrapper changes, and made 16 mistakes, resulting in precision of 0.73 and recall of 0.95. We validated the reinduction algorithm on ten Web sources. We were able to successfully reinduce the wrappers, obtaining precision and recall values of 0.90 and 0.80 on the data extraction task

    Segmenting Hand-Drawn Strokes

    Get PDF
    Pen-based interfaces utilize sketch recognition so users can create and interact with complex, graphical systems via drawn input. In order for people to freely draw within these systems, users' drawing styles should not be constrained. The low-level techniques involved with sketch recognition must then be perfected, because poor low-level accuracy can impair a user's interaction experience. Corner finding, also known as stroke segmentation, is one of the first steps to free-form sketch recognition. Corner finding breaks a drawn stroke into a set of primitive symbols such as lines, arcs, and circles, so that the original stoke data can be transformed into a more machine-friendly format. By working with sketched primitives, drawn objects can then be described in a visual language, noting what primitive shapes have been drawn and the shapes? geometric relationships to each other. We present three new corner finding techniques that improve segmentation accuracy. Our first technique, MergeCF, is a multi-primitive segmenter that splits drawn strokes into primitive lines and arcs. MergeCF eliminates extraneous primitives by merging them with their neighboring segments. Our second technique, ShortStraw, works with polyline-only data. Polyline segments are important since many domains use simple polyline symbols formed with squares, triangles, and arrows. Our ShortStraw algorithm is simple to implement, yet more powerful than previous polyline work in the corner finding literature. Lastly, we demonstrate how a combination technique can be used to pull the best corner finding results from multiple segmentation algorithms. This combination segmenter utilizes the best corners found from other segmentation techniques, eliminating many false negatives (missed primitive segmentations) from the final, low-level results. We will present the implementation and results from our new segmentation techniques, showing how they perform better than related work in the corner finding field. We will also discuss limitations of each technique, how we have sought to overcome those limitations, and where we believe the sketch recognition subfield of corner finding is headed

    Video foreground extraction for mobile camera platforms

    Get PDF
    Foreground object detection is a fundamental task in computer vision with many applications in areas such as object tracking, event identification, and behavior analysis. Most conventional foreground object detection methods work only in a stable illumination environments using fixed cameras. In real-world applications, however, it is often the case that the algorithm needs to operate under the following challenging conditions: drastic lighting changes, object shape complexity, moving cameras, low frame capture rates, and low resolution images. This thesis presents four novel approaches for foreground object detection on real-world datasets using cameras deployed on moving vehicles.The first problem addresses passenger detection and tracking tasks for public transport buses investigating the problem of changing illumination conditions and low frame capture rates. Our approach integrates a stable SIFT (Scale Invariant Feature Transform) background seat modelling method with a human shape model into a weighted Bayesian framework to detect passengers. To deal with the problem of tracking multiple targets, we employ the Reversible Jump Monte Carlo Markov Chain tracking algorithm. Using the SVM classifier, the appearance transformation models capture changes in the appearance of the foreground objects across two consecutives frames under low frame rate conditions. In the second problem, we present a system for pedestrian detection involving scenes captured by a mobile bus surveillance system. It integrates scene localization, foreground-background separation, and pedestrian detection modules into a unified detection framework. The scene localization module performs a two stage clustering of the video data.In the first stage, SIFT Homography is applied to cluster frames in terms of their structural similarity, and the second stage further clusters these aligned frames according to consistency in illumination. This produces clusters of images that are differential in viewpoint and lighting. A kernel density estimation (KDE) technique for colour and gradient is then used to construct background models for each image cluster, which is further used to detect candidate foreground pixels. Finally, using a hierarchical template matching approach, pedestrians can be detected.In addition to the second problem, we present three direct pedestrian detection methods that extend the HOG (Histogram of Oriented Gradient) techniques (Dalal and Triggs, 2005) and provide a comparative evaluation of these approaches. The three approaches include: a) a new histogram feature, that is formed by the weighted sum of both the gradient magnitude and the filter responses from a set of elongated Gaussian filters (Leung and Malik, 2001) corresponding to the quantised orientation, which we refer to as the Histogram of Oriented Gradient Banks (HOGB) approach; b) the codebook based HOG feature with branch-and-bound (efficient subwindow search) algorithm (Lampert et al., 2008) and; c) the codebook based HOGB approach.In the third problem, a unified framework that combines 3D and 2D background modelling is proposed to detect scene changes using a camera mounted on a moving vehicle. The 3D scene is first reconstructed from a set of videos taken at different times. The 3D background modelling identifies inconsistent scene structures as foreground objects. For the 2D approach, foreground objects are detected using the spatio-temporal MRF algorithm. Finally, the 3D and 2D results are combined using morphological operations.The significance of these research is that it provides basic frameworks for automatic large-scale mobile surveillance applications and facilitates many higher-level applications such as object tracking and behaviour analysis

    SeaVipers - Computer Vision and Inertial Position Reference Sensor System (CVIPRSS)

    Get PDF
    This work describes the design and development of an optical, Computer Vision (CV) based sensor for use as a Position Reference System (PRS) in Dynamic Positioning (DP). Using a combination of robotics and CV techniques, the sensor provides range and heading information to a selected reference object. The proposed optical system is superior to existing ones because it does not depend upon special reflectors nor does it require a lengthy set-up time. This system, the Computer Vision and Inertial Position Reference Sensor System (CVIPRSS, pronounced \nickname), combines a laser rangefinder, infrared camera, and a pan--tilt unit with the robust TLD (Tracking--Learning--Detection) object tracker. In this work, a \nickname ~prototype is evaluated, showing promising results as viable PRS with research, commercial, and industrial applications

    User-directed sketch interpretation

    Get PDF
    Thesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2004.Includes bibliographical references (p. 91-92).I present a novel approach to creating structured diagrams (such as flow charts and object diagrams) by combining an off-line sketch recognition system with the user interface of a traditional structured graphics editor. The system, called UDSI (user-directed sketch interpretation), aims to provide drawing freedom by allowing the user to sketch entirely off-line using a pure pen-and-paper interface. The results of the drawing can then be presented to UDSI, which recognizes shapes and lines and text areas that the user can then polish as desired. The system can infer multiple interpretations for a given sketch, to aid during the user's polishing stage. The UDSI program offers three novel features. First, it implements a greedy algorithm for determing alternative interpretations of the user's original pen drawing. Second, it introduces a user interface for selecting from these multiple candidate interpretations. Third, it implements a circle recognizer using a novel circle-detection algorithm and combines it with other hand-coded recognizers to provide a robust sketch recognition system.by Matthew J. Notowidigdo.M.Eng

    Image segmentation and pattern classification using support vector machines

    Get PDF
    Image segmentation and pattern classification have long been important topics in computer science research. Image segmentation is one of the basic and challenging lower-level image processing tasks. Feature extraction, feature reduction, and classifier design based on selected features are the three essential issues for the pattern classification problem. In this dissertation, an automatic Seeded Region Growing (SRG) algorithm for color image segmentation is developed. In the SRG algorithm, the initial seeds are automatically determined. An adaptive morphological edge-linking algorithm to fill in the gaps between edge segments is designed. Broken edges are extended along their slope directions by using the adaptive dilation operation with suitably sized elliptical structuring elements. The size and orientation of the structuring element are adjusted according to local properties. For feature reduction, an improved feature reduction method in input and feature spaces using Support Vector Machines (SVMs) is developed. In the input space, a subset of input features is selected by the ranking of their contributions to the decision function. In the feature space, features are ranked according to the weighted support vectors in each dimension. For object detection, a fast face detection system using SVMs is designed. Twoeye patterns are first detected using a linear SVM, so that most of the background can be eliminated quickly. Two-layer 2nd-degree polynomial SVMs are trained for further face verification. The detection process is implemented directly in feature space, which leads to a faster SVM. By training a two-layer SVM, higher classification rates can be achieved. For active learning, an improved incremental training algorithm for SVMs is developed. Instead of selecting training samples randomly, the k-mean clustering algorithm is applied to collect the initial set of training samples. In active query, a weight is assigned to each sample according to its distance to the current separating hyperplane and the confidence factor. The confidence factor, calculated from the upper bounds of SVM errors, is used to indicate the degree of closeness of the current separating hyperplane to the optimal solution

    Intelligent summarization of sports videos using automatic saliency detection

    Get PDF
    The aim of this thesis is to present an efficient and intelligent way of creating sports summary videos by automatically identifying the highlights or salient events from one or multiple video footage using computer vision techniques and combining them to form a video summary of the game. The thesis presents a twofold solution -Identification of salient parts from single or multiple video footage of a certain sports event. -Remixing of video by extracting and merging various segments, with effects (such as slow replay) and mixing audio. This project involves applying methods of machine learning and computer vision to identify regions of interest in the video frames and detect action areas and scoring attempts. These methods were developed for the sport of basketball. However, the methods may be tweaked or enhanced for other sports such as football, hockey etc. For creating summary videos, various video processing techniques have been experimented to add certain visual effects to improve the quality of summary videos. The goal has been to deliver a fully automated, fast and robust system that could work with large high definition video files

    Model-based Curvilinear Network Extraction and Tracking toward Quantitative Analysis of Biopolymer Networks

    Get PDF
    Curvilinear biopolymer networks pervade living systems. They are routinely imaged by fluorescence microscopy to gain insight into their structural, mechanical, and dynamic properties. Image analysis can facilitate understanding the mechanisms of their formation and their biological functions from a quantitative viewpoint. Due to the variability in network geometry, topology and dynamics as well as often low resolution and low signal-to-noise ratio in images, segmentation and tracking networks from these images is challenging. In this dissertation, we propose a complete framework for extracting the geometry and topology of curvilinear biopolymer networks, and also tracking their dynamics from multi-dimensional images. The proposed multiple Stretching Open Active Contours (SOACs) can identify network centerlines and junctions, and infer plausible network topology. Combined with a kk-partite matching algorithm, temporal correspondences among all the detected filaments can be established. This work enables statistical analysis of structural parameters of biopolymer networks as well as their dynamics. Quantitative evaluation using simulated and experimental images demonstrate its effectiveness and efficiency. Moreover, a principled method of optimizing key parameters without ground truth is proposed for attaining the best extraction result for any type of images. The proposed methods are implemented into a usable open source software ``SOAX\u27\u27. Besides network extraction and tracking, SOAX provides a user-friendly cross-platform GUI for interactive visualization, manual editing and quantitative analysis. Using SOAX to analyze several types of biopolymer networks demonstrates the potential of the proposed methods to help answer key questions in cell biology and biophysics from a quantitative viewpoint

    Coronal loop detection from solar images and extraction of salient contour groups from cluttered images.

    Get PDF
    This dissertation addresses two different problems: 1) coronal loop detection from solar images: and 2) salient contour group extraction from cluttered images. In the first part, we propose two different solutions to the coronal loop detection problem. The first solution is a block-based coronal loop mining method that detects coronal loops from solar images by dividing the solar image into fixed sized blocks, labeling the blocks as Loop or Non-Loop , extracting features from the labeled blocks, and finally training classifiers to generate learning models that can classify new image blocks. The block-based approach achieves 64% accuracy in IO-fold cross validation experiments. To improve the accuracy and scalability, we propose a contour-based coronal loop detection method that extracts contours from cluttered regions, then labels the contours as Loop and Non-Loop , and extracts geometric features from the labeled contours. The contour-based approach achieves 85% accuracy in IO-fold cross validation experiments, which is a 20% increase compared to the block-based approach. In the second part, we propose a method to extract semi-elliptical open curves from cluttered regions. Our method consists of the following steps: obtaining individual smooth contours along with their saliency measures; then starting from the most salient contour, searching for possible grouping options for each contour; and continuing the grouping until an optimum solution is reached. Our work involved the design and development of a complete system for coronal loop mining in solar images, which required the formulation of new Gestalt perceptual rules and a systematic methodology to select and combine them in a fully automated judicious manner using machine learning techniques that eliminate the need to manually set various weight and threshold values to define an effective cost function. After finding salient contour groups, we close the gaps within the contours in each group and perform B-spline fitting to obtain smooth curves. Our methods were successfully applied on cluttered solar images from TRACE and STEREO/SECCHI to discern coronal loops. Aerial road images were also used to demonstrate the applicability of our grouping techniques to other contour-types in other real applications
    corecore