11 research outputs found

    Substructure Discovery Using Minimum Description Length and Background Knowledge

    Full text link
    The ability to identify interesting and repetitive substructures is an essential component to discovering knowledge in structural data. We describe a new version of our SUBDUE substructure discovery system based on the minimum description length principle. The SUBDUE system discovers substructures that compress the original data and represent structural concepts in the data. By replacing previously-discovered substructures in the data, multiple passes of SUBDUE produce a hierarchical description of the structural regularities in the data. SUBDUE uses a computationally-bounded inexact graph match that identifies similar, but not identical, instances of a substructure and finds an approximate measure of closeness of two substructures when under computational constraints. In addition to the minimum description length principle, other background knowledge can be used by SUBDUE to guide the search towards more appropriate substructures. Experiments in a variety of domains demonstrate SUBDUE's ability to find substructures capable of compressing the original data and to discover structural concepts important to the domain. Description of Online Appendix: This is a compressed tar file containing the SUBDUE discovery system, written in C. The program accepts as input databases represented in graph form, and will output discovered substructures with their corresponding value.Comment: See http://www.jair.org/ for an online appendix and other files accompanying this articl

    Multi-Scale Vector-Ridge-Detection for Perceptual Organization Without Edges

    Get PDF
    We present a novel ridge detector that finds ridges on vector fields. It is designed to automatically find the right scale of a ridge even in the presence of noise, multiple steps and narrow valleys. One of the key features of such ridge detector is that it has a zero response at discontinuities. The ridge detector can be applied to scalar and vector quantities such as color. We also present a parallel perceptual organization scheme based on such ridge detector that works without edges; in addition to perceptual groups, the scheme computes potential focus of attention points at which to direct future processing. The relation to human perception and several theoretical findings supporting the scheme are presented. We also show results of a Connection Machine implementation of the scheme for perceptual organization (without edges) using color

    Higher-Order Statistics in Visual Object Recognition

    Get PDF
    In this paper, I develop a higher-order statistical theory of matching models against images. The basic idea is not only to take into account {\em how much} of an object can be seen in the image, but also {\em what parts} of it are jointly present. I show that this additional information can improve the specificity (i.e., reduce the probability of false positive matches) of a recognition algorithm. I demonstrate formally that most commonly used quality of match measures employed by recognition algorithms are based on an independence assumption. Using the Minimum Description Length (MDL) principle and a simple scene-description language as a guide, I show that this independence assumption is not satisfied for common scenes, and propose several important higher-order statistical properties of matches that approximate some aspects of these statistical dependencies. I have implemented a recognition system that takes advantage of this additional statistical information and demonstrate its efficacy in comparisons with a standard recognition system based on bounded error matching. We also observe that the existing use of grouping and segmentation methods has significant effects on the performance of recognition systems that are similar to those resulting from the use of higher-order statistical information. Our analysis provides a statistical framework in which to understand the effects of grouping and segmentation on recognition and suggests ways to take better advantage of such information

    Integrated descriptions for vision

    Get PDF
    Thesis (M.S.)--Massachusetts Institute of Technology, Dept. of Architecture, 1990.Includes bibliographical references (leaves 51-54).by Trevor Darrell.M.S

    Part decomposition of 3D surfaces

    Get PDF
    This dissertation describes a general algorithm that automatically decomposes realworld scenes and objects into visual parts. The input to the algorithm is a 3 D triangle mesh that approximates the surfaces of a scene or object. This geometric mesh completely specifies the shape of interest. The output of the algorithm is a set of boundary contours that dissect the mesh into parts where these parts agree with human perception. In this algorithm, shape alone defines the location of a bom1dary contour for a part. The algorithm leverages a human vision theory known as the minima rule that states that human visual perception tends to decompose shapes into parts along lines of negative curvature minima. Specifically, the minima rule governs the location of part boundaries, and as a result the algorithm is known as the Minima Rule Algorithm. Previous computer vision methods have attempted to implement this rule but have used pseudo measures of surface curvature. Thus, these prior methods are not true implementations of the rule. The Minima Rule Algorithm is a three step process that consists of curvature estimation, mesh segmentation, and quality evaluation. These steps have led to three novel algorithms known as Normal Vector Voting, Fast Marching Watersheds, and Part Saliency Metric, respectively. For each algorithm, this dissertation presents both the supporting theory and experimental results. The results demonstrate the effectiveness of the algorithm using both synthetic and real data and include comparisons with previous methods from the research literature. Finally, the dissertation concludes with a summary of the contributions to the state of the art

    Superquadric representation of scenes from multi-view range data

    Get PDF
    Object representation denotes representing three-dimensional (3D) real-world objects with known graphic or mathematic primitives recognizable to computers. This research has numerous applications for object-related tasks in areas including computer vision, computer graphics, reverse engineering, etc. Superquadrics, as volumetric and parametric models, have been selected to be the representation primitives throughout this research. Superquadrics are able to represent a large family of solid shapes by a single equation with only a few parameters. This dissertation addresses superquadric representation of multi-part objects and multiobject scenes. Two issues motivate this research. First, superquadric representation of multipart objects or multi-object scenes has been an unsolved problem due to the complex geometry of objects. Second, superquadrics recovered from single-view range data tend to have low confidence and accuracy due to partially scanned object surfaces caused by inherent occlusions. To address these two problems, this dissertation proposes a multi-view superquadric representation algorithm. By incorporating both part decomposition and multi-view range data, the proposed algorithm is able to not only represent multi-part objects or multi-object scenes, but also achieve high confidence and accuracy of recovered superquadrics. The multi-view superquadric representation algorithm consists of (i) initial superquadric model recovery from single-view range data, (ii) pairwise view registration based on recovered superquadric models, (iii) view integration, (iv) part decomposition, and (v) final superquadric fitting for each decomposed part. Within the multi-view superquadric representation framework, this dissertation proposes a 3D part decomposition algorithm to automatically decompose multi-part objects or multiobject scenes into their constituent single parts consistent with human visual perception. Superquadrics can then be recovered for each decomposed single-part object. The proposed part decomposition algorithm is based on curvature analysis, and includes (i) Gaussian curvature estimation, (ii) boundary labeling, (iii) part growing and labeling, and (iv) post-processing. In addition, this dissertation proposes an extended view registration algorithm based on superquadrics. The proposed view registration algorithm is able to handle deformable superquadrics as well as 3D unstructured data sets. For superquadric fitting, two objective functions primarily used in the literature have been comprehensively investigated with respect to noise, viewpoints, sample resolutions, etc. The objective function proved to have better performance has been used throughout this dissertation. In summary, the three algorithms (contributions) proposed in this dissertation are generic and flexible in the sense of handling triangle meshes, which are standard surface primitives in computer vision and graphics. For each proposed algorithm, the dissertation presents both theory and experimental results. The results demonstrate the efficiency of the algorithms using both synthetic and real range data of a large variety of objects and scenes. In addition, the experimental results include comparisons with previous methods from the literature. Finally, the dissertation concludes with a summary of the contributions to the state of the art in superquadric representation, and presents possible future extensions to this research

    A comparison of different approaches to target differentiation with sonar

    Get PDF
    Ankara : The Department of Electrical and Electronics Engineering and the Institute of Engineering and Science of Bilkent University, 2001.Thesis (Ph.D.) -- Bilkent University, 2001.Includes bibliographical references leaves 180-197This study compares the performances of di erent classication schemes and fusion techniques for target di erentiation and localization of commonly encountered features in indoor robot environments using sonar sensing Di erentiation of such features is of interest for intelligent systems in a variety of applications such as system control based on acoustic signal detection and identication map building navigation obstacle avoidance and target tracking The classication schemes employed include the target di erentiation algorithm developed by Ayrulu and Barshan statistical pattern recognition techniques fuzzy c means clustering algorithm and articial neural networks The fusion techniques used are Dempster Shafer evidential reasoning and di erent voting schemes To solve the consistency problem arising in simple ma jority voting di erent voting schemes including preference ordering and reliability measures are proposed and veried experimentally To improve the performance of neural network classiers di erent input signal representations two di erent training algorithms and both modular and non modular network structures are considered The best classication and localization scheme is found to be the neural network classier trained with the wavelet transform of the sonar signals This method is applied to map building in mobile robot environments Physically di erent sensors such as infrared sensors and structured light systems besides sonar sensors are also considered to improve the performance in target classication and localization.Ayrulu (Erdem), BirselPh.D

    Dynamic Trees: A Hierarchical Probabilistic Approach to Image Modelling

    Get PDF
    Institute for Adaptive and Neural ComputationThis work introduces a new class of image model which we call dynamic trees or DTs. A dynamic tree model specifies a prior over structures of trees, each of which is a forest of one or more tree-structured belief networks (TSBN). In the literature standard tree-structured belief network models were found to produce “blocky” segmentations when naturally occurring boundaries within an image did not coincide with those of the subtrees in the rigid fixed structure of the network. Dynamic trees have a flexible architecture which allows the structure to vary to accommodate configurations where the subtree and image boundaries align, and experimentation with the model showed significant improvements. They are also hierarchical in nature allowing a multi-scale representation and are constructed within a well founded Bayesian framework. For large models the number of tree configurations quickly becomes intractable to enumerate over, presenting a problem for exact inference. Techniques such as Gibbs sampling over trees are considered and search using simulated annealing finds high posterior probability trees on synthetic 2-d images generated from the model. However simulated annealing and sampling techniques are rather slow. Variational methods are applied to the model in an attempt to approximate the posterior by a simpler tractable distribution, and the simplest of these techniques, mean field, found comparable solutions to simulated annealing in the order of 100 times faster. This increase in speed goes a long way towards making real-time inference in the dynamic tree viable. Variational methods have the further advantage that by attempting to model the full posterior distribution it is possible to gain an indication as to the quality of the solutions found. An EM-style update based upon mean field inference is derived and the learned conditional probability tables (describing state transitions between a node and its parent) are compared with exact EM on small tractable fixed architecture models. The mean field approximation by virtue of its form is biased towards fully factorised solutions which tends to create degenerate CPTs, but despite this mean field learning still produces solutions whose log likelihood rivals exact EM. Development of algorithms for learning the probabilities of the prior over tree structures completes the dynamic tree picture. After discussion of the relative merits of certain representations for the disconnection probabilities and initial investigation on small model structures the full dynamic tree model is applied to a database of images of outdoor scenes where all of its parameters are learned. DTs are seen to offer significant improvement in performance over the fixed architecture TSBN and in a coding comparison the DT achieves 0 294 bits per pixel (bpp) compression compared to 0 378 bpp for lossless JPEG on images of 7 colours

    Mid-Level Vision and Recognition of Non-Rigid Objects

    Get PDF
    We address mid-level vision for the recognition of non-rigid objects. We align model and image using frame curves - which are object or "figure/ground" skeletons. Frame curves are computed, without discontinuities, using Curved Inertia Frames, a provably global scheme implemented on the Connection Machine, based on: non-cartisean networks; a definition of curved axis of inertia; and a ridge detector. I present evidence against frame alignment in human perception. This suggests: frame curves have a role in figure/ground segregation and in fuzzy boundaries; their outside/near/top/ incoming regions are more salient; and that perception begins by setting a reference frame (prior to early vision), and proceeds by processing convex structures
    corecore