360 research outputs found

    A graph theoretic approach to scene matching

    Get PDF
    The ability to match two scenes is a fundamental requirement in a variety of computer vision tasks. A graph theoretic approach to inexact scene matching is presented which is useful in dealing with problems due to imperfect image segmentation. A scene is described by a set of graphs, with nodes representing objects and arcs representing relationships between objects. Each node has a set of values representing the relations between pairs of objects, such as angle, adjacency, or distance. With this method of scene representation, the task in scene matching is to match two sets of graphs. Because of segmentation errors, variations in camera angle, illumination, and other conditions, an exact match between the sets of observed and stored graphs is usually not possible. In the developed approach, the problem is represented as an association graph, in which each node represents a possible mapping of an observed region to a stored object, and each arc represents the compatibility of two mappings. Nodes and arcs have weights indicating the merit or a region-object mapping and the degree of compatibility between two mappings. A match between the two graphs corresponds to a clique, or fully connected subgraph, in the association graph. The task is to find the clique that represents the best match. Fuzzy relaxation is used to update the node weights using the contextual information contained in the arcs and neighboring nodes. This simplifies the evaluation of cliques. A method of handling oversegmentation and undersegmentation problems is also presented. The approach is tested with a set of realistic images which exhibit many types of sementation errors

    A personal identification biometric system based on back-of-hand vein patterns

    Get PDF
    This report describes research on the use of back-of-hand vein patterns as a means of uniquely identifying people. In particular it describes a prototype biometric system developed by the Australian Institute of Security and Applied Technology (AISAT). This system comprises an infrared cold source, a monochrome CCD camera, a monochrome frame-grabber, a personal computer, and custom image acquisition, processing, registration, and matching software. The image processing algorithms are based on Mathematical Morphology. Registration is performed using rotation and translation with respect to the centroid of the two-dimensional domain of a hand. Vein patterns are stored as medial axis representations. Matching involves comparing a given medial axis pattern against a library of patterns using constrained sequential correlation. The matching is two-fold: a newly acquired signature is matched against a dilated library signature, and then the library signature is matched against the dilated acquired signature; this is necessary because of the positional noise exhibited by the back-of-hand veins. The results of a cross-matching experiment for a sample of 20 adults and more than 100 hand images is detailed. In addition preliminary estimates of the false acceptance rate (FAR) and false rejection rate (FRR) for the prototype system are given. Fuzzy relaxation on an association graph is discussed as an alternative to sequential correlation for the matching of vein signatures. An example is provided (including a C program) illustrating the matching process for a pair of signatures obtained from the same hand. The example demonstrates the ability of the fuzzy relaxation method to deal with segmentation errors

    Image Understanding by Hierarchical Symbolic Representation and Inexact Matching of Attributed Graphs

    Get PDF
    We study the symbolic representation of imagery information by a powerful global representation scheme in the form of Attributed Relational Graph (ARG), and propose new techniques for the extraction of such representation from spatial-domain images, and for performing the task of image understanding through the analysis of the extracted ARG representation. To achieve practical image understanding tasks, the system needs to comprehend the imagery information in a global form. Therefore, we propose a multi-layer hierarchical scheme for the extraction of global symbolic representation from spatial-domain images. The proposed scheme produces a symbolic mapping of the input data in terms of an output alphabet, whose elements are defined over global subimages. The proposed scheme uses a combination of model-driven and data-driven concepts. The model- driven principle is represented by a graph transducer, which is used to specify the alphabet at each layer in the scheme. A symbolic mapping is driven by the input data to map the input local alphabet into the output global alphabet. Through the iterative application of the symbolic transformational mapping at different levels of hierarchy, the system extracts a global representation from the image in the form of attributed relational graphs. Further processing and interpretation of the imagery information can, then, be performed on their ARG representation. We also propose an efficient approach for calculating a distance measure and finding the best inexact matching configuration between attributed relational graphs. For two ARGs, we define sequences of weighted error-transformations which when performed on one ARG (or a subgraph of it), will produce the other ARG. A distance measure between two ARGs is defined as the weight of the sequence which possesses minimum total-weight. Moreover, this minimum-total weight sequence defines the best inexact matching configuration between the two ARGs. The global minimization over the possible sequences is performed by a dynamic programming technique, the approach shows good results for ARGs of practical sizes. The proposed system possesses the capability to inference the alphabets of the ARG representation which it uses. In the inference phase, the hierarchical scheme is usually driven by the input data only, which normally consist of images of model objects. It extracts the global alphabet of the ARG representation of the models. The extracted model representation is then used in the operation phase of the system to: perform the mapping in the multi-layer scheme. We present our experimental results for utilizing the proposed system for locating objects in complex scenes

    Semantic Similarity of Spatial Scenes

    Get PDF
    The formalization of similarity in spatial information systems can unleash their functionality and contribute technology not only useful, but also desirable by broad groups of users. As a paradigm for information retrieval, similarity supersedes tedious querying techniques and unveils novel ways for user-system interaction by naturally supporting modalities such as speech and sketching. As a tool within the scope of a broader objective, it can facilitate such diverse tasks as data integration, landmark determination, and prediction making. This potential motivated the development of several similarity models within the geospatial and computer science communities. Despite the merit of these studies, their cognitive plausibility can be limited due to neglect of well-established psychological principles about properties and behaviors of similarity. Moreover, such approaches are typically guided by experience, intuition, and observation, thereby often relying on more narrow perspectives or restrictive assumptions that produce inflexible and incompatible measures. This thesis consolidates such fragmentary efforts and integrates them along with novel formalisms into a scalable, comprehensive, and cognitively-sensitive framework for similarity queries in spatial information systems. Three conceptually different similarity queries at the levels of attributes, objects, and scenes are distinguished. An analysis of the relationship between similarity and change provides a unifying basis for the approach and a theoretical foundation for measures satisfying important similarity properties such as asymmetry and context dependence. The classification of attributes into categories with common structural and cognitive characteristics drives the implementation of a small core of generic functions, able to perform any type of attribute value assessment. Appropriate techniques combine such atomic assessments to compute similarities at the object level and to handle more complex inquiries with multiple constraints. These techniques, along with a solid graph-theoretical methodology adapted to the particularities of the geospatial domain, provide the foundation for reasoning about scene similarity queries. Provisions are made so that all methods comply with major psychological findings about people’s perceptions of similarity. An experimental evaluation supplies the main result of this thesis, which separates psychological findings with a major impact on the results from those that can be safely incorporated into the framework through computationally simpler alternatives

    Valued constraint satisfaction problems: Hard and easy problems

    Get PDF
    tschiexOtoulouse.inra.fr fargierOirit.fr verfailOcert.fr In order to deal with over-constrained Constraint Satisfaction Problems, various extensions of the CSP framework have been considered by taking into account costs, uncertainties, preferences, priorities...Each extension uses a specific mathematical operator (+, max...) to aggregate constraint violations. In this paper, we consider a simple algebraic framework, related to Partial Constraint Satisfaction, which subsumes most of these proposals and use it to characterize existing proposals in terms of rationality and computational complexity. We exhibit simple relationships between these proposals, try t

    LOGICSEG: Parsing Visual Semantics with Neural Logic Learning and Reasoning

    Full text link
    Current high-performance semantic segmentation models are purely data-driven sub-symbolic approaches and blind to the structured nature of the visual world. This is in stark contrast to human cognition which abstracts visual perceptions at multiple levels and conducts symbolic reasoning with such structured abstraction. To fill these fundamental gaps, we devise LOGICSEG, a holistic visual semantic parser that integrates neural inductive learning and logic reasoning with both rich data and symbolic knowledge. In particular, the semantic concepts of interest are structured as a hierarchy, from which a set of constraints are derived for describing the symbolic relations and formalized as first-order logic rules. After fuzzy logic-based continuous relaxation, logical formulae are grounded onto data and neural computational graphs, hence enabling logic-induced network training. During inference, logical constraints are packaged into an iterative process and injected into the network in a form of several matrix multiplications, so as to achieve hierarchy-coherent prediction with logic reasoning. These designs together make LOGICSEG a general and compact neural-logic machine that is readily integrated into existing segmentation models. Extensive experiments over four datasets with various segmentation models and backbones verify the effectiveness and generality of LOGICSEG. We believe this study opens a new avenue for visual semantic parsing.Comment: ICCV 2023 (Oral). Code: https://github.com/lingorX/LogicSeg

    Recognizing Partially Occluded Objects Using Information Extracted From Polygonal Approximation.

    Get PDF
    This thesis addresses the problem of recognizing partially occluded two dimensional objects. The goal is to develop a system which is able to identify and locate several overlapping objects in the scene. To achieve this goal, the system must perform the following specific tasks: (1) storing useful information about objects in some format, which is often referred to as the process of object representation or model formation (2) matching procedure based on the object representation, and (3) efficient search of the best matching. This thesis presents a new approach to accomplish these tasks. Polygonal approximation is used to represent an object in this research. The accumulated lengths of line segments, s, and the accumulated sizes of turning angles, θ\theta, along the boundary from some starting point are extracted. The boundary of an object is then described as an equation θ\theta = f(s). As algorithm shows, matching objects under s-θ\theta space will be simple and effective. To avoid exhaustive matching in the recognition process, index diagrams of the features characterizing the boundary are established. Once the features of some unknown object are detected, the possible objects which might produce the best matching can be efficiently retrieved from this scheme

    Modeling of remote sensing image content using attributed relational graphs

    Get PDF
    Automatic content modeling and retrieval in remote sensing image databases are important and challenging problems. Statistical pattern recognition and computer vision algorithms concentrate on feature-based analysis and representations in pixel or region levels whereas syntactic and structural techniques focus on modeling symbolic representations for interpreting scenes. We describe a hybrid hierarchical approach for image content modeling and retrieval. First, scenes are decomposed into regions using pixel-based classifiers and an iterative split-and-merge algorithm. Next, spatial relationships of regions are computed using boundary, distance and orientation information based on different region representations. Finally, scenes are modeled using attributed relational graphs that combine region class information and spatial arrangements. We demonstrate the effectiveness of this approach in query scenarios that cannot be expressed by traditional approaches but where the proposed models can capture both feature and spatial characteristics of scenes and can retrieve similar areas according to their high-level semantic content. © Springer-Verlag Berlin Heidelberg 2006

    Automatic Landmarking for Non-cooperative 3D Face Recognition

    Get PDF
    This thesis describes a new framework for 3D surface landmarking and evaluates its performance for feature localisation on human faces. This framework has two main parts that can be designed and optimised independently. The first one is a keypoint detection system that returns positions of interest for a given mesh surface by using a learnt dictionary of local shapes. The second one is a labelling system, using model fitting approaches that establish a one-to-one correspondence between the set of unlabelled input points and a learnt representation of the class of object to detect. Our keypoint detection system returns local maxima over score maps that are generated from an arbitrarily large set of local shape descriptors. The distributions of these descriptors (scalars or histograms) are learnt for known landmark positions on a training dataset in order to generate a model. The similarity between the input descriptor value for a given vertex and a model shape is used as a descriptor-related score. Our labelling system can make use of both hypergraph matching techniques and rigid registration techniques to reduce the ambiguity attached to unlabelled input keypoints for which a list of model landmark candidates have been seeded. The soft matching techniques use multi-attributed hyperedges to reduce ambiguity, while the registration techniques use scale-adapted rigid transformation computed from 3 or more points in order to obtain one-to-one correspondences. Our final system achieves better or comparable (depending on the metric) results than the state-of-the-art while being more generic. It does not require pre-processing such as cropping, spike removal and hole filling and is more robust to occlusion of salient local regions, such as those near the nose tip and inner eye corners. It is also fully pose invariant and can be used with kinds of objects other than faces, provided that labelled training data is available
    • …
    corecore