422 research outputs found

    A graph theoretic approach to scene matching

    Get PDF
    The ability to match two scenes is a fundamental requirement in a variety of computer vision tasks. A graph theoretic approach to inexact scene matching is presented which is useful in dealing with problems due to imperfect image segmentation. A scene is described by a set of graphs, with nodes representing objects and arcs representing relationships between objects. Each node has a set of values representing the relations between pairs of objects, such as angle, adjacency, or distance. With this method of scene representation, the task in scene matching is to match two sets of graphs. Because of segmentation errors, variations in camera angle, illumination, and other conditions, an exact match between the sets of observed and stored graphs is usually not possible. In the developed approach, the problem is represented as an association graph, in which each node represents a possible mapping of an observed region to a stored object, and each arc represents the compatibility of two mappings. Nodes and arcs have weights indicating the merit or a region-object mapping and the degree of compatibility between two mappings. A match between the two graphs corresponds to a clique, or fully connected subgraph, in the association graph. The task is to find the clique that represents the best match. Fuzzy relaxation is used to update the node weights using the contextual information contained in the arcs and neighboring nodes. This simplifies the evaluation of cliques. A method of handling oversegmentation and undersegmentation problems is also presented. The approach is tested with a set of realistic images which exhibit many types of sementation errors

    Efficient Subgraph Isomorphism using Graph Topology

    Full text link
    Subgraph isomorphism or subgraph matching is generally considered as an NP-complete problem, made more complex in practical applications where the edge weights take real values and are subject to measurement noise and possible anomalies. To the best of our knowledge, almost all subgraph matching methods utilize node labels to perform node-node matching. In the absence of such labels (in applications such as image matching and map matching among others), these subgraph matching methods do not work. We propose a method for identifying the node correspondence between a subgraph and a full graph in the inexact case without node labels in two steps - (a) extract the minimal unique topology preserving subset from the subgraph and find its feasible matching in the full graph, and (b) implement a consensus-based algorithm to expand the matched node set by pairing unique paths based on boundary commutativity. Going beyond the existing subgraph matching approaches, the proposed method is shown to have realistically sub-linear computational efficiency, robustness to random measurement noise, and good statistical properties. Our method is also readily applicable to the exact matching case without loss of generality. To demonstrate the effectiveness of the proposed method, a simulation and a case study is performed on the Erdos-Renyi random graphs and the image-based affine covariant features dataset respectively.Comment: Authors contributed equally. Names listed in alphabetical orde

    A word image coding technique and its applications in information retrieval from imaged documents

    Get PDF
    Master'sMASTER OF SCIENC

    A Digitization and Conversion Tool for Imaged Drawings to Intelligent Piping and Instrumentation Diagrams (P&ID)

    Get PDF
    In the Fourth Industrial Revolution, artificial intelligence technology and big data science are emerging rapidly. To apply these informational technologies to the engineering industries, it is essential to digitize the data that are currently archived in image or hard-copy format. For previously created design drawings, the consistency between the design products is reduced in the digitization process, and the accuracy and reliability of estimates of the equipment and materials by the digitized drawings are remarkably low. In this paper, we propose a method and system of automatically recognizing and extracting design information from imaged piping and instrumentation diagram (P&ID) drawings and automatically generating digitized drawings based on the extracted data by using digital image processing techniques such as template matching and sliding window method. First, the symbols are recognized by template matching and extracted from the imaged P&ID drawing and registered automatically in the database. Then, lines and text are recognized and extracted from in the imaged P&ID drawing using the sliding window method and aspect ratio calculation, respectively. The extracted symbols for equipment and lines are associated with the attributes of the closest text and are stored in the database in neutral format. It is mapped with the predefined intelligent P&ID information and transformed to commercial P&ID tool formats with the associated information stored. As illustrated through the validation case studies, the intelligent digitized drawings generated by the above automatic conversion system, the consistency of the design product is maintained, and the problems experienced with the traditional and manual P&ID input method by engineering companies, such as time consumption, missing items, and misspellings, are solved through the final fine-tune validation process.11Ysciescopu

    Some methods of encoding simple visual images for use with a sparse distributed memory, with applications to character recognition

    Get PDF
    To study the problems of encoding visual images for use with a Sparse Distributed Memory (SDM), I consider a specific class of images- those that consist of several pieces, each of which is a line segment or an arc of a circle. This class includes line drawings of characters such as letters of the alphabet. I give a method of representing a segment of an arc by five numbers in a continuous way; that is, similar arcs have similar representations. I also give methods for encoding these numbers as bit strings in an approximately continuous way. The set of possible segments and arcs may be viewed as a five-dimensional manifold M, whose structure is like a Mobious strip. An image, considered to be an unordered set of segments and arcs, is therefore represented by a set of points in M - one for each piece. I then discuss the problem of constructing a preprocessor to find the segments and arcs in these images, although a preprocessor has not been developed. I also describe a possible extension of the representation

    Elastic shape analysis of geometric objects with complex structures and partial correspondences

    Get PDF
    In this dissertation, we address the development of elastic shape analysis frameworks for the registration, comparison and statistical shape analysis of geometric objects with complex topological structures and partial correspondences. In particular, we introduce a variational framework and several numerical algorithms for the estimation of geodesics and distances induced by higher-order elastic Sobolev metrics on the space of parametrized and unparametrized curves and surfaces. We extend our framework to the setting of shape graphs (i.e., geometric objects with branching structures where each branch is a curve) and surfaces with complex topological structures and partial correspondences. To do so, we leverage the flexibility of varifold fidelity metrics in order to augment our geometric objects with a spatially-varying weight function, which in turn enables us to indirectly model topological changes and handle partial matching constraints via the estimation of vanishing weights within the registration process. In the setting of shape graphs, we prove the existence of solutions to the relaxed registration problem with weights, which is the main theoretical contribution of this thesis. In the setting of surfaces, we leverage our surface matching algorithms to develop a comprehensive collection of numerical routines for the statistical shape analysis of sets of 3D surfaces, which includes algorithms to compute Karcher means, perform dimensionality reduction via multidimensional scaling and tangent principal component analysis, and estimate parallel transport across surfaces (possibly with partial matching constraints). Moreover, we also address the development of numerical shape analysis pipelines for large-scale data-driven applications with geometric objects. Towards this end, we introduce a supervised deep learning framework to compute the square-root velocity (SRV) distance for curves. Our trained network provides fast and accurate estimates of the SRV distance between pairs of geometric curves, without the need to find optimal reparametrizations. As a proof of concept for the suitability of such approaches in practical contexts, we use it to perform optical character recognition (OCR), achieving comparable performance in terms of computational speed and accuracy to other existing OCR methods. Lastly, we address the difficulty of extracting high quality shape structures from imaging data in the field of astronomy. To do so, we present a state-of-the-art expectation-maximization approach for the challenging task of multi-frame astronomical image deconvolution and super-resolution. We leverage our approach to obtain a high-fidelity reconstruction of the night sky, from which high quality shape data can be extracted using appropriate segmentation and photometric techniques

    Information Preserving Processing of Noisy Handwritten Document Images

    Get PDF
    Many pre-processing techniques that normalize artifacts and clean noise induce anomalies due to discretization of the document image. Important information that could be used at later stages may be lost. A proposed composite-model framework takes into account pre-printed information, user-added data, and digitization characteristics. Its benefits are demonstrated by experiments with statistically significant results. Separating pre-printed ruling lines from user-added handwriting shows how ruling lines impact people\u27s handwriting and how they can be exploited for identifying writers. Ruling line detection based on multi-line linear regression reduces the mean error of counting them from 0.10 to 0.03, 6.70 to 0.06, and 0.13 to 0.02, com- pared to an HMM-based approach on three standard test datasets, thereby reducing human correction time by 50%, 83%, and 72% on average. On 61 page images from 16 rule-form templates, the precision and recall of form cell recognition are increased by 2.7% and 3.7%, compared to a cross-matrix approach. Compensating for and exploiting ruling lines during feature extraction rather than pre-processing raises the writer identification accuracy from 61.2% to 67.7% on a 61-writer noisy Arabic dataset. Similarly, counteracting page-wise skew by subtracting it or transforming contours in a continuous coordinate system during feature extraction improves the writer identification accuracy. An implementation study of contour-hinge features reveals that utilizing the full probabilistic probability distribution function matrix improves the writer identification accuracy from 74.9% to 79.5%
    corecore