2,008 research outputs found

    Edge Potential Functions (EPF) and Genetic Algorithms (GA) for Edge-Based Matching of Visual Objects

    Get PDF
    Edges are known to be a semantically rich representation of the contents of a digital image. Nevertheless, their use in practical applications is sometimes limited by computation and complexity constraints. In this paper, a new approach is presented that addresses the problem of matching visual objects in digital images by combining the concept of Edge Potential Functions (EPF) with a powerful matching tool based on Genetic Algorithms (GA). EPFs can be easily calculated starting from an edge map and provide a kind of attractive pattern for a matching contour, which is conveniently exploited by GAs. Several tests were performed in the framework of different image matching applications. The results achieved clearly outline the potential of the proposed method as compared to state of the art methodologies. (c) 2007 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other users, including reprinting/ republishing this material for advertising or promotional purposes, creating new collective works for resale or redistribution to servers or lists, or reuse of any copyrighted components of this work in other works

    Partial shape matching using CCP map and weighted graph transformation matching

    Get PDF
    La dĂ©tection de la similaritĂ© ou de la diffĂ©rence entre les images et leur mise en correspondance sont des problĂšmes fondamentaux dans le traitement de l'image. Pour rĂ©soudre ces problĂšmes, on utilise, dans la littĂ©rature, diffĂ©rents algorithmes d'appariement. MalgrĂ© leur nouveautĂ©, ces algorithmes sont pour la plupart inefficaces et ne peuvent pas fonctionner correctement dans les situations d’images bruitĂ©es. Dans ce mĂ©moire, nous rĂ©solvons la plupart des problĂšmes de ces mĂ©thodes en utilisant un algorithme fiable pour segmenter la carte des contours image, appelĂ©e carte des CCPs, et une nouvelle mĂ©thode d'appariement. Dans notre algorithme, nous utilisons un descripteur local qui est rapide Ă  calculer, est invariant aux transformations affines et est fiable pour des objets non rigides et des situations d’occultation. AprĂšs avoir trouvĂ© le meilleur appariement pour chaque contour, nous devons vĂ©rifier si ces derniers sont correctement appariĂ©s. Pour ce faire, nous utilisons l'approche « Weighted Graph Transformation Matching » (WGTM), qui est capable d'Ă©liminer les appariements aberrants en fonction de leur proximitĂ© et de leurs relations gĂ©omĂ©triques. WGTM fonctionne correctement pour les objets Ă  la fois rigides et non rigides et est robuste aux distorsions importantes. Pour Ă©valuer notre mĂ©thode, le jeu de donnĂ©es ETHZ comportant cinq classes diffĂ©rentes d'objets (bouteilles, cygnes, tasses, girafes, logos Apple) est utilisĂ©. Enfin, notre mĂ©thode est comparĂ©e Ă  plusieurs mĂ©thodes cĂ©lĂšbres proposĂ©es par d'autres chercheurs dans la littĂ©rature. Bien que notre mĂ©thode donne un rĂ©sultat comparable Ă  celui des mĂ©thodes de rĂ©fĂ©rence en termes du rappel et de la prĂ©cision de localisation des frontiĂšres, elle amĂ©liore significativement la prĂ©cision moyenne pour toutes les catĂ©gories du jeu de donnĂ©es ETHZ.Matching and detecting similarity or dissimilarity between images is a fundamental problem in image processing. Different matching algorithms are used in literature to solve this fundamental problem. Despite their novelty, these algorithms are mostly inefficient and cannot perform properly in noisy situations. In this thesis, we solve most of the problems of previous methods by using a reliable algorithm for segmenting image contour map, called CCP Map, and a new matching method. In our algorithm, we use a local shape descriptor that is very fast, invariant to affine transform, and robust for dealing with non-rigid objects and occlusion. After finding the best match for the contours, we need to verify if they are correctly matched. For this matter, we use the Weighted Graph Transformation Matching (WGTM) approach, which is capable of removing outliers based on their adjacency and geometrical relationships. WGTM works properly for both rigid and non-rigid objects and is robust to high order distortions. For evaluating our method, the ETHZ dataset including five diverse classes of objects (bottles, swans, mugs, giraffes, apple-logos) is used. Finally, our method is compared to several famous methods proposed by other researchers in the literature. While our method shows a comparable result to other benchmarks in terms of recall and the precision of boundary localization, it significantly improves the average precision for all of the categories in the ETHZ dataset

    Object detection and activity recognition in digital image and video libraries

    Get PDF
    This thesis is a comprehensive study of object-based image and video retrieval, specifically for car and human detection and activity recognition purposes. The thesis focuses on the problem of connecting low level features to high level semantics by developing relational object and activity presentations. With the rapid growth of multimedia information in forms of digital image and video libraries, there is an increasing need for intelligent database management tools. The traditional text based query systems based on manual annotation process are impractical for today\u27s large libraries requiring an efficient information retrieval system. For this purpose, a hierarchical information retrieval system is proposed where shape, color and motion characteristics of objects of interest are captured in compressed and uncompressed domains. The proposed retrieval method provides object detection and activity recognition at different resolution levels from low complexity to low false rates. The thesis first examines extraction of low level features from images and videos using intensity, color and motion of pixels and blocks. Local consistency based on these features and geometrical characteristics of the regions is used to group object parts. The problem of managing the segmentation process is solved by a new approach that uses object based knowledge in order to group the regions according to a global consistency. A new model-based segmentation algorithm is introduced that uses a feedback from relational representation of the object. The selected unary and binary attributes are further extended for application specific algorithms. Object detection is achieved by matching the relational graphs of objects with the reference model. The major advantages of the algorithm can be summarized as improving the object extraction by reducing the dependence on the low level segmentation process and combining the boundary and region properties. The thesis then addresses the problem of object detection and activity recognition in compressed domain in order to reduce computational complexity. New algorithms for object detection and activity recognition in JPEG images and MPEG videos are developed. It is shown that significant information can be obtained from the compressed domain in order to connect to high level semantics. Since our aim is to retrieve information from images and videos compressed using standard algorithms such as JPEG and MPEG, our approach differentiates from previous compressed domain object detection techniques where the compression algorithms are governed by characteristics of object of interest to be retrieved. An algorithm is developed using the principal component analysis of MPEG motion vectors to detect the human activities; namely, walking, running, and kicking. Object detection in JPEG compressed still images and MPEG I frames is achieved by using DC-DCT coefficients of the luminance and chrominance values in the graph based object detection algorithm. The thesis finally addresses the problem of object detection in lower resolution and monochrome images. Specifically, it is demonstrated that the structural information of human silhouettes can be captured from AC-DCT coefficients

    Review of Person Re-identification Techniques

    Full text link
    Person re-identification across different surveillance cameras with disjoint fields of view has become one of the most interesting and challenging subjects in the area of intelligent video surveillance. Although several methods have been developed and proposed, certain limitations and unresolved issues remain. In all of the existing re-identification approaches, feature vectors are extracted from segmented still images or video frames. Different similarity or dissimilarity measures have been applied to these vectors. Some methods have used simple constant metrics, whereas others have utilised models to obtain optimised metrics. Some have created models based on local colour or texture information, and others have built models based on the gait of people. In general, the main objective of all these approaches is to achieve a higher-accuracy rate and lowercomputational costs. This study summarises several developments in recent literature and discusses the various available methods used in person re-identification. Specifically, their advantages and disadvantages are mentioned and compared.Comment: Published 201

    An Overview of Advances of Pattern Recognition Systems in Computer Vision

    Get PDF
    26 pagesFirst of all, let's give a tentative answer to the following question: what is pattern recognition (PR)? Among all the possible existing answers, that which we consider being the best adapted to the situation and to the concern of this chapter is: "pattern recognition is the scientific discipline of machine learning (or artificial intelligence) that aims at classifying data (patterns) into a number of categories or classes". But what is a pattern? A pattern recognition system (PRS) is an automatic system that aims at classifying the input pattern into a specific class. It proceeds into two successive tasks: (1) the analysis (or description) that extracts the characteristics from the pattern being studied and (2) the classification (or recognition) that enables us to recognise an object (or a pattern) by using some characteristics derived from the first task

    STV-based Video Feature Processing for Action Recognition

    Get PDF
    In comparison to still image-based processes, video features can provide rich and intuitive information about dynamic events occurred over a period of time, such as human actions, crowd behaviours, and other subject pattern changes. Although substantial progresses have been made in the last decade on image processing and seen its successful applications in face matching and object recognition, video-based event detection still remains one of the most difficult challenges in computer vision research due to its complex continuous or discrete input signals, arbitrary dynamic feature definitions, and the often ambiguous analytical methods. In this paper, a Spatio-Temporal Volume (STV) and region intersection (RI) based 3D shape-matching method has been proposed to facilitate the definition and recognition of human actions recorded in videos. The distinctive characteristics and the performance gain of the devised approach stemmed from a coefficient factor-boosted 3D region intersection and matching mechanism developed in this research. This paper also reported the investigation into techniques for efficient STV data filtering to reduce the amount of voxels (volumetric-pixels) that need to be processed in each operational cycle in the implemented system. The encouraging features and improvements on the operational performance registered in the experiments have been discussed at the end

    Shape description and matching using integral invariants on eccentricity transformed images

    Get PDF
    Matching occluded and noisy shapes is a problem frequently encountered in medical image analysis and more generally in computer vision. To keep track of changes inside the breast, for example, it is important for a computer aided detection system to establish correspondences between regions of interest. Shape transformations, computed both with integral invariants (II) and with geodesic distance, yield signatures that are invariant to isometric deformations, such as bending and articulations. Integral invariants describe the boundaries of planar shapes. However, they provide no information about where a particular feature lies on the boundary with regard to the overall shape structure. Conversely, eccentricity transforms (Ecc) can match shapes by signatures of geodesic distance histograms based on information from inside the shape; but they ignore the boundary information. We describe a method that combines the boundary signature of a shape obtained from II and structural information from the Ecc to yield results that improve on them separately

    Using Raster Sketches for Digital Image Retrieval

    Get PDF
    This research addresses the problem of content-based image retrieval using queries on image-object shape, completely in the raster domain. It focuses on the particularities of image databases encountered in typical topographic applications and presents the development of an environment for visual information management that enables such queries. The query consists of a user-provided raster sketch of the shape of an imaged object. The objective of the search is to retrieve images that contain an object sufficiently similar to the one specified in the query. The new contribution of this work combines the design of a comprehensive digital image database on-line query access strategy through the development of a feature library, image library and metadata library and the necessary matching tools. The matching algorithm is inspired by least-squares matching (lsm), and represents an extension of lsm to function with a variety of raster representations. The image retrieval strategy makes use of a hierarchical organization of linked feature (image-object) shapes within the feature library. The query results are ranked according to statistical scores and the user can subsequently narrow or broaden his/her search according to the previously obtained results and the purpose of the search
    • 

    corecore