936 research outputs found

    Content Recognition and Context Modeling for Document Analysis and Retrieval

    Get PDF
    The nature and scope of available documents are changing significantly in many areas of document analysis and retrieval as complex, heterogeneous collections become accessible to virtually everyone via the web. The increasing level of diversity presents a great challenge for document image content categorization, indexing, and retrieval. Meanwhile, the processing of documents with unconstrained layouts and complex formatting often requires effective leveraging of broad contextual knowledge. In this dissertation, we first present a novel approach for document image content categorization, using a lexicon of shape features. Each lexical word corresponds to a scale and rotation invariant local shape feature that is generic enough to be detected repeatably and is segmentation free. A concise, structurally indexed shape lexicon is learned by clustering and partitioning feature types through graph cuts. Our idea finds successful application in several challenging tasks, including content recognition of diverse web images and language identification on documents composed of mixed machine printed text and handwriting. Second, we address two fundamental problems in signature-based document image retrieval. Facing continually increasing volumes of documents, detecting and recognizing unique, evidentiary visual entities (\eg, signatures and logos) provides a practical and reliable supplement to the OCR recognition of printed text. We propose a novel multi-scale framework to detect and segment signatures jointly from document images, based on the structural saliency under a signature production model. We formulate the problem of signature retrieval in the unconstrained setting of geometry-invariant deformable shape matching and demonstrate state-of-the-art performance in signature matching and verification. Third, we present a model-based approach for extracting relevant named entities from unstructured documents. In a wide range of applications that require structured information from diverse, unstructured document images, processing OCR text does not give satisfactory results due to the absence of linguistic context. Our approach enables learning of inference rules collectively based on contextual information from both page layout and text features. Finally, we demonstrate the importance of mining general web user behavior data for improving document ranking and other web search experience. The context of web user activities reveals their preferences and intents, and we emphasize the analysis of individual user sessions for creating aggregate models. We introduce a novel algorithm for estimating web page and web site importance, and discuss its theoretical foundation based on an intentional surfer model. We demonstrate that our approach significantly improves large-scale document retrieval performance

    Pattern Recognition

    Get PDF
    Pattern recognition is a very wide research field. It involves factors as diverse as sensors, feature extraction, pattern classification, decision fusion, applications and others. The signals processed are commonly one, two or three dimensional, the processing is done in real- time or takes hours and days, some systems look for one narrow object class, others search huge databases for entries with at least a small amount of similarity. No single person can claim expertise across the whole field, which develops rapidly, updates its paradigms and comprehends several philosophical approaches. This book reflects this diversity by presenting a selection of recent developments within the area of pattern recognition and related fields. It covers theoretical advances in classification and feature extraction as well as application-oriented works. Authors of these 25 works present and advocate recent achievements of their research related to the field of pattern recognition

    Multimodal Image Fusion and Its Applications.

    Full text link
    Image fusion integrates different modality images to provide comprehensive information of the image content, increasing interpretation capabilities and producing more reliable results. There are several advantages of combining multi-modal images, including improving geometric corrections, complementing data for improved classification, and enhancing features for analysis...etc. This thesis develops the image fusion idea in the context of two domains: material microscopy and biomedical imaging. The proposed methods include image modeling, image indexing, image segmentation, and image registration. The common theme behind all proposed methods is the use of complementary information from multi-modal images to achieve better registration, feature extraction, and detection performances. In material microscopy, we propose an anomaly-driven image fusion framework to perform the task of material microscopy image analysis and anomaly detection. This framework is based on a probabilistic model that enables us to index, process and characterize the data with systematic and well-developed statistical tools. In biomedical imaging, we focus on the multi-modal registration problem for functional MRI (fMRI) brain images which improves the performance of brain activation detection.PhDElectrical Engineering: SystemsUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/120701/1/yuhuic_1.pd

    CHORUS Deliverable 2.1: State of the Art on Multimedia Search Engines

    Get PDF
    Based on the information provided by European projects and national initiatives related to multimedia search as well as domains experts that participated in the CHORUS Think-thanks and workshops, this document reports on the state of the art related to multimedia content search from, a technical, and socio-economic perspective. The technical perspective includes an up to date view on content based indexing and retrieval technologies, multimedia search in the context of mobile devices and peer-to-peer networks, and an overview of current evaluation and benchmark inititiatives to measure the performance of multimedia search engines. From a socio-economic perspective we inventorize the impact and legal consequences of these technical advances and point out future directions of research

    A comprehensive review of fruit and vegetable classification techniques

    Get PDF
    Recent advancements in computer vision have enabled wide-ranging applications in every field of life. One such application area is fresh produce classification, but the classification of fruit and vegetable has proven to be a complex problem and needs to be further developed. Fruit and vegetable classification presents significant challenges due to interclass similarities and irregular intraclass characteristics. Selection of appropriate data acquisition sensors and feature representation approach is also crucial due to the huge diversity of the field. Fruit and vegetable classification methods have been developed for quality assessment and robotic harvesting but the current state-of-the-art has been developed for limited classes and small datasets. The problem is of a multi-dimensional nature and offers significantly hyperdimensional features, which is one of the major challenges with current machine learning approaches. Substantial research has been conducted for the design and analysis of classifiers for hyperdimensional features which require significant computational power to optimise with such features. In recent years numerous machine learning techniques for example, Support Vector Machine (SVM), K-Nearest Neighbour (KNN), Decision Trees, Artificial Neural Networks (ANN) and Convolutional Neural Networks (CNN) have been exploited with many different feature description methods for fruit and vegetable classification in many real-life applications. This paper presents a critical comparison of different state-of-the-art computer vision methods proposed by researchers for classifying fruit and vegetable

    Disentangling Geometric Deformation Spaces in Generative Latent Shape Models

    Full text link
    A complete representation of 3D objects requires characterizing the space of deformations in an interpretable manner, from articulations of a single instance to changes in shape across categories. In this work, we improve on a prior generative model of geometric disentanglement for 3D shapes, wherein the space of object geometry is factorized into rigid orientation, non-rigid pose, and intrinsic shape. The resulting model can be trained from raw 3D shapes, without correspondences, labels, or even rigid alignment, using a combination of classical spectral geometry and probabilistic disentanglement of a structured latent representation space. Our improvements include more sophisticated handling of rotational invariance and the use of a diffeomorphic flow network to bridge latent and spectral space. The geometric structuring of the latent space imparts an interpretable characterization of the deformation space of an object. Furthermore, it enables tasks like pose transfer and pose-aware retrieval without requiring supervision. We evaluate our model on its generative modelling, representation learning, and disentanglement performance, showing improved rotation invariance and intrinsic-extrinsic factorization quality over the prior model.Comment: 22 page

    Analysis of textural image features for content based retrieval

    Get PDF
    Digital archaelogy and virtual reality with archaeological artefacts have been quite hot research topics in the last years 55,56 . This thesis is a preperation study to build the background knowledge required for the research projects, which aim to computerize the reconstruction of the archaelogical data like pots, marbles or mosaic pieces by shape and ex ural features. Digitalization of the cultural heritage may shorten the reconstruction time which takes tens of years currently 61 ; it will improve the reconstruction robustness by incorporating with the literally available machine vision algorithms and experiences from remote experts working on a no-cost virtual object together. Digitalization can also ease the exhibition of the results for regular people, by multiuser media applications like internet based virtual museums or virtual tours. And finally, it will make possible to archive values with their original texture and shapes for long years far away from the physical risks that the artefacts currently face. On the literature 1,2,3,5,8,11,14,15,16 , texture analysis techniques have been throughly studied and implemented for the purpose of defect analysis purposes by image processing and machine vision scientists. In the last years, these algorithms have been started to be used for similarity analysis of content based image retrieval 1,4,10 . For retrieval systems, the concurrent problems seem to be building efficient and fast systems, therefore, robust image features haven't been focused enough yet. This document is the first performance review of the texture algorithms developed for retrieval and defect analysis together. The results and experiences gained during the thesis study will be used to support the studies aiming to solve the 2D puzzle problem using textural continuity methods on archaelogical artifects, Appendix A for more detail. The first chapter is devoted to learn how the medicine and psychology try to explain the solutions of similiarity and continuity analysis, which our biological model, the human vision, accomplishes daily. In the second chapter, content based image retrieval systems, their performance criterias, similiarity distance metrics and the systems available have been summarized. For the thesis work, a rich texture database has been built, including over 1000 images in total. For the ease of the users, a GUI and a platform that is used for content based retrieval has been designed; The first version of a content based search engine has been coded which takes the source of the internet pages, parses the metatags of images and downloads the files in a loop controlled by our texture algorithms. The preprocessing algorithms and the pattern analysis algorithms required for the robustness of the textural feature processing have been implemented. In the last section, the most important textural feature extraction methods have been studied in detail with the performance results of the codes written in Matlab and run on different databases developed

    Using contour information and segmentation for object registration, modeling and retrieval

    Get PDF
    This thesis considers different aspects of the utilization of contour information and syntactic and semantic image segmentation for object registration, modeling and retrieval in the context of content-based indexing and retrieval in large collections of images. Target applications include retrieval in collections of closed silhouettes, holistic w ord recognition in handwritten historical manuscripts and shape registration. Also, the thesis explores the feasibility of contour-based syntactic features for improving the correspondence of the output of bottom-up segmentation to semantic objects present in the scene and discusses the feasibility of different strategies for image analysis utilizing contour information, e.g. segmentation driven by visual features versus segmentation driven by shape models or semi-automatic in selected application scenarios. There are three contributions in this thesis. The first contribution considers structure analysis based on the shape and spatial configuration of image regions (socalled syntactic visual features) and their utilization for automatic image segmentation. The second contribution is the study of novel shape features, matching algorithms and similarity measures. Various applications of the proposed solutions are presented throughout the thesis providing the basis for the third contribution which is a discussion of the feasibility of different recognition strategies utilizing contour information. In each case, the performance and generality of the proposed approach has been analyzed based on extensive rigorous experimentation using as large as possible test collections

    Engineering order-disorder transitions at the surface of topological insulators to manipulate electronic transport

    Get PDF
    The theoretical discovery and experimental realisation of topological insulators, a new state of matter, has brought enormous research interest in the field of condensed matter physics. The novel and fascinating phenomenon in topological insulators (TIs) is the existence of gapless surface states (or edge states) while the bulk interior has an energy gap. These states are intriguing because they enable the transport of charges with no backscattering, which means less heat generation than ordinary conductors. The unique transport in TIs arises because of the spin-momentum locking of the surface states. Thus, the promising properties of the TIs have the potential for applications in low-energy quantum electronics and spintronics. The research is motivated by a recent series of theories discussing the existence of amorphous topological materials and exploring the practical applications of the TIs in the semiconductor industry. The current research has extensively studied the feasibility of ion-beam patterning of a 3D strong topological insulator Sb2Te3. Subsequently, both in experiment and theory, this thesis proposes a novel method of engineering topological surface and edge states based on controlling the topological phase transition of Sb2Te3 with a focused ion-beam
    • 

    corecore