2,071 research outputs found

    Word matching using single closed contours for indexing handwritten historical documents

    Get PDF
    Effective indexing is crucial for providing convenient access to scanned versions of large collections of historically valuable handwritten manuscripts. Since traditional handwriting recognizers based on optical character recognition (OCR) do not perform well on historical documents, recently a holistic word recognition approach has gained in popularity as an attractive and more straightforward solution (Lavrenko et al. in proc. document Image Analysis for Libraries (DIAL’04), pp. 278–287, 2004). Such techniques attempt to recognize words based on scalar and profile-based features extracted from whole word images. In this paper, we propose a new approach to holistic word recognition for historical handwritten manuscripts based on matching word contours instead of whole images or word profiles. The new method consists of robust extraction of closed word contours and the application of an elastic contour matching technique proposed originally for general shapes (Adamek and O’Connor in IEEE Trans Circuits Syst Video Technol 5:2004). We demonstrate that multiscale contour-based descriptors can effectively capture intrinsic word features avoiding any segmentation of words into smaller subunits. Our experiments show a recognition accuracy of 83%, which considerably exceeds the performance of other systems reported in the literature

    Maximum Energy Subsampling: A General Scheme For Multi-resolution Image Representation And Analysis

    Get PDF
    Image descriptors play an important role in image representation and analysis. Multi-resolution image descriptors can effectively characterize complex images and extract their hidden information. Wavelets descriptors have been widely used in multi-resolution image analysis. However, making the wavelets transform shift and rotation invariant produces redundancy and requires complex matching processes. As to other multi-resolution descriptors, they usually depend on other theories or information, such as filtering function, prior-domain knowledge, etc.; that not only increases the computation complexity, but also generates errors. We propose a novel multi-resolution scheme that is capable of transforming any kind of image descriptor into its multi-resolution structure with high computation accuracy and efficiency. Our multi-resolution scheme is based on sub-sampling an image into an odd-even image tree. Through applying image descriptors to the odd-even image tree, we get the relative multi-resolution image descriptors. Multi-resolution analysis is based on downsampling expansion with maximum energy extraction followed by upsampling reconstruction. Since the maximum energy usually retained in the lowest frequency coefficients; we do maximum energy extraction through keeping the lowest coefficients from each resolution level. Our multi-resolution scheme can analyze images recursively and effectively without introducing artifacts or changes to the original images, produce multi-resolution representations, obtain higher resolution images only using information from lower resolutions, compress data, filter noise, extract effective image features and be implemented in parallel processing

    Graph Spectral Image Processing

    Full text link
    Recent advent of graph signal processing (GSP) has spurred intensive studies of signals that live naturally on irregular data kernels described by graphs (e.g., social networks, wireless sensor networks). Though a digital image contains pixels that reside on a regularly sampled 2D grid, if one can design an appropriate underlying graph connecting pixels with weights that reflect the image structure, then one can interpret the image (or image patch) as a signal on a graph, and apply GSP tools for processing and analysis of the signal in graph spectral domain. In this article, we overview recent graph spectral techniques in GSP specifically for image / video processing. The topics covered include image compression, image restoration, image filtering and image segmentation

    A comparison of score, rank and probability-based fusion methods for video shot retrieval

    Get PDF
    It is now accepted that the most effective video shot retrieval is based on indexing and retrieving clips using multiple, parallel modalities such as text-matching, image-matching and feature matching and then combining or fusing these parallel retrieval streams in some way. In this paper we investigate a range of fusion methods for combining based on multiple visual features (colour, edge and texture), for combining based on multiple visual examples in the query and for combining multiple modalities (text and visual). Using three TRECVid collections and the TRECVid search task, we specifically compare fusion methods based on normalised score and rank that use either the average, weighted average or maximum of retrieval results from a discrete Jelinek-Mercer smoothed language model. We also compare these results with a simple probability-based combination of the language model results that assumes all features and visual examples are fully independent

    Spread spectrum-based video watermarking algorithms for copyright protection

    Get PDF
    Merged with duplicate record 10026.1/2263 on 14.03.2017 by CS (TIS)Digital technologies know an unprecedented expansion in the last years. The consumer can now benefit from hardware and software which was considered state-of-the-art several years ago. The advantages offered by the digital technologies are major but the same digital technology opens the door for unlimited piracy. Copying an analogue VCR tape was certainly possible and relatively easy, in spite of various forms of protection, but due to the analogue environment, the subsequent copies had an inherent loss in quality. This was a natural way of limiting the multiple copying of a video material. With digital technology, this barrier disappears, being possible to make as many copies as desired, without any loss in quality whatsoever. Digital watermarking is one of the best available tools for fighting this threat. The aim of the present work was to develop a digital watermarking system compliant with the recommendations drawn by the EBU, for video broadcast monitoring. Since the watermark can be inserted in either spatial domain or transform domain, this aspect was investigated and led to the conclusion that wavelet transform is one of the best solutions available. Since watermarking is not an easy task, especially considering the robustness under various attacks several techniques were employed in order to increase the capacity/robustness of the system: spread-spectrum and modulation techniques to cast the watermark, powerful error correction to protect the mark, human visual models to insert a robust mark and to ensure its invisibility. The combination of these methods led to a major improvement, but yet the system wasn't robust to several important geometrical attacks. In order to achieve this last milestone, the system uses two distinct watermarks: a spatial domain reference watermark and the main watermark embedded in the wavelet domain. By using this reference watermark and techniques specific to image registration, the system is able to determine the parameters of the attack and revert it. Once the attack was reverted, the main watermark is recovered. The final result is a high capacity, blind DWr-based video watermarking system, robust to a wide range of attacks.BBC Research & Developmen

    2D Face Recognition System Based on Selected Gabor Filters and Linear Discriminant Analysis LDA

    Full text link
    We present a new approach for face recognition system. The method is based on 2D face image features using subset of non-correlated and Orthogonal Gabor Filters instead of using the whole Gabor Filter Bank, then compressing the output feature vector using Linear Discriminant Analysis (LDA). The face image has been enhanced using multi stage image processing technique to normalize it and compensate for illumination variation. Experimental results show that the proposed system is effective for both dimension reduction and good recognition performance when compared to the complete Gabor filter bank. The system has been tested using CASIA, ORL and Cropped YaleB 2D face images Databases and achieved average recognition rate of 98.9 %

    Segmentation for Image Indexing and Retrieval on Discrete Cosines Domain

    Get PDF
    This paper used region growing segmentation technique to segment the Discrete Cosines (DC) image. The classic problem of content Based image retrieval (CBIR) is the lack of accuracy in matching between image query and image in the database. By using region growing technique on DC image,it reduced the number of image regions indexed. The proposed of recursive region growing is not new technique but its application on DC images to build  indexing keys is quite new and not yet presented by many  authors. The experimental results show that the proposed methods on segmented images present good precision which are higher than 0.60 on all classes. So, it could be concluded that region growing segmented based CBIR more efficient   compared to DC images in term of their precision 0.59 and 0.75, respectively. Moreover, DC based CBIR can save time and simplify algorithm compared to DCT images. The most significant finding from this work is instead of using 64 DCT coefficients this research only used 1/64 coefficients which is DC coefficient.
    corecore