653 research outputs found

    Character Recognition

    Get PDF
    Character recognition is one of the pattern recognition technologies that are most widely used in practical applications. This book presents recent advances that are relevant to character recognition, from technical topics such as image processing, feature extraction or classification, to new applications including human-computer interfaces. The goal of this book is to provide a reference source for academic research and for professionals working in the character recognition field

    Code-division multiplexing

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2004.Includes bibliographical references (p. 395-404).(cont.) counterpart. Among intra-cell orthogonal schemes, we show that the most efficient broadcast signal is a linear superposition of many binary orthogonal waveforms. The information set is also binary. Each orthogonal waveform is generated by modulating a periodic stream of finite-length chip pulses with a receiver-specific signature code that is derived from a special class of binary antipodal, superimposed recursive orthogonal code sequences. With the imposition of practical pulse shapes for carrier modulation, we show that multi-carrier format using cosine functions has higher bandwidth efficiency than the single-carrier format, even in an ideal Gaussian channel model. Each pulse is shaped via a prototype baseband filter such that when the demodulated signal is detected through a baseband matched filter, the resulting output samples satisfy the Generalized Nyquist criterion. Specifically, we propose finite-length, time overlapping orthogonal pulse shapes that are g-Nyquist. They are derived from extended and modulated lapped transforms by proving the equivalence between Perfect Reconstruction and Generalized Nyquist criteria. Using binary data modulation format, we measure and analyze the accuracy of various Gaussian approximation methods for spread-spectrum modulated (SSM) signalling ...We study forward link performance of a multi-user cellular wireless network. In our proposed cellular broadcast model, the receiver population is partitioned into smaller mutually exclusive subsets called cells. In each cell an autonomous transmitter with average transmit power constraint communicates to all receivers in its cell by broadcasting. The broadcast signal is a multiplex of independent information from many remotely located sources. Each receiver extracts its desired information from the composite signal, which consists of a distorted version of the desired signal, interference from neighboring cells and additive white Gaussian noise. Waveform distortion is caused by time and frequency selective linear time-variant channel that exists between every transmitter-receiver pair. Under such system and design constraints, and a fixed bandwidth for the entire network, we show that the most efficient resource allocation policy for each transmitter based on information theoretic measures such as channel capacity, simultaneously achievable rate regions and sum-rate is superposition coding with successive interference cancellation. The optimal policy dominates over its sub-optimal alternatives at the boundaries of the capacity region. By taking into account practical constraints such as finite constellation sets, frequency translation via carrier modulation, pulse shaping and real-time signal processing and decoding of finite-length waveforms and fairness in rate distribution, we argue that sub-optimal orthogonal policies are preferred. For intra-cell multiplexing, all orthogonal schemes based on frequency, time and code division are equivalent. For inter-cell multiplexing, non-orthogonal code-division has a larger capacity than its orthogonalby Ceilidh Hoffmann.Ph.D

    A framework for ancient and machine-printed manuscripts categorization

    Get PDF
    Document image understanding (DIU) has attracted a lot of attention and became an of active fields of research. Although, the ultimate goal of DIU is extracting textual information of a document image, many steps are involved in a such a process such as categorization, segmentation and layout analysis. All of these steps are needed in order to obtain an accurate result from character recognition or word recognition of a document image. One of the important steps in DIU is document image categorization (DIC) that is needed in many situations such as document image written or printed in more than one script, font or language. This step provides useful information for recognition system and helps in reducing its error by allowing to incorporate a category-specific Optical Character Recognition (OCR) system or word recognition (WR) system. This research focuses on the problem of DIC in different categories of scripts, styles and languages and establishes a framework for flexible representation and feature extraction that can be adapted to many DIC problem. The current methods for DIC have many limitations and drawbacks that restrict the practical usage of these methods. We proposed an efficient framework for categorization of document image based on patch representation and Non-negative Matrix Factorization (NMF). This framework is flexible and can be adapted to different categorization problem. Many methods exist for script identification of document image but few of them addressed the problem in handwritten manuscripts and they have many limitations and drawbacks. Therefore, our first goal is to introduce a novel method for script identification of ancient manuscripts. The proposed method is based on patch representation in which the patches are extracted using skeleton map of a document images. This representation overcomes the limitation of the current methods about the fixed level of layout. The proposed feature extraction scheme based on Projective Non-negative Matrix Factorization (PNMF) is robust against noise and handwriting variation and can be used for different scripts. The proposed method has higher performance compared to state of the art methods and can be applied to different levels of layout. The current methods for font (style) identification are mostly proposed to be applied on machine-printed document image and many of them can only be used for a specific level of layout. Therefore, we proposed new method for font and style identification of printed and handwritten manuscripts based on patch representation and Non-negative Matrix Tri-Factorization (NMTF). The images are represented by overlapping patches obtained from the foreground pixels. The position of these patches are set based on skeleton map to reduce the number of patches. Non-Negative Matrix Tri-Factorization is used to learn bases from each fonts (style) and then these bases are used to classify a new image based on minimum representation error. The proposed method can easily be extended to new fonts as the bases for each font are learned separately from the other fonts. This method is tested on two datasets of machine-printed and ancient manuscript and the results confirmed its performance compared to the state of the art methods. Finally, we proposed a novel method for language identification of printed and handwritten manuscripts based on patch representation and Non-negative Matrix Tri-Factorization (NMTF). The current methods for language identification are based on textual data obtained by OCR engine or images data through coding and comparing with textual data. The OCR based method needs lots of processing and the current image based method are not applicable to cursive scripts such as Arabic. In this work we introduced a new method for language identification of machine-printed and handwritten manuscripts based on patch representation and NMTF. The patch representation provides the component of the Arabic script (letters) that can not be extracted simply by segmentation methods. Then NMTF is used for dictionary learning and generating codebooks that will be used to represent document image with a histogram. The proposed method is tested on two datasets of machine-printed and handwritten manuscripts and compared to n-gram features (text-based), texture features and codebook features (imagebased) to validate the performance. The above proposed methods are robust against variation in handwritings, changes in the font (handwriting style) and presence of degradation and are flexible that can be used to various levels of layout (from a textline to paragraph). The methods in this research have been tested on datasets of handwritten and machine-printed manuscripts and compared to state-of-the-art methods. All of the evaluations show the efficiency, robustness and flexibility of the proposed methods for categorization of document image. As mentioned before the proposed strategies provide a framework for efficient and flexible representation and feature extraction for document image categorization. This frame work can be applied to different levels of layout, the information from different levels of layout can be merged and mixed and this framework can be extended to more complex situations and different tasks

    Characterization and Identification of Distraction During Naturalistic Driving Using Wearable Non-Intrusive Physiological Measure of Galvanic Skin Responses

    Full text link
    Fatalities due to road accidents are mainly caused by distracted driving. Driving demands continuous attention of the driver. Certain levels of distraction while driving can cause the driver to lose his/her attention which might lead to a fatal accident. Thus, early detection of distraction will help reduce the number of accidents. Several researches have been conducted for automatic detection of driver distraction. Many previous approaches have employed camera-based techniques. However, these methods might detect the distraction rather late to warn the drivers. Although neurophysiological signals using Electroencephalography (EEG) have shown to be another reliable indicator of distraction, EEG signals are very complex, and the technology is intrusive to the drivers, which creates serious doubt for its implementation. In this thesis we investigate a non-intrusive physiological measure-Galvanic Skin Responses (GSR) using a wrist band wearable and conduct an empirical characterization of driver GSR signals during a naturalistic driving experiment. The proposed method is used to evaluate and extract statistical, frequency and time domain features to identify distraction. Also, several data mining techniques such as feature selection, feature-ranking, dimensionality reduction and feature space analysis are performed to generate discriminative bases that reduce the computational complexity for efficient identification of distraction using supervised learning. A signal processing technique: continuous decomposition analysis, exclusive for skin conductance signal was investigated to better understand the behavior of raw signal during cognitive and visual over load from secondary tasks while driving. The proposed driver monitoring and identification system on the edge provided evident results using GSR as a reliable indicator of driver distraction while meeting the requirement of early notification of distraction state to driver.Master of ScienceComputer and Information Science, College of Engineering & Computer ScienceUniversity of Michigan-Dearbornhttps://deepblue.lib.umich.edu/bitstream/2027.42/143521/1/Vikas Final Text Embedded.pdfDescription of Vikas Final Text Embedded.pdf : Thesi

    Digital watermark technology in security applications

    Get PDF
    With the rising emphasis on security and the number of fraud related crimes around the world, authorities are looking for new technologies to tighten security of identity. Among many modern electronic technologies, digital watermarking has unique advantages to enhance the document authenticity. At the current status of the development, digital watermarking technologies are not as matured as other competing technologies to support identity authentication systems. This work presents improvements in performance of two classes of digital watermarking techniques and investigates the issue of watermark synchronisation. Optimal performance can be obtained if the spreading sequences are designed to be orthogonal to the cover vector. In this thesis, two classes of orthogonalisation methods that generate binary sequences quasi-orthogonal to the cover vector are presented. One method, namely "Sorting and Cancelling" generates sequences that have a high level of orthogonality to the cover vector. The Hadamard Matrix based orthogonalisation method, namely "Hadamard Matrix Search" is able to realise overlapped embedding, thus the watermarking capacity and image fidelity can be improved compared to using short watermark sequences. The results are compared with traditional pseudo-randomly generated binary sequences. The advantages of both classes of orthogonalisation inethods are significant. Another watermarking method that is introduced in the thesis is based on writing-on-dirty-paper theory. The method is presented with biorthogonal codes that have the best robustness. The advantage and trade-offs of using biorthogonal codes with this watermark coding methods are analysed comprehensively. The comparisons between orthogonal and non-orthogonal codes that are used in this watermarking method are also made. It is found that fidelity and robustness are contradictory and it is not possible to optimise them simultaneously. Comparisons are also made between all proposed methods. The comparisons are focused on three major performance criteria, fidelity, capacity and robustness. aom two different viewpoints, conclusions are not the same. For fidelity-centric viewpoint, the dirty-paper coding methods using biorthogonal codes has very strong advantage to preserve image fidelity and the advantage of capacity performance is also significant. However, from the power ratio point of view, the orthogonalisation methods demonstrate significant advantage on capacity and robustness. The conclusions are contradictory but together, they summarise the performance generated by different design considerations. The synchronisation of watermark is firstly provided by high contrast frames around the watermarked image. The edge detection filters are used to detect the high contrast borders of the captured image. By scanning the pixels from the border to the centre, the locations of detected edges are stored. The optimal linear regression algorithm is used to estimate the watermarked image frames. Estimation of the regression function provides rotation angle as the slope of the rotated frames. The scaling is corrected by re-sampling the upright image to the original size. A theoretically studied method that is able to synchronise captured image to sub-pixel level accuracy is also presented. By using invariant transforms and the "symmetric phase only matched filter" the captured image can be corrected accurately to original geometric size. The method uses repeating watermarks to form an array in the spatial domain of the watermarked image and the the array that the locations of its elements can reveal information of rotation, translation and scaling with two filtering processes

    Connected Attribute Filtering Based on Contour Smoothness

    Get PDF
    • …
    corecore