110 research outputs found

    Directional Discrete Cosine Transform for Handwritten Script Identification

    Get PDF
    Authors' copy - ICDAR International Conference on Document Analysis and Recognition (2013), Washington DC, USAInternational audienceThis paper presents directional discrete cosine transforms (D-DCT) based word level handwritten script identification. The conventional discrete cosine transform (DCT) emphasizes vertical and horizontal energies of an image and de-emphasizes directional edge information, which of course plays a significant role in shape analysis problem, in particular. Conventional DCT however, is not efficient in characterizing the images where directional edges are dominant. In this paper, we investigate two different methods to capture directional edge information, one by performing 1D-DCT along left and right diagonals of an image, and another by decomposing 2D-DCT coefficients in left and right diagonals. The mean and standard deviations of left and right diagonals of DCT coefficients are computed and are used for the classification of words using linear discriminant analysis (LDA) and K-nearest neighbour (K-NN). We validate the method over 9000 words belonging to six different scripts. The classification of words is performed at bi-scripts, tri-scripts and multi-scripts scenarios and accomplished the identification accuracies respectively as 96.95%, 96.42% and 85.77% in average

    Pattern detection and recognition using over-complete and sparse representations

    Get PDF
    Recent research in harmonic analysis and mammalian vision systems has revealed that over-complete and sparse representations play an important role in visual information processing. The research on applying such representations to pattern recognition and detection problems has become an interesting field of study. The main contribution of this thesis is to propose two feature extraction strategies - the global strategy and the local strategy - to make use of these representations. In the global strategy, over-complete and sparse transformations are applied to the input pattern as a whole and features are extracted in the transformed domain. This strategy has been applied to the problems of rotation invariant texture classification and script identification, using the Ridgelet transform. Experimental results have shown that better performance has been achieved when compared with Gabor multi-channel filtering method and Wavelet based methods. The local strategy is divided into two stages. The first one is to analyze the local over-complete and sparse structure, where the input 2-D patterns are divided into patches and the local over-complete and sparse structure is learned from these patches using sparse approximation techniques. The second stage concerns the application of the local over-complete and sparse structure. For an object detection problem, we propose a sparsity testing technique, where a local over-complete and sparse structure is built to give sparse representations to the text patterns and non-sparse representations to other patterns. Object detection is achieved by identifying patterns that can be sparsely represented by the learned. structure. This technique has been applied. to detect texts in scene images with a recall rate of 75.23% (about 6% improvement compared with other works) and a precision rate of 67.64% (about 12% improvement). For applications like character or shape recognition, the learned over-complete and sparse structure is combined. with a Convolutional Neural Network (CNN). A second text detection method is proposed based on such a combination to further improve (about 11% higher compared with our first method based on sparsity testing) the accuracy of text detection in scene images. Finally, this method has been applied to handwritten Farsi numeral recognition, which has obtained a 99.22% recognition rate on the CENPARMI Database and a 99.5% recognition rate on the HODA Database. Meanwhile, a SVM with gradient features achieves recognition rates of 98.98% and 99.22% on these databases respectivel

    Adaptive Methods for Robust Document Image Understanding

    Get PDF
    A vast amount of digital document material is continuously being produced as part of major digitization efforts around the world. In this context, generic and efficient automatic solutions for document image understanding represent a stringent necessity. We propose a generic framework for document image understanding systems, usable for practically any document types available in digital form. Following the introduced workflow, we shift our attention to each of the following processing stages in turn: quality assurance, image enhancement, color reduction and binarization, skew and orientation detection, page segmentation and logical layout analysis. We review the state of the art in each area, identify current defficiencies, point out promising directions and give specific guidelines for future investigation. We address some of the identified issues by means of novel algorithmic solutions putting special focus on generality, computational efficiency and the exploitation of all available sources of information. More specifically, we introduce the following original methods: a fully automatic detection of color reference targets in digitized material, accurate foreground extraction from color historical documents, font enhancement for hot metal typesetted prints, a theoretically optimal solution for the document binarization problem from both computational complexity- and threshold selection point of view, a layout-independent skew and orientation detection, a robust and versatile page segmentation method, a semi-automatic front page detection algorithm and a complete framework for article segmentation in periodical publications. The proposed methods are experimentally evaluated on large datasets consisting of real-life heterogeneous document scans. The obtained results show that a document understanding system combining these modules is able to robustly process a wide variety of documents with good overall accuracy

    HVS Inspired System for Script Identification in Indian Multi-script Documents

    No full text
    Identification of the script of the text, present in multi-script documents, is one of the important first steps in the design of an OCR system. Much work has been reported relating to Roman, Arabic, Chinese, Korean and Japanese scripts. Though some work has already been reported involving Indian scripts, the work is still in its nascent stage. For 4 example, most of the work assumes that the script changes only at the level of the line, which is rarely an acceptable assumption in the Indian scenario. In this work, we report a script identification algorithm, which takes into account the fact that the script changes at the word level in most Indian bilingual or multilingual documents. Initially, we deal with the identification of the script of words, using Gabor filters, in a bi-script scenario. Later, we extend this to tri-script and then, five-script scenarios. The combination of Gabor features with nearest neighbor classifier shows promising results. Words of different font styles and sizes are used. We have shown that our identification scheme, inspired from the Human Visual System (HVS), utilizing the same feature and classifier combination, works consistently well for any of the combination of scripts experimented

    Co-designing of an intervention to support health visitors’ implementation of practices recommended for prevention of excess weight gain in 0-2 year old children

    Get PDF
    Ph. D. Thesis.Background: Early rapid weight gain is a risk factor for later obesity. UK health visitors (HVs) are well-positioned to address excessive weight-gain trends in early childhood. However, HVs face unique barriers when caring for children under age two with excessive rates of weight gain. Interventions that strengthen HVs’ role by addressing key barriers and facilitators of implementation of recommended guidelines into routine practice are needed. Aim: This research engaged with HVs to systematically design an intervention to support their implementation of practice behaviours. Methods: A mixed-methods evidence synthesis and series of interactive workshops with HVs were conducted. HVs who are the recipients of the intervention provided their views of what is important, relevant, and feasible in the local context. The findings of the workshops were combined in an iterative process to inform the sequential steps of the Behaviour Change Wheel framework and guide the process of designing the intervention. Results: Theoretical analysis of the workshops revealed HVs’ capabilities, opportunities, and motivations related to addressing early-childhood obesity prevention. Intervention strategies deemed most likely to support implementation (enablement, education, training, modelling, persuasion) were combined to design a face-to-face interactive training intervention. Outcome measures to test feasibility, acceptability, and fidelity of delivery of the proposed intervention were identified. Discussion: An interactive training intervention has been designed, informed by behaviour change theory, evidence, expert knowledge, and experiences of health visitors, in an area of health promotion that is currently evolving. Future research should be directed to evaluate the acceptability and feasibility of the intervention in a pilot trial. The use of a systematic approach to the development process, identification of intervention contents and their hypothesised mechanisms of action using standard terminology provides an opportunity for this research to contribute to the body of literature on designing of implementation interventions using a collaborative approach.Durham County Council, Newcastle University, and Fuse, the Centre for Translational Research in Public Healt

    Automatic Speech Recognition for Low-resource Languages and Accents Using Multilingual and Crosslingual Information

    Get PDF
    This thesis explores methods to rapidly bootstrap automatic speech recognition systems for languages, which lack resources for speech and language processing. We focus on finding approaches which allow using data from multiple languages to improve the performance for those languages on different levels, such as feature extraction, acoustic modeling and language modeling. Under application aspects, this thesis also includes research work on non-native and Code-Switching speech

    Research and Technology Report. Goddard Space Flight Center

    Get PDF
    This issue of Goddard Space Flight Center's annual report highlights the importance of mission operations and data systems covering mission planning and operations; TDRSS, positioning systems, and orbit determination; ground system and networks, hardware and software; data processing and analysis; and World Wide Web use. The report also includes flight projects, space sciences, Earth system science, and engineering and materials

    Research & Technology Report Goddard Space Flight Center

    Get PDF
    The main theme of this edition of the annual Research and Technology Report is Mission Operations and Data Systems. Shifting from centralized to distributed mission operations, and from human interactive operations to highly automated operations is reported. The following aspects are addressed: Mission planning and operations; TDRSS, Positioning Systems, and orbit determination; hardware and software associated with Ground System and Networks; data processing and analysis; and World Wide Web. Flight projects are described along with the achievements in space sciences and earth sciences. Spacecraft subsystems, cryogenic developments, and new tools and capabilities are also discussed

    Remote Sensing Data Compression

    Get PDF
    A huge amount of data is acquired nowadays by different remote sensing systems installed on satellites, aircrafts, and UAV. The acquired data then have to be transferred to image processing centres, stored and/or delivered to customers. In restricted scenarios, data compression is strongly desired or necessary. A wide diversity of coding methods can be used, depending on the requirements and their priority. In addition, the types and properties of images differ a lot, thus, practical implementation aspects have to be taken into account. The Special Issue paper collection taken as basis of this book touches on all of the aforementioned items to some degree, giving the reader an opportunity to learn about recent developments and research directions in the field of image compression. In particular, lossless and near-lossless compression of multi- and hyperspectral images still remains current, since such images constitute data arrays that are of extremely large size with rich information that can be retrieved from them for various applications. Another important aspect is the impact of lossless compression on image classification and segmentation, where a reasonable compromise between the characteristics of compression and the final tasks of data processing has to be achieved. The problems of data transition from UAV-based acquisition platforms, as well as the use of FPGA and neural networks, have become very important. Finally, attempts to apply compressive sensing approaches in remote sensing image processing with positive outcomes are observed. We hope that readers will find our book useful and interestin
    corecore