1,160 research outputs found

    Spectral Graph-based Features for Recognition of Handwritten Characters: A Case Study on Handwritten Devanagari Numerals

    Full text link
    Interpretation of different writing styles, unconstrained cursiveness and relationship between different primitive parts is an essential and challenging task for recognition of handwritten characters. As feature representation is inadequate, appropriate interpretation/description of handwritten characters seems to be a challenging task. Although existing research in handwritten characters is extensive, it still remains a challenge to get the effective representation of characters in feature space. In this paper, we make an attempt to circumvent these problems by proposing an approach that exploits the robust graph representation and spectral graph embedding concept to characterise and effectively represent handwritten characters, taking into account writing styles, cursiveness and relationships. For corroboration of the efficacy of the proposed method, extensive experiments were carried out on the standard handwritten numeral Computer Vision Pattern Recognition, Unit of Indian Statistical Institute Kolkata dataset. The experimental results demonstrate promising findings, which can be used in future studies.Comment: 16 pages, 8 figure

    PRESS: A Novel Framework of Trajectory Compression in Road Networks

    Get PDF
    Location data becomes more and more important. In this paper, we focus on the trajectory data, and propose a new framework, namely PRESS (Paralleled Road-Network-Based Trajectory Compression), to effectively compress trajectory data under road network constraints. Different from existing work, PRESS proposes a novel representation for trajectories to separate the spatial representation of a trajectory from the temporal representation, and proposes a Hybrid Spatial Compression (HSC) algorithm and error Bounded Temporal Compression (BTC) algorithm to compress the spatial and temporal information of trajectories respectively. PRESS also supports common spatial-temporal queries without fully decompressing the data. Through an extensive experimental study on real trajectory dataset, PRESS significantly outperforms existing approaches in terms of saving storage cost of trajectory data with bounded errors.Comment: 27 pages, 17 figure

    Cuneiform Character Similarity Using Graph Representations

    Get PDF
    Motivated by the increased demand for computerized analysis of documents within the Digital Humanities we are developing algorithms for cuneiform tablets, which contain the oldest handwritten script used for more than three millennia. These tablets are typically found in the Middle East and contain a total amount of written words comparable to all documents in Latin or ancient Greek. In previous work we have shown how to extract vector drawings from 3D-models similar to those manually drawn over digital photographs. Both types of drawings share the Scalable Vector Graphic (SVG) format representing the cuneiform characters as splines. These splines are transformed into a graph representation and extend these by triangulation. Based on graph kernel methods we show a similarity metric for cuneiform characters, which have higher degrees of freedom than handwriting with ink on paper. An evaluation of the precision and recall of our proposed approach is shown and compared to well-known methods for processing handwriting. Finally a summary and an outlook are given

    Riemannian Reading: Using Manifolds to Calculate and Unfold Narrative

    Get PDF
    The purpose of this study is to investigate the space where readers and texts interact. By applying non-Euclidean geometry to the modern subgenre of science fiction known as steampunk, we can see that narratives have no intrinsic geometry. Instead, what we can understand is that readers unflatten inherently flat narratives by applying their own metric of understanding to a narrative. Steampunk acts a primer to considering this mathematical process by explicitly flattening its settings and characters, as well as the historical accounts founding the narrative. Mark Hodder\u27s novel, The Strange Affair of Spring-Heeled Jack, offers two characters that unsuccessfully attempt to act as non-Euclidean readers. Through manipulation of agency, Hodder\u27s novel demonstrates the unflattening process as we read novels. However, our unflattening process distorts a narrative through the application of our metric of understanding. The study first gives a short historical account of non-Euclidean geometry in the 19th century. The analysis stems from the application of non-Euclidean geometric thinking to narrative structures

    Riemannian Reading: Using Manifolds to Calculate and Unfold Narrative

    Get PDF
    The purpose of this study is to investigate the space where readers and texts interact. By applying non-Euclidean geometry to the modern subgenre of science fiction known as steampunk, we can see that narratives have no intrinsic geometry. Instead, what we can understand is that readers unflatten inherently flat narratives by applying their own metric of understanding to a narrative. Steampunk acts a primer to considering this mathematical process by explicitly flattening its settings and characters, as well as the historical accounts founding the narrative. Mark Hodder\u27s novel, The Strange Affair of Spring-Heeled Jack, offers two characters that unsuccessfully attempt to act as non-Euclidean readers. Through manipulation of agency, Hodder\u27s novel demonstrates the unflattening process as we read novels. However, our unflattening process distorts a narrative through the application of our metric of understanding. The study first gives a short historical account of non-Euclidean geometry in the 19th century. The analysis stems from the application of non-Euclidean geometric thinking to narrative structures

    Working with ArcGIS 9.2 manual

    Get PDF
    This manual is intended for undergraduate and graduate students learning to use ArcView 9 in a classroom setting. It is meant to be a complement, rather than substitute, for ArcView software manuals, ESRI training products, or the ArcView help options. It reflects the order and emphasis of topics that I have found most helpful while teaching introductory GIS classes. I expect that it will be particularly helpful to people new to GIS who may be intimidated by conventional software manuals. It may also be helpful as a resource to those who have completed a course in ArcView but don’t always remember how to perform particular tasks. This manual does not try to be comprehensive, focusing instead on the basic tools and functions that users new to GIS should know how to use. Those who master these basic functions should have the skills to learn about additional tools, using the ArcView help menus, or just exploring additional menu options, toolbars, and buttons

    Image Processing Applications in Real Life: 2D Fragmented Image and Document Reassembly and Frequency Division Multiplexed Imaging

    Get PDF
    In this era of modern technology, image processing is one the most studied disciplines of signal processing and its applications can be found in every aspect of our daily life. In this work three main applications for image processing has been studied. In chapter 1, frequency division multiplexed imaging (FDMI), a novel idea in the field of computational photography, has been introduced. Using FDMI, multiple images are captured simultaneously in a single shot and can later be extracted from the multiplexed image. This is achieved by spatially modulating the images so that they are placed at different locations in the Fourier domain. Finally, a Texas Instruments digital micromirror device (DMD) based implementation of FDMI is presented and results are shown. Chapter 2 discusses the problem of image reassembly which is to restore an image back to its original form from its pieces after it has been fragmented due to different destructive reasons. We propose an efficient algorithm for 2D image fragment reassembly problem based on solving a variation of Longest Common Subsequence (LCS) problem. Our processing pipeline has three steps. First, the boundary of each fragment is extracted automatically; second, a novel boundary matching is performed by solving LCS to identify the best possible adjacency relationship among image fragment pairs; finally, a multi-piece global alignment is used to filter out incorrect pairwise matches and compose the final image. We perform experiments on complicated image fragment datasets and compare our results with existing methods to show the improved efficiency and robustness of our method. The problem of reassembling a hand-torn or machine-shredded document back to its original form is another useful version of the image reassembly problem. Reassembling a shredded document is different from reassembling an ordinary image because the geometric shape of fragments do not carry a lot of valuable information if the document has been machine-shredded rather than hand-torn. On the other hand, matching words and context can be used as an additional tool to help improve the task of reassembly. In the final chapter, document reassembly problem has been addressed through solving a graph optimization problem

    Information Extraction and Classification on Journal Papers

    Get PDF
    The importance of journals for diffusing the results of scientific research has increased considerably. In the digital era, Portable Document Format (PDF) became the established format of electronic journal articles. This structured form, combined with a regular and wide dissemination, spread scientific advancements easily and quickly. However, the rapidly increasing numbers of published scientific articles requires more time and effort on systematic literature reviews, searches and screens. The comprehension and extraction of useful information from the digital documents is also a challenging task, due to the complex structure of PDF. To help a soil science team from the United States Department of Agriculture (USDA) build a queryable journal paper system, we used web crawler to download articles on soil science from the digital library. We applied named entity recognition and table analysis to extract useful information including authors, journal name and type, publish date, abstract, DOI, experiment location in papers and highlight the paper characteristics in a computer queryable format in the system. Text classification is applied on to identify the parts of interest to the users and save their search time. We used traditional machine learning techniques including logistic regression, support vector machine, decision tree, naive bayes, k-nearest neighbors, random forest, ensemble modeling, and neural networks in text classification and compare the advantages of these approaches in the end. Advisor: Stephen D. Scot

    Recognition of off-line handwritten cursive text

    Get PDF
    The author presents novel algorithms to design unconstrained handwriting recognition systems organized in three parts: In Part One, novel algorithms are presented for processing of Arabic text prior to recognition. Algorithms are described to convert a thinned image of a stroke to a straight line approximation. Novel heuristic algorithms and novel theorems are presented to determine start and end vertices of an off-line image of a stroke. A straight line approximation of an off-line stroke is converted to a one-dimensional representation by a novel algorithm which aims to recover the original sequence of writing. The resulting ordering of the stroke segments is a suitable preprocessed representation for subsequent handwriting recognition algorithms as it helps to segment the stroke. The algorithm was tested against one data set of isolated handwritten characters and another data set of cursive handwriting, each provided by 20 subjects, and has been 91.9% and 91.8% successful for these two data sets, respectively. In Part Two, an entirely novel fuzzy set-sequential machine character recognition system is presented. Fuzzy sequential machines are defined to work as recognizers of handwritten strokes. An algorithm to obtain a deterministic fuzzy sequential machine from a stroke representation, that is capable of recognizing that stroke and its variants, is presented. An algorithm is developed to merge two fuzzy machines into one machine. The learning algorithm is a combination of many described algorithms. The system was tested against isolated handwritten characters provided by 20 subjects resulting in 95.8% recognition rate which is encouraging and shows that the system is highly flexible in dealing with shape and size variations. In Part Three, also an entirely novel text recognition system, capable of recognizing off-line handwritten Arabic cursive text having a high variability is presented. This system is an extension of the above recognition system. Tokens are extracted from a onedimensional representation of a stroke. Fuzzy sequential machines are defined to work as recognizers of tokens. It is shown how to obtain a deterministic fuzzy sequential machine from a token representation that is capable'of recognizing that token and its variants. An algorithm for token learning is presented. The tokens of a stroke are re-combined to meaningful strings of tokens. Algorithms to recognize and learn token strings are described. The. recognition stage uses algorithms of the learning stage. The process of extracting the best set of basic shapes which represent the best set of token strings that constitute an unknown stroke is described. A method is developed to extract lines from pages of handwritten text, arrange main strokes of extracted lines in the same order as they were written, and present secondary strokes to main strokes. Presented secondary strokes are combined with basic shapes to obtain the final characters by formulating and solving assignment problems for this purpose. Some secondary strokes which remain unassigned are individually manipulated. The system was tested against the handwritings of 20 subjects yielding overall subword and character recognition rates of 55.4% and 51.1%, respectively
    corecore