2,648 research outputs found
A Computational Theory of Contextual Knowledge in Machine Reading
Machine recognition of off–line handwriting can be achieved by either recognising words as individual symbols (word level recognition) or by segmenting a word into parts, usually letters, and classifying those parts (letter level recognition). Whichever method is used, current handwriting recognition systems cannot overcome the inherent ambiguity in writingwithout recourse to contextual information.
This thesis presents a set of experiments that use Hidden Markov Models of language to resolve ambiguity in the classification process. It goes on to describe an algorithm designed to recognise a document written by a single–author and to improve recognition by adaptingto the writing style and learning new words. Learning and adaptation is achieved by
reading the document over several iterations. The algorithm is designed to incorporate contextual processing, adaptation to modify the shape of known words and learning of new words within a constrained dictionary.
Adaptation occurs when a word that has previously been trained in the classifier is recognised at either the word or letter level and the word image is used to modify the classifier. Learning occurs when a new word that has not been in the training set is recognised at the
letter level and is subsequently added to the classifier.
Words and letters are recognised using a nearest neighbour classifier and used features based on the two–dimensional Fourier transform. By incorporating a measure of confidence based on the distribution of training points around an exemplar, adaptation and learning is constrained to only occur when a word is confidently classified.
The algorithm was implemented and tested with a dictionary of 1000 words. Results show that adaptation of the letter classifier improved recognition on average by 3.9% with only 1.6% at the whole word level. Two experiments were carried out to evaluate the learning in the system. It was found that learning accounted for little improvement in the classification results and also that learning new words was prone to misclassifications being propagated
Implementation of a Human-Computer Interface for Computer Assisted Translation and Handwritten Text Recognition
A human-computer interface is developed to provide services of computer assisted machine translation (CAT) and computer assisted transcription of handwritten text images (CATTI). The back-end machine translation (MT) and handwritten text recognition (HTR) systems are provided by the Pattern Recognition and Human Language Technology (PRHLT) research group. The idea is to provide users with easy to use tools to convert interactive translation and transcription feasible tasks. The assisted service is provided by remote servers with CAT or CATTI capabilities. The interface supplies the user with tools for efficient local edition: deletion, insertion and substitution.Ocampo Sepúlveda, JC. (2009). Implementation of a Human-Computer Interface for Computer Assisted Translation and Handwritten Text Recognition. http://hdl.handle.net/10251/14318Archivo delegad
Inductive Visual Localisation: Factorised Training for Superior Generalisation
End-to-end trained Recurrent Neural Networks (RNNs) have been successfully
applied to numerous problems that require processing sequences, such as image
captioning, machine translation, and text recognition. However, RNNs often
struggle to generalise to sequences longer than the ones encountered during
training. In this work, we propose to optimise neural networks explicitly for
induction. The idea is to first decompose the problem in a sequence of
inductive steps and then to explicitly train the RNN to reproduce such steps.
Generalisation is achieved as the RNN is not allowed to learn an arbitrary
internal state; instead, it is tasked with mimicking the evolution of a valid
state. In particular, the state is restricted to a spatial memory map that
tracks parts of the input image which have been accounted for in previous
steps. The RNN is trained for single inductive steps, where it produces updates
to the memory in addition to the desired output. We evaluate our method on two
different visual recognition problems involving visual sequences: (1) text
spotting, i.e. joint localisation and reading of text in images containing
multiple lines (or a block) of text, and (2) sequential counting of objects in
aerial images. We show that inductive training of recurrent models enhances
their generalisation ability on challenging image datasets.Comment: In BMVC 2018 (spotlight
An investigation into the use of linguistic context in cursive script recognition by computer
The automatic recognition of hand-written text has been a goal
for over thirty five years. The highly ambiguous nature of cursive
writing (with high variability between not only different writers, but
even between different samples from the same writer), means that
systems based only on visual information are prone to errors.
It is suggested that the application of linguistic knowledge to
the recognition task may improve recognition accuracy. If a low-level
(pattern recognition based) recogniser produces a candidate lattice
(i.e. a directed graph giving a number of alternatives at each word
position in a sentence), then linguistic knowledge can be used to find
the 'best' path through the lattice.
There are many forms of linguistic knowledge that may be used
to this end. This thesis looks specifically at the use of collocation as a
source of linguistic knowledge. Collocation describes the statistical
tendency of certain words to co-occur in a language, within a defined
range. It is suggested that this tendency may be exploited to aid
automatic text recognition.
The construction and use of a post-processing system
incorporating collocational knowledge is described, as are a number
of experiments designed to test the effectiveness of collocation as an
aid to text recognition. The results of these experiments suggest that
collocational statistics may be a useful form of knowledge for this
application and that further research may produce a system of real
practical use
Multimodal Interactive Transcription of Handwritten Text Images
En esta tesis se presenta un nuevo marco interactivo y multimodal para la transcripción de
Documentos manuscritos. Esta aproximación, lejos de proporcionar la transcripción completa
pretende asistir al experto en la dura tarea de transcribir.
Hasta la fecha, los sistemas de reconocimiento de texto manuscrito disponibles no proporcionan
transcripciones aceptables por los usuarios y, generalmente, se requiere la intervención
del humano para corregir las transcripciones obtenidas. Estos sistemas han demostrado ser
realmente útiles en aplicaciones restringidas y con vocabularios limitados (como es el caso
del reconocimiento de direcciones postales o de cantidades numéricas en cheques bancarios),
consiguiendo en este tipo de tareas resultados aceptables. Sin embargo, cuando se trabaja
con documentos manuscritos sin ningún tipo de restricción (como documentos manuscritos
antiguos o texto espontáneo), la tecnología actual solo consigue resultados inaceptables.
El escenario interactivo estudiado en esta tesis permite una solución más efectiva. En este
escenario, el sistema de reconocimiento y el usuario cooperan para generar la transcripción final
de la imagen de texto. El sistema utiliza la imagen de texto y una parte de la transcripción
previamente validada (prefijo) para proponer una posible continuación. Despues, el usuario
encuentra y corrige el siguente error producido por el sistema, generando así un nuevo prefijo
mas largo. Este nuevo prefijo, es utilizado por el sistema para sugerir una nueva hipótesis. La
tecnología utilizada se basa en modelos ocultos de Markov y n-gramas. Estos modelos son
utilizados aquí de la misma manera que en el reconocimiento automático del habla. Algunas
modificaciones en la definición convencional de los n-gramas han sido necesarias para tener
en cuenta la retroalimentación del usuario en este sistema.Romero Gómez, V. (2010). Multimodal Interactive Transcription of Handwritten Text Images [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/8541Palanci
Handwritten Text Recognition for Historical Documents in the tranScriptorium Project
""© Owner/Author 2014. This is the author's version of the work. It is posted here for your personal use. Not for redistribution. The definitive Version of Record was published in ACM, In Proceedings of the First International Conference on Digital Access to Textual Cultural Heritage (pp. 111-117) http://dx.doi.org/10.1145/2595188.2595193Transcription of historical handwritten documents is a crucial
problem for making easier the access to these documents
to the general public. Currently, huge amount of historical
handwritten documents are being made available by on-line
portals worldwide. It is not realistic to obtain the transcription
of these documents manually, and therefore automatic
techniques has to be used. tranScriptorium is
a project that aims at researching on modern Handwritten
Text Recognition (HTR) technology for transcribing historical
handwritten documents. The HTR technology used in
tranScriptorium is based on models that are learnt automatically
from examples. This HTR technology has been
used on a Dutch collection from 15th century selected for
the tranScriptorium project. This paper provides preliminary
HTR results on this Dutch collection that are very
encouraging, taken into account that minimal resources have
been deployed to develop the transcription system.The research leading to these results has received funding from the European Union’s Seventh Framework Programme (FP7/2007-2013) under grant agreement no. 600707 - tranScriptorium and the Spanish MEC under the STraDa (TIN2012-37475-C02-01) research project.Sánchez Peiró, JA.; Bosch Campos, V.; Romero Gómez, V.; Depuydt, K.; De Does, J. (2014). Handwritten Text Recognition for Historical Documents in the tranScriptorium Project. ACM. https://doi.org/10.1145/2595188.2595193
ICFHR2016 Competition on Handwritten Text Recognition on the READ Dataset
© 2016 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.[EN] This paper describes the Handwritten Text Recognition (HTR) competition on the READ dataset that has been held in the context of the International Conference on Frontiers in Handwriting Recognition 2016. This competition
aims to bring together researchers working on off-line HTR and provide them a suitable benchmark to compare their techniques on the task of transcribing typical historical handwritten documents. Two tracks with different conditions on
the use of training data were proposed. Ten research groups registered in the competition but finally five submitted results. The handwritten images for this competition were drawn from the German document Ratsprotokolle collection composed of minutes of the council meetings held from 1470 to 1805, used
in the READ project. The selected dataset is written by several hands and entails significant variabilities and difficulties. The five participants achieved good results with transcriptions word error rates ranging from 21% to 47% and character error rates rating from 5% to 19%.This work has been partially supported through the European Union's H2020 grant READ (Recognition and Enrichment of Archival Documents) (Ref: 674943), and the MINECO/FEDER UE project TIN2015-70924-C2-1-R.Sánchez Peiró, JA.; Romero Gómez, V.; Toselli, AH.; Vidal, E. (2016). ICFHR2016 Competition on Handwritten Text Recognition on the READ Dataset. IEEE. https://doi.org/10.1109/ICFHR.2016.0120
Beyond writing: The development of literacy in the Ancient Near East
Previous discussions of the origins of writing in the Ancient Near East have not incorporated the neuroscience of literacy, which suggests that when southern Mesopotamians wrote marks on clay in the late-fourth millennium, they inadvertently reorganized their neural activity, a factor in manipulating the writing system to reflect language, yielding literacy through a combination of neurofunctional change and increased script fidelity to language. Such a development appears to take place only with a sufficient demand for writing and reading, such as that posed by a state-level bureaucracy; the use of a material with suitable characteristics; and the production of marks that are conventionalized, handwritten, simple, and non-numerical. From the perspective of Material Engagement Theory, writing and reading represent the interactivity of bodies, materiality, and brains: movements of hands, arms, and eyes; clay and the implements used to mark it and form characters; and vision, motor planning, object recognition, and language. Literacy is a cognitive change that emerges from and depends upon the nexus of interactivity of the components
Survey on Publicly Available Sinhala Natural Language Processing Tools and Research
Sinhala is the native language of the Sinhalese people who make up the
largest ethnic group of Sri Lanka. The language belongs to the globe-spanning
language tree, Indo-European. However, due to poverty in both linguistic and
economic capital, Sinhala, in the perspective of Natural Language Processing
tools and research, remains a resource-poor language which has neither the
economic drive its cousin English has nor the sheer push of the law of numbers
a language such as Chinese has. A number of research groups from Sri Lanka have
noticed this dearth and the resultant dire need for proper tools and research
for Sinhala natural language processing. However, due to various reasons, these
attempts seem to lack coordination and awareness of each other. The objective
of this paper is to fill that gap of a comprehensive literature survey of the
publicly available Sinhala natural language tools and research so that the
researchers working in this field can better utilize contributions of their
peers. As such, we shall be uploading this paper to arXiv and perpetually
update it periodically to reflect the advances made in the field
- …