149 research outputs found

    Arabic Handwritten Word Recognition based on Bernoulli Mixture HMMs

    Full text link
    This thesis presents new approaches in off-line Arabic Handwriting Recognition based on conventional Bernoulli Hidden Markov models. Until now, the off-line handwriting recognition, in particular, the Arabic handwriting recognition is still far away form being perfect. Hidden Markov Models (HMMs) are now widely used for off-line handwriting recognition in many languages and, in particular, in Arabic. As in speech recognition, they are usually built from shared, embedded HMMs at symbol level, in which state-conditional probability density functions are modeled with Gaussian mixtures. In contrast to speech recognition, however, it is unclear which kind of features should be used and, indeed, very different features sets are in use today. Among them, we have recently proposed to simply use columns of raw, binary image pixels, which are directly fed into embedded Bernoulli (mixture) HMMs, that is, embedded HMMs in which the emission probabilities are modeled with Bernoulli mixtures. The idea is to by-pass feature extraction and ensure that no discriminative information is filtered out during feature extraction, which in some sense is integrated into the recognition model. In this thesis, we review this idea along with some extensions that are currently providing state-of-the-art results on Arabic handwritten word recognition.Alkhoury, I. (2010). Arabic Handwritten Word Recognition based on Bernoulli Mixture HMMs. http://hdl.handle.net/10251/11478Archivo delegad

    Motion-resistant pulse oximetry

    Get PDF
    The measurement of vital signs ? such as peripheral capillary oxygen saturation (SpO2) and heart rate (HR) levels ? by a pulse oximeter is studied. The pulse oximeter is a non-invasive device that measures photoplethysmography (PPG) signals and extracts vital signs from them. However, the quality of the PPG signal measured by oximetry sensors is known to deteriorate in the presence of substantial human and sensor movements contributing to the measurement noise. Methods to suppress such noise from PPG signals measured by an oximeter and to calculate the associated vital signs with high accuracy even when the wearer is under substantial motion are presented in this study. The spectral components of the PPG waveform are known to appear at a fundamental frequency that corresponds to the participant\u27s HR and at its harmonics. To match this signal, a time-varying comb filter tuned to the participant\u27s HR is employed. The filter captures the HR components and eliminates most other artifacts. A significant improvement in the accuracy of SpO2 calculated from the comb-filtered PPG signals is observed, when tested on data collected from human participants while they are at rest and while they are exercising. In addition, an architecture that integrates SpO2 levels from multiple PPG channels mounted on different parts of the wearer\u27s arm is presented. The SpO2 levels are integrated using a Kalman filter that uses past measurements and modeling of the SpO2 dynamics to attenuate the effect of the motion artifacts. Again, data collected from human participants while they are at rest and while they are exercising are used. The integrated SpO2 levels are shown to be more accurate and reliable than those calculated from individual channels. Motion-resistant algorithms typically require an additional noise reference signal to produce high quality vital signs such as HR. A framework that employs PPG sensors only ? one in the green and one in the infrared spectrum ? to compute high quality HR levels is developed. Our framework is tested on experimental data collected from human participants while at rest and while running at various speeds. Our PPG-only framework generates HR levels with high accuracy and low computational complexity as compared to leading HR calculation methods in the literature that require the availability of a noise reference signal. The methods for SpO2 and HR calculation presented in this study are desirable since (1) they yield high accuracy in estimating vital signs under substantial level of motion artifacts and (2) they are computationally efficient, (and therefore are capable to be implemented in wearable devices)

    Arabic Text Recognition and Machine Translation

    Full text link
    [EN] Research on Arabic Handwritten Text Recognition (HTR) and Arabic-English Machine Translation (MT) has been usually approached as two independent areas of study. However, the idea of creating one system that combines both areas together, in order to generate English translation out of images containing Arabic text, is still a very challenging task. This process can be interpreted as the translation of Arabic images. In this thesis, we propose a system that recognizes Arabic handwritten text images, and translates the recognized text into English. This system is built from the combination of an HTR system and an MT system. Regarding the HTR system, our work focuses on the use of Bernoulli Hidden Markov Models (BHMMs). BHMMs had proven to work very well with Latin script. Indeed, empirical results based on it were reported on well-known corpora, such as IAM and RIMES. In this thesis, these results are extended to Arabic script, in particular, to the well-known IfN/ENIT and NIST OpenHaRT databases for Arabic handwritten text. The need for transcribing Arabic text is not only limited to handwritten text, but also to printed text. Arabic printed text might be considered as a simple form of handwritten text version. Thus, for this kind of text, we also propose Bernoulli HMMs. In addition, we propose to compare BHMMs with state-of-the-art technology based on neural networks. A key idea that has proven to be very effective in this application of Bernoulli HMMs is the use of a sliding window of adequate width for feature extraction. This idea has allowed us to obtain very competitive results in the recognition of both Arabic handwriting and printed text. Indeed, a system based on it ranked first at the ICDAR 2011 Arabic recognition competition on the Arabic Printed Text Image (APTI) database. Moreover, this idea has been refined by using repositioning techniques for extracted windows, leading to further improvements in Arabic text recognition. In the case of handwritten text, this refinement improved our system which ranked first at the ICFHR 2010 Arabic handwriting recognition competition on IfN/ENIT. In the case of printed text, this refinement led to an improved system which ranked second at the ICDAR 2013 Competition on Multi-font and Multi-size Digitally Represented Arabic Text on APTI. Furthermore, this refinement was used with neural networks-based technology, which led to state-of-the-art results. For machine translation, the system was based on the combination of three state-of-the-art statistical models: the standard phrase-based models, the hierarchical phrase-based models, and the N-gram phrase-based models. This combination was done using the Recognizer Output Voting Error Reduction (ROVER) method. Finally, we propose three methods of combining HTR and MT to develop an Arabic image translation system. The system was evaluated on the NIST OpenHaRT database, where competitive results were obtained.[ES] El reconocimiento de texto manuscrito (HTR) en árabe y la traducción automática (MT) del árabe al inglés se han tratado habitualmente como dos áreas de estudio independientes. De hecho, la idea de crear un sistema que combine las dos áreas, que directamente genere texto en inglés a partir de imágenes que contienen texto en árabe, sigue siendo una tarea difícil. Este proceso se puede interpretar como la traducción de imágenes de texto en árabe. En esta tesis, se propone un sistema que reconoce las imágenes de texto manuscrito en árabe, y que traduce el texto reconocido al inglés. Este sistema está construido a partir de la combinación de un sistema HTR y un sistema MT. En cuanto al sistema HTR, nuestro trabajo se enfoca en el uso de los Bernoulli Hidden Markov Models (BHMMs). Los modelos BHMMs ya han sido probados anteriormente en tareas con alfabeto latino obteniendo buenos resultados. De hecho, existen resultados empíricos publicados usando corpus conocidos, tales como IAM o RIMES. En esta tesis, estos resultados se han extendido al texto manuscrito en árabe, en particular, a las bases de datos IfN/ENIT y NIST OpenHaRT. En aplicaciones reales, la transcripción del texto en árabe no se limita únicamente al texto manuscrito, sino también al texto impreso. El texto impreso se puede interpretar como una forma simplificada de texto manuscrito. Por lo tanto, para este tipo de texto, también proponemos el uso de modelos BHMMs. Además, estos modelos se han comparado con tecnología del estado del arte basada en redes neuronales. Una idea clave que ha demostrado ser muy eficaz en la aplicación de modelos BHMMs es el uso de una ventana deslizante (sliding window) de anchura adecuada durante la extracción de características. Esta idea ha permitido obtener resultados muy competitivos tanto en el reconocimiento de texto manuscrito en árabe como en el de texto impreso. De hecho, un sistema basado en este tipo de extracción de características quedó en la primera posición en el concurso ICDAR 2011 Arabic recognition competition usando la base de datos Arabic Printed Text Image (APTI). Además, esta idea se ha perfeccionado mediante el uso de técnicas de reposicionamiento aplicadas a las ventanas extraídas, dando lugar a nuevas mejoras en el reconocimiento de texto árabe. En el caso de texto manuscrito, este refinamiento ha conseguido mejorar el sistema que ocupó el primer lugar en el concurso ICFHR 2010 Arabic handwriting recognition competition usando IfN/ENIT. En el caso del texto impreso, este refinamiento condujo a un sistema mejor que ocupó el segundo lugar en el concurso ICDAR 2013 Competition on Multi-font and Multi-size Digitally Represented Arabic Text en el que se usaba APTI. Por otro lado, esta técnica se ha evaluado también en tecnología basada en redes neuronales, lo que ha llevado a resultados del estado del arte. Respecto a la traducción automática, el sistema se ha basado en la combinación de tres tipos de modelos estadísticos del estado del arte: los modelos standard phrase-based, los modelos hierarchical phrase-based y los modelos N-gram phrase-based. Esta combinación se hizo utilizando el método Recognizer Output Voting Error Reduction (ROVER). Por último, se han propuesto tres métodos para combinar los sistemas HTR y MT con el fin de desarrollar un sistema de traducción de imágenes de texto árabe a inglés. El sistema se ha evaluado sobre la base de datos NIST OpenHaRT, donde se han obtenido resultados competitivos.[CA] El reconeixement de text manuscrit (HTR) en àrab i la traducció automàtica (MT) de l'àrab a l'anglès s'han tractat habitualment com dues àrees d'estudi independents. De fet, la idea de crear un sistema que combine les dues àrees, que directament genere text en anglès a partir d'imatges que contenen text en àrab, continua sent una tasca difícil. Aquest procés es pot interpretar com la traducció d'imatges de text en àrab. En aquesta tesi, es proposa un sistema que reconeix les imatges de text manuscrit en àrab, i que tradueix el text reconegut a l'anglès. Aquest sistema està construït a partir de la combinació d'un sistema HTR i d'un sistema MT. Pel que fa al sistema HTR, el nostre treball s'enfoca en l'ús dels Bernoulli Hidden Markov Models (BHMMs). Els models BHMMs ja han estat provats anteriorment en tasques amb alfabet llatí obtenint bons resultats. De fet, existeixen resultats empírics publicats emprant corpus coneguts, tals com IAM o RIMES. En aquesta tesi, aquests resultats s'han estès a la escriptura manuscrita en àrab, en particular, a les bases de dades IfN/ENIT i NIST OpenHaRT. En aplicacions reals, la transcripció de text en àrab no es limita únicament al text manuscrit, sinó també al text imprès. El text imprès es pot interpretar com una forma simplificada de text manuscrit. Per tant, per a aquest tipus de text, també proposem l'ús de models BHMMs. A més a més, aquests models s'han comparat amb tecnologia de l'estat de l'art basada en xarxes neuronals. Una idea clau que ha demostrat ser molt eficaç en l'aplicació de models BHMMs és l'ús d'una finestra lliscant (sliding window) d'amplària adequada durant l'extracció de característiques. Aquesta idea ha permès obtenir resultats molt competitius tant en el reconeixement de text àrab manuscrit com en el de text imprès. De fet, un sistema basat en aquest tipus d'extracció de característiques va quedar en primera posició en el concurs ICDAR 2011 Arabic recognition competition emprant la base de dades Arabic Printed Text Image (APTI). A més a més, aquesta idea s'ha perfeccionat mitjançant l'ús de tècniques de reposicionament aplicades a les finestres extretes, donant lloc a noves millores en el reconeixement de text en àrab. En el cas de text manuscrit, aquest refinament ha aconseguit millorar el sistema que va ocupar el primer lloc en el concurs ICFHR 2010 Arabic handwriting recognition competition usant IfN/ENIT. En el cas del text imprès, aquest refinament va conduir a un sistema millor que va ocupar el segon lloc en el concurs ICDAR 2013 Competition on Multi-font and Multi-size Digitally Represented Arabic Text en el qual s'usava APTI. D'altra banda, aquesta tècnica s'ha avaluat també en tecnologia basada en xarxes neuronals, el que ha portat a resultats de l'estat de l'art. Respecte a la traducció automàtica, el sistema s'ha basat en la combinació de tres tipus de models estadístics de l'estat de l'art: els models standard phrase-based, els models hierarchical phrase-based i els models N-gram phrase-based. Aquesta combinació es va fer utilitzant el mètode Recognizer Output Voting Errada Reduction (ROVER). Finalment, s'han proposat tres mètodes per combinar els sistemes HTR i MT amb la finalitat de desenvolupar un sistema de traducció d'imatges de text àrab a anglès. El sistema s'ha avaluat sobre la base de dades NIST OpenHaRT, on s'han obtingut resultats competitius.Alkhoury, I. (2015). Arabic Text Recognition and Machine Translation [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/53029TESI

    Vision-based route following by an embodied insect-inspired sparse neural network

    Get PDF
    We compared the efficiency of the FlyHash model, an insect-inspired sparse neural network (Dasgupta et al., 2017), to similar but non-sparse models in an embodied navigation task. This requires a model to control steering by comparing current visual inputs to memories stored along a training route. We concluded the FlyHash model is more efficient than others, especially in terms of data encoding

    Vision-based route following by an embodied insect-inspired sparse neural network

    Full text link
    We compared the efficiency of the FlyHash model, an insect-inspired sparse neural network (Dasgupta et al., 2017), to similar but non-sparse models in an embodied navigation task. This requires a model to control steering by comparing current visual inputs to memories stored along a training route. We concluded the FlyHash model is more efficient than others, especially in terms of data encoding.Comment: 8 pages, 4 figures; work-in-progress submission, accepted as a poster at ICLR 2023 Workshop on Sparsity in Neural Networks; non-archiva

    Arabic recognition and translation system

    Full text link
    To our knowledge, there are only few systems that are able to automatically translate handwritten text images into another language, in particular, Arabic. Typically, the available systems are based on a concatenation of two systems: a Handwritten Text Recognition (HTR) system and a Machine Translation (MT) system. Roughly speaking, in the case of recognition of Arabic text images, our work has focused on the use of the embedded Bernoulli (mixture) HMMs (BHMMs), that is, embedded HMMs in which the emission probabilities are modeled with Bernoulli mixtures. In the case of Arabic text translation, our work has focused on one of the state-of-theart phrase-based log-linear translation models. In this work we evaluate our system on the LDC corpus introduced in the NIST OpenHaRT 2010 and 2013 evaluations. Very competitive and promising results are shown. Additionally, we present the idea of a simple mobile application system for image translation that recognizes the Arabic text in an image and translates the recognized text into English.Alkhoury, I. (2013). Arabic recognition and translation system. http://hdl.handle.net/10251/33086.Archivo delegad

    Esophageal granular cell tumor colliding with intramucosal adenocarcinoma: a case report

    Get PDF
    We report a case of a granular cell tumor colliding with intramucosal adenocarcinoma of the esophagus. A 58-year-old white was found to have a 5 mm nodule in the distal esophagus detected by upper gastrointestinal endoscopy performed as part of the workup of long standing reflux. Endoscopic biopsies revealed intramucosal adenocarcinoma arising in the setting of Barrett’s esophagus. The adenocarcinoma infiltrated a granular cell tumor also present at the nodular site. Endoscopic mucosal resection using Duette band ligation and hot snare electrocautery was performed. Margins were negative for both tumors, and endoscopic surveillance for recurrence is planned
    corecore