Search CORE

33 research outputs found

Design of an Offline Handwriting Recognition System Tested on the Bangla and Korean Scripts

Author: Majid Nishatul
Publication venue: 'IUScholarWorks'
Publication date: 01/08/2020
Field of study

This dissertation presents a flexible and robust offline handwriting recognition system which is tested on the Bangla and Korean scripts. Offline handwriting recognition is one of the most challenging and yet to be solved problems in machine learning. While a few popular scripts (like Latin) have received a lot of attention, many other widely used scripts (like Bangla) have seen very little progress. Features such as connectedness and vowels structured as diacritics make it a challenging script to recognize. A simple and robust design for offline recognition is presented which not only works reliably, but also can be used for almost any alphabetic writing system. The framework has been rigorously tested for Bangla and demonstrated how it can be transformed to apply to other scripts through experiments on the Korean script whose two-dimensional arrangement of characters makes it a challenge to recognize. The base of this design is a character spotting network which detects the location of different script elements (such as characters, diacritics) from an unsegmented word image. A transcript is formed from the detected classes based on their corresponding location information. This is the first reported lexicon-free offline recognition system for Bangla and achieves a Character Recognition Accuracy (CRA) of 94.8%. This is also one of the most flexible architectures ever presented. Recognition of Korean was achieved with a 91.2% CRA. Also, a powerful technique of autonomous tagging was developed which can drastically reduce the effort of preparing a dataset for any script. The combination of the character spotting method and the autonomous tagging brings the entire offline recognition problem very close to a singular solution. Additionally, a database named the Boise State Bangla Handwriting Dataset was developed. This is one of the richest offline datasets currently available for Bangla and this has been made publicly accessible to accelerate the research progress. Many other tools were developed and experiments were conducted to more rigorously validate this framework by evaluating the method against external datasets (CMATERdb 1.1.1, Indic Word Dataset and REID2019: Early Indian Printed Documents). Offline handwriting recognition is an extremely promising technology and the outcome of this research moves the field significantly ahead

Boise State University - ScholarWorks

Articulatory Copy Synthesis Based on the Speech Synthesizer VocalTractLab

Author: Gao Yingming
Publication venue
Publication date: 04/08/2022
Field of study

Articulatory copy synthesis (ACS), a subarea of speech inversion, refers to the reproduction of natural utterances and involves both the physiological articulatory processes and their corresponding acoustic results. This thesis proposes two novel methods for the ACS of human speech using the articulatory speech synthesizer VocalTractLab (VTL) to address or mitigate the existing problems of speech inversion, such as non-unique mapping, acoustic variation among different speakers, and the time-consuming nature of the process. The first method involved finding appropriate VTL gestural scores for given natural utterances using a genetic algorithm. It consisted of two steps: gestural score initialization and optimization. In the first step, gestural scores were initialized using the given acoustic signals with speech recognition, grapheme-to-phoneme (G2P), and a VTL rule-based method for converting phoneme sequences to gestural scores. In the second step, the initial gestural scores were optimized by a genetic algorithm via an analysis-by-synthesis (ABS) procedure that sought to minimize the cosine distance between the acoustic features of the synthetic and natural utterances. The articulatory parameters were also regularized during the optimization process to restrict them to reasonable values. The second method was based on long short-term memory (LSTM) and convolutional neural networks, which were responsible for capturing the temporal dependence and the spatial structure of the acoustic features, respectively. The neural network regression models were trained, which used acoustic features as inputs and produced articulatory trajectories as outputs. In addition, to cover as much of the articulatory and acoustic space as possible, the training samples were augmented by manipulating the phonation type, speaking effort, and the vocal tract length of the synthetic utterances. Furthermore, two regularization methods were proposed: one based on the smoothness loss of articulatory trajectories and another based on the acoustic loss between original and predicted acoustic features. The best-performing genetic algorithms and convolutional LSTM systems (evaluated in terms of the difference between the estimated and reference VTL articulatory parameters) obtained average correlation coefficients of 0.985 and 0.983 for speaker-dependent utterances, respectively, and their reproduced speech achieved recognition accuracies of 86.25% and 64.69% for speaker-independent utterances of German words, respectively. When applied to German sentence utterances, as well as English and Mandarin Chinese word utterances, the neural network based ACS systems achieved recognition accuracies of 73.88%, 52.92%, and 52.41%, respectively. The results showed that both of these methods not only reproduced the articulatory processes but also reproduced the acoustic signals of reference utterances. Moreover, the regularization methods led to more physiologically plausible articulatory processes and made the estimated articulatory trajectories be more articulatorily preferred by VTL, thus reproducing more natural and intelligible speech. This study also found that the convolutional layers, when used in conjunction with batch normalization layers, automatically learned more distinctive features from log power spectrograms. Furthermore, the neural network based ACS systems trained using German data could be generalized to the utterances of other languages

Technische Universität Dresden: Qucosa

Distributed learning in sensor networks

Author: Gao Huaien
Publication venue: Ludwig-Maximilians-Universität München
Publication date: 26/01/2009
Field of study

Digitale Hochschulschriften der LMU

On the Synthesis of fuzzy neural systems.

Author
Publication venue: Department of Cultural and Religious Studies, The Chinese University of Hong Kong
Publication date: 01/01/1995
Field of study

by Chung, Fu Lai.Thesis (Ph.D.)--Chinese University of Hong Kong, 1995.Includes bibliographical references (leaves 166-174).ACKNOWLEDGEMENT --- p.iiiABSTRACT --- p.ivChapter 1. --- Introduction --- p.1Chapter 1.1 --- Integration of Fuzzy Systems and Neural Networks --- p.1Chapter 1.2 --- Objectives of the Research --- p.7Chapter 1.2.1 --- Fuzzification of Competitive Learning Algorithms --- p.7Chapter 1.2.2 --- Capacity Analysis of FAM and FRNS Models --- p.8Chapter 1.2.3 --- Structure and Parameter Identifications of FRNS --- p.9Chapter 1.3 --- Outline of the Thesis --- p.9Chapter 2. --- A Fuzzy System Primer --- p.11Chapter 2.1 --- Basic Concepts of Fuzzy Sets --- p.11Chapter 2.2 --- Fuzzy Set-Theoretic Operators --- p.15Chapter 2.3 --- "Linguistic Variable, Fuzzy Rule and Fuzzy Inference" --- p.19Chapter 2.4 --- Basic Structure of a Fuzzy System --- p.22Chapter 2.4.1 --- Fuzzifier --- p.22Chapter 2.4.2 --- Fuzzy Knowledge Base --- p.23Chapter 2.4.3 --- Fuzzy Inference Engine --- p.24Chapter 2.4.4 --- Defuzzifier --- p.28Chapter 2.5 --- Concluding Remarks --- p.29Chapter 3. --- Categories of Fuzzy Neural Systems --- p.30Chapter 3.1 --- Introduction --- p.30Chapter 3.2 --- Fuzzification of Neural Networks --- p.31Chapter 3.2.1 --- Fuzzy Membership Driven Models --- p.32Chapter 3.2.2 --- Fuzzy Operator Driven Models --- p.34Chapter 3.2.3 --- Fuzzy Arithmetic Driven Models --- p.35Chapter 3.3 --- Layered Network Implementation of Fuzzy Systems --- p.36Chapter 3.3.1 --- Mamdani's Fuzzy Systems --- p.36Chapter 3.3.2 --- Takagi and Sugeno's Fuzzy Systems --- p.37Chapter 3.3.3 --- Fuzzy Relation Based Fuzzy Systems --- p.38Chapter 3.4 --- Concluding Remarks --- p.40Chapter 4. --- Fuzzification of Competitive Learning Networks --- p.42Chapter 4.1 --- Introduction --- p.42Chapter 4.2 --- Crisp Competitive Learning --- p.44Chapter 4.2.1 --- Unsupervised Competitive Learning Algorithm --- p.46Chapter 4.2.2 --- Learning Vector Quantization Algorithm --- p.48Chapter 4.2.3 --- Frequency Sensitive Competitive Learning Algorithm --- p.50Chapter 4.3 --- Fuzzy Competitive Learning --- p.50Chapter 4.3.1 --- Unsupervised Fuzzy Competitive Learning Algorithm --- p.53Chapter 4.3.2 --- Fuzzy Learning Vector Quantization Algorithm --- p.54Chapter 4.3.3 --- Fuzzy Frequency Sensitive Competitive Learning Algorithm --- p.58Chapter 4.4 --- Stability of Fuzzy Competitive Learning --- p.58Chapter 4.5 --- Controlling the Fuzziness of Fuzzy Competitive Learning --- p.60Chapter 4.6 --- Interpretations of Fuzzy Competitive Learning Networks --- p.61Chapter 4.7 --- Simulation Results --- p.64Chapter 4.7.1 --- Performance of Fuzzy Competitive Learning Algorithms --- p.64Chapter 4.7.2 --- Performance of Monotonically Decreasing Fuzziness Control Scheme --- p.74Chapter 4.7.3 --- Interpretation of Trained Networks --- p.76Chapter 4.8 --- Concluding Remarks --- p.80Chapter 5. --- Capacity Analysis of Fuzzy Associative Memories --- p.82Chapter 5.1 --- Introduction --- p.82Chapter 5.2 --- Fuzzy Associative Memories (FAMs) --- p.83Chapter 5.3 --- Storing Multiple Rules in FAMs --- p.87Chapter 5.4 --- A High Capacity Encoding Scheme for FAMs --- p.90Chapter 5.5 --- Memory Capacity --- p.91Chapter 5.6 --- Rule Modification --- p.93Chapter 5.7 --- Inference Performance --- p.99Chapter 5.8 --- Concluding Remarks --- p.104Chapter 6. --- Capacity Analysis of Fuzzy Relational Neural Systems --- p.105Chapter 6.1 --- Introduction --- p.105Chapter 6.2 --- Fuzzy Relational Equations and Fuzzy Relational Neural Systems --- p.107Chapter 6.3 --- Solving a System of Fuzzy Relational Equations --- p.109Chapter 6.4 --- New Solvable Conditions --- p.112Chapter 6.4.1 --- Max-t Fuzzy Relational Equations --- p.112Chapter 6.4.2 --- Min-s Fuzzy Relational Equations --- p.117Chapter 6.5 --- Approximate Resolution --- p.119Chapter 6.6 --- System Capacity --- p.123Chapter 6.7 --- Inference Performance --- p.125Chapter 6.8 --- Concluding Remarks --- p.127Chapter 7. --- Structure and Parameter Identifications of Fuzzy Relational Neural Systems --- p.129Chapter 7.1 --- Introduction --- p.129Chapter 7.2 --- Modelling Nonlinear Dynamic Systems by Fuzzy Relational Equations --- p.131Chapter 7.3 --- A General FRNS Identification Algorithm --- p.138Chapter 7.4 --- An Evolutionary Computation Approach to Structure and Parameter Identifications --- p.139Chapter 7.4.1 --- Guided Evolutionary Simulated Annealing --- p.140Chapter 7.4.2 --- An Evolutionary Identification (EVIDENT) Algorithm --- p.143Chapter 7.5 --- Simulation Results --- p.146Chapter 7.6 --- Concluding Remarks --- p.158Chapter 8. --- Conclusions --- p.159Chapter 8.1 --- Summary of Contributions --- p.160Chapter 8.1.1 --- Fuzzy Competitive Learning --- p.160Chapter 8.1.2 --- Capacity Analysis of FAM and FRNS --- p.160Chapter 8.1.3 --- Numerical Identification of FRNS --- p.161Chapter 8.2 --- Further Investigations --- p.162Appendix A Publication List of the Candidate --- p.164BIBLIOGRAPHY --- p.16

CUHK Digital Repository

First Annual Workshop on Space Operations Automation and Robotics (SOAR 87)

Author: Griffin Sandy
Publication venue
Publication date
Field of study

Several topics relative to automation and robotics technology are discussed. Automation of checkout, ground support, and logistics; automated software development; man-machine interfaces; neural networks; systems engineering and distributed/parallel processing architectures; and artificial intelligence/expert systems are among the topics covered

NASA Technical Reports Server

Pertanika Journal of Science & Technology

Author: Universiti Putra Malaysia Press
Publication venue: Universiti Putra Malaysia Press
Publication date: 01/01/2017
Field of study

Universiti Putra Malaysia Institutional Repository

Artificial Intelligence methodologies to early predict student outcome and enrich learning material

Author: CANALE LORENZO
Publication venue: country:Italy
Publication date: 13/12/2022
Field of study

L'abstract è presente nell'allegato / the abstract is in the attachmen

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

The 1989 Goddard Conference on Space Applications of Artificial Intelligence

Author: Rash James
Publication venue
Publication date
Field of study

The following topics are addressed: mission operations support; planning and scheduling; fault isolation/diagnosis; image processing and machine vision; data management; and modeling and simulation

NASA Technical Reports Server

Computer-aided design of cellular manufacturing layout.

Author: Wu Yue
Publication venue
Publication date: 01/01/1999
Field of study

Durham e-Theses

Methods of covert communication of speech signals based on a bio-inspired principle

Author: Ballesteros Larrota Dora María
Publication venue: Universitat Politècnica de Catalunya
Publication date: 01/01/2013
Field of study

This work presents two speech hiding methods based on a bio-inspired concept known as the ability of adaptation of speech signals. A cryptographic model uses the adaptation to transform a secret message to a non-sensitive target speech signal, and then, the scrambled speech signal is an intelligible signal. The residual intelligibility is extremely low and it is appropriate to transmit secure speech signals. On the other hand, in a steganographic model, the adapted speech signal is hidden into a host signal by using indirect substitution or direct substitution. In the first case, the scheme is known as Efficient Wavelet Masking (EWM), and in the second case, it is known as improved-EWM (iEWM). While EWM demonstrated to be highly statistical transparent, the second one, iEWM, demonstrated to be highly robust against signal manipulations. Finally, with the purpose to transmit secure speech signals in real-time operation, a hardware-based scheme is proposedEsta tesis presenta dos métodos de comunicación encubierta de señales de voz utilizando un concepto bio-inspirado, conocido como la “habilidad de adaptación de señales de voz”. El modelo de criptografía utiliza la adaptación para transformar un mensaje secreto a una señal de voz no confidencial, obteniendo una señal de voz encriptada legible. Este método es apropiado para transmitir señales de voz seguras porque en la señal encriptada no quedan rastros del mensaje secreto original. En el caso de esteganografía, la señal de voz adaptada se oculta en una señal de voz huésped, utilizando sustitución directa o indirecta. En el primer caso el esquema se denomina EWM y en el segundo caso iEWM. EWM demostró ser altamente transparente, mientras que iEWM demostró ser altamente robusto contra manipulaciones de señal. Finalmente, con el propósito de transmitir señales de voz seguras en tiempo real, se propone un esquema para dispositivos hardware

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Tesis Doctorals en Xarxa

Secretaría de Estado de Cultura