14 research outputs found

    Offline Handwritten Digit Recognition Using Triangle Geometry Properties

    Get PDF
    Offline digit handwritten recognition is one of the frequent studies that is being explored nowadays.Most of the digit characters have their own handwriting nature. Recognizing their patterns and types is a challenging task to do.Lately,triangle geometry nature has been adapted to identify the pattern and type of digit handwriting.However,a huge size of generated triangle features and data has caused slow performances and longer processing time.Therefore,in this paper,we proposed an improvement on triangle features by combining the ratio and gradient features respectively in order to overcome the problem.There are four types of datasets used in the experiment which are IFCHDB,HODA,MNIST and BANGLA.In this experiment,the comparison was made based on the training time for each dataset Besides,Support Vector Machine (SVM) and Multi-Layer Perceptron (MLP) techniques are used to measure the accuracies for each of datasets in this study

    MLP neural network based gas classification system on Zynq SoC

    Get PDF
    Systems based on Wireless Gas Sensor Networks (WGSN) offer a powerful tool to observe and analyse data in complex environments over long monitoring periods. Since the reliability of sensors is very important in those systems, gas classification is a critical process within the gas safety precautions. A gas classification system has to react fast in order to take essential actions in case of fault detection. This paper proposes a low latency real-time gas classification service system, which uses a Multi-Layer Perceptron (MLP) Artificial Neural Network (ANN) to detect and classify the gas sensor data. An accurate MLP is developed to work with the data set obtained from an array of tin oxide (SnO2) gas sensor, based on convex Micro hotplates (MHP). The overall system acquires the gas sensor data through RFID, and processes the sensor data with the proposed MLP classifier implemented on a System on Chip (SoC) platform from Xilinx. Hardware implementation of the classifier is optimized to achieve very low latency for real-time application. The proposed architecture has been implemented on a ZYNQ SoC using fixed-point format and achieved results have shown that an accuracy of 97.4% has been obtained

    A review on handwritten character and numeral recognition for Roman, Arabic, Chinese and Indian scripts

    Get PDF
    Abstract -There are a lot of intensive researches on handwritten character recognition (HCR) for almost past four decades. The research has been done on some of popular scripts such as Roman, Arabic, Chinese and Indian. In this paper we present a review on HCR work on the four popular scripts. We have summarized most of the published paper from 2005 to recent and also analyzed the various methods in creating a robust HCR system. We also added some future direction of research on HCR

    Convolutional neural networks for face recognition and finger-vein biometric identification

    Get PDF
    The Convolutional Neural Network (CNN), a variant of the Multilayer Perceptron (MLP), has shown promise in solving complex recognition problems, particularly in visual pattern recognition. However, the classical LeNet-5 CNN model, which most solutions are based on, is highly compute-intensive. This CNN also suffers from long training time, due to the large number of layers that ranges from six to eight. In this research, a CNN model with a reduced complexity is proposed for application in face recognition and finger-vein biometric identification. A simpler architecture is obtained by fusing convolutional and subsampling layers into one layer, in conjunction with a partial connection scheme applied between the first two layers in the network. As a result, the total number of layers is reduced to four. The number of feature maps at each layer is optimized according to the type of image database being processed. Consequently, the numbers of network parameters (including neurons, trainable parameters and connections) are significantly reduced, essentially increasing the generalization ability of the network. The Stochastic Diagonal Levenberg-Marquadt (SDLM) backpropagation algorithm is modified and applied in the training of the proposed network. With this learning algorithm, the convergence rate is accelerated such that the proposed CNN converges within 15 epochs. For face recognition, the proposed CNN achieves recognition rates of 100.00% and 99.50% for AT&T and AR Purdue face databases respectively. Recognition time on the AT&T database is less than 0.003 seconds. These results outperform previous existing works. In addition, when compared with the other CNN-based face recognizer, the proposed CNN model has the least number of network parameters, hence better generalization ability. A training scheme is also proposed to recognize new categories without full CNN training. In this research, a novel CNN solution for the finger-vein biometric identification problem is also proposed. To the best of knowledge, there is no previous work reported in literature that applied CNN for finger-vein recognition. The proposed method is efficient in that simple preprocessing algorithms are deployed. The CNN design is adapted on a finger-vein database, which is developed in-house and contains 81 subjects. A recognition accuracy of 99.38% is achieved, which is similar to the results of state-of-the-art work. In conclusion, the success of the research in solving face recognition and finger-vein biometric identification problems proves the feasibility of the proposed CNN model in any pattern recognition system

    Speech Recognition

    Get PDF
    Chapters in the first part of the book cover all the essential speech processing techniques for building robust, automatic speech recognition systems: the representation for speech signals and the methods for speech-features extraction, acoustic and language modeling, efficient algorithms for searching the hypothesis space, and multimodal approaches to speech recognition. The last part of the book is devoted to other speech processing applications that can use the information from automatic speech recognition for speaker identification and tracking, for prosody modeling in emotion-detection systems and in other speech processing applications that are able to operate in real-world environments, like mobile communication services and smart homes

    Survey of FPGA applications in the period 2000 – 2015 (Technical Report)

    Get PDF
    Romoth J, Porrmann M, Rückert U. Survey of FPGA applications in the period 2000 – 2015 (Technical Report).; 2017.Since their introduction, FPGAs can be seen in more and more different fields of applications. The key advantage is the combination of software-like flexibility with the performance otherwise common to hardware. Nevertheless, every application field introduces special requirements to the used computational architecture. This paper provides an overview of the different topics FPGAs have been used for in the last 15 years of research and why they have been chosen over other processing units like e.g. CPUs

    Advanced document data extraction techniques to improve supply chain performance

    Get PDF
    In this thesis, a novel machine learning technique to extract text-based information from scanned images has been developed. This information extraction is performed in the context of scanned invoices and bills used in financial transactions. These financial transactions contain a considerable amount of data that must be extracted, refined, and stored digitally before it can be used for analysis. Converting this data into a digital format is often a time-consuming process. Automation and data optimisation show promise as methods for reducing the time required and the cost of Supply Chain Management (SCM) processes, especially Supplier Invoice Management (SIM), Financial Supply Chain Management (FSCM) and Supply Chain procurement processes. This thesis uses a cross-disciplinary approach involving Computer Science and Operational Management to explore the benefit of automated invoice data extraction in business and its impact on SCM. The study adopts a multimethod approach based on empirical research, surveys, and interviews performed on selected companies.The expert system developed in this thesis focuses on two distinct areas of research: Text/Object Detection and Text Extraction. For Text/Object Detection, the Faster R-CNN model was analysed. While this model yields outstanding results in terms of object detection, it is limited by poor performance when image quality is low. The Generative Adversarial Network (GAN) model is proposed in response to this limitation. The GAN model is a generator network that is implemented with the help of the Faster R-CNN model and a discriminator that relies on PatchGAN. The output of the GAN model is text data with bonding boxes. For text extraction from the bounding box, a novel data extraction framework consisting of various processes including XML processing in case of existing OCR engine, bounding box pre-processing, text clean up, OCR error correction, spell check, type check, pattern-based matching, and finally, a learning mechanism for automatizing future data extraction was designed. Whichever fields the system can extract successfully are provided in key-value format.The efficiency of the proposed system was validated using existing datasets such as SROIE and VATI. Real-time data was validated using invoices that were collected by two companies that provide invoice automation services in various countries. Currently, these scanned invoices are sent to an OCR system such as OmniPage, Tesseract, or ABBYY FRE to extract text blocks and later, a rule-based engine is used to extract relevant data. While the system’s methodology is robust, the companies surveyed were not satisfied with its accuracy. Thus, they sought out new, optimized solutions. To confirm the results, the engines were used to return XML-based files with text and metadata identified. The output XML data was then fed into this new system for information extraction. This system uses the existing OCR engine and a novel, self-adaptive, learning-based OCR engine. This new engine is based on the GAN model for better text identification. Experiments were conducted on various invoice formats to further test and refine its extraction capabilities. For cost optimisation and the analysis of spend classification, additional data were provided by another company in London that holds expertise in reducing their clients' procurement costs. This data was fed into our system to get a deeper level of spend classification and categorisation. This helped the company to reduce its reliance on human effort and allowed for greater efficiency in comparison with the process of performing similar tasks manually using excel sheets and Business Intelligence (BI) tools.The intention behind the development of this novel methodology was twofold. First, to test and develop a novel solution that does not depend on any specific OCR technology. Second, to increase the information extraction accuracy factor over that of existing methodologies. Finally, it evaluates the real-world need for the system and the impact it would have on SCM. This newly developed method is generic and can extract text from any given invoice, making it a valuable tool for optimizing SCM. In addition, the system uses a template-matching approach to ensure the quality of the extracted information

    Implementación de las principales capas que constituyen una red neuronal convolucional secuencial en lógica reconfigurable

    Get PDF
    "Este trabajo presenta la implementación de las principales capas de una red neuronal convolucional, como es la operación de convolución, la función de activación ReLU, el operador Pooling y el perceptrón, además, se implementó una unidad de punto flotante con un formato de 24 bits para realizar las operaciones de suma, resta y multiplicación. Los módulos fueron implementados en el IDE de desarrollo ISE DESIGN SUITE en su versión 14.6, una vez implementados y conforme se fueron haciendo más robustos, se migraron al IDE de desarrollo VIVADO en su versión 2020.2. Se hizo uso de una PC con un procesador AMD Ryzen 7 de 64 bits con 8 núcleos y con una frecuencia de 2.3 GHz para programar los scripts que arrojan los valores correctos que son comparados con los resultados de los módulos implementados para determinar la calidad de su funcionamiento. Se implementó un módulo que realiza la suma y resta en formato de 24 bits de punto flotante y un módulo que realiza la multiplicación, es posible realizar divisiones si se representan números con exponente negativo"
    corecore