318 research outputs found

    A review of Arabic text recognition dataset

    Get PDF
    Building a robust Optical Character Recognition (OCR) system for languages, such as Arabic with cursive scripts, has always been challenging. These challenges increase if the text contains diacritics of different sizes for characters and words. Apart from the complexity of the used font, these challenges must be addressed in recognizing the text of the Holy Quran. To solve these challenges, the OCR system would have to undergo different phases. Each problem would have to be addressed using different approaches, thus, researchers are studying these challenges and proposing various solutions. This has motivate this study to review Arabic OCR dataset because the dataset plays a major role in determining the nature of the OCR systems. State-of-the-art approaches in segmentation and recognition are discovered with the implementation of Recurrent Neural Networks (Long Short-Term Memory-LSTM and Gated Recurrent Unit-GRU) with the use of the Connectionist Temporal Classification (CTC). This also includes deep learning model and implementation of GRU in the Arabic domain. This paper has contribute in profiling the Arabic text recognition dataset thus determining the nature of OCR system developed and has identified research direction in building Arabic text recognition dataset

    Subword Recognition in Historical Arabic Documents using C-GRUs

    Get PDF
    The recent years have witnessed an increased tendency to digitize historical manuscripts that not only ensures the preservation of these collections but also allows researchers and end-users’ direct access to these images. Recognition of Arabic handwriting is challenging due to the highly cursive nature of the script and other challenges associated with historical documents (degradation etc.). This paper presents an end-to-end system to recognize Arabic handwritten sub words in historical documents. More specifically, we introduce a hybrid CNN-GRU model where the shallow convolutional network learns robust feature representations while the GRU layers carry out the sequence modelling and generate the transcription of the text. The proposed system is evaluated on two different datasets, IBN SINA and VML-HD reporting recognition rates of 96.10% and 98.60% respectively. A comparison with existing techniques evaluated on the same datasets validates the effectiveness of our proposed model in characterizing Arabic subwords

    Enhancing Automation with Label Defect Detection and Content Parsing Algorithms

    Get PDF
    The stable operation of power transmission and distribution is closely related to the overall performance and construction quality of circuit breakers. Focusing on circuit breakers as the research subject, we propose a machine vision method for automated defect detection, which can be applied in intelligent robots to improve detection efficiency, reduce costs, and address the issues related to performance and assembly quality. Based on the LeNet-5 convolutional neural network, a method for the detection of character defects on labels is proposed. This method is then combined with squeezing and excitation networks to achieve more precise classification with a feature graph mechanism. The experimental results show the accuracy of the LeNet-CB model can reach up to 99.75%, while the average time for single character detection is 17.9 milliseconds. Although the LeNet-SE model demonstrates certain limitations in handling some easily confused characters, it maintains an average accuracy of 98.95%. Through further optimization, a label content detection method based on the LSTM framework is constructed, with an average accuracy of 99.57%, and an average detection time of 84 milliseconds. Overall, the system meets the detection accuracy requirements and delivers a rapid response. making the results of this research a meaningful contribution to the practical foundation for ongoing improvements in robot intelligence and machine vision

    Deep GRU-CNN model for COVID-19 detection from chest X-rays data

    Get PDF
    In the current era, big data is growing exponentially due to advancements in smart devices. Data scientists apply varied learning-based techniques to identify the underlying patterns in the big medical data to address various health-related issues. In this context, automated disease detection has now become a central concern in medical science due to rapid population growth. It reduces the mortality rate by diagnosing the disease correctly and early enough. The novel virus disease COVID-19 has spread all over the world and is affecting millions of people. Many countries are facing a shortage of test kits, vaccines, and other resources due to substantial growth in COVID-19 cases. In order to accelerate the testing process, scientists around the world have sought to create revolutionary novel alternative methods for the detection of the deadly virus. In this paper, we have proposed a hybrid deep learning model based on a convolutional neural network (CNN) and gated recurrent unit (GRU) for diagnosing the virus from chest X-rays (CXRs). In the proposed model, CNN is used to extract features, and GRU is used as a classifier. The model has been trained on 424 CXRs images with 3 (COIVD-19, Pneumonia, and Normal) classes. The proposed model achieved encouraging results of 0.96, 0.96, and 0.95 in terms of precision, recall, and f1-score, respectively. These findings indicate how deep learning can significantly contribute to the early detection of COVID-19 patients using X-ray scans. Such indications can pave the ways to mitigate the deadly disease. We believe that this model can be an effective tool for medical practitioners for the early diagnosis of coronavirus from CXRs

    Broadcasting Convolutional Network for Visual Relational Reasoning

    Full text link
    In this paper, we propose the Broadcasting Convolutional Network (BCN) that extracts key object features from the global field of an entire input image and recognizes their relationship with local features. BCN is a simple network module that collects effective spatial features, embeds location information and broadcasts them to the entire feature maps. We further introduce the Multi-Relational Network (multiRN) that improves the existing Relation Network (RN) by utilizing the BCN module. In pixel-based relation reasoning problems, with the help of BCN, multiRN extends the concept of `pairwise relations' in conventional RNs to `multiwise relations' by relating each object with multiple objects at once. This yields in O(n) complexity for n objects, which is a vast computational gain from RNs that take O(n^2). Through experiments, multiRN has achieved a state-of-the-art performance on CLEVR dataset, which proves the usability of BCN on relation reasoning problems.Comment: Accepted paper at ECCV 2018. 24 page
    • …
    corecore