3 research outputs found

    Enhanced Characterness for Text Detection in the Wild

    Full text link
    Text spotting is an interesting research problem as text may appear at any random place and may occur in various forms. Moreover, ability to detect text opens the horizons for improving many advanced computer vision problems. In this paper, we propose a novel language agnostic text detection method utilizing edge enhanced Maximally Stable Extremal Regions in natural scenes by defining strong characterness measures. We show that a simple combination of characterness cues help in rejecting the non text regions. These regions are further fine-tuned for rejecting the non-textual neighbor regions. Comprehensive evaluation of the proposed scheme shows that it provides comparative to better generalization performance to the traditional methods for this task

    Pemilihan Parameter Smoothing pada Probabilistic Neural Network dengan Menggunakan Particle Swarm Optimization untuk Pendeteksian Teks pada Citra

    Full text link
    Teks sering dijumpai di berbagai tempat seperti nama jalan, nama toko, spanduk, penunjuk jalan, peringatan, dan lain sebagainya. Deteksi teks terbagi menjadi tiga pendekatan yaitu pendekatan tekstur, pendekatan edge, dan pendekatan Connected Component. Pendekatan tekstur dapat mendeteksi teks dengan baik, namun membutuhkan data training yang banyak. Probabilistic Neural Netwok (PNN) dapat mengatasi permasalahan tersebut. Namun PNN memiliki permasalahan dalam menentukan nilai parameter smoothing yang biasanya dilakukan secara trial and error. Particle Swarm Optimization (PSO) merupakan algoritma optimasi yang dapat menangani permasalahan pada PNN. Pada penelitian ini, PNN digunakan pada pendekatan tekstur guna menangani permasalahan pada pendekatan tekstur, yaitu banyaknya data training yang dibutuhkan. Selain itu, digunakan PSO untuk menentukan parameter smoothing pada PNN agar akurasi yang dihasilkan PNN-PSO lebih baik dari PNN tradisional. Hasil eksperimen menunjukkan PNN dapat mendeteksi teks dengan akurasi 75,42% hanya dengan mengunakan 300 data training, dan menghasilkan 77,75% dengan menggunakan 1500 data training. Sedangkan PNN-PSO dapat menghasilkan akurasi 76,91% dengan menggunakan 300 data training dan 77,89% dengan menggunakan 1500 data training. Maka dapat disimpulkan bahwa PNN dapat mendeteksi teks dengan baik walaupun data training yang digunakan sedikit dan dapat mengatasi permasalahan pada pendekatan tekstur. Sedangkan, PSO dapat menentukan nilai parameter smoothing pada PNN dan menghasilkan akurasi yang lebih baik dari PNN tradisional, yaitu dengan peningkatan akurasi sekitar 0,1% hingga 1,5%. Selain itu, penggunaan PSO pada PNN dapat digunakan dalam menentukan nilai parameter smoothing secara otomatis pada dataset yang berbeda

    Text-detection and -recognition from natural images

    Get PDF
    Text detection and recognition from images could have numerous functional applications for document analysis, such as assistance for visually impaired people; recognition of vehicle license plates; evaluation of articles containing tables, street signs, maps, and diagrams; keyword-based image exploration; document retrieval; recognition of parts within industrial automation; content-based extraction; object recognition; address block location; and text-based video indexing. This research exploited the advantages of artificial intelligence (AI) to detect and recognise text from natural images. Machine learning and deep learning were used to accomplish this task.In this research, we conducted an in-depth literature review on the current detection and recognition methods used by researchers to identify the existing challenges, wherein the differences in text resulting from disparity in alignment, style, size, and orientation combined with low image contrast and a complex background make automatic text extraction a considerably challenging and problematic task. Therefore, the state-of-the-art suggested approaches obtain low detection rates (often less than 80%) and recognition rates (often less than 60%). This has led to the development of new approaches. The aim of the study was to develop a robust text detection and recognition method from natural images with high accuracy and recall, which would be used as the target of the experiments. This method could detect all the text in the scene images, despite certain specific features associated with the text pattern. Furthermore, we aimed to find a solution to the two main problems concerning arbitrarily shaped text (horizontal, multi-oriented, and curved text) detection and recognition in a low-resolution scene and with various scales and of different sizes.In this research, we propose a methodology to handle the problem of text detection by using novel combination and selection features to deal with the classification algorithms of the text/non-text regions. The text-region candidates were extracted from the grey-scale images by using the MSER technique. A machine learning-based method was then applied to refine and validate the initial detection. The effectiveness of the features based on the aspect ratio, GLCM, LBP, and HOG descriptors was investigated. The text-region classifiers of MLP, SVM, and RF were trained using selections of these features and their combinations. The publicly available datasets ICDAR 2003 and ICDAR 2011 were used to evaluate the proposed method. This method achieved the state-of-the-art performance by using machine learning methodologies on both databases, and the improvements were significant in terms of Precision, Recall, and F-measure. The F-measure for ICDAR 2003 and ICDAR 2011 was 81% and 84%, respectively. The results showed that the use of a suitable feature combination and selection approach could significantly increase the accuracy of the algorithms.A new dataset has been proposed to fill the gap of character-level annotation and the availability of text in different orientations and of curved text. The proposed dataset was created particularly for deep learning methods which require a massive completed and varying range of training data. The proposed dataset includes 2,100 images annotated at the character and word levels to obtain 38,500 samples of English characters and 12,500 words. Furthermore, an augmentation tool has been proposed to support the proposed dataset. The missing of object detection augmentation tool encroach to proposed tool which has the ability to update the position of bounding boxes after applying transformations on images. This technique helps to increase the number of samples in the dataset and reduce the time of annotations where no annotation is required. The final part of the thesis presents a novel approach for text spotting, which is a new framework for an end-to-end character detection and recognition system designed using an improved SSD convolutional neural network, wherein layers are added to the SSD networks and the aspect ratio of the characters is considered because it is different from that of the other objects. Compared with the other methods considered, the proposed method could detect and recognise characters by training the end-to-end model completely. The performance of the proposed method was better on the proposed dataset; it was 90.34. Furthermore, the F-measure of the method’s accuracy on ICDAR 2015, ICDAR 2013, and SVT was 84.5, 91.9, and 54.8, respectively. On ICDAR13, the method achieved the second-best accuracy. The proposed method could spot text in arbitrarily shaped (horizontal, oriented, and curved) scene text.</div
    corecore