2,667 research outputs found

    Recognition of compound characters in Kannada language

    Get PDF
    Recognition of degraded printed compound Kannada characters is a challenging research problem. It has been verified experimentally that noise removal is an essential preprocessing step. Proposed are two methods for degraded Kannada character recognition problem. Method 1 is conventionally used histogram of oriented gradients (HOG) feature extraction for character recognition problem. Extracted features are transformed and reduced using principal component analysis (PCA) and classification performed. Various classifiers are experimented with. Simple compound character classification is satisfactory (more than 98% accuracy) with this method. However, the method does not perform well on other two compound types. Method 2 is deep convolutional neural networks (CNN) model for classification. This outperforms HOG features and classification. The highest classification accuracy is found as 98.8% for simple compound character classification. The performance of deep CNN is far better for other two compound types. Deep CNN turns out to better for pooled character classes

    Automatic Vehicle Detection and Identification using Visual Features

    Get PDF
    In recent decades, a vehicle has become the most popular transportation mechanism in the world. High accuracy and success rate are key factors in automatic vehicle detection and identification. As the most important label on vehicles, the license plate serves as a mean of public identification for them. However, it can be stolen and affixed to different vehicles by criminals to conceal their identities. Furthermore, in some cases, the plate numbers can be the same for two vehicles coming from different countries. In this thesis, we propose a new vehicle identification system that provides high degree of accuracy and success rates. The proposed system consists of four stages: license plate detection, license plate recognition, license plate province detection and vehicle shape detection. In the proposed system, the features are converted into local binary pattern (LBP) and histogram of oriented gradients (HOG) as training dataset. To reach high accuracy in real-time application, a novel method is used to update the system. Meanwhile, via the proposed system, we can store the vehicles features and information in the database. Additionally, with the database, the procedure can automatically detect any discrepancy between license plate and vehicles

    تمثيل الإطار الخارجي للكلمات العربية بكفاءة من خلال الدمج بين نموذج الكنتور النشط وتحديد ونقاط الزوايا

    Get PDF
    Graphical curves and surfaces fitting are hot areas of research studies and application, such as artistic applications, analysis applications and encoding purposes. Outline capture of digital word images is important in most of the desktop publishing systems. The shapes of the characters are stored in the computer memory in terms of their outlines, and the outlines are expressed as Bezier curves. Existing methods for Arabic font outline description suffer from low fitting accuracy and efficiency. In our research, we developed a new method for outlining shapes using Bezier curves with minimal set of curve points. A distinguishing characteristic of our method is that it combines the active contour method (snake) with corner detection to achieve an initial set of points that is as close to the shape's boundaries as possible. The method links these points (snake + corner) into a compound Bezier curve, and iteratively improves the fitting of the curve over the actual boundaries of the shape. We implemented and tested our method using MATLAB. Test cases included various levels of shape complexity varying from simple, moderate, and high complexity depending on factors, such as: boundary concavities, number of corners. Results show that our method achieved average 86% of accuracy when measured relative to true shape boundary. When compared to other similar methods (Masood & Sarfraz, 2009; Sarfraz & Khan, 2002; Ferdous A Sohel, Karmakar, Dooley, & Bennamoun, 2010), our method performed comparatively well. Keywords: Bezier curves, shape descriptor, curvature, corner points, control points, Active Contour Model.تعتبر المنحنيات والأسطح الرسومية موضوعاً هاماً في الدراسات البحثية وفي التطبيقات البرمجية مثل التطبيقات الفنية، وتطبيقات تحليل وترميز البيانات. ويعتبر تخطيط الحدود الخارجية للكلمات عملية أساسية في غالبية تطبيقات النشر المكتبي. في هذه التطبيقات تخزن أشكال الأحرف في الذاكرة من حيث خطوطها الخارجية، وتمثل الخطوط الخارجية على هيئة منحنيات Bezier. الطرق المستخدمة حالياً لتحديد الخطوط الخارجية للكلمات العربية تنقصها دقة وكفاءة الملاءمة ما بين الحدود الحقيقية والمنحنى الرسومي الذي تقوم بتشكيله. في هذا البحث قمنا بتطوير طريقة جديدة لتخطيط الحدود الخارجية للكلمات تعتمد على منحنيات Bezier بمجموعة أقل من المنحنيات الجزئية. تتميز طريقتنا بخاصية مميزة وهي الدمج بين آلية لاستشعار الزوايا مع آلية نموذج الكنتور النشط (الأفعى). يتم الدمج بين نقاط الزوايا ونقاط الأفعى لتشكيل مجموعة موحدة من النقاط المبدئية قريبة قدر الإمكان من الحدود الحقيقية للشكل المراد تحديده. يتشكل منحنى Bezier من هذه المجموعة المدمجة، وتتم عملية تدريجية على دورات لملاءمة المنحنى على الحدود الحقيقية للشكل. قام الباحث بتنفيذ وتجربة الطريقة الجديدة باستخدام برنامج MATLAB. وتم اختيار أشكال رسومية كعينات اختبار تتصف بمستويات متباينة من التعقيد تتراوح ما بين بسيط إلى متوسط إلى عالي التعقيد على أساس عوامل مثل تقعرات الحدود، عدد نقاط الزوايا، الفتحات الداخلية، إلخ. وقد أظهرت نتائج الاختبار أن طريقتنا الجديدة حققت دقة في الملائمة تصل نسبتها إلى 86% مقارنة بالحدود الحقيقية للشكل المستهدف. وكذلك فقد كان أداء طريقتنا جيداً بالمقارنة مع طرق أخرى مماثلة

    OTS: A One-shot Learning Approach for Text Spotting in Historical Manuscripts

    Full text link
    Historical manuscript processing poses challenges like limited annotated training data and novel class emergence. To address this, we propose a novel One-shot learning-based Text Spotting (OTS) approach that accurately and reliably spots novel characters with just one annotated support sample. Drawing inspiration from cognitive research, we introduce a spatial alignment module that finds, focuses on, and learns the most discriminative spatial regions in the query image based on one support image. Especially, since the low-resource spotting task often faces the problem of example imbalance, we propose a novel loss function called torus loss which can make the embedding space of distance metric more discriminative. Our approach is highly efficient and requires only a few training samples while exhibiting the remarkable ability to handle novel characters, and symbols. To enhance dataset diversity, a new manuscript dataset that contains the ancient Dongba hieroglyphics (DBH) is created. We conduct experiments on publicly available VML-HD, TKH, NC datasets, and the new proposed DBH dataset. The experimental results demonstrate that OTS outperforms the state-of-the-art methods in one-shot text spotting. Overall, our proposed method offers promising applications in the field of text spotting in historical manuscripts

    Arabic Manuscripts Analysis and Retrieval

    Get PDF

    Text search engine for digitized historical book

    Get PDF
    Abstract. There’s need to digitalize numerous historical books and texts and make it possible to read them electronically. Also it is often wanted to preserve their original appearance, not just the text itself. For these operations there is a need for systems, which understand the books and text as they are and are able to distinguish the text information from other context. Traditional optical character recognition systems perform well when processing modern printed text, but they might face problems with old handwritten texts. These types of texts need to be analyzed with systems, which can analyse and segment the text areas well from other irrelevant information. That is why it is important, that the document image segmentation works well. This thesis focuses on manual rectification, automatic segmentation and text line search on document images in Orationes project. When the document images are segmented and text lines found, information from XML transcript is used to find characters and words from the segmented document images. Search engine was developed with with Python programmin language. Python was chosen to ensure high platform independency.Tekstinhakujärjestelmä digitoidulle historialliselle kirjalle. Tiivistelmä. Lukuisia historiallisia kirjoja halutaan digitalisoida ja siirtää sähköisesti luettaviksi. Usein ne halutaan myös säilyttää alkuperäisessä ulkoasussaan. Tällaista operaatiota varten tarvitaan järjestelmiä, jotka osaavat ymmärtää kirjat ja tekstit sellaisinaan ja osaavat erottaa tekstin kirjan muusta kontekstista. Perinteiset optiset kirjaimentunnistusmenetelmät suorituvat hyvin painettujen tekstien analysoinnista, mutta ongelmia aiheuttavat käsinkirjoitetut vanhat tekstit. Tällaisten tekstien kohdalla dokumenttikuvat pitää pystyä ensin analysoimaan hyvin ja erottelemaan tekstialueet muusta tekstin kannalta irrelevantista informaatiosta. Siksi onkin tärkeää, että dokumenttikuvan segmentaatio onnistuu hyvin. Tässä työssä keskitytään Orationes projektin dokumenttikuvien manuaaliseen suoristamiseen, segmentaatioon ja tekstirivien löytämiseen. Lisäksi segmentaation jälkeen segmentoidusta dokumenttikuvasta yritetään löytää haluttuja kirjaimia ja sanoja, dokumenttikuvan XML transkriptista saadun informaation avulla. Hakumoottori toteutettiin Python ohjelmointikielellä, jotta saavutettiin alustariippumattomuus hakumoottorille

    Graph-based word spotting by inexact matching techniques

    Get PDF
    Al llarg d'aquest projecte s'ha desenvolupat un nou mètode de word spotting (localització de paraules) en què es té molt en compte l'estructura de les paraules a buscar. Aquestes tècniques consisteixen a trobar paraules escrites a mà, a partir d'un exemple. La tècnica presentada s'ha desenvolupat per utilitzar-la en documents antics. Seguidament, es presenta una indexació per tal d'accelerar el procés de cerca. Aquesta indexació consisteix a trobar ràpidament un conjunt de candidats on aplicar tècniques de word spotting en grans col·leccions de documents. Finalment, es mostra un exemple d'aplicació de les tècniques desenvolupades en una aplicació per a dispositius Android.A lo largo del proyecto se ha desarrollado un nuevo método de word spotting (localización de palabras) en el cual se tiene muy en consideración la estructura de las palabras a buscar. Estas técnicas consisten en encontrar palabras escritas a mano partiendo de un ejemplo. La técnica presentada se ha desarrollado utilizándola en documentos antiguos. Seguidamente, se presenta una indexación con el objetivo de acelerar el proceso de búsqueda. Esta indexación consiste en encontrar rápidamente un conjunto de candidatos donde aplicar técnicas de word spotting en grandes colecciones de documentos. Finalmente, se muestra un ejemplo de aplicación de la técnica desarrollada en una aplicación para dispositivos Android.Along this project a new method for word spotting (location of words) has been developed. This method has in mind the structure of the words to search. These techniques consist in finding handwritten words from a given example. The presented technique has been meant to be used in old documents. Afterwards an indexation process is presented to speed up the search step. This indexation is used to find a set of candidates in large document collections in order to apply word spotting techniques. Finally, an example application of the developed techniques is proposed for Android devices
    corecore