15 research outputs found

    An Approach to Extract Features from Document Image for Character Recognition

    Get PDF
    In this paper we present a technique to extract features from a document image which can be used in machine learning algorithms in order to recognize characters from document image. The proposed method takes the scanned image of the handwritten character from paper document as input and processes that input through several stages to extract effective features. The object in the converted binary image is segmented from the background and resized in a global resolution. Morphological thinning operation is applied on the resized object and then the technique scanned the object in order to search for features there. In this approach the feature values are estimated by calculating the frequency of existence of some predefined shapes in a character object. All of these frequencies are considered as estimated feature values which are then stored in a vector. Every element in that vector is considered as a single feature value or an attribute for the corresponding image. Now these feature vectors for individual character objects can be used to train a suitable machine learning algorithms in order to classify a test object. The k-nearest neighbor classifier is used for simulation in this paper to classify the handwritten character into the recognized classes of characters. The proposed technique takes less time to compute, has less complexity and increases the performance of classifiers in matching the handwritten characters with the machine readable form

    An Improved Adaptive Filtering Technique to Remove High Density Salt-and-Pepper Noise using Multiple Last Processed Pixels

    Get PDF
    This paper presents an efficient algorithm which can remove high density salt-and-pepper noise from corrupted digital image This technique differentiates between corrupted and uncorrupted pixels and performs the filtering process only on the corrupted ones The proposed algorithm calculates median only among the noise-free neighborhoods in the processing window and replaces the centre corrupted pixel with that median value The adaptive behavior is enabled here by expanding the processing window based on neighbourhood noise-free pixels In case of high density noise corruption where no noise-free neighborhood is found within the maximum size of window this algorithm takes last processed pixels into the account While most of the existing filtering techniques use only one last processed pixel after reaching maximum window the proposed algorithm considers multiple last processed pixels rather than considering a single one so that more accurate decision can be taken in order to replace the corrupted pixe

    Book Cover Synthesis from the Summary

    Full text link
    The cover is the face of a book and is a point of attraction for the readers. Designing book covers is an essential task in the publishing industry. One of the main challenges in creating a book cover is representing the theme of the book's content in a single image. In this research, we explore ways to produce a book cover using artificial intelligence based on the fact that there exists a relationship between the summary of the book and its cover. Our key motivation is the application of text-to-image synthesis methods to generate images from given text or captions. We explore several existing text-to-image conversion techniques for this purpose and propose an approach to exploit these frameworks for producing book covers from provided summaries. We construct a dataset of English books that contains a large number of samples of summaries of existing books and their cover images. In this paper, we describe our approach to collecting, organizing, and pre-processing the dataset to use it for training models. We apply different text-to-image synthesis techniques to generate book covers from the summary and exhibit the results in this paper.Comment: Accepted as a full paper in AICCSA2022 (19th ACS/IEEE International Conference on Computer Systems and Applications

    Real Time Bangladeshi Sign Language Detection using Faster R-CNN

    Full text link
    Bangladeshi Sign Language (BdSL) is a commonly used medium of communication for the hearing-impaired people in Bangladesh. Developing a real time system to detect these signs from images is a great challenge. In this paper, we present a technique to detect BdSL from images that performs in real time. Our method uses Convolutional Neural Network based object detection technique to detect the presence of signs in the image region and to recognize its class. For this purpose, we adopted Faster Region-based Convolutional Network approach and developed a dataset −- BdSLImset −- to train our system. Previous research works in detecting BdSL generally depend on external devices while most of the other vision-based techniques do not perform efficiently in real time. Our approach, however, is free from such limitations and the experimental results demonstrate that the proposed method successfully identifies and recognizes Bangladeshi signs in real time.Comment: 6 pages, Accepted in International Conference on Innovation in Engineering and Technology (ICIET) 27-29 December, 2018, Dhaka, Banglades

    Shapes2Toon: Generating Cartoon Characters from Simple Geometric Shapes

    Full text link
    Cartoons are an important part of our entertainment culture. Though drawing a cartoon is not for everyone, creating it using an arrangement of basic geometric primitives that approximates that character is a fairly frequent technique in art. The key motivation behind this technique is that human bodies - as well as cartoon figures - can be split down into various basic geometric primitives. Numerous tutorials are available that demonstrate how to draw figures using an appropriate arrangement of fundamental shapes, thus assisting us in creating cartoon characters. This technique is very beneficial for children in terms of teaching them how to draw cartoons. In this paper, we develop a tool - shape2toon - that aims to automate this approach by utilizing a generative adversarial network which combines geometric primitives (i.e. circles) and generate a cartoon figure (i.e. Mickey Mouse) depending on the given approximation. For this purpose, we created a dataset of geometrically represented cartoon characters. We apply an image-to-image translation technique on our dataset and report the results in this paper. The experimental results show that our system can generate cartoon characters from input layout of geometric shapes. In addition, we demonstrate a web-based tool as a practical implication of our work.Comment: Accepted as a full paper in AICCSA2022 (19th ACS/IEEE International Conference on Computer Systems and Applications

    Icosahedral Maps for a Multiresolution Representation of Earth Data

    No full text
    The icosahedral non-hydrostatic (ICON) model is a climate model based on an icosahedral representation of the Earth and is used for numerical weather prediction. In this thesis, we investigate the unstructured representation of different cells in ICON and undertake the task of designing a technique that converts it to a common structured representation. We introduce icosahedral maps, data structures that are designed to fit the geometry of cells in the ICON model irrespective of their types. These maps represent the connectivity information in ICON in a highly structured two-dimensional hexagonal representation that provides explicit neighborhood information. Our maps facilitate the execution of a multiresolution analysis on the ICON model. We demonstrate this by applying a hexagonal version of the discrete wavelet transform in conjunction with our icosahedral maps to decompose ICON data to different levels of detail and to compress it via a thresholding of the wavelet coefficients

    toon2real: Translating Cartoon Images to Realistic Images

    Full text link
    In terms of Image-to-image translation, Generative Adversarial Networks (GANs) has achieved great success even when it is used in the unsupervised dataset. In this work, we aim to translate cartoon images to photo-realistic images using GAN. We apply several state-of-the-art models to perform this task; however, they fail to perform good quality translations. We observe that the shallow difference between these two domains causes this issue. Based on this idea, we propose a method based on CycleGAN model for image translation from cartoon domain to photo-realistic domain. To make our model efficient, we implemented Spectral Normalization which added stability in our model. We demonstrate our experimental results and show that our proposed model has achieved the lowest Frechet Inception Distance score and better results compared to another state-of-the-art technique, UNIT.Comment: Accepted as a short paper at ICTAI 202
    corecore