15 research outputs found
An Approach to Extract Features from Document Image for Character Recognition
In this paper we present a technique to extract features from a document image which can be used in machine learning algorithms in order to recognize characters from document image. The proposed method takes the scanned image of the handwritten character from paper document as input and processes that input through several stages to extract effective features. The object in the converted binary image is segmented from the background and resized in a global resolution. Morphological thinning operation is applied on the resized object and then the technique scanned the object in order to search for features there. In this approach the feature values are estimated by calculating the frequency of existence of some predefined shapes in a character object. All of these frequencies are considered as estimated feature values which are then stored in a vector. Every element in that vector is considered as a single feature value or an attribute for the corresponding image. Now these feature vectors for individual character objects can be used to train a suitable machine learning algorithms in order to classify a test object. The k-nearest neighbor classifier is used for simulation in this paper to classify the handwritten character into the recognized classes of characters. The proposed technique takes less time to compute, has less complexity and increases the performance of classifiers in matching the handwritten characters with the machine readable form
An Improved Adaptive Filtering Technique to Remove High Density Salt-and-Pepper Noise using Multiple Last Processed Pixels
This paper presents an efficient algorithm which can remove high density salt-and-pepper noise from corrupted digital image This technique differentiates between corrupted and uncorrupted pixels and performs the filtering process only on the corrupted ones The proposed algorithm calculates median only among the noise-free neighborhoods in the processing window and replaces the centre corrupted pixel with that median value The adaptive behavior is enabled here by expanding the processing window based on neighbourhood noise-free pixels In case of high density noise corruption where no noise-free neighborhood is found within the maximum size of window this algorithm takes last processed pixels into the account While most of the existing filtering techniques use only one last processed pixel after reaching maximum window the proposed algorithm considers multiple last processed pixels rather than considering a single one so that more accurate decision can be taken in order to replace the corrupted pixe
Book Cover Synthesis from the Summary
The cover is the face of a book and is a point of attraction for the readers.
Designing book covers is an essential task in the publishing industry. One of
the main challenges in creating a book cover is representing the theme of the
book's content in a single image. In this research, we explore ways to produce
a book cover using artificial intelligence based on the fact that there exists
a relationship between the summary of the book and its cover. Our key
motivation is the application of text-to-image synthesis methods to generate
images from given text or captions. We explore several existing text-to-image
conversion techniques for this purpose and propose an approach to exploit these
frameworks for producing book covers from provided summaries. We construct a
dataset of English books that contains a large number of samples of summaries
of existing books and their cover images. In this paper, we describe our
approach to collecting, organizing, and pre-processing the dataset to use it
for training models. We apply different text-to-image synthesis techniques to
generate book covers from the summary and exhibit the results in this paper.Comment: Accepted as a full paper in AICCSA2022 (19th ACS/IEEE International
Conference on Computer Systems and Applications
Real Time Bangladeshi Sign Language Detection using Faster R-CNN
Bangladeshi Sign Language (BdSL) is a commonly used medium of communication
for the hearing-impaired people in Bangladesh. Developing a real time system to
detect these signs from images is a great challenge. In this paper, we present
a technique to detect BdSL from images that performs in real time. Our method
uses Convolutional Neural Network based object detection technique to detect
the presence of signs in the image region and to recognize its class. For this
purpose, we adopted Faster Region-based Convolutional Network approach and
developed a dataset BdSLImset to train our system. Previous research
works in detecting BdSL generally depend on external devices while most of the
other vision-based techniques do not perform efficiently in real time. Our
approach, however, is free from such limitations and the experimental results
demonstrate that the proposed method successfully identifies and recognizes
Bangladeshi signs in real time.Comment: 6 pages, Accepted in International Conference on Innovation in
Engineering and Technology (ICIET) 27-29 December, 2018, Dhaka, Banglades
Shapes2Toon: Generating Cartoon Characters from Simple Geometric Shapes
Cartoons are an important part of our entertainment culture. Though drawing a
cartoon is not for everyone, creating it using an arrangement of basic
geometric primitives that approximates that character is a fairly frequent
technique in art. The key motivation behind this technique is that human bodies
- as well as cartoon figures - can be split down into various basic geometric
primitives. Numerous tutorials are available that demonstrate how to draw
figures using an appropriate arrangement of fundamental shapes, thus assisting
us in creating cartoon characters. This technique is very beneficial for
children in terms of teaching them how to draw cartoons. In this paper, we
develop a tool - shape2toon - that aims to automate this approach by utilizing
a generative adversarial network which combines geometric primitives (i.e.
circles) and generate a cartoon figure (i.e. Mickey Mouse) depending on the
given approximation. For this purpose, we created a dataset of geometrically
represented cartoon characters. We apply an image-to-image translation
technique on our dataset and report the results in this paper. The experimental
results show that our system can generate cartoon characters from input layout
of geometric shapes. In addition, we demonstrate a web-based tool as a
practical implication of our work.Comment: Accepted as a full paper in AICCSA2022 (19th ACS/IEEE International
Conference on Computer Systems and Applications
Icosahedral Maps for a Multiresolution Representation of Earth Data
The icosahedral non-hydrostatic (ICON) model is a climate model based on an icosahedral representation of the Earth and is used for numerical weather prediction. In this thesis, we investigate the unstructured representation of different cells in ICON and undertake the task of designing a technique that converts it to a common structured representation. We introduce icosahedral maps, data structures that are designed to fit the geometry of cells in the ICON model irrespective of their types. These maps represent the connectivity information in ICON in a highly structured two-dimensional hexagonal representation that provides explicit neighborhood information. Our maps facilitate the execution of a multiresolution analysis on the ICON model. We demonstrate this by applying a hexagonal version of the discrete wavelet transform in conjunction with our icosahedral maps to decompose ICON data to different levels of detail and to compress it via a thresholding of the wavelet coefficients
toon2real: Translating Cartoon Images to Realistic Images
In terms of Image-to-image translation, Generative Adversarial Networks
(GANs) has achieved great success even when it is used in the unsupervised
dataset. In this work, we aim to translate cartoon images to photo-realistic
images using GAN. We apply several state-of-the-art models to perform this
task; however, they fail to perform good quality translations. We observe that
the shallow difference between these two domains causes this issue. Based on
this idea, we propose a method based on CycleGAN model for image translation
from cartoon domain to photo-realistic domain. To make our model efficient, we
implemented Spectral Normalization which added stability in our model. We
demonstrate our experimental results and show that our proposed model has
achieved the lowest Frechet Inception Distance score and better results
compared to another state-of-the-art technique, UNIT.Comment: Accepted as a short paper at ICTAI 202