Search CORE

396 research outputs found

A comprehensive survey of handwritten document benchmarks: structure, usage and evaluation

Author
Publication venue: Springer
Publication date: 24/12/2015
Field of study

Springer - Publisher Connector

Data display and analysis

Author: De Fanti D.
Meads J.
Sweet P.
Publication venue
Publication date
Field of study

Graphical character recognizer and data displa

NASA Technical Reports Server

Neural Dataset Generality

Author: Gattupalli Vijetha
Li Baoxin
Venkatesan Ragav
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 13/05/2016
Field of study

Often the filters learned by Convolutional Neural Networks (CNNs) from different datasets appear similar. This is prominent in the first few layers. This similarity of filters is being exploited for the purposes of transfer learning and some studies have been made to analyse such transferability of features. This is also being used as an initialization technique for different tasks in the same dataset or for the same task in similar datasets. Off-the-shelf CNN features have capitalized on this idea to promote their networks as best transferable and most general and are used in a cavalier manner in day-to-day computer vision tasks. It is curious that while the filters learned by these CNNs are related to the atomic structures of the images from which they are learnt, all datasets learn similar looking low-level filters. With the understanding that a dataset that contains many such atomic structures learn general filters and are therefore useful to initialize other networks with, we propose a way to analyse and quantify generality among datasets from their accuracies on transferred filters. We applied this metric on several popular character recognition, natural image and a medical image dataset, and arrived at some interesting conclusions. On further experimentation we also discovered that particular classes in a dataset themselves are more general than others.Comment: Long version of the paper accepted at IEEE International Conference on Image Processing 201

arXiv.org e-Print Archive

Crossref

Hand-written English numeral recognition system using neural network

Author: Choudhary Akash
Publication venue
Publication date: 08/07/2014
Field of study

This thesis aims at implementing an algorithm for recognition of hand-written English numeral. Handwriting recognition has been one of the active and challenging research areas in the field of image processing and pattern recognition. In this thesis the digits are classified into two groups, one group comprises of blobs with/without stems and the other digits with stems only. The blobs are identified based on a new concept called morphological region filling technique. This eliminates the issue of finding the size of blobs and their structuring elements. This method completely eliminates the complex process of recognition of horizontal or vertical lines. This extracted feature will then classified with the help of neural network train tool. It is a faster English numeral recognition algorithm it uses part of the character instead of complete image

ethesis@nitr

WordStylist: Styled Verbatim Handwritten Text Generation with Latent Diffusion Models

Author: Christlein Vincent
Liwicki Marcus
Mokayed Hamam
Nikolaidou Konstantina
Retsinas George
Seuret Mathias
Sfikas Giorgos
Smith Elisa Barney
Publication venue
Publication date: 29/03/2023
Field of study

Text-to-Image synthesis is the task of generating an image according to a specific text description. Generative Adversarial Networks have been considered the standard method for image synthesis virtually since their introduction; today, Denoising Diffusion Probabilistic Models are recently setting a new baseline, with remarkable results in Text-to-Image synthesis, among other fields. Aside its usefulness per se, it can also be particularly relevant as a tool for data augmentation to aid training models for other document image processing tasks. In this work, we present a latent diffusion-based method for styled text-to-text-content-image generation on word-level. Our proposed method manages to generate realistic word image samples from different writer styles, by using class index styles and text content prompts without the need of adversarial training, writer recognition, or text recognition. We gauge system performance with Frechet Inception Distance, writer recognition accuracy, and writer retrieval. We show that the proposed model produces samples that are aesthetically pleasing, help boosting text recognition performance, and gets similar writer retrieval score as real data

arXiv.org e-Print Archive

Luleå University of Technology Publications