4,426 research outputs found
Approximate Bayesian Image Interpretation using Generative Probabilistic Graphics Programs
The idea of computer vision as the Bayesian inverse problem to computer
graphics has a long history and an appealing elegance, but it has proved
difficult to directly implement. Instead, most vision tasks are approached via
complex bottom-up processing pipelines. Here we show that it is possible to
write short, simple probabilistic graphics programs that define flexible
generative models and to automatically invert them to interpret real-world
images. Generative probabilistic graphics programs consist of a stochastic
scene generator, a renderer based on graphics software, a stochastic likelihood
model linking the renderer's output and the data, and latent variables that
adjust the fidelity of the renderer and the tolerance of the likelihood model.
Representations and algorithms from computer graphics, originally designed to
produce high-quality images, are instead used as the deterministic backbone for
highly approximate and stochastic generative models. This formulation combines
probabilistic programming, computer graphics, and approximate Bayesian
computation, and depends only on general-purpose, automatic inference
techniques. We describe two applications: reading sequences of degraded and
adversarially obscured alphanumeric characters, and inferring 3D road models
from vehicle-mounted camera images. Each of the probabilistic graphics programs
we present relies on under 20 lines of probabilistic code, and supports
accurate, approximately Bayesian inferences about ambiguous real-world images.Comment: The first two authors contributed equally to this wor
A Study of Techniques and Challenges in Text Recognition Systems
The core system for Natural Language Processing (NLP) and digitalization is Text Recognition. These systems are critical in bridging the gaps in digitization produced by non-editable documents, as well as contributing to finance, health care, machine translation, digital libraries, and a variety of other fields. In addition, as a result of the pandemic, the amount of digital information in the education sector has increased, necessitating the deployment of text recognition systems to deal with it. Text Recognition systems worked on three different categories of text: (a) Machine Printed, (b) Offline Handwritten, and (c) Online Handwritten Texts. The major goal of this research is to examine the process of typewritten text recognition systems. The availability of historical documents and other traditional materials in many types of texts is another major challenge for convergence. Despite the fact that this research examines a variety of languages, the Gurmukhi language receives the most focus. This paper shows an analysis of all prior text recognition algorithms for the Gurmukhi language. In addition, work on degraded texts in various languages is evaluated based on accuracy and F-measure
GROUNDTRUTH GENERATION AND DOCUMENT IMAGE DEGRADATION
The problem of generating synthetic data for the training and evaluation of document analysis systems has been widely addressed in recent years. With the increased interest in processing multilingual sources, however, there is a tremendous need to be able to rapidly generate data in new languages and scripts, without the need to develop specialized systems. We have developed a system, which uses language support of the MS Windows operating system combined with custom print drivers to render tiff images simultaneously with windows Enhanced Metafile directives. The metafile information is parsed to generate zone, line, word, and character ground truth including location, font information and content in any language supported by Windows. The resulting images can be physically or synthetically degraded by our degradation modules, and used for training and evaluating Optical Character Recognition (OCR) systems. Our document image degradation methodology incorporates several often-encountered types of noise at the page and pixel levels. Examples of OCR evaluation and synthetically degraded document images are given to demonstrate the effectiveness
- …