Search CORE

15,612 research outputs found

A generic method for structure recognition of handwritten mail documents

Author: Camillerapp Jean
Coüasnon Bertrand
Lemaitre Aurélie
Publication venue: HAL CCSD
Publication date: 01/01/2008
Field of study

International audienceThis paper presents a system to extract the logical structure of handwritten mail documents. It consists in two joined tasks: the segmentation of documents into blocks and the labeling of such blocks. The main considered label classes are: addressee details, sender details, date, subject, text body, signature. This work has to face with difficulties of unconstrained handwritten documents: variable structure and writing. We propose a method based on a geometric analysis of the arrangement of elements in the document. We give a description of the document using a two-dimension grammatical formalism, which makes it possible to easily introduce knowledge on mail into a generic parser. Our grammatical parser is LL(k), which means several combinations are tried before extracting the good one. The main interest of this approach is that we can deal with low structured documents. Moreover, as the segmentation into blocks often depends on the associated classes, our method is able to retry a different segmentation until labeling succeeds. We validated this method in the context of the French national project RIMES, which proposed a contest on a large base of documents. We obtain a recognition rate of 91.7% on 1150 images

HAL-Rennes 1

Video advertisement mining for predicting revenue using random forest

Author: Huang Yuan Hsin
Publication venue: 'Purdue University (bepress)'
Publication date: 01/01/2015
Field of study

Shaken by the threat of financial crisis in 2008, industries began to work on the topic of predictive analytics to efficiently control inventory levels and minimize revenue risks. In this third-generation age of web-connected data, organizations emphasized the importance of data science and leveraged the data mining techniques for gaining a competitive edge. Consider the features of Web 3.0, where semantic-oriented interaction between humans and computers can offer a tailored service or product to meet consumers\u27 needs by means of learning their preferences. In this study, we concentrate on the area of marketing science to demonstrate the correlation between TV commercial advertisements and sales achievement. Through different data mining and machine-learning methods, this research will come up with one concrete and complete predictive framework to clarify the effects of word of mouth by using open data sources from YouTube. The uniqueness of this predictive model is that we adopt the sentiment analysis as one of our predictors. This research offers a preliminary study on unstructured marketing data for further business use

Purdue E-Pubs

The Production of Speech Corpora

Author: Baumann Angela
Draxler Christoph
Ellbogen Tania
Schiel Florian
Steffen Alexander
Publication venue
Publication date: 21/03/2012
Field of study

Open Access LMU

Historical Document Image Segmentation with LDA-Initialized Deep Neural Networks

Author: Alberti Michele
Ingold Rolf
Liwicki Marcus
Pondenkandath Vinaychandran
Seuret Mathias
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 19/10/2017
Field of study

In this paper, we present a novel approach to perform deep neural networks layer-wise weight initialization using Linear Discriminant Analysis (LDA). Typically, the weights of a deep neural network are initialized with: random values, greedy layer-wise pre-training (usually as Deep Belief Network or as auto-encoder) or by re-using the layers from another network (transfer learning). Hence, many training epochs are needed before meaningful weights are learned, or a rather similar dataset is required for seeding a fine-tuning of transfer learning. In this paper, we describe how to turn an LDA into either a neural layer or a classification layer. We analyze the initialization technique on historical documents. First, we show that an LDA-based initialization is quick and leads to a very stable initialization. Furthermore, for the task of layout analysis at pixel level, we investigate the effectiveness of LDA-based initialization and show that it outperforms state-of-the-art random weight initialization methods.Comment: 5 page

arXiv.org e-Print Archive

Crossref

A step towards understanding paper documents

Author: Dengel Andreas
Publication venue: Sonstige Einrichtungen. DFKI Deutsches Forschungszentrum für Künstliche Intelligenz
Publication date: 01/01/1990
Field of study

This report focuses on analysis steps necessary for a paper document processing. It is divided in three major parts: a document image preprocessing, a knowledge-based geometric classification of the image, and a expectation-driven text recognition. It first illustrates the several low level image processing procedures providing the physical document structure of a scanned document image. Furthermore, it describes a knowledge-based approach, developed for the identification of logical objects (e.g., sender or the footnote of a letter) in a document image. The logical identifiers provide a context-restricted consideration of the containing text. While using specific logical dictionaries, a expectation-driven text recognition is possible to identify text parts of specific interest. The system has been implemented for the analysis of single-sided business letters in Common Lisp on a SUN 3/60 Workstation. It is running for a large population of different letters. The report also illustrates and discusses examples of typical results obtained by the system

Universaar

Acronym

Access to financial services: the case of the ‘Mzansi’ account in South Africa

Author: Beck
Beck
Besanko
Bester
Bondell
Chan
Claessens
De Meza
Efron
Fan
FinMark Trust
Kon
Kostov
Kostov
Kostov
Lensink
Meagher
Osili
Parker
Stiglitz
Publication venue: 'Elsevier BV'
Publication date: 01/06/2015
Field of study

The presence of rationing of financial services in the developing countries is a major obstacle to achieving sustainable growth. In recent years there have been co-ordinated efforts to increase the level of financial inclusion, i.e. to reduce the supply-side constraints restricting access to finance. This paper aims to understand household’s latent behaviour decision making in accessing financial services, by analysing an entry level Mzansi account in South Africa. The willingness to access financial services is not taken as given, but it is instead defined by perceptions and attitudes. The Mzansi intervention is appealing to individuals with basic but insufficient financial education. Aspirations seem to be very influential in revealing the choice of financial services and to this end, Mzansi is perceived as a pre-entry account not meeting the aspirations of individuals aiming to climb up the financial services ladder

University of Essex Research Repository

CLoK

Crossref

Elsevier - Publisher Connector

Directory of Open Access Journals

Business Advisory in Private Equity

Author: Emanuele Cairo
Publication venue
Publication date
Field of study

Research Papers in Economics