2,938 research outputs found
Client-Driven Content Extraction Associated with Table
The goal of the project is to extract content within table in document images
based on learnt patterns. Real-world users i.e., clients first provide a set of
key fields within the table which they think are important. These are first
used to represent the graph where nodes are labelled with semantics including
other features and edges are attributed with relations. Attributed relational
graph (ARG) is then employed to mine similar graphs from a document image. Each
mined graph will represent an item within the table, and hence a set of such
graphs will compose a table. We have validated the concept by using a
real-world industrial problem
Dark sector interaction: a remedy of the tensions between CMB and LSS data
The well-known tensions on the cosmological parameters and
within the CDM cosmology shown by the Planck-CMB and LSS data are
possibly due to the systematics in the data or our ignorance of some new
physics beyond the CDM model. In this letter, we focus on the second
possibility, and investigate a minimal extension of the CDM model by
allowing a coupling between its dark sector components (dark energy and dark
matter). We analyze this scenario with Planck-CMB, KiDS and HST data, and find
that the and tensions disappear at 68\% CL. In the joint
analyzes with Planck, HST and KiDS data, we find non-zero coupling in the dark
sector up to 99\% CL. Thus, we find a strong statistical support from the
observational data for an interaction in the dark sector of the Universe while
solving the and tensions simultaneously.Comment: 5 pages, 3 figure
Handwritten and Printed Text Separation in Real Document
The aim of the paper is to separate handwritten and printed text from a real
document embedded with noise, graphics including annotations. Relying on
run-length smoothing algorithm (RLSA), the extracted pseudo-lines and
pseudo-words are used as basic blocks for classification. To handle this, a
multi-class support vector machine (SVM) with Gaussian kernel performs a first
labelling of each pseudo-word including the study of local neighbourhood. It
then propagates the context between neighbours so that we can correct possible
labelling errors. Considering running time complexity issue, we propose linear
complexity methods where we use k-NN with constraint. When using a kd-tree, it
is almost linearly proportional to the number of pseudo-words. The performance
of our system is close to 90%, even when very small learning dataset where
samples are basically composed of complex administrative documents.Comment: Machine Vision Applications (2013
- …