1 research outputs found
Incremental Machine Learning Techniques for Document Layout Understanding
In real-world Digital Libraries, Artificial Intelligence techniques are essential for tackling the automatic document processing task with sufficient flexibility. The great variability in document kind, content and shape requires powerful representation formalisms
to catch all the domain complexity. The continuous
flow of new documents requires adaptable techniques
that can progressively adjust the acquired knowledge
on documents as long as new evidence becomes avail-
able, even extending if needed the set of recognized
document types. Both these issues have not yet been
thoroughly studied. This paper presents an incremental
first-order logic learning framework for automatically
dealing with various kinds of evolution in digital repositories content: evolution in the definition of class definitions, evolution in the set of known classes and evolution by addition of new unknown classes. Experiments
show that the approach can be applied to real-world