9 research outputs found
On Cropped versus Uncropped Training Sets in Tabular Structure Detection
Automated document processing for tabular information extraction is highly
desired in many organizations, from industry to government. Prior works have
addressed this problem under table detection and table structure detection
tasks. Proposed solutions leveraging deep learning approaches have been giving
promising results in these tasks. However, the impact of dataset structures on
table structure detection has not been investigated. In this study, we provide
a comparison of table structure detection performance with cropped and
uncropped datasets. The cropped set consists of only table images that are
cropped from documents assuming tables are detected perfectly. The uncropped
set consists of regular document images. Experiments show that deep learning
models can improve the detection performance by up to 9% in average precision
and average recall on the cropped versions. Furthermore, the impact of cropped
images is negligible under the Intersection over Union (IoU) values of 50%-70%
when compared to the uncropped versions. However, beyond 70% IoU thresholds,
cropped datasets provide significantly higher detection performance
Automated Development of Semantic Data Models Using Scientific Publications
The traditional methods for analyzing information in digital documents have evolved with the ever-increasing volume of data. Some challenges in analyzing scientific publications include the lack of a unified vocabulary and a defined context, different standards and formats in presenting information, various types of data, and diverse areas of knowledge. These challenges hinder detecting, understanding, comparing, sharing, and querying information rapidly.
I design a dynamic conceptual data model with common elements in publications from any domain, such as context, metadata, and tables. To enhance the models, I use related definitions contained in ontologies and the Internet. Therefore, this dissertation generates semantically-enriched data models from digital publications based on the Semantic Web principles, which allow people and computers to work cooperatively. Finally, this work uses a vocabulary and ontologies to generate a structured characterization and organize the data models. This organization allows integration, sharing, management, and comparing and contrasting information from publications
A Table Detection Method for Multipage PDF Documents via Visual Seperators and Tabular Structures
Table detection is always an important task of document analysis and recognition. In this paper, we propose a novel and effective table detection method via visual separators and geometric content layout information, targeting at PDF documents. The visual separators refer to not only the graphic ruling lines but also the white spaces to handle tables with or without ruling lines. Furthermore, we detect page columns in order to assist table region delimitation in complex layout pages. Evaluations of our algorithm on an e-Book dataset and a scientific document dataset show competitive performance. It is noteworthy that the proposed method has been successfully incorporated into a commercial software package for large-scale Chinese e-Book production.http://gateway.webofknowledge.com/gateway/Gateway.cgi?GWVersion=2&SrcApp=PARTNER_APP&SrcAuth=LinksAMR&KeyUT=WOS:000343450700153&DestLinkType=FullRecord&DestApp=ALL_WOS&UsrCustomerID=8e1609b174ce4e31116a60747a720701Computer Science, Artificial IntelligenceEngineering, Electrical & ElectronicCPCI-S(ISTP)
Gaze-Based Human-Robot Interaction by the Brunswick Model
We present a new paradigm for human-robot interaction based on social signal processing, and in particular on the Brunswick model. Originally, the Brunswick model copes with face-to-face dyadic interaction, assuming that the interactants are communicating through a continuous exchange of non verbal social signals, in addition to the spoken messages. Social signals have to be interpreted, thanks to a proper recognition phase that considers visual and audio information. The Brunswick model allows to quantitatively evaluate the quality of the interaction using statistical tools which measure how effective is the recognition phase. In this paper we cast this theory when one of the interactants is a robot; in this case, the recognition phase performed by the robot and the human have to be revised w.r.t. the original model. The model is applied to Berrick, a recent open-source low-cost robotic head platform, where the gazing is the social signal to be considered