3 research outputs found
Technical Report on Web-based Visual Corpus Construction for Visual Document Understanding
We present a dataset generator engine named Web-based Visual Corpus Builder
(Webvicob). Webvicob can readily construct a large-scale visual corpus (i.e.,
images with text annotations) from a raw Wikipedia HTML dump. In this report,
we validate that Webvicob-generated data can cover a wide range of context and
knowledge and helps practitioners to build a powerful Visual Document
Understanding (VDU) backbone. The proposed engine is publicly available at
https://github.com/clovaai/webvicob