1 research outputs found
SlideImages: A Dataset for Educational Image Classification
In the past few years, convolutional neural networks (CNNs) have achieved
impressive results in computer vision tasks, which however mainly focus on
photos with natural scene content. Besides, non-sensor derived images such as
illustrations, data visualizations, figures, etc. are typically used to convey
complex information or to explore large datasets. However, this kind of images
has received little attention in computer vision. CNNs and similar techniques
use large volumes of training data. Currently, many document analysis systems
are trained in part on scene images due to the lack of large datasets of
educational image data. In this paper, we address this issue and present
SlideImages, a dataset for the task of classifying educational illustrations.
SlideImages contains training data collected from various sources, e.g.,
Wikimedia Commons and the AI2D dataset, and test data collected from
educational slides. We have reserved all the actual educational images as a
test dataset in order to ensure that the approaches using this dataset
generalize well to new educational images, and potentially other domains.
Furthermore, we present a baseline system using a standard deep neural
architecture and discuss dealing with the challenge of limited training data.Comment: 8 pages, 2 figures, to be presented at ECIR 202