Tandem 2.0: Image and Text Data Generation Application

Abstract

First created as part of the Digital Humanities Praxis course in the spring of 2012 at the CUNY Graduate Center, Tandem explores the generation of datasets comprised of text and image data by leveraging Optical Character Recognition (OCR), Natural Language Processing (NLP) and Computer Vision (CV). This project builds upon that earlier work in a new programming framework. While other developers and digital humanities scholars have created similar tools specifically geared toward NLP (e.g. Voyant-Tools), as well as algorithms for image processing and feature extraction on the CV side, Tandem explores the process of developing a more robust and user-friendly web-based multimodal data generator using modern development processes with the intention of expanding the use of the tool among interested academics. Tandem functions as a full-stack JavaScript in-browser web application that allows a user to login, upload a corpus of image files for OCR, NLP, and CV based image processing to facilitate data generation. The corpora intended for this tool includes picture books, comics, and other types of image and text based manuscripts and is discussed in detail. Once images are processed, the application provides some key initial insights and data lightly visualized in a dashboard view for the user. As a research question, this project explores the viability of full-stack JavaScript application development for academic end products by looking at a variety of courses and literature that inspired the work alongside the documented process of development of the application and proposed future enhancements for the tool. For those interested in further research or development, the full codebase for this project is available for download

    Similar works

    Full text

    thumbnail-image