research

Processing Large Amounts of Images on Hadoop with OpenCV

Abstract

Modern image collections cannot be processed efficiently on one computer due to large collection sizes and high computational costs of modern image processing algorithms. Hence, image processing often requires distributed computing. However, distributed computing is a complicated subject that demands deep technical knowledge and often cannot be used by researches who develop image processing algorithms. The framework is needed that allows the researches to concentrate on image processing tasks and hides from them the complicated details of distributed computing. In addition, the framework should provide the researches with the familiar image processing tools. The paper describes the extension to the MapReduce Image Processing (MIPr) framework that provides the ability to use OpenCV in Hadoop cluster for distributed image processing. The modified MIPr framework allows the development of image processing programs in Java using the OpenCV Java binding. The performance testing of created system on the cloud cluster demonstrated near-linear scalability

    Similar works