3 research outputs found

    Heuristics-based detection to improve text/graphics segmentation in complex engineering drawings.

    Get PDF
    The demand for digitisation of complex engineering drawings becomes increasingly important for the industry given the pressure to improve the efficiency and time effectiveness of operational processes. There have been numerous attempts to solve this problem, either by proposing a general form of document interpretation or by establishing an application dependant framework. Moreover, text/graphics segmentation has been presented as a particular form of addressing document digitisation problem, with the main aim of splitting text and graphics into different layers. Given the challenging characteristics of complex engineering drawings, this paper presents a novel sequential heuristics-based methodology which is aimed at localising and detecting the most representative symbols of the drawing. This implementation enables the subsequent application of a text/graphics segmentation method in a more effective form. The experimental framework is composed of two parts: first we show the performance of the symbol detection system and then we present an evaluation of three different state of the art text/graphic segmentation techniques to find text on the remaining image

    Symbols in engineering drawings (SiED): an imbalanced dataset benchmarked by convolutional neural networks.

    Get PDF
    Engineering drawings are common across different domains such as Oil & Gas, construction, mechanical and other domains. Automatic processing and analysis of these drawings is a challenging task. This is partly due to the complexity of these documents and also due to the lack of dataset availability in the public domain that can help push the research in this area. In this paper, we present a multiclass imbalanced dataset for the research community made of 2432 instances of engineering symbols. These symbols were extracted from a collection of complex engineering drawings known as Piping and Instrumentation Diagram (P&ID). By providing such dataset to the research community, we anticipate that this will help attract more attention to an important, yet overlooked industrial problem, and will also advance the research in such important and timely topics. We discuss the datasets characteristics in details, and we also show how Convolutional Neural Networks (CNNs) perform on such extremely imbalanced datasets. Finally, conclusions and future directions are discussed
    corecore