4,261 research outputs found

    Component Segmentation of Engineering Drawings Using Graph Convolutional Networks

    Full text link
    We present a data-driven framework to automate the vectorization and machine interpretation of 2D engineering part drawings. In industrial settings, most manufacturing engineers still rely on manual reads to identify the topological and manufacturing requirements from drawings submitted by designers. The interpretation process is laborious and time-consuming, which severely inhibits the efficiency of part quotation and manufacturing tasks. While recent advances in image-based computer vision methods have demonstrated great potential in interpreting natural images through semantic segmentation approaches, the application of such methods in parsing engineering technical drawings into semantically accurate components remains a significant challenge. The severe pixel sparsity in engineering drawings also restricts the effective featurization of image-based data-driven methods. To overcome these challenges, we propose a deep learning based framework that predicts the semantic type of each vectorized component. Taking a raster image as input, we vectorize all components through thinning, stroke tracing, and cubic bezier fitting. Then a graph of such components is generated based on the connectivity between the components. Finally, a graph convolutional neural network is trained on this graph data to identify the semantic type of each component. We test our framework in the context of semantic segmentation of text, dimension and, contour components in engineering drawings. Results show that our method yields the best performance compared to recent image, and graph-based segmentation methods.Comment: Preprint accepted to Computers in Industr

    Human perception in segmentation of sketches

    Get PDF
    In this paper, we study the segmentation of sketched engineering drawings into a set of straight and curved segments. Our immediate objective is to produce a benchmarking method for segmentation algorithms. The criterion is to minimise the differences between what the algorithm detects and what human beings perceive. We have created a set of sketched drawings and have asked people to segment them. By analysis of the produced segmentations, we have obtained the number and locations of the segmentation points which people perceive. Evidence collected during our experiments supports useful hypotheses, for example that not all kinds of segmentation points are equally difficult to perceive. The resulting methodology can be repeated with other drawings to obtain a set of sketches and segmentation data which could be used as a benchmark for segmentation algorithms, to evaluate their capability to emulate human perception of sketches

    The VIA Annotation Software for Images, Audio and Video

    Full text link
    In this paper, we introduce a simple and standalone manual annotation tool for images, audio and video: the VGG Image Annotator (VIA). This is a light weight, standalone and offline software package that does not require any installation or setup and runs solely in a web browser. The VIA software allows human annotators to define and describe spatial regions in images or video frames, and temporal segments in audio or video. These manual annotations can be exported to plain text data formats such as JSON and CSV and therefore are amenable to further processing by other software tools. VIA also supports collaborative annotation of a large dataset by a group of human annotators. The BSD open source license of this software allows it to be used in any academic project or commercial application.Comment: to appear in Proceedings of the 27th ACM International Conference on Multimedia (MM '19), October 21-25, 2019, Nice, France. ACM, New York, NY, USA, 4 page

    Text Detection in Natural Scenes and Technical Diagrams with Convolutional Feature Learning and Cascaded Classification

    Get PDF
    An enormous amount of digital images are being generated and stored every day. Understanding text in these images is an important challenge with large impacts for academic, industrial and domestic applications. Recent studies address the difficulty of separating text targets from noise and background, all of which vary greatly in natural scenes. To tackle this problem, we develop a text detection system to analyze and utilize visual information in a data driven, automatic and intelligent way. The proposed method incorporates features learned from data, including patch-based coarse-to-fine detection (Text-Conv), connected component extraction using region growing, and graph-based word segmentation (Word-Graph). Text-Conv is a sliding window-based detector, with convolution masks learned using the Convolutional k-means algorithm (Coates et. al, 2011). Unlike convolutional neural networks (CNNs), a single vector/layer of convolution mask responses are used to classify patches. An initial coarse detection considers both local and neighboring patch responses, followed by refinement using varying aspect ratios and rotations for a smaller local detection window. Different levels of visual detail from ground truth are utilized in each step, first using constraints on bounding box intersections, and then a combination of bounding box and pixel intersections. Combining masks from different Convolutional k-means initializations, e.g., seeded using random vectors and then support vectors improves performance. The Word-Graph algorithm uses contextual information to improve word segmentation and prune false character detections based on visual features and spatial context. Our system obtains pixel, character, and word detection f-measures of 93.14%, 90.26%, and 86.77% respectively for the ICDAR 2015 Robust Reading Focused Scene Text dataset, out-performing state-of-the-art systems, and producing highly accurate text detection masks at the pixel level. To investigate the utility of our feature learning approach for other image types, we perform tests on 8- bit greyscale USPTO patent drawing diagram images. An ensemble of Ada-Boost classifiers with different convolutional features (MetaBoost) is used to classify patches as text or background. The Tesseract OCR system is used to recognize characters in detected labels and enhance performance. With appropriate pre-processing and post-processing, f-measures of 82% for part label location, and 73% for valid part label locations and strings are obtained, which are the best obtained to-date for the USPTO patent diagram data set used in our experiments. To sum up, an intelligent refinement of convolutional k-means-based feature learning and novel automatic classification methods are proposed for text detection, which obtain state-of-the-art results without the need for strong prior knowledge. Different ground truth representations along with features including edges, color, shape and spatial relationships are used coherently to improve accuracy. Different variations of feature learning are explored, e.g. support vector-seeded clustering and MetaBoost, with results suggesting that increased diversity in learned features benefit convolution-based text detectors

    Design as communication in micro-strategy — strategic sensemaking and sensegiving mediated through designed artefacts.

    Get PDF
    This paper relates key concepts of strategic cognition in microstrategy to design practice. It considers the potential roles of designers' output in strategic sensemaking and sensegiving. Designed artifacts play well-known roles as communication media; sketches, renderings, models, and prototypes are created to explore and test possibilities and to communicate these options within and outside the design team. This article draws on design and strategy literature to propose that designed artifacts can and do play a role as symbolic communication resources in sensemaking and sensegiving activities that impact strategic decision making and change. Extracts from interviews with three designers serve as illustrative examples. This article is a call for further empirical exploration of such a complex subject

    Chart recognition and interpretation in document images

    Get PDF
    Ph.DDOCTOR OF PHILOSOPH

    Semi Automatic Segmentation of a Rat Brain Atlas

    Get PDF
    A common approach to segment an MRI dataset is to use a standard atlas to identify different regions of interest. Existing 2D atlases, prepared by freehand tracings of templates, are seldom complete for 3D volume segmentation. Although many of these atlases are prepared in graphics packages like Adobe Illustrator® (AI), which present the geometrical entities based on their mathematical description, the drawings are not numerically robust. This work presents an automatic conversion of graphical atlases suitable for further usage such as creation of a segmented 3D numerical atlas. The system begins with DXF (Drawing Exchange Format) files of individual atlas drawings. The drawing entities are mostly in cubic spline format. Each segment of the spline is reduced to polylines, which reduces the complexity of data. The system merges overlapping nodes and polylines to make the database of the drawing numerically integrated, i.e. each location within the drawing is referred by only one point, each line is uniquely defined by only two nodes, etc. Numerous integrity diagnostics are performed to eliminate duplicate or overlapping lines, extraneous markers, open-ended loops, etc. Numerically intact closed loops are formed using atlas labels as seed points. These loops specify the boundary and tissue type for each area. The final results preserve the original atlas with its 1272 different neuroanatomical regions which are complete, non-overlapping, contiguous sub-areas whose boundaries are composed of unique polyline

    Automated CAD conversion with the Machine Drawing Understanding System: concepts, algorithms, and performance

    Full text link
    corecore