582 research outputs found

    Task-based Adaptation of Graphical Content in Smart Visual Interfaces

    Get PDF
    To be effective visual representations must be adapted to their respective context of use, especially in so-called Smart Visual Interfaces striving to present specifically those information required for the task at hand. This thesis proposes a generic approach that facilitate the automatic generation of task-specific visual representations from suitable task descriptions. It is discussed how the approach is applied to four principal content types raster images, 2D vector and 3D graphics as well as data visualizations, and how existing display techniques can be integrated into the approach.Effektive visuelle Repräsentationen müssen an den jeweiligen Nutzungskontext angepasst sein, insbesondere in sog. Smart Visual Interfaces, welche anstreben, möglichst genau für die aktuelle Aufgabe benötigte Informationen anzubieten. Diese Arbeit entwirft einen generischen Ansatz zur automatischen Erzeugung aufgabenspezifischer Darstellungen anhand geeigneter Aufgabenbeschreibungen. Es wird gezeigt, wie dieser Ansatz auf vier grundlegende Inhaltstypen Rasterbilder, 2D-Vektor- und 3D-Grafik sowie Datenvisualisierungen anwendbar ist, und wie existierende Darstellungstechniken integrierbar sind

    Fine Art Pattern Extraction and Recognition

    Get PDF
    This is a reprint of articles from the Special Issue published online in the open access journal Journal of Imaging (ISSN 2313-433X) (available at: https://www.mdpi.com/journal/jimaging/special issues/faper2020)

    Labeling, discovering, and detecting objects in images

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2008.Includes bibliographical references (p. 131-138).Recognizing the many objects that comprise our visual world is a difficult task. Confounding factors, such as intra-class object variation, clutter, pose, lighting, dealing with never-before seen objects, scale, and lack of visual experience often fool existing recognition systems. In this thesis, we explore three issues that address a few of these factors: the importance of labeled image databases for recognition, the ability to discover object categories from simply looking at many images, and the use of large labeled image databases to efficiently detect objects embedded in scenes. For each of the issues above, we will need to cope with large collections of images. We begin by introducing LabelMe, a large labeled image database collected from users via a web annotation tool. The users of the annotation tool provided information about the identity, location, and extent of objects in images. Through this effort, we have collected about 160,000 images and 200,000 object labels to date. We show that the database spans more object categories and scenes and offers a wider range of appearance variation than most other labeled databases for object recognition. We also provide four useful extensions of the database: (i) resolving synonym ambiguities that arise in the object labels, (ii) recovering object-part relationships, (iii) extracting a depth ordering of the labeled objects in an image, and (iv) providing a semi-automatic process for the fast labeling of images. We then seek to learn models of objects in the extreme case when no supervision is provided. We draw inspiration from the success of unsupervised topic discovery in text. We apply the Latent Dirichlet Allocation model of Blei et al. to unlabeled images to automatically discover object categories. To achieve this, we employ the visual words representation of images, which is analogous to the words in text.(cont) We show that our unsupervised model achieves comparable classification performance to a model trained with supervision on an unseen image set depicting several object classes. We also successfully localize the discovered object classes in images. While the image representation used for the object discovery process is simple to compute and can distinguish between different object categories, it does not capture explicit spatial information about regions in different parts of the image. We describe a procedure for combining image segmentation with the object discovery process toby Bryan Christopher Russell.Ph.D

    Aspects of Synthetic Vision Display Systems and the Best Practices of the NASA's SVS Project

    Get PDF
    NASA s Synthetic Vision Systems (SVS) Project conducted research aimed at eliminating visibility-induced errors and low visibility conditions as causal factors in civil aircraft accidents while enabling the operational benefits of clear day flight operations regardless of actual outside visibility. SVS takes advantage of many enabling technologies to achieve this capability including, for example, the Global Positioning System (GPS), data links, radar, imaging sensors, geospatial databases, advanced display media and three dimensional video graphics processors. Integration of these technologies to achieve the SVS concept provides pilots with high-integrity information that improves situational awareness with respect to terrain, obstacles, traffic, and flight path. This paper attempts to emphasize the system aspects of SVS - true systems, rather than just terrain on a flight display - and to document from an historical viewpoint many of the best practices that evolved during the SVS Project from the perspective of some of the NASA researchers most heavily involved in its execution. The Integrated SVS Concepts are envisagements of what production-grade Synthetic Vision systems might, or perhaps should, be in order to provide the desired functional capabilities that eliminate low visibility as a causal factor to accidents and enable clear-day operational benefits regardless of visibility conditions

    Intelligent road lane mark extraction using a Mobile Mapping System

    Get PDF
    102 p.During the last years, road landmark in- ventory has raised increasing interest in different areas: the maintenance of transport infrastructures, road 3d modelling, GIS applications, etc. The lane mark detection is posed as a two-class classification problem over a highly class imbalanced dataset. To cope with this imbalance we have applied Active Learning approaches. This Thesis has been divided into two main com- putational parts. In the first part, we have evaluated different Machine Learning approaches using panoramic images, obtained from image sensor, such as Random Forest (RF) and ensembles of Extreme Learning Machines (V-ELM), obtaining satisfactory results in the detection of road continuous lane marks. In the second part of the Thesis, we have applied a Random Forest algorithm to a LiDAR point cloud, obtaining a georeferenced road horizontal signs classification. We have not only identified continuous lines, but also, we have been able to identify every horizontal lane mark detected by the LiDAR sensor

    Automated thematic mapping and change detection of ERTS-A images

    Get PDF
    The author has identified the following significant results. For the recognition of terrain types, spatial signatures are developed from the diffraction patterns of small areas of ERTS-1 images. This knowledge is exploited for the measurements of a small number of meaningful spatial features from the digital Fourier transforms of ERTS-1 image cells containing 32 x 32 picture elements. Using these spatial features and a heuristic algorithm, the terrain types in the vicinity of Phoenix, Arizona were recognized by the computer with a high accuracy. Then, the spatial features were combined with spectral features and using the maximum likelihood criterion the recognition accuracy of terrain types increased substantially. It was determined that the recognition accuracy with the maximum likelihood criterion depends on the statistics of the feature vectors. Nonlinear transformations of the feature vectors are required so that the terrain class statistics become approximately Gaussian. It was also determined that for a given geographic area the statistics of the classes remain invariable for a period of a month but vary substantially between seasons

    Text Detection in Natural Scenes and Technical Diagrams with Convolutional Feature Learning and Cascaded Classification

    Get PDF
    An enormous amount of digital images are being generated and stored every day. Understanding text in these images is an important challenge with large impacts for academic, industrial and domestic applications. Recent studies address the difficulty of separating text targets from noise and background, all of which vary greatly in natural scenes. To tackle this problem, we develop a text detection system to analyze and utilize visual information in a data driven, automatic and intelligent way. The proposed method incorporates features learned from data, including patch-based coarse-to-fine detection (Text-Conv), connected component extraction using region growing, and graph-based word segmentation (Word-Graph). Text-Conv is a sliding window-based detector, with convolution masks learned using the Convolutional k-means algorithm (Coates et. al, 2011). Unlike convolutional neural networks (CNNs), a single vector/layer of convolution mask responses are used to classify patches. An initial coarse detection considers both local and neighboring patch responses, followed by refinement using varying aspect ratios and rotations for a smaller local detection window. Different levels of visual detail from ground truth are utilized in each step, first using constraints on bounding box intersections, and then a combination of bounding box and pixel intersections. Combining masks from different Convolutional k-means initializations, e.g., seeded using random vectors and then support vectors improves performance. The Word-Graph algorithm uses contextual information to improve word segmentation and prune false character detections based on visual features and spatial context. Our system obtains pixel, character, and word detection f-measures of 93.14%, 90.26%, and 86.77% respectively for the ICDAR 2015 Robust Reading Focused Scene Text dataset, out-performing state-of-the-art systems, and producing highly accurate text detection masks at the pixel level. To investigate the utility of our feature learning approach for other image types, we perform tests on 8- bit greyscale USPTO patent drawing diagram images. An ensemble of Ada-Boost classifiers with different convolutional features (MetaBoost) is used to classify patches as text or background. The Tesseract OCR system is used to recognize characters in detected labels and enhance performance. With appropriate pre-processing and post-processing, f-measures of 82% for part label location, and 73% for valid part label locations and strings are obtained, which are the best obtained to-date for the USPTO patent diagram data set used in our experiments. To sum up, an intelligent refinement of convolutional k-means-based feature learning and novel automatic classification methods are proposed for text detection, which obtain state-of-the-art results without the need for strong prior knowledge. Different ground truth representations along with features including edges, color, shape and spatial relationships are used coherently to improve accuracy. Different variations of feature learning are explored, e.g. support vector-seeded clustering and MetaBoost, with results suggesting that increased diversity in learned features benefit convolution-based text detectors

    A Digitization and Conversion Tool for Imaged Drawings to Intelligent Piping and Instrumentation Diagrams (P&ID)

    Get PDF
    In the Fourth Industrial Revolution, artificial intelligence technology and big data science are emerging rapidly. To apply these informational technologies to the engineering industries, it is essential to digitize the data that are currently archived in image or hard-copy format. For previously created design drawings, the consistency between the design products is reduced in the digitization process, and the accuracy and reliability of estimates of the equipment and materials by the digitized drawings are remarkably low. In this paper, we propose a method and system of automatically recognizing and extracting design information from imaged piping and instrumentation diagram (P&ID) drawings and automatically generating digitized drawings based on the extracted data by using digital image processing techniques such as template matching and sliding window method. First, the symbols are recognized by template matching and extracted from the imaged P&ID drawing and registered automatically in the database. Then, lines and text are recognized and extracted from in the imaged P&ID drawing using the sliding window method and aspect ratio calculation, respectively. The extracted symbols for equipment and lines are associated with the attributes of the closest text and are stored in the database in neutral format. It is mapped with the predefined intelligent P&ID information and transformed to commercial P&ID tool formats with the associated information stored. As illustrated through the validation case studies, the intelligent digitized drawings generated by the above automatic conversion system, the consistency of the design product is maintained, and the problems experienced with the traditional and manual P&ID input method by engineering companies, such as time consumption, missing items, and misspellings, are solved through the final fine-tune validation process.11Ysciescopu
    corecore