12,370 research outputs found

    As-Built 3D Heritage City Modelling to Support Numerical Structural Analysis: Application to the Assessment of an Archaeological Remain

    Get PDF
    Terrestrial laser scanning is a widely used technology to digitise archaeological, architectural and cultural heritage. This allows for modelling the assets’ real condition in comparison with traditional data acquisition methods. This paper, based on the case study of the basilica in the Baelo Claudia archaeological ensemble (Tarifa, Spain), justifies the need of accurate heritage modelling against excessively simplified approaches in order to support structural safety analysis. To do this, after validating the 3Dmeshing process frompoint cloud data, the semi-automatic digital reconstitution of the basilica columns is performed. Next, a geometric analysis is conducted to calculate the structural alterations of the columns. In order to determine the structural performance, focusing both on the accuracy and suitability of the geometric models, static and modal analyses are carried out by means of the finite element method (FEM) on three different models for the most unfavourable column in terms of structural damage: (1) as-built (2) simplified and (3) ideal model without deformations. Finally, the outcomes show that the as-built modelling enhances the conservation status analysis of the 3D heritage city (in terms of realistic compliance factor values), although further automation still needs to be implemented in the modelling process

    Multi-modal gated recurrent units for image description

    Full text link
    Using a natural language sentence to describe the content of an image is a challenging but very important task. It is challenging because a description must not only capture objects contained in the image and the relationships among them, but also be relevant and grammatically correct. In this paper a multi-modal embedding model based on gated recurrent units (GRU) which can generate variable-length description for a given image. In the training step, we apply the convolutional neural network (CNN) to extract the image feature. Then the feature is imported into the multi-modal GRU as well as the corresponding sentence representations. The multi-modal GRU learns the inter-modal relations between image and sentence. And in the testing step, when an image is imported to our multi-modal GRU model, a sentence which describes the image content is generated. The experimental results demonstrate that our multi-modal GRU model obtains the state-of-the-art performance on Flickr8K, Flickr30K and MS COCO datasets.Comment: 25 pages, 7 figures, 6 tables, magazin

    Learning to Evaluate Performance of Multi-modal Semantic Localization

    Full text link
    Semantic localization (SeLo) refers to the task of obtaining the most relevant locations in large-scale remote sensing (RS) images using semantic information such as text. As an emerging task based on cross-modal retrieval, SeLo achieves semantic-level retrieval with only caption-level annotation, which demonstrates its great potential in unifying downstream tasks. Although SeLo has been carried out successively, but there is currently no work has systematically explores and analyzes this urgent direction. In this paper, we thoroughly study this field and provide a complete benchmark in terms of metrics and testdata to advance the SeLo task. Firstly, based on the characteristics of this task, we propose multiple discriminative evaluation metrics to quantify the performance of the SeLo task. The devised significant area proportion, attention shift distance, and discrete attention distance are utilized to evaluate the generated SeLo map from pixel-level and region-level. Next, to provide standard evaluation data for the SeLo task, we contribute a diverse, multi-semantic, multi-objective Semantic Localization Testset (AIR-SLT). AIR-SLT consists of 22 large-scale RS images and 59 test cases with different semantics, which aims to provide a comprehensive evaluations for retrieval models. Finally, we analyze the SeLo performance of RS cross-modal retrieval models in detail, explore the impact of different variables on this task, and provide a complete benchmark for the SeLo task. We have also established a new paradigm for RS referring expression comprehension, and demonstrated the great advantage of SeLo in semantics through combining it with tasks such as detection and road extraction. The proposed evaluation metrics, semantic localization testsets, and corresponding scripts have been open to access at github.com/xiaoyuan1996/SemanticLocalizationMetrics .Comment: 19 pages, 11 figure
    • …
    corecore