12,370 research outputs found
As-Built 3D Heritage City Modelling to Support Numerical Structural Analysis: Application to the Assessment of an Archaeological Remain
Terrestrial laser scanning is a widely used technology to digitise archaeological, architectural
and cultural heritage. This allows for modelling the assets’ real condition in comparison with
traditional data acquisition methods. This paper, based on the case study of the basilica in the Baelo
Claudia archaeological ensemble (Tarifa, Spain), justifies the need of accurate heritage modelling
against excessively simplified approaches in order to support structural safety analysis. To do this,
after validating the 3Dmeshing process frompoint cloud data, the semi-automatic digital reconstitution
of the basilica columns is performed. Next, a geometric analysis is conducted to calculate the structural
alterations of the columns. In order to determine the structural performance, focusing both on the
accuracy and suitability of the geometric models, static and modal analyses are carried out by means of
the finite element method (FEM) on three different models for the most unfavourable column in terms
of structural damage: (1) as-built (2) simplified and (3) ideal model without deformations. Finally,
the outcomes show that the as-built modelling enhances the conservation status analysis of the 3D
heritage city (in terms of realistic compliance factor values), although further automation still needs to
be implemented in the modelling process
Multi-modal gated recurrent units for image description
Using a natural language sentence to describe the content of an image is a
challenging but very important task. It is challenging because a description
must not only capture objects contained in the image and the relationships
among them, but also be relevant and grammatically correct. In this paper a
multi-modal embedding model based on gated recurrent units (GRU) which can
generate variable-length description for a given image. In the training step,
we apply the convolutional neural network (CNN) to extract the image feature.
Then the feature is imported into the multi-modal GRU as well as the
corresponding sentence representations. The multi-modal GRU learns the
inter-modal relations between image and sentence. And in the testing step, when
an image is imported to our multi-modal GRU model, a sentence which describes
the image content is generated. The experimental results demonstrate that our
multi-modal GRU model obtains the state-of-the-art performance on Flickr8K,
Flickr30K and MS COCO datasets.Comment: 25 pages, 7 figures, 6 tables, magazin
Learning to Evaluate Performance of Multi-modal Semantic Localization
Semantic localization (SeLo) refers to the task of obtaining the most
relevant locations in large-scale remote sensing (RS) images using semantic
information such as text. As an emerging task based on cross-modal retrieval,
SeLo achieves semantic-level retrieval with only caption-level annotation,
which demonstrates its great potential in unifying downstream tasks. Although
SeLo has been carried out successively, but there is currently no work has
systematically explores and analyzes this urgent direction. In this paper, we
thoroughly study this field and provide a complete benchmark in terms of
metrics and testdata to advance the SeLo task. Firstly, based on the
characteristics of this task, we propose multiple discriminative evaluation
metrics to quantify the performance of the SeLo task. The devised significant
area proportion, attention shift distance, and discrete attention distance are
utilized to evaluate the generated SeLo map from pixel-level and region-level.
Next, to provide standard evaluation data for the SeLo task, we contribute a
diverse, multi-semantic, multi-objective Semantic Localization Testset
(AIR-SLT). AIR-SLT consists of 22 large-scale RS images and 59 test cases with
different semantics, which aims to provide a comprehensive evaluations for
retrieval models. Finally, we analyze the SeLo performance of RS cross-modal
retrieval models in detail, explore the impact of different variables on this
task, and provide a complete benchmark for the SeLo task. We have also
established a new paradigm for RS referring expression comprehension, and
demonstrated the great advantage of SeLo in semantics through combining it with
tasks such as detection and road extraction. The proposed evaluation metrics,
semantic localization testsets, and corresponding scripts have been open to
access at github.com/xiaoyuan1996/SemanticLocalizationMetrics .Comment: 19 pages, 11 figure
- …