18 research outputs found
A window to the past through modern urban environments: Developing a photogrammetric workflow for the orientation parameter estimation of historical images
The ongoing process of digitization in archives is providing access to ever-increasing historical image collections. In many of these repositories, images can typically be viewed in a list or gallery view. Due to the growing number of digitized objects, this type of visualization is becoming increasingly complex. Among other things, it is difficult to determine how many photographs show a particular object and spatial information can only be communicated via metadata.
Within the scope of this thesis, research is conducted on the automated determination and provision of this spatial data. Enhanced visualization options make this information more eas- ily accessible to scientists as well as citizens. Different types of visualizations can be presented in three-dimensional (3D), Virtual Reality (VR) or Augmented Reality (AR) applications. However, applications of this type require the estimation of the photographer’s point of view. In the photogrammetric context, this is referred to as estimating the interior and exterior orientation parameters of the camera. For determination of orientation parameters for single images, there are the established methods of Direct Linear Transformation (DLT) or photogrammetric space resection. Using these methods requires the assignment of measured object points to their homologue image points. This is feasible for single images, but quickly becomes impractical due to the large amount of images available in archives. Thus, for larger image collections, usually the Structure-from-Motion (SfM) method is chosen, which allows the simultaneous estimation of the interior as well as the exterior orientation of the cameras. While this method yields good results especially for sequential, contemporary image data, its application to unsorted historical photographs poses a major challenge.
In the context of this work, which is mainly limited to scenarios of urban terrestrial photographs, the reasons for failure of the SfM process are identified. In contrast to sequential image collections, pairs of images from different points in time or from varying viewpoints show huge differences in terms of scene representation such as deviations in the lighting situation, building state, or seasonal changes. Since homologue image points have to be found automatically in image pairs or image sequences in the feature matching procedure of SfM, these image differences
pose the most complex problem.
In order to test different feature matching methods, it is necessary to use a pre-oriented historical dataset. Since such a benchmark dataset did not exist yet, eight historical image triples (corresponding to 24 image pairs) are oriented in this work by manual selection of homologue image points. This dataset allows the evaluation of frequently new published methods in feature matching. The initial methods used, which are based on algorithmic procedures for feature matching (e.g., Scale Invariant Feature Transform (SIFT)), provide satisfactory results for only few of the image pairs in this dataset. By introducing methods that use neural networks for feature detection and feature description, homologue features can be reliably found for a large fraction of image pairs in the benchmark dataset.
In addition to a successful feature matching strategy, determining camera orientation requires an initial estimate of the principal distance. Hence for historical images, the principal distance cannot be directly determined as the camera information is usually lost during the process of digitizing the analog original. A possible solution to this problem is to use three vanishing points that are automatically detected in the historical image and from which the principal distance can then be determined. The combination of principal distance estimation and robust feature matching is integrated into the SfM process and allows the determination of the interior
and exterior camera orientation parameters of historical images. Based on
these results, a workflow is designed that allows archives to be directly connected to 3D applications.
A search query in archives is usually performed using keywords, which have to be assigned to the corresponding object as metadata. Therefore, a keyword search for a specific building also results in hits on drawings, paintings, events, interior or detailed views directly connected to this building. However, for the successful application of SfM in an urban context, primarily the photographic exterior view of the building is of interest. While the images for a single building can be sorted by hand, this process is too time-consuming for multiple buildings.
Therefore, in collaboration with the Competence Center for Scalable Data Services and Solutions (ScaDS), an approach is developed to filter historical photographs by image similarities. This method reliably enables the search for content-similar views via the selection of one or more query images. By linking this content-based image retrieval with the SfM approach, automatic determination of camera parameters for a large number of historical photographs is possible. The developed method represents a significant improvement over commercial and open-source SfM standard solutions.
The result of this work is a complete workflow from archive to application that automatically filters images and calculates the camera parameters. The expected accuracy of a few meters for the camera position is sufficient for the presented applications in this work, but offer further potential for improvement. A connection to archives, which will automatically exchange photographs and positions via interfaces, is currently under development. This makes it possible to retrieve interior and exterior orientation parameters directly from historical photography as metadata which opens up new fields of research.:1 Introduction 1
1.1 Thesis structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Historical image data and archives . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Structure-from-Motion for historical images . . . . . . . . . . . . . . . . . . . 4
1.3.1 Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3.2 Selection of images and preprocessing . . . . . . . . . . . . . . . . . . 5
1.3.3 Feature detection, feature description and feature matching . . . . . . 6
1.3.3.1 Feature detection . . . . . . . . . . . . . . . . . . . . . . . . 7
1.3.3.2 Feature description . . . . . . . . . . . . . . . . . . . . . . . 9
1.3.3.3 Feature matching . . . . . . . . . . . . . . . . . . . . . . . . 10
1.3.3.4 Geometric verification and robust estimators . . . . . . . . . 13
1.3.3.5 Joint methods . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.3.4 Initial parameterization . . . . . . . . . . . . . . . . . . . . . . . . . . 19
1.3.5 Bundle adjustment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
1.3.6 Dense reconstruction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
1.3.7 Georeferencing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
1.4 Research objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2 Generation of a benchmark dataset using historical photographs for the evaluation
of feature matching methods 29
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.1.1 Image differences based on digitization and image medium . . . . . . . 30
2.1.2 Image differences based on different cameras and acquisition technique 31
2.1.3 Object differences based on different dates of acquisition . . . . . . . . 31
2.2 Related work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.3 The image dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
2.4 Comparison of different feature detection and description methods . . . . . . 35
2.4.1 Oriented FAST and Rotated BRIEF (ORB) . . . . . . . . . . . . . . . 36
2.4.2 Maximally Stable Extremal Region Detector (MSER) . . . . . . . . . 36
2.4.3 Radiation-invariant Feature Transform (RIFT) . . . . . . . . . . . . . 36
2.4.4 Feature matching and outlier removal . . . . . . . . . . . . . . . . . . 36
2.5 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
2.6 Conclusions and future work . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
3 Photogrammetry as a link between image repository and 4D applications 45
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
IX
Contents
3.2 Multimodal access on repositories . . . . . . . . . . . . . . . . . . . . . . . . . 47
3.2.1 Conventional access . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
3.2.2 Virtual access using online collections . . . . . . . . . . . . . . . . . . 48
3.2.3 Virtual museums . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
3.3 Workflow and access strategies . . . . . . . . . . . . . . . . . . . . . . . . . . 52
3.3.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
3.3.2 Filtering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
3.3.3 Photogrammetry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
3.3.4 Browser access . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
3.3.5 VR and AR access . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
3.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
4 An adapted Structure-from-Motion Workflow for the orientation of historical
images 69
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
4.2 Related Research . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
4.2.1 Historical images for 3D reconstruction . . . . . . . . . . . . . . . . . 72
4.2.2 Algorithmic Feature Detection and Matching . . . . . . . . . . . . . . 73
4.2.3 Feature Detection and Matching using Convolutional Neural Networks 74
4.3 Feature Matching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
4.4 Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
4.4.1 Step 1: Data preparation . . . . . . . . . . . . . . . . . . . . . . . . . 78
4.4.2 Step 2.1: Feature Detection and Matching . . . . . . . . . . . . . . . . 78
4.4.3 Step 2.2: Vanishing Point Detection and Principal Distance Estimation 80
4.4.4 Step 3: Scene Reconstruction . . . . . . . . . . . . . . . . . . . . . . . 80
4.4.5 Comparison with Three Other State-of-the-Art SfM Workflows . . . . 81
4.5 Datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
4.6 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
4.7 Conclusions and Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
4.8 Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
4.A Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
5 Fully automated pose estimation of historical images 97
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
5.2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
5.2.1 Image Retrieval . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
5.2.2 Feature Detection and Matching . . . . . . . . . . . . . . . . . . . . . 101
5.3 Data Preparation: Image Retrieval . . . . . . . . . . . . . . . . . . . . . . . . 102
5.3.1 Experiment and Data . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
5.3.2 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
5.3.2.1 Layer Extraction Approach (LEA) . . . . . . . . . . . . . . . 104
5.3.2.2 Attentive Deep Local Features (DELF) Approach . . . . . . 105
5.3.3 Results and Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . 106
5.4 Camera Pose Estimation of Historical Images Using Photogrammetric Methods 110
5.4.1 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
5.4.1.1 Benchmark Datasets . . . . . . . . . . . . . . . . . . . . . . . 111
5.4.1.2 Retrieval Datasets . . . . . . . . . . . . . . . . . . . . . . . . 113
5.4.2 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
5.4.2.1 Feature Detection and Matching . . . . . . . . . . . . . . . . 115
5.4.2.2 Geometric Verification and Camera Pose Estimation . . . . . 116
5.4.3 Results and Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . 117
5.5 Conclusions and Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
5.A Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
6 Related publications 129
6.1 Photogrammetric analysis of historical image repositores for virtual reconstruction
in the field of digital humanities . . . . . . . . . . . . . . . . . . . . . . . 130
6.2 Feature matching of historical images based on geometry of quadrilaterals . . 131
6.3 Geo-information technologies for a multimodal access on historical photographs
and maps for research and communication in urban history . . . . . . . . . . 132
6.4 An automated pipeline for a browser-based, city-scale mobile 4D VR application
based on historical images . . . . . . . . . . . . . . . . . . . . . . . . . . 133
6.5 Software and content design of a browser-based mobile 4D VR application to
explore historical city architecture . . . . . . . . . . . . . . . . . . . . . . . . 134
7 Synthesis 135
7.1 Summary of the developed workflows . . . . . . . . . . . . . . . . . . . . . . . 135
7.1.1 Error assessment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
7.1.2 Accuracy estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
7.1.3 Transfer of the workflow . . . . . . . . . . . . . . . . . . . . . . . . . . 141
7.2 Developments and Outlook . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
8 Appendix 149
8.1 Setup for the feature matching evaluation . . . . . . . . . . . . . . . . . . . . 149
8.2 Transformation from COLMAP coordinate system to OpenGL . . . . . . . . 150
References 151
List of Figures 165
List of Tables 167
List of Abbreviations 169Der andauernde Prozess der Digitalisierung in Archiven ermöglicht den Zugriff auf immer größer werdende historische Bildbestände. In vielen Repositorien können die Bilder typischerweise in einer Listen- oder Gallerieansicht betrachtet werden. Aufgrund der steigenden Zahl an digitalisierten Objekten wird diese Art der Visualisierung zunehmend unübersichtlicher. Es kann u.a. nur noch schwierig bestimmt werden, wie viele Fotografien ein bestimmtes Motiv zeigen. Des Weiteren können räumliche Informationen bisher nur über Metadaten vermittelt werden.
Im Rahmen der Arbeit wird an der automatisierten Ermittlung und Bereitstellung dieser räumlichen Daten geforscht. Erweiterte Visualisierungsmöglichkeiten machen diese Informationen Wissenschaftlern sowie Bürgern einfacher zugänglich. Diese Visualisierungen können u.a. in drei-dimensionalen (3D), Virtual Reality (VR) oder Augmented Reality (AR) Anwendungen präsentiert werden. Allerdings erfordern Anwendungen dieser Art die Schätzung des Standpunktes des Fotografen. Im photogrammetrischen Kontext spricht man dabei von der Schätzung der inneren und äußeren Orientierungsparameter der Kamera. Zur Bestimmung der Orientierungsparameter für Einzelbilder existieren die etablierten Verfahren der direkten linearen Transformation oder des photogrammetrischen Rückwärtsschnittes. Dazu muss eine Zuordnung von gemessenen Objektpunkten zu ihren homologen Bildpunkten erfolgen. Das ist für einzelne Bilder realisierbar, wird aber aufgrund der großen Menge an Bildern in Archiven schnell nicht mehr praktikabel. Für größere Bildverbände wird im photogrammetrischen Kontext somit üblicherweise das Verfahren Structure-from-Motion (SfM) gewählt, das die simultane Schätzung der inneren sowie der äußeren Orientierung der Kameras ermöglicht. Während diese Methode vor allem für sequenzielle, gegenwärtige Bildverbände gute Ergebnisse liefert, stellt die Anwendung auf unsortierten historischen Fotografien eine große Herausforderung dar.
Im Rahmen der Arbeit, die sich größtenteils auf Szenarien stadträumlicher terrestrischer Fotografien beschränkt, werden zuerst die Gründe für das Scheitern des SfM Prozesses identifiziert. Im Gegensatz zu sequenziellen Bildverbänden zeigen Bildpaare aus unterschiedlichen zeitlichen Epochen oder von unterschiedlichen Standpunkten enorme Differenzen hinsichtlich der Szenendarstellung. Dies können u.a. Unterschiede in der Beleuchtungssituation, des
Aufnahmezeitpunktes oder Schäden am originalen analogen Medium sein. Da für die Merkmalszuordnung in SfM automatisiert homologe Bildpunkte in Bildpaaren bzw. Bildsequenzen gefunden werden müssen, stellen diese Bilddifferenzen die größte Schwierigkeit dar.
Um verschiedene Verfahren der Merkmalszuordnung testen zu können, ist es notwendig einen vororientierten historischen Datensatz zu verwenden. Da solch ein Benchmark-Datensatz noch nicht existierte, werden im Rahmen der Arbeit durch manuelle Selektion homologer Bildpunkte acht historische Bildtripel (entspricht 24 Bildpaaren) orientiert, die anschließend genutzt werden, um neu publizierte Verfahren bei der Merkmalszuordnung zu evaluieren. Die ersten verwendeten Methoden, die algorithmische Verfahren zur Merkmalszuordnung nutzen (z.B. Scale Invariant Feature Transform (SIFT)), liefern nur für wenige Bildpaare des Datensatzes zufriedenstellende Ergebnisse. Erst durch die Verwendung von Verfahren, die neuronale Netze zur Merkmalsdetektion und Merkmalsbeschreibung einsetzen, können für einen großen Teil der historischen Bilder des Benchmark-Datensatzes zuverlässig homologe Bildpunkte gefunden werden.
Die Bestimmung der Kameraorientierung erfordert zusätzlich zur Merkmalszuordnung eine initiale Schätzung der Kamerakonstante, die jedoch im Zuge der Digitalisierung des analogen Bildes nicht mehr direkt zu ermitteln ist. Eine mögliche Lösung dieses Problems ist die Verwendung von drei Fluchtpunkten, die automatisiert im historischen Bild detektiert werden und aus denen dann die Kamerakonstante bestimmt werden kann. Die Kombination aus Schätzung der Kamerakonstante und robuster Merkmalszuordnung wird in den SfM Prozess integriert und erlaubt die Bestimmung der Kameraorientierung historischer Bilder.
Auf Grundlage dieser Ergebnisse wird ein Arbeitsablauf konzipiert, der es ermöglicht, Archive mittels dieses photogrammetrischen Verfahrens direkt an 3D-Anwendungen anzubinden.
Eine Suchanfrage in Archiven erfolgt üblicherweise über Schlagworte, die dann als Metadaten dem entsprechenden Objekt zugeordnet sein müssen. Eine Suche nach einem bestimmten Gebäude generiert deshalb u.a. Treffer zu Zeichnungen, Gemälden, Veranstaltungen, Innen- oder Detailansichten. Für die erfolgreiche Anwendung von SfM im stadträumlichen Kontext interessiert jedoch v.a. die fotografische Außenansicht des Gebäudes. Während die Bilder für ein einzelnes Gebäude von Hand sortiert werden können, ist dieser Prozess für mehrere
Gebäude zu zeitaufwendig.
Daher wird in Zusammenarbeit mit dem Competence Center for Scalable Data Services and Solutions (ScaDS) ein Ansatz entwickelt, um historische Fotografien über Bildähnlichkeiten zu filtern. Dieser ermöglicht zuverlässig über die Auswahl eines oder mehrerer Suchbilder die Suche nach inhaltsähnlichen Ansichten. Durch die Verknüpfung der inhaltsbasierten Suche mit dem SfM Ansatz ist es möglich, automatisiert für eine große Anzahl historischer Fotografien die Kameraparameter zu bestimmen. Das entwickelte Verfahren stellt eine deutliche Verbesserung im Vergleich zu kommerziellen und open-source SfM Standardlösungen dar.
Das Ergebnis dieser Arbeit ist ein kompletter Arbeitsablauf vom Archiv bis zur Applikation, der automatisch Bilder filtert und diese orientiert. Die zu erwartende Genauigkeit von wenigen Metern für die Kameraposition sind ausreichend für die dargestellten Anwendungen in dieser Arbeit, bieten aber weiteres Verbesserungspotential. Eine Anbindung an Archive, die über Schnittstellen automatisch Fotografien und Positionen austauschen soll, befindet sich bereits in der Entwicklung. Dadurch ist es möglich, innere und äußere Orientierungsparameter direkt von der historischen Fotografie als Metadaten abzurufen, was neue Forschungsfelder eröffnet.:1 Introduction 1
1.1 Thesis structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Historical image data and archives . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Structure-from-Motion for historical images . . . . . . . . . . . . . . . . . . . 4
1.3.1 Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3.2 Selection of images and preprocessing . . . . . . . . . . . . . . . . . . 5
1.3.3 Feature detection, feature description and feature matching . . . . . . 6
1.3.3.1 Feature detection . . . . . . . . . . . . . . . . . . . . . . . . 7
1.3.3.2 Feature description . . . . . . . . . . . . . . . . . . . . . . . 9
1.3.3.3 Feature matching . . . . . . . . . . . . . . . . . . . . . . . . 10
1.3.3.4 Geometric verification and robust estimators . . . . . . . . . 13
1.3.3.5 Joint methods . . . . . . . . . . . . . . . .
Coloring the Past: Neural Historical Buildings Reconstruction from Archival Photography
Historical buildings are a treasure and milestone of human cultural heritage.
Reconstructing the 3D models of these building hold significant value. The
rapid development of neural rendering methods makes it possible to recover the
3D shape only based on archival photographs. However, this task presents
considerable challenges due to the limitations of such datasets. Historical
photographs are often limited in number and the scenes in these photos might
have altered over time. The radiometric quality of these images is also often
sub-optimal. To address these challenges, we introduce an approach to
reconstruct the geometry of historical buildings, employing volumetric
rendering techniques. We leverage dense point clouds as a geometric prior and
introduce a color appearance embedding loss to recover the color of the
building given limited available color images. We aim for our work to spark
increased interest and focus on preserving historical buildings. Thus, we also
introduce a new historical dataset of the Hungarian National Theater, providing
a new benchmark for the reconstruction method
A 4D information system for the exploration of multitemporal images and maps using photogrammetry, web technologies and VR/AR
[EN] This contribution shows the comparison, investigation, and implementation of different access strategies on multimodal data. The first part of the research is structured as a theoretical part opposing and explaining the terms of conventional access, virtual archival access, and virtual museums while additionally referencing related work. Especially, issues that still persist in repositories like the ambiguity or missing of metadata is pointed out. The second part explains the practical implementation of a workflow from a large image repository to various four-dimensional applications. Mainly, the filtering of images and in the following, the orientation of images is explained. Selection of the relevant images is partly done manually but also with the use of deep convolutional neural networks for image classification. In the following, photogrammetric methods are used for finding the relative orientation between image pairs in a projective frame. For this purpose, an adapted Structure from Motion (SfM) workflow is presented, in which the step of feature detection and matching is replaced by the Radiant-Invariant Feature Transform (RIFT) and Matching On Demand with View Synthesis (MODS). Both methods have been evaluated on a benchmark dataset and performed superior than other approaches. Subsequently, the oriented images are placed interactively and in the future automatically in a 4D browser application showing images, maps, and building models Further usage scenarios are presented in several Virtual Reality (VR) and Augmented Reality (AR) applications. The new representation of the archival data enables spatial and temporal browsing of repositories allowing the research of innovative perspectives and the uncovering of historical details.Highlights:Strategies for a completely automated workflow from image repositories to four-dimensional (4D) access approaches.The orientation of historical images using adapted and evaluated feature matching methods.4D access methods for historical images and 3D models using web technologies and Virtual Reality (VR)/Augmented Reality (AR).[ES] Esta contribución muestra la comparación, investigación e implementación de diferentes estrategias de acceso a datos multimodales. La primera parte de la investigación se estructura en una parte teórica en la que se oponen y explican los términos de acceso convencional, acceso a los archivos virtuales, y museos virtuales, a la vez que se hace referencia a trabajos relacionados. En especial, se señalan los problemas que aún persisten en los repositorios, como la ambigüedad o la falta de metadatos. La segunda parte explica la implementación práctica de un flujo de trabajo desde un gran repositorio de imágenes a varias aplicaciones en cuatro dimensiones (4D). Principalmente, se explica el filtrado de imágenes y, a continuación, la orientación de las mismas. La selección de las imágenes relevantes se hace en parte manualmente, pero también con el uso de redes neuronales convolucionales profundas para la clasificación de las imágenes. A continuación, se utilizan métodos fotogramétricos para encontrar la orientación relativa entre pares de imágenes en un marco proyectivo. Para ello, se presenta un flujo de trabajo adaptado a partir de Structure from Motion, (SfM), en el que el paso de la detección y la correspondencia de entidades es sustituido por la Transformación de entidades invariante a la radiancia (Radiant-Invariant Feature Transform, RIFT) y la Correspondencia a demanda con vistas sintéticas (Matching on Demand with View Synthesis, MODS). Ambos métodos han sido evaluados sobre la base de un conjunto de datos de referencia y funcionaron mejor que otros procedimientos. Posteriormente, las imágenes orientadas se colocan interactivamente y en el futuro automáticamente en una aplicación de navegador 4D que muestra imágenes, mapas y modelos de edificios. Otros escenarios de uso se presentan en varias aplicación es de Realidad Virtual (RV) y Realidad Aumentada (RA). La nueva representación de los datos archivados permite la navegación espacial y temporal de los repositorios, lo que permite la investigación en perspectivas innovadoras y el descubrimiento de detalles históricos.The research upon which this paper is based is part of the junior research group UrbanHistory4D’s activities which has received funding from the German Federal Ministry of Education and Research under grant agreement No 01UG1630. This work was supported by the German Federal Ministry of Education and Research (BMBF, 01IS18026BA-F) by funding the competence center for Big Data “ScaDS Dresden/Leipzig”.Maiwald, F.; Bruschke, J.; Lehmann, C.; Niebling, F. (2019). Un sistema de información 4D para la exploración de imágenes y mapas multitemporales utilizando fotogrametría, tecnologías web y VR/AR. Virtual Archaeology Review. 10(21):1-13. https://doi.org/10.4995/var.2019.11867SWORD1131021Ackerman, A., & Glekas, E. (2017). Digital Capture and Fabrication Tools for Interpretation of Historic Sites. ISPRS Annals of Photogrammetry, Remote Sensing and Spatial Information Sciences, IV-2/W2, 107-114. doi:10.5194/isprs-annals-IV-2-W2-107-2017Armingeon, M., Komani, P., Zanwar, T., Korkut, S., & Dornberger, R. (2019). A Case Study: Assessing Effectiveness of the Augmented Reality Application in Augusta Raurica Augmented Reality and Virtual Reality (pp. 99-111): Springer.Artstor. (2019). Artstor Digital Library. Retrieved April 30, 2019, from https://library.artstor.orgBay, H., Tuytelaars, T., & Van Gool, L. (2006). SURF: Speeded Up Robust Features. Paper presented at the European Conference on Computer Vision, Berlin, Heidelberg.Beaudoin, J. E., & Brady, J. E. (2011). Finding visual information: a study of image resources used by archaeologists, architects, art historians, and artists. Art Documentation: Journal of the Art Libraries Society of North America, 30(2), 24-36.Beltrami, C., Cavezzali, D., Chiabrando, F., Iaccarino Idelson, A., Patrucco, G., & Rinaudo, F. (2019). 3D Digital and Physical Reconstruction of a Collapsed Dome using SFM Techniques from Historical Images. Int. Arch. Photogramm. Remote Sens. Spatial Inf. Sci., XLII-2/W11, 217-224. doi:10.5194/isprs-archives-XLII-2-W11-217-2019Bevilacqua, M. G., Caroti, G., Piemonte, A., & Ulivieri, D. (2019). Reconstruction of lost Architectural Volumes by Integration of Photogrammetry from Archive Imagery with 3-D Models of the Status Quo. Int. Arch. Photogramm. Remote Sens. Spatial Inf. Sci., XLII-2/W9, 119-125. doi:10.5194/isprs-archives-XLII-2-W9-119-2019Bitelli, G., Dellapasqua, M., Girelli, V. A., Sbaraglia, S., & Tinia, M. A. (2017). Historical Photogrammetry and Terrestrial Laser Scanning for the 3d Virtual Reconstruction of Destroyed Structures: A Case Study in Italy. ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, XLII-5/W1, 113-119. doi:10.5194/isprs-archives-XLII-5-W1-113-2017Bruschke, J., Niebling, F., Maiwald, F., Friedrichs, K., Wacker, M., & Latoschik, M. E. (2017). Towards browsing repositories of spatially oriented historic photographic images in 3D web environments. Paper presented at the Proceedings of the 22nd International Conference on 3D Web Technology.Bruschke, J., Niebling, F., & Wacker, M. (2018). Visualization of Orientations of Spatial Historical Photographs. Paper presented at the Eurographics Workshop on Graphics and Cultural Heritage.Bruschke, J., & Wacker, M. (2014). Application of a Graph Database and Graphical User Interface for the CIDOC CRM. Paper presented at the Access and Understanding-Networking in the Digital Era. Session J1. The 2014 annual conference of CIDOC, the International Committee for Documentation of ICOM.Burdea, G. C., & Coiffet, P. (2003). Virtual reality technology: John Wiley & Sons.Callieri, M., Cignoni, P., Corsini, M., & Scopigno, R. (2008). Masked photo blending: Mapping dense photographic data set on high-resolution sampled 3D models. Computers & Graphics, 32(4), 464-473.Chum, O., & Matas, J. (2005). Matching with PROSAC-progressive sample consensus. Paper presented at the Computer Vision and Pattern Recognition, 2005. CVPR 2005. IEEE Computer Society Conference on.Coordination and Support Action Virtual Multimodal Museum (ViMM). (2018). ViMM. Retrieved April 30, 2019, from https://www.vi-mm.eu/CultLab3D. (2019). CultLab3D. Retrieved April 30, 2019, from https://www.cultlab3d.deDeng, J., Dong, W., Socher, R., Li, L.-J., Li, K., & Fei-Fei, L. (2009). Imagenet: A large-scale hierarchical image database. Paper presented at the 2009 IEEE conference on computer vision and pattern recognition.Deutsches Archäologisches Institut (DAI). (2019). iDAI.objects arachne (Arachne). Retrieved April 30, 2019, from https://arachne.dainst.org/Efron, B., & Tibshirani, R. J. (1994). An introduction to the bootstrap: CRC press.Europeana. (2019). Europeana Collections. Retrieved 30.04.2019, from https://www.europeana.euEvens, T., & Hauttekeete, L. (2011). Challenges of digital preservation for cultural heritage institutions. Journal of Librarianship and Information Science, 43(3), 157-165.Fischler, M. A., & Bolles, R. C. (1981). Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Communications of the ACM, 24(6), 381-395.Fleming‐May, R. A., & Green, H. (2016). Digital innovations in poetry: Practices of creative writing faculty in online literary publishing. Journal of the Association for Information Science and Technology, 67(4), 859-873.Franken, T., Dellepiane, M., Ganovelli, F., Cignoni, P., Montani, C., & Scopigno, R. (2005). Minimizing user intervention in registering 2D images to 3D models. The visual computer, 21(8-10), 619-628.Girardi, G., von Schwerin, J., Richards-Rissetto, H., Remondino, F., & Agugiaro, G. (2013). The MayaArch3D project: A 3D WebGIS for analyzing ancient architecture and landscapes. Literary and Linguistic Computing, 28(4), 736-753. doi:10.1093/llc/fqt059Grussenmeyer, P., & Al Khalil, O. (2017). From Metric Image Archives to Point Cloud Reconstruction: Case Study of the Great Mosque of Aleppo in Syria. Int. Arch. Photogramm. Remote Sens. Spatial Inf. Sci., XLII-2/W5, 295-301. doi:10.5194/isprs-archives-XLII-2-W5-295-2017Gutierrez, M., Vexo, F., & Thalmann, D. (2008). Stepping into virtual reality: Springer Science & Business Media.Guttentag, D. A. (2010). Virtual reality: Applications and implications for tourism. Tourism Management, 31(5), 637-651.Hartley, R., & Zisserman, A. (2003). Multiple view geometry in computer vision: Cambridge university press.Koutsoudis, A., Arnaoutoglou, F., Tsaouselis, A., Ioannakis, G., & Chamzas, C. (2015). Creating 3D Replicas of Medium-to Large-Scale Monuments for Web-Based Dissemination Within the Framework of the 3D-Icons Project. CAA2015, 971.Li, J., Hu, Q., & Ai, M. (2018). RIFT: Multi-modal Image Matching Based on Radiation-invariant Feature Transform. arXiv preprint arXiv:1804.09493.Lowe, D. G. (2004). Distinctive image features from scale-invariant keypoints. International journal of computer vision, 60(2), 91-110.Maietti, F., Di Giulio, R., Piaia, E., Medici, M., & Ferrari, F. (2018). Enhancing Heritage fruition through 3D semantic modelling and digital tools: the INCEPTION project. Paper presented at the IOP Conference Series: Materials Science and Engineering.Maiwald, F., Schneider, D., Henze, F., Münster, S., & Niebling, F. (2018). Feature Matching of Historical Images Based on Geometry of Quadrilaterals. ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, XLII-2, 643-650. doi:10.5194/isprs-archives-XLII-2-643-2018Maiwald, F., Vietze, T., Schneider, D., Henze, F., Münster, S., & Niebling, F. (2017). Photogrammetric analysis of historical image repositories for virtual reconstruction in the field of digital humanities. The International Archives of Photogrammetry, Remote Sensing and Spatial Information Sciences, 42, 447.Matas, J., Chum, O., Urban, M., & Pajdla, T. (2004). Robust wide-baseline stereo from maximally stable extremal regions. Image and Vision Computing, 22(10), 761-767.Melero, F. J., Revelles, J., & Bellido, M. L. (2018). Atalaya3D: making universities' cultural heritage accessible through 3D technologies.Milgram, P., Takemura, H., Utsumi, A., & Kishino, F. (1995). Augmented reality: A class of displays on the reality-virtuality continuum. Paper presented at the Telemanipulator and telepresence technologies.Mishkin, D., Matas, J., & Perdoch, M. (2015). MODS: Fast and robust method for two-view matching. Computer Vision and Image Understanding, 141, 81-93.Moulon, P., Monasse, P., & Marlet, R. (2012). Adaptive structure from motion with a contrario model estimation. Paper presented at the Asian Conference on Computer Vision.Münster, S., Kamposiori, C., Friedrichs, K., & Kröber, C. (2018). Image libraries and their scholarly use in the field of art and architectural history. International journal on digital libraries, 19(4), 367-383.Niebling, F., Bruschke, J., & Latoschik, M. E. (2018). Browsing Spatial Photography for Dissemination of Cultural Heritage Research Results using Augmented Models.Niebling, F., Maiwald, F., Barthel, K., & Latoschik, M. E. (2017). 4D Augmented City Models, Photogrammetric Creation and Dissemination Digital Research and Education in Architectural Heritage (pp. 196-212). Cham: Springer International Publishing.Oliva, L. S., Mura, A., Betella, A., Pacheco, D., Martinez, E., & Verschure, P. (2015). Recovering the history of Bergen Belsen using an interactive 3D reconstruction in a mixed reality space the role of pre-knowledge on memory recollection. Paper presented at the 2015 Digital Heritage.Pani Paudel, D., Habed, A., Demonceaux, C., & Vasseur, P. (2015). Robust and optimal sum-of-squares-based point-to-plane registration of image sets and structured scenes. Paper presented at the Proceedings of the IEEE International Conference on Computer Vision.Ross, S., & Hedstrom, M. (2005). Preservation research and sustainable digital libraries. International journal on digital libraries, 5(4), 317-324.Schindler, G., & Dellaert, F. (2012). 4D Cities: Analyzing, Visualizing, and Interacting with Historical Urban Photo Collections. Journal of Multimedia, 7(2), 124-131.Selvaraju, R. R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., & Batra, D. (2017). Grad-cam: Visual explanations from deep networks via gradient-based localization. Paper presented at the Proceedings of the IEEE International Conference on Computer Vision.Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556.Slater, M., & Sanchez-Vives, M. V. (2016). Enhancing our lives with immersive virtual reality. Frontiers in Robotics and AI, 3, 74.Styliani, S., Fotis, L., Kostas, K., & Petros, P. (2009). Virtual museums, a survey and some issues for consideration. Journal of cultural Heritage, 10(4), 520-528.Tschirschwitz, F., Büyüksalih, G., Kersten, T., Kan, T., Enc, G., & Baskaraca, P. (2019). Virtualising an Ottoman Fortress - Laser Scanning and 3D Modelling for the Development of an Interactive, Immersive Virtual Reality Application. International archives of the photogrammetry, remote sensing and spatial information sciences, 42(2/W9).Web3D Consortium. (2019). Open Standards for Real-Time 3D Communication. Retrieved April 30, 2019, from http://www.web3d.org/Wu, C. (2013). Towards linear-time incremental structure from motion. Paper presented at the 3D Vision-3DV 2013, 2013 International conference on.Wu, Y., Ma, W., Gong, M., Su, L., & Jiao, L. (2015). A Novel Point-Matching Algorithm Based on Fast Sample Consensus for Image Registration. IEEE Geosci. Remote Sensing Lett., 12(1), 43-47.Yoon, J., & Chung, E. (2011). Understanding image needs in daily life by analyzing questions in a social Q&A site. Journal of the American Society for Information Science and Technology, 62(11), 2201-2213
Supporting Learning in Art History – Artificial Intelligence in Digital Humanities Education
In recent years and especially in the context of the coronavirus pandemic, digital distance learning increases. But for academic students, the selection of adequate learning materials for educational purposes is becoming more and more complex. This marks only one starting point where the use of artificial intelligence (AI) offers additional value. AI has a great potential to enhance and support research and education in the field of digital humanities (DH). As international organisations have just expressed their thoughts on the subject, AI is the topic par excellence and will decisively shape the future development of educational processes
Introducing an Automated Pipeline for a Browser-based, City-scale Mobile 4D VR Application Based on Historical Images
The process for automatically creating 3D city models from contemporary photographs and visualizing them on mobile devices is now well established, but historical 4D city models are more challenging. The fourth dimension here is time. This contribution describes an automated VR pipeline based on historical photographs and resulting in an interactive browser-based device-rendered 4D visualization and information system for mobile devices. Since the pipeline shown is currently still under development, initial results for stages of the process will be shown and assessed for feasibility
Browsing and Experiencing Repositories of Spatially Oriented Historic Photographic Images
Many institutions archive historical images of architecture in urban areas and make them available to scholars and the general public through online platforms. Users can explore these often huge repositories by faceted browsing or keyword-based searching. Metadata that enable these kinds of investigations, however, are often incomplete, imprecise, or even wrong. Thus, retrieving images of interest can be a cumbersome task for users such as art and architectural historians trying to answer their research questions. Many of these images, often containing historic buildings and landscapes, can be oriented spatially using automatic methods such as “structure from motion” (SfM). Providing spatially and temporally oriented images of urban architecture, in combination with advanced searching and exploration techniques, offers new potential in supporting historians in their research. We are developing a 3D web environment useful to historians enabling them to search and access historic photographic images in a spatial context. Related projects use 2D maps, showing only a planar view of the current urban situation. In this paper, we present an approach to create interactive views of 4D city models, i.e., 3D spatial models that show changes over time, to provide a better understanding of the urban building situation regarding the photographer’s position and surroundings. A major feature of the application is to make it possible to spatially align 3D reconstruction models to photogrammetric digitized models based on historical photographs. At the same time, this mixed methods approach is used for validation of the 3D reconstructions
Novel Approaches to research and discover Urban History
Photographs and plans are an essential source for historical research (Münster, Kamposiori, Friedrichs, & Kröber, 2018) and key objects in Digital Humanities (Kwastek, 2014). Numerous digital image archives, containing vast numbers of photographs, have been set up in the context of digitization projects. These extensive repositories of image media are still difficult to search. It is not easy to identify sources relevant for research, analyze and contextualize them, or compare them with the historical original. The eHumanities research group HistStadt4D, funded by the German Federal Ministry of Education and Research (BMBF) until July 2020 consists of 14 people – including 4 post-doctoral and 5 PhD researchers. Since a focal interest is to comprehensively investigate how to enhance accessibility of large scale image repositories, researchers and research approaches originate from the humanities, geoand information technologies as well as from educational and information studies. In contrast to adjacent projects dealing primarily with large scale linked text data as the Venice Time Machine project (“The Venice Time Machine,” 2017), sources addressed by the junior group are primarily historical photographs and plans. Historical media and their contextual information are being transferred into a 4D – 3D spatial and temporal scaled - model to support research and education on urban history. Content will be made accessible in two ways; via a 4D browser and a location-dependent augmented-reality representation. The prototype database consists of about 200,000 digitized historical photographs and plans of Dresden from the Deutsche Fotothek (“Deutsche Fotothek,”)
Solving photogrammetric cold cases using AI-based image matching: New potential for monitoring the past with historical aerial images
With the ongoing digitization in archives, an increasing number of historical data becomes available for research. This includes historical aerial images which provide detailed information about the depicted area. Among the applications enabled by these images are change detection of land use, land cover, glaciers, andcoastal environments as well as the observation of land degradation, and natural hazards. Studying the depicted areas and occurring 3D deformations requires the generation of a digital surface model (DSM) which is usually obtained via photogrammetric Structure-from-Motion (SfM). However, conventional SfM workflows often fail in registering historical aerial images due to their radiometric characteristics introduced by digitization, original image quality, or vast temporal changes between epochs. We demonstrate that the feature matching step in the Structure from Motion (SfM) pipeline is particularly crucial. To address this issue, we apply the two synergetic neural network methods SuperGlue and DISK, improving feature matching for historical aerial images. This requires several modifications to enable rotational invariance and leveraging the high resolution of aerial images. In contrast to other studies our workflow does not require any prior information such as DSMs, flight height, focal lengths, or scan resolution which are often no more extent in archives. It is shown that our methods using adapted parameter settings are even able to deal with quasi texture-less images. This enables the simultaneous processing of various kind of mono-temporal and multi-temporal data handled in a single workflow from data preparation over feature matching through to camera parameter estimation and the generation of a sparse point cloud. It outperforms conventional strategies in the number of correct feature matches, number of registered images and calculated 3D points and allows the generation of multi-temporal DSMs with high quality.With the flexibility of the method, it enables the automatic processing of formerly unusable or only to be interactively processed data, e.g. aerial images where the flight route is unknown, or with difficult radiometric properties. This makes it possible to go back even further in time, where the data quality usually decreases, and enables a holistic monitoring and comparison of environments of high interest. The code is made publicly available at https://github.com/tudipffmgt/HAI-SFM
Un sistema de información 4D para la exploración de imágenes y mapas multitemporales utilizando fotogrametría, tecnologías web y VR/AR
Abstract. The historical images preserved in archives and in private collections represent not only a valuable documentation of objects belonging to Cultural Heritage; sometimes they are the only remained evidence of destroyed assets of our past. In the last few years, the improvement of the technologies in the framework of photogrammetric vision and the implementations of new Structure-from-Motion (SfM) algorithms allow to extract metric information's from this kind of images in order to carry out a digital reconstruction of these lost masterpieces. The study presented in this paper aims to evaluate a SfM approach to perform the 3D reconstruction of a dome collapsed in 1971 by using historical images. The final goal is to provide not only a digital replica but also a physical reconstruction of a portion of the collapsed dome as a support for the recovered fragments of the fresco originally present on the surface of the dome.</p