51 research outputs found

    Automatic Annotation of Images from the Practitioner Perspective

    No full text
    This paper describes an ongoing project which seeks to contribute to a wider understanding of the realities of bridging the semantic gap in visual image retrieval. A comprehensive survey of the means by which real image retrieval transactions are realised is being undertaken. An image taxonomy has been developed, in order to provide a framework within which account may be taken of the plurality of image types, user needs and forms of textual metadata. Significant limitations exhibited by current automatic annotation techniques are discussed, and a possible way forward using ontologically supported automatic content annotation is briefly considered as a potential means of mitigating these limitations

    Audiovisual and Chill: An Evaluation of Video Digital Libraries and Catalogues

    No full text
    Research Problem: This research investigates how well video digital libraries and catalogues used in academic libraries meet user expectations. This is in the context of increasing use and demand for online audiovisual content by the wider community, as well as growing use of audiovisual materials for teaching, learning, and research at academic institutions. It also aims to give an understanding of how well libraries are meeting the challenges of delivering audiovisual materials to users in an on-demand world. Methodology: Twelve platforms—developed between 1996 and 2015—are evaluated against 23 user-centred criteria, divided into four core areas: retrieval functionality, user interface, collection qualities, and user support. Results: The study found that not one of the platforms evaluated met all the evaluation criteria, and identified three key areas in the usability of the video digital libraries and catalogues: search and retrieval, technology, and structure, scope, and strategy. Implications: From this we gain an understanding of performance and usability of video digital libraries and catalogues currently in use by academic libraries. We also learn about the difficulties those working with audiovisual materials are facing, and also of the solutions that are being proposed. Findings of this study could help influence decision making, development of future platforms, and influence policies for delivering audiovisual materials to users

    An examination of automatic video retrieval technology on access to the contents of an historical video archive

    Get PDF
    Purpose – This paper aims to provide an initial understanding of the constraints that historical video collections pose to video retrieval technology and the potential that online access offers to both archive and users. Design/methodology/approach – A small and unique collection of videos on customs and folklore was used as a case study. Multiple methods were employed to investigate the effectiveness of technology and the modality of user access. Automatic keyframe extraction was tested on the visual content while the audio stream was used for automatic classification of speech and music clips. The user access (search vs browse) was assessed in a controlled user evaluation. A focus group and a survey provided insight on the actual use of the analogue archive. The results of these multiple studies were then compared and integrated (triangulation). Findings – The amateur material challenged automatic techniques for video and audio indexing, thus suggesting that the technology must be tested against the material before deciding on a digitisation strategy. Two user interaction modalities, browsing vs searching, were tested in a user evaluation. Results show users preferred searching, but browsing becomes essential when the search engine fails in matching query and indexed words. Browsing was also valued for serendipitous discovery; however the organisation of the archive was judged cryptic and therefore of limited use. This indicates that the categorisation of an online archive should be thought of in terms of users who might not understand the current classification. The focus group and the survey showed clearly the advantage of online access even when the quality of the video surrogate is poor. The evidence gathered suggests that the creation of a digital version of a video archive requires a rethinking of the collection in terms of the new medium: a new archive should be specially designed to exploit the potential that the digital medium offers. Similarly, users' needs have to be considered before designing the digital library interface, as needs are likely to be different from those imagined. Originality/value – This paper is the first attempt to understand the advantages offered and limitations held by video retrieval technology for small video archives like those often found in special collections

    The TREC-2002 video track report

    Get PDF
    TREC-2002 saw the second running of the Video Track, the goal of which was to promote progress in content-based retrieval from digital video via open, metrics-based evaluation. The track used 73.3 hours of publicly available digital video (in MPEG-1/VCD format) downloaded by the participants directly from the Internet Archive (Prelinger Archives) (internetarchive, 2002) and some from the Open Video Project (Marchionini, 2001). The material comprised advertising, educational, industrial, and amateur films produced between the 1930's and the 1970's by corporations, nonprofit organizations, trade associations, community and interest groups, educational institutions, and individuals. 17 teams representing 5 companies and 12 universities - 4 from Asia, 9 from Europe, and 4 from the US - participated in one or more of three tasks in the 2001 video track: shot boundary determination, feature extraction, and search (manual or interactive). Results were scored by NIST using manually created truth data for shot boundary determination and manual assessment of feature extraction and search results. This paper is an introduction to, and an overview of, the track framework - the tasks, data, and measures - the approaches taken by the participating groups, the results, and issues regrading the evaluation. For detailed information about the approaches and results, the reader should see the various site reports in the final workshop proceedings

    Indexing, browsing and searching of digital video

    Get PDF
    Video is a communications medium that normally brings together moving pictures with a synchronised audio track into a discrete piece or pieces of information. The size of a “piece ” of video can variously be referred to as a frame, a shot, a scene, a clip, a programme or an episode, and these are distinguished by their lengths and by their composition. We shall return to the definition of each of these in section 4 this chapter. In modern society, video is ver

    Symbiosis between the TRECVid benchmark and video libraries at the Netherlands Institute for Sound and Vision

    Get PDF
    Audiovisual archives are investing in large-scale digitisation efforts of their analogue holdings and, in parallel, ingesting an ever-increasing amount of born- digital files in their digital storage facilities. Digitisation opens up new access paradigms and boosted re-use of audiovisual content. Query-log analyses show the shortcomings of manual annotation, therefore archives are complementing these annotations by developing novel search engines that automatically extract information from both audio and the visual tracks. Over the past few years, the TRECVid benchmark has developed a novel relationship with the Netherlands Institute of Sound and Vision (NISV) which goes beyond the NISV just providing data and use cases to TRECVid. Prototype and demonstrator systems developed as part of TRECVid are set to become a key driver in improving the quality of search engines at the NISV and will ultimately help other audiovisual archives to offer more efficient and more fine-grained access to their collections. This paper reports the experiences of NISV in leveraging the activities of the TRECVid benchmark

    TRECVID 2007 - Overview

    Get PDF

    TRECVID 2004 - an overview

    Get PDF

    A window to the past through modern urban environments: Developing a photogrammetric workflow for the orientation parameter estimation of historical images

    Get PDF
    The ongoing process of digitization in archives is providing access to ever-increasing historical image collections. In many of these repositories, images can typically be viewed in a list or gallery view. Due to the growing number of digitized objects, this type of visualization is becoming increasingly complex. Among other things, it is difficult to determine how many photographs show a particular object and spatial information can only be communicated via metadata. Within the scope of this thesis, research is conducted on the automated determination and provision of this spatial data. Enhanced visualization options make this information more eas- ily accessible to scientists as well as citizens. Different types of visualizations can be presented in three-dimensional (3D), Virtual Reality (VR) or Augmented Reality (AR) applications. However, applications of this type require the estimation of the photographer’s point of view. In the photogrammetric context, this is referred to as estimating the interior and exterior orientation parameters of the camera. For determination of orientation parameters for single images, there are the established methods of Direct Linear Transformation (DLT) or photogrammetric space resection. Using these methods requires the assignment of measured object points to their homologue image points. This is feasible for single images, but quickly becomes impractical due to the large amount of images available in archives. Thus, for larger image collections, usually the Structure-from-Motion (SfM) method is chosen, which allows the simultaneous estimation of the interior as well as the exterior orientation of the cameras. While this method yields good results especially for sequential, contemporary image data, its application to unsorted historical photographs poses a major challenge. In the context of this work, which is mainly limited to scenarios of urban terrestrial photographs, the reasons for failure of the SfM process are identified. In contrast to sequential image collections, pairs of images from different points in time or from varying viewpoints show huge differences in terms of scene representation such as deviations in the lighting situation, building state, or seasonal changes. Since homologue image points have to be found automatically in image pairs or image sequences in the feature matching procedure of SfM, these image differences pose the most complex problem. In order to test different feature matching methods, it is necessary to use a pre-oriented historical dataset. Since such a benchmark dataset did not exist yet, eight historical image triples (corresponding to 24 image pairs) are oriented in this work by manual selection of homologue image points. This dataset allows the evaluation of frequently new published methods in feature matching. The initial methods used, which are based on algorithmic procedures for feature matching (e.g., Scale Invariant Feature Transform (SIFT)), provide satisfactory results for only few of the image pairs in this dataset. By introducing methods that use neural networks for feature detection and feature description, homologue features can be reliably found for a large fraction of image pairs in the benchmark dataset. In addition to a successful feature matching strategy, determining camera orientation requires an initial estimate of the principal distance. Hence for historical images, the principal distance cannot be directly determined as the camera information is usually lost during the process of digitizing the analog original. A possible solution to this problem is to use three vanishing points that are automatically detected in the historical image and from which the principal distance can then be determined. The combination of principal distance estimation and robust feature matching is integrated into the SfM process and allows the determination of the interior and exterior camera orientation parameters of historical images. Based on these results, a workflow is designed that allows archives to be directly connected to 3D applications. A search query in archives is usually performed using keywords, which have to be assigned to the corresponding object as metadata. Therefore, a keyword search for a specific building also results in hits on drawings, paintings, events, interior or detailed views directly connected to this building. However, for the successful application of SfM in an urban context, primarily the photographic exterior view of the building is of interest. While the images for a single building can be sorted by hand, this process is too time-consuming for multiple buildings. Therefore, in collaboration with the Competence Center for Scalable Data Services and Solutions (ScaDS), an approach is developed to filter historical photographs by image similarities. This method reliably enables the search for content-similar views via the selection of one or more query images. By linking this content-based image retrieval with the SfM approach, automatic determination of camera parameters for a large number of historical photographs is possible. The developed method represents a significant improvement over commercial and open-source SfM standard solutions. The result of this work is a complete workflow from archive to application that automatically filters images and calculates the camera parameters. The expected accuracy of a few meters for the camera position is sufficient for the presented applications in this work, but offer further potential for improvement. A connection to archives, which will automatically exchange photographs and positions via interfaces, is currently under development. This makes it possible to retrieve interior and exterior orientation parameters directly from historical photography as metadata which opens up new fields of research.:1 Introduction 1 1.1 Thesis structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Historical image data and archives . . . . . . . . . . . . . . . . . . . . . . . . 2 1.3 Structure-from-Motion for historical images . . . . . . . . . . . . . . . . . . . 4 1.3.1 Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.3.2 Selection of images and preprocessing . . . . . . . . . . . . . . . . . . 5 1.3.3 Feature detection, feature description and feature matching . . . . . . 6 1.3.3.1 Feature detection . . . . . . . . . . . . . . . . . . . . . . . . 7 1.3.3.2 Feature description . . . . . . . . . . . . . . . . . . . . . . . 9 1.3.3.3 Feature matching . . . . . . . . . . . . . . . . . . . . . . . . 10 1.3.3.4 Geometric verification and robust estimators . . . . . . . . . 13 1.3.3.5 Joint methods . . . . . . . . . . . . . . . . . . . . . . . . . . 16 1.3.4 Initial parameterization . . . . . . . . . . . . . . . . . . . . . . . . . . 19 1.3.5 Bundle adjustment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 1.3.6 Dense reconstruction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 1.3.7 Georeferencing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 1.4 Research objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 2 Generation of a benchmark dataset using historical photographs for the evaluation of feature matching methods 29 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 2.1.1 Image differences based on digitization and image medium . . . . . . . 30 2.1.2 Image differences based on different cameras and acquisition technique 31 2.1.3 Object differences based on different dates of acquisition . . . . . . . . 31 2.2 Related work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 2.3 The image dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 2.4 Comparison of different feature detection and description methods . . . . . . 35 2.4.1 Oriented FAST and Rotated BRIEF (ORB) . . . . . . . . . . . . . . . 36 2.4.2 Maximally Stable Extremal Region Detector (MSER) . . . . . . . . . 36 2.4.3 Radiation-invariant Feature Transform (RIFT) . . . . . . . . . . . . . 36 2.4.4 Feature matching and outlier removal . . . . . . . . . . . . . . . . . . 36 2.5 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 2.6 Conclusions and future work . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 3 Photogrammetry as a link between image repository and 4D applications 45 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 IX Contents 3.2 Multimodal access on repositories . . . . . . . . . . . . . . . . . . . . . . . . . 47 3.2.1 Conventional access . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 3.2.2 Virtual access using online collections . . . . . . . . . . . . . . . . . . 48 3.2.3 Virtual museums . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 3.3 Workflow and access strategies . . . . . . . . . . . . . . . . . . . . . . . . . . 52 3.3.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 3.3.2 Filtering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 3.3.3 Photogrammetry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 3.3.4 Browser access . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 3.3.5 VR and AR access . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 3.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 4 An adapted Structure-from-Motion Workflow for the orientation of historical images 69 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 4.2 Related Research . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 4.2.1 Historical images for 3D reconstruction . . . . . . . . . . . . . . . . . 72 4.2.2 Algorithmic Feature Detection and Matching . . . . . . . . . . . . . . 73 4.2.3 Feature Detection and Matching using Convolutional Neural Networks 74 4.3 Feature Matching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 4.4 Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 4.4.1 Step 1: Data preparation . . . . . . . . . . . . . . . . . . . . . . . . . 78 4.4.2 Step 2.1: Feature Detection and Matching . . . . . . . . . . . . . . . . 78 4.4.3 Step 2.2: Vanishing Point Detection and Principal Distance Estimation 80 4.4.4 Step 3: Scene Reconstruction . . . . . . . . . . . . . . . . . . . . . . . 80 4.4.5 Comparison with Three Other State-of-the-Art SfM Workflows . . . . 81 4.5 Datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 4.6 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 4.7 Conclusions and Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 4.8 Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86 4.A Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 5 Fully automated pose estimation of historical images 97 5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98 5.2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100 5.2.1 Image Retrieval . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100 5.2.2 Feature Detection and Matching . . . . . . . . . . . . . . . . . . . . . 101 5.3 Data Preparation: Image Retrieval . . . . . . . . . . . . . . . . . . . . . . . . 102 5.3.1 Experiment and Data . . . . . . . . . . . . . . . . . . . . . . . . . . . 103 5.3.2 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104 5.3.2.1 Layer Extraction Approach (LEA) . . . . . . . . . . . . . . . 104 5.3.2.2 Attentive Deep Local Features (DELF) Approach . . . . . . 105 5.3.3 Results and Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . 106 5.4 Camera Pose Estimation of Historical Images Using Photogrammetric Methods 110 5.4.1 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110 5.4.1.1 Benchmark Datasets . . . . . . . . . . . . . . . . . . . . . . . 111 5.4.1.2 Retrieval Datasets . . . . . . . . . . . . . . . . . . . . . . . . 113 5.4.2 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115 5.4.2.1 Feature Detection and Matching . . . . . . . . . . . . . . . . 115 5.4.2.2 Geometric Verification and Camera Pose Estimation . . . . . 116 5.4.3 Results and Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . 117 5.5 Conclusions and Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . 120 5.A Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124 6 Related publications 129 6.1 Photogrammetric analysis of historical image repositores for virtual reconstruction in the field of digital humanities . . . . . . . . . . . . . . . . . . . . . . . 130 6.2 Feature matching of historical images based on geometry of quadrilaterals . . 131 6.3 Geo-information technologies for a multimodal access on historical photographs and maps for research and communication in urban history . . . . . . . . . . 132 6.4 An automated pipeline for a browser-based, city-scale mobile 4D VR application based on historical images . . . . . . . . . . . . . . . . . . . . . . . . . . 133 6.5 Software and content design of a browser-based mobile 4D VR application to explore historical city architecture . . . . . . . . . . . . . . . . . . . . . . . . 134 7 Synthesis 135 7.1 Summary of the developed workflows . . . . . . . . . . . . . . . . . . . . . . . 135 7.1.1 Error assessment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137 7.1.2 Accuracy estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139 7.1.3 Transfer of the workflow . . . . . . . . . . . . . . . . . . . . . . . . . . 141 7.2 Developments and Outlook . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145 8 Appendix 149 8.1 Setup for the feature matching evaluation . . . . . . . . . . . . . . . . . . . . 149 8.2 Transformation from COLMAP coordinate system to OpenGL . . . . . . . . 150 References 151 List of Figures 165 List of Tables 167 List of Abbreviations 169Der andauernde Prozess der Digitalisierung in Archiven ermöglicht den Zugriff auf immer grĂ¶ĂŸer werdende historische BildbestĂ€nde. In vielen Repositorien können die Bilder typischerweise in einer Listen- oder Gallerieansicht betrachtet werden. Aufgrund der steigenden Zahl an digitalisierten Objekten wird diese Art der Visualisierung zunehmend unĂŒbersichtlicher. Es kann u.a. nur noch schwierig bestimmt werden, wie viele Fotografien ein bestimmtes Motiv zeigen. Des Weiteren können rĂ€umliche Informationen bisher nur ĂŒber Metadaten vermittelt werden. Im Rahmen der Arbeit wird an der automatisierten Ermittlung und Bereitstellung dieser rĂ€umlichen Daten geforscht. Erweiterte Visualisierungsmöglichkeiten machen diese Informationen Wissenschaftlern sowie BĂŒrgern einfacher zugĂ€nglich. Diese Visualisierungen können u.a. in drei-dimensionalen (3D), Virtual Reality (VR) oder Augmented Reality (AR) Anwendungen prĂ€sentiert werden. Allerdings erfordern Anwendungen dieser Art die SchĂ€tzung des Standpunktes des Fotografen. Im photogrammetrischen Kontext spricht man dabei von der SchĂ€tzung der inneren und Ă€ußeren Orientierungsparameter der Kamera. Zur Bestimmung der Orientierungsparameter fĂŒr Einzelbilder existieren die etablierten Verfahren der direkten linearen Transformation oder des photogrammetrischen RĂŒckwĂ€rtsschnittes. Dazu muss eine Zuordnung von gemessenen Objektpunkten zu ihren homologen Bildpunkten erfolgen. Das ist fĂŒr einzelne Bilder realisierbar, wird aber aufgrund der großen Menge an Bildern in Archiven schnell nicht mehr praktikabel. FĂŒr grĂ¶ĂŸere BildverbĂ€nde wird im photogrammetrischen Kontext somit ĂŒblicherweise das Verfahren Structure-from-Motion (SfM) gewĂ€hlt, das die simultane SchĂ€tzung der inneren sowie der Ă€ußeren Orientierung der Kameras ermöglicht. WĂ€hrend diese Methode vor allem fĂŒr sequenzielle, gegenwĂ€rtige BildverbĂ€nde gute Ergebnisse liefert, stellt die Anwendung auf unsortierten historischen Fotografien eine große Herausforderung dar. Im Rahmen der Arbeit, die sich grĂ¶ĂŸtenteils auf Szenarien stadtrĂ€umlicher terrestrischer Fotografien beschrĂ€nkt, werden zuerst die GrĂŒnde fĂŒr das Scheitern des SfM Prozesses identifiziert. Im Gegensatz zu sequenziellen BildverbĂ€nden zeigen Bildpaare aus unterschiedlichen zeitlichen Epochen oder von unterschiedlichen Standpunkten enorme Differenzen hinsichtlich der Szenendarstellung. Dies können u.a. Unterschiede in der Beleuchtungssituation, des Aufnahmezeitpunktes oder SchĂ€den am originalen analogen Medium sein. Da fĂŒr die Merkmalszuordnung in SfM automatisiert homologe Bildpunkte in Bildpaaren bzw. Bildsequenzen gefunden werden mĂŒssen, stellen diese Bilddifferenzen die grĂ¶ĂŸte Schwierigkeit dar. Um verschiedene Verfahren der Merkmalszuordnung testen zu können, ist es notwendig einen vororientierten historischen Datensatz zu verwenden. Da solch ein Benchmark-Datensatz noch nicht existierte, werden im Rahmen der Arbeit durch manuelle Selektion homologer Bildpunkte acht historische Bildtripel (entspricht 24 Bildpaaren) orientiert, die anschließend genutzt werden, um neu publizierte Verfahren bei der Merkmalszuordnung zu evaluieren. Die ersten verwendeten Methoden, die algorithmische Verfahren zur Merkmalszuordnung nutzen (z.B. Scale Invariant Feature Transform (SIFT)), liefern nur fĂŒr wenige Bildpaare des Datensatzes zufriedenstellende Ergebnisse. Erst durch die Verwendung von Verfahren, die neuronale Netze zur Merkmalsdetektion und Merkmalsbeschreibung einsetzen, können fĂŒr einen großen Teil der historischen Bilder des Benchmark-Datensatzes zuverlĂ€ssig homologe Bildpunkte gefunden werden. Die Bestimmung der Kameraorientierung erfordert zusĂ€tzlich zur Merkmalszuordnung eine initiale SchĂ€tzung der Kamerakonstante, die jedoch im Zuge der Digitalisierung des analogen Bildes nicht mehr direkt zu ermitteln ist. Eine mögliche Lösung dieses Problems ist die Verwendung von drei Fluchtpunkten, die automatisiert im historischen Bild detektiert werden und aus denen dann die Kamerakonstante bestimmt werden kann. Die Kombination aus SchĂ€tzung der Kamerakonstante und robuster Merkmalszuordnung wird in den SfM Prozess integriert und erlaubt die Bestimmung der Kameraorientierung historischer Bilder. Auf Grundlage dieser Ergebnisse wird ein Arbeitsablauf konzipiert, der es ermöglicht, Archive mittels dieses photogrammetrischen Verfahrens direkt an 3D-Anwendungen anzubinden. Eine Suchanfrage in Archiven erfolgt ĂŒblicherweise ĂŒber Schlagworte, die dann als Metadaten dem entsprechenden Objekt zugeordnet sein mĂŒssen. Eine Suche nach einem bestimmten GebĂ€ude generiert deshalb u.a. Treffer zu Zeichnungen, GemĂ€lden, Veranstaltungen, Innen- oder Detailansichten. FĂŒr die erfolgreiche Anwendung von SfM im stadtrĂ€umlichen Kontext interessiert jedoch v.a. die fotografische Außenansicht des GebĂ€udes. WĂ€hrend die Bilder fĂŒr ein einzelnes GebĂ€ude von Hand sortiert werden können, ist dieser Prozess fĂŒr mehrere GebĂ€ude zu zeitaufwendig. Daher wird in Zusammenarbeit mit dem Competence Center for Scalable Data Services and Solutions (ScaDS) ein Ansatz entwickelt, um historische Fotografien ĂŒber BildĂ€hnlichkeiten zu filtern. Dieser ermöglicht zuverlĂ€ssig ĂŒber die Auswahl eines oder mehrerer Suchbilder die Suche nach inhaltsĂ€hnlichen Ansichten. Durch die VerknĂŒpfung der inhaltsbasierten Suche mit dem SfM Ansatz ist es möglich, automatisiert fĂŒr eine große Anzahl historischer Fotografien die Kameraparameter zu bestimmen. Das entwickelte Verfahren stellt eine deutliche Verbesserung im Vergleich zu kommerziellen und open-source SfM Standardlösungen dar. Das Ergebnis dieser Arbeit ist ein kompletter Arbeitsablauf vom Archiv bis zur Applikation, der automatisch Bilder filtert und diese orientiert. Die zu erwartende Genauigkeit von wenigen Metern fĂŒr die Kameraposition sind ausreichend fĂŒr die dargestellten Anwendungen in dieser Arbeit, bieten aber weiteres Verbesserungspotential. Eine Anbindung an Archive, die ĂŒber Schnittstellen automatisch Fotografien und Positionen austauschen soll, befindet sich bereits in der Entwicklung. Dadurch ist es möglich, innere und Ă€ußere Orientierungsparameter direkt von der historischen Fotografie als Metadaten abzurufen, was neue Forschungsfelder eröffnet.:1 Introduction 1 1.1 Thesis structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Historical image data and archives . . . . . . . . . . . . . . . . . . . . . . . . 2 1.3 Structure-from-Motion for historical images . . . . . . . . . . . . . . . . . . . 4 1.3.1 Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.3.2 Selection of images and preprocessing . . . . . . . . . . . . . . . . . . 5 1.3.3 Feature detection, feature description and feature matching . . . . . . 6 1.3.3.1 Feature detection . . . . . . . . . . . . . . . . . . . . . . . . 7 1.3.3.2 Feature description . . . . . . . . . . . . . . . . . . . . . . . 9 1.3.3.3 Feature matching . . . . . . . . . . . . . . . . . . . . . . . . 10 1.3.3.4 Geometric verification and robust estimators . . . . . . . . . 13 1.3.3.5 Joint methods . . . . . . . . . . . . . . . .

    TRECVID 2009 - goals, tasks, data, evaluation mechanisms and metrics

    Get PDF
    The TREC Video Retrieval Evaluation (TRECVID) 2009 was a TREC-style video analysis and retrieval evaluation, the goal of which was to promote progress in content-based exploitation of digital video via open, metrics-based evaluation. Over the last 9 years TRECVID has yielded a better understanding of how systems can effectively accomplish such processing and how one can reliably benchmark their performance. 63 teams from various research organizations — 28 from Europe, 24 from Asia, 10 from North America, and 1 from Africa — completed one or more of four tasks: high-level feature extraction, search (fully automatic, manually assisted, or interactive), copy detection, or surveillance event detection. This paper gives an overview of the tasks, data used, evaluation mechanisms and performanc
    • 

    corecore