33 research outputs found

    Aerial reconstructions via probabilistic data fusion

    Get PDF
    Thesis (S.M.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2013.Cataloged from PDF version of thesis.Includes bibliographical references (pages 133-136).In this thesis we propose a probabilistic model that incorporates multi-modal noisy measurements: aerial images and Light Detection and Ranging (LiDAR) to recover scene geometry and appearance in order to build a 3D photo-realistic model of a given scene. In urban environments, these reconstructions have many applications, such as surveillance, and urban planning. The proposed probabilistic model can be viewed as a data fusion model, in which the two data sources complement each other and allow for better results than when only a single one is present. Moreover, this modeling approach has the advantages that it can capture uncertainty in reconstructions, and the ability to incorporate additional scene measurements easily when the sensor models are available. Furthermore, the results obtained with the proposed method are qualitatively comparable to those obtained with traditional structure from motion, despite differences in modeling approach and reconstruction goals. The appearance and geometry trade-off present in the model between the different data sources can be used to obtain a similar (and sometime superior) reconstruction of complex urban scenes with fewer image observations over traditional reconstruction methods. Extending beyond reconstructions, the proposed model has two alluring features: first we are able to determine absolute scale and orientation, and secondly, we are able to detect moving objects. From an implementation standpoint, this thesis has shown how to leverage the power of graphic processing units (GPUs) and parallel programming to allow fast inference. Achieving real time rendering of scenes with hundreds of thousands of geometric primitives and inferring latent appearance, camera pose and geometry in the order of seconds each.by Randi Cabezas.S.M

    Response-Based and Counterfactual Learning for Sequence-to-Sequence Tasks in NLP

    Get PDF
    Many applications nowadays rely on statistical machine-learnt models, such as a rising number of virtual personal assistants. To train statistical models, typically large amounts of labelled data are required which are expensive and difficult to obtain. In this thesis, we investigate two approaches that alleviate the need for labelled data by leveraging feedback to model outputs instead. Both scenarios are applied to two sequence-to-sequence tasks for Natural Language Processing (NLP): machine translation and semantic parsing for question-answering. Additionally, we define a new question-answering task based on the geographical database OpenStreetMap (OSM) and collect a corpus, NLmaps v2, with 28,609 question-parse pairs. With the corpus, we build semantic parsers for subsequent experiments. Furthermore, we are the first to design a natural language interface to OSM, for which we specifically tailor a parser. The first approach to learn from feedback given to model outputs, considers a scenario where weak supervision is available by grounding the model in a downstream task for which labelled data has been collected. Feedback obtained from the downstream task is used to improve the model in a response-based on-policy learning setup. We apply this approach to improve a machine translation system, which is grounded in a multilingual semantic parsing task, by employing ramp loss objectives. Next, we improve a neural semantic parser where only gold answers, but not gold parses, are available, by lifting ramp loss objectives to non-linear neural networks. In the second approach to learn from feedback, instead of collecting expensive labelled data, a model is deployed and user-model interactions are recorded in a log. This log is used to improve a model in a counterfactual off-policy learning setup. We first exemplify this approach on a domain adaptation task for machine translation. Here, we show that counterfactual learning can be applied to tasks with large output spaces and, in contrast to prevalent theory, deterministic logs can successfully be used on sequence-to-sequence tasks for NLP. Next, we demonstrate on a semantic parsing task that counterfactual learning can also be applied when the underlying model is a neural network and feedback is collected from human users. Applying both approaches to the same semantic parsing task, allows us to draw a direct comparison between them. Response-based on-policy learning outperforms counterfactual off-policy learning, but requires expensive labelled data for the downstream task, whereas interaction logs for counterfactual learning can be easier to obtain in various scenarios

    Grouping Uncertain Oriented Projective Geometric Entities with Application to Automatic Building Reconstruction

    Get PDF
    The fully automatic reconstruction of 3d scenes from a set of 2d images has always been a key issue in photogrammetry and computer vision and has not been solved satisfactory so far. Most of the current approaches match features between the images based on radiometric cues followed by a reconstruction using the image geometry. The motivation for this work is the conjecture that in the presence of highly redundant data it should be possible to recover the scene structure by grouping together geometric primitives in a bottom-up manner. Oriented projective geometry will be used throughout this work, which allows to represent geometric primitives, such as points, lines and planes in 2d and 3d space as well as projective cameras, together with their uncertainty. The first major contribution of the work is the use of uncertain oriented projective geometry, rather than uncertain projective geometry, that enables the representation of more complex compound entities, such as line segments and polygons in 2d and 3d space as well as 2d edgels and 3d facets. Within the uncertain oriented projective framework a procedure is developed, which allows to test pairwise relations between the various uncertain oriented projective entities. Again, the novelty lies in the possibility to check relations between the novel compound entities. The second major contribution of the work is the development of a data structure, specifically designed to enable performing the tests between large numbers of entities in an efficient manner. Being able to efficiently test relations between the geometric entities, a framework for grouping those entities together is developed. Various different grouping methods are discussed. The third major contribution of this work is the development of a novel grouping method that by analyzing the entropy change incurred by incrementally adding observations into an estimation is able to balance efficiency against robustness in order to achieve better grouping results. Finally the applicability of the proposed representations, tests and grouping methods for the task of purely geometry based building reconstruction from oriented aerial images is demonstrated. It will be shown that in the presence of highly redundant datasets it is possible to achieve reasonable reconstruction results by grouping together geometric primitives.Gruppierung unsicherer orientierter projektiver geometrischer Elemente mit Anwendung in der automatischen GebĂ€uderekonstruktion Die vollautomatische Rekonstruktion von 3D Szenen aus einer Menge von 2D Bildern war immer ein Hauptanliegen in der Photogrammetrie und Computer Vision und wurde bisher noch nicht zufriedenstellend gelöst. Die meisten aktuellen AnsĂ€tze ordnen Merkmale zwischen den Bildern basierend auf radiometrischen Eigenschaften zu. Daran schließt sich dann eine Rekonstruktion auf der Basis der Bildgeometrie an. Die Motivation fĂŒr diese Arbeit ist die These, dass es möglich sein sollte, die Struktur einer Szene durch Gruppierung geometrischer Primitive zu rekonstruieren, falls die Eingabedaten genĂŒgend redundant sind. Orientierte projektive Geometrie wird in dieser Arbeit zur ReprĂ€sentation geometrischer Primitive, wie Punkten, Linien und Ebenen in 2D und 3D sowie projektiver Kameras, zusammen mit ihrer Unsicherheit verwendet.Der erste Hauptbeitrag dieser Arbeit ist die Verwendung unsicherer orientierter projektiver Geometrie, anstatt von unsicherer projektiver Geometrie, welche die ReprĂ€sentation von komplexeren zusammengesetzten Objekten, wie Liniensegmenten und Polygonen in 2D und 3D sowie 2D Edgels und 3D Facetten, ermöglicht. Innerhalb dieser unsicheren orientierten projektiven ReprĂ€sentation wird ein Verfahren zum testen paarweiser Relationen zwischen den verschiedenen unsicheren orientierten projektiven geometrischen Elementen entwickelt. Dabei liegt die Neuheit wieder in der Möglichkeit, Relationen zwischen den neuen zusammengesetzten Elementen zu prĂŒfen. Der zweite Hauptbeitrag dieser Arbeit ist die Entwicklung einer Datenstruktur, welche speziell auf die effiziente PrĂŒfung von solchen Relationen zwischen vielen Elementen ausgelegt ist. Die Möglichkeit zur effizienten PrĂŒfung von Relationen zwischen den geometrischen Elementen erlaubt nun die Entwicklung eines Systems zur Gruppierung dieser Elemente. Verschiedene Gruppierungsmethoden werden vorgestellt. Der dritte Hauptbeitrag dieser Arbeit ist die Entwicklung einer neuen Gruppierungsmethode, die durch die Analyse der Ă€nderung der Entropie beim HinzufĂŒgen von Beobachtungen in die SchĂ€tzung Effizienz und Robustheit gegeneinander ausbalanciert und dadurch bessere Gruppierungsergebnisse erzielt. Zum Schluss wird die Anwendbarkeit der vorgeschlagenen ReprĂ€sentationen, Tests und Gruppierungsmethoden fĂŒr die ausschließlich geometriebasierte GebĂ€uderekonstruktion aus orientierten Luftbildern demonstriert. Es wird gezeigt, dass unter der Annahme von hoch redundanten DatensĂ€tzen vernĂŒnftige Rekonstruktionsergebnisse durch Gruppierung von geometrischen Primitiven erzielbar sind
    corecore