1,944 research outputs found

    Model-Based Environmental Visual Perception for Humanoid Robots

    Get PDF
    The visual perception of a robot should answer two fundamental questions: What? and Where? In order to properly and efficiently reply to these questions, it is essential to establish a bidirectional coupling between the external stimuli and the internal representations. This coupling links the physical world with the inner abstraction models by sensor transformation, recognition, matching and optimization algorithms. The objective of this PhD is to establish this sensor-model coupling

    Methods for Ellipse Detection from Edge Maps of Real Images

    Get PDF

    The Sound of Music: Externalist Style

    Get PDF
    Philosophical exploration of individualism and externalism in the cognitive sciences most recently has been focused on general evaluations of these two views (Adams & Aizawa 2008, Rupert 2008, Wilson 2004, Clark 2008). Here we return to broaden an earlier phase of the debate between individualists and externalists about cognition, one that considered in detail particular theories, such as those in developmental psychology (Patterson 1991) and the computational theory of vision (Burge 1986, Segal 1989). Music cognition is an area in the cognitive sciences that has received little attention from philosophers, though it has relatively recently been thrown into the externalist spotlight (Cochrane 2008, Kruger 2014, Kersten forthcoming). Given that individualism can be thought of as a kind of paradigm for research on cognition, we provide a brief overview of the field of music cognition and individualistic tendencies within the field (sections 2 and 3) before turning to consider externalist alternatives to individualistic paradigms (section 4-5) and then arguing for a qualified form of externalism about music cognition (section 6)

    Extraction of buildings from high-resolution satellite data and airborne Lidar

    Get PDF
    Automatic building extraction is a difficult object recognition problem due to a high complexity of the scene content and the object representation. There is a dilemma to select appropriate building models to be reconstructed; the models have to be generic in order to represent a variety of building shape, whereas they also have to be specific to differentiate buildings from other objects in the scene. Therefore, a scientific challenge of building extraction lies in constructing a framework for modelling building objects with appropriate balance between generic and specific models. This thesis investigates a synergy of IKONOS satellite imagery and airborne LIDAR data, which have recently emerged as powerful remote sensing tools, and aims to develop an automatic system, which delineates building outlines with more complex shape, but by less use of geometric constraints. The method described in this thesis is a two step procedure: building detection and building description. A method of automatic building detection that can separate individual buildings from surrounding features is presented. The process is realized in a hierarchical strategy, where terrain, trees, and building objects are sequentially detected. Major research efforts are made on the development of a LIDAR filtering technique, which automatically detects terrain surfaces from a cloud of 3D laser points. The thesis also proposes a method of building description to automatically reconstruct building boundaries. A building object is generally represented as a mosaic of convex polygons. The first stage is to generate polygonal cues by a recursive intersection of both datadriven and model-driven linear features extracted from IKONOS imagery and LIDAR data. The second stage is to collect relevant polygons comprising the building object and to merge them for reconstructing the building outlines. The developed LIDAR filter was tested in a range of different landforms, and showed good results to meet most of the requirements of DTM generation and building detection. Also, the implemented building extraction system was able to successfully reconstruct the building outlines, and the accuracy of the building extraction is good enough for mapping purposes

    Extraction of buildings from high-resolution satellite data and airborne LIDAR

    Get PDF
    Automatic building extraction is a difficult object recognition problem due to a high complexity of the scene content and the object representation. There is a dilemma to select appropriate building models to be reconstructed; the models have to be generic in order to represent a variety of building shape, whereas they also have to be specific to differentiate buildings from other objects in the scene. Therefore, a scientific challenge of building extraction lies in constructing a framework for modelling building objects with appropriate balance between generic and specific models. This thesis investigates a synergy of IKONOS satellite imagery and airborne LIDAR data, which have recently emerged as powerful remote sensing tools, and aims to develop an automatic system, which delineates building outlines with more complex shape, but by less use of geometric constraints. The method described in this thesis is a two step procedure: building detection and building description. A method of automatic building detection that can separate individual buildings from surrounding features is presented. The process is realized in a hierarchical strategy, where terrain, trees, and building objects are sequentially detected. Major research efforts are made on the development of a LIDAR filtering technique, which automatically detects terrain surfaces from a cloud of 3D laser points. The thesis also proposes a method of building description to automatically reconstruct building boundaries. A building object is generally represented as a mosaic of convex polygons. The first stage is to generate polygonal cues by a recursive intersection of both datadriven and model-driven linear features extracted from IKONOS imagery and LIDAR data. The second stage is to collect relevant polygons comprising the building object and to merge them for reconstructing the building outlines. The developed LIDAR filter was tested in a range of different landforms, and showed good results to meet most of the requirements of DTM generation and building detection. Also, the implemented building extraction system was able to successfully reconstruct the building outlines, and the accuracy of the building extraction is good enough for mapping purposes.EThOS - Electronic Theses Online ServiceGBUnited Kingdo

    Digital Image Access & Retrieval

    Get PDF
    The 33th Annual Clinic on Library Applications of Data Processing, held at the University of Illinois at Urbana-Champaign in March of 1996, addressed the theme of "Digital Image Access & Retrieval." The papers from this conference cover a wide range of topics concerning digital imaging technology for visual resource collections. Papers covered three general areas: (1) systems, planning, and implementation; (2) automatic and semi-automatic indexing; and (3) preservation with the bulk of the conference focusing on indexing and retrieval.published or submitted for publicatio

    Grouping Uncertain Oriented Projective Geometric Entities with Application to Automatic Building Reconstruction

    Get PDF
    The fully automatic reconstruction of 3d scenes from a set of 2d images has always been a key issue in photogrammetry and computer vision and has not been solved satisfactory so far. Most of the current approaches match features between the images based on radiometric cues followed by a reconstruction using the image geometry. The motivation for this work is the conjecture that in the presence of highly redundant data it should be possible to recover the scene structure by grouping together geometric primitives in a bottom-up manner. Oriented projective geometry will be used throughout this work, which allows to represent geometric primitives, such as points, lines and planes in 2d and 3d space as well as projective cameras, together with their uncertainty. The first major contribution of the work is the use of uncertain oriented projective geometry, rather than uncertain projective geometry, that enables the representation of more complex compound entities, such as line segments and polygons in 2d and 3d space as well as 2d edgels and 3d facets. Within the uncertain oriented projective framework a procedure is developed, which allows to test pairwise relations between the various uncertain oriented projective entities. Again, the novelty lies in the possibility to check relations between the novel compound entities. The second major contribution of the work is the development of a data structure, specifically designed to enable performing the tests between large numbers of entities in an efficient manner. Being able to efficiently test relations between the geometric entities, a framework for grouping those entities together is developed. Various different grouping methods are discussed. The third major contribution of this work is the development of a novel grouping method that by analyzing the entropy change incurred by incrementally adding observations into an estimation is able to balance efficiency against robustness in order to achieve better grouping results. Finally the applicability of the proposed representations, tests and grouping methods for the task of purely geometry based building reconstruction from oriented aerial images is demonstrated. It will be shown that in the presence of highly redundant datasets it is possible to achieve reasonable reconstruction results by grouping together geometric primitives.Gruppierung unsicherer orientierter projektiver geometrischer Elemente mit Anwendung in der automatischen Gebäuderekonstruktion Die vollautomatische Rekonstruktion von 3D Szenen aus einer Menge von 2D Bildern war immer ein Hauptanliegen in der Photogrammetrie und Computer Vision und wurde bisher noch nicht zufriedenstellend gelöst. Die meisten aktuellen Ansätze ordnen Merkmale zwischen den Bildern basierend auf radiometrischen Eigenschaften zu. Daran schließt sich dann eine Rekonstruktion auf der Basis der Bildgeometrie an. Die Motivation für diese Arbeit ist die These, dass es möglich sein sollte, die Struktur einer Szene durch Gruppierung geometrischer Primitive zu rekonstruieren, falls die Eingabedaten genügend redundant sind. Orientierte projektive Geometrie wird in dieser Arbeit zur Repräsentation geometrischer Primitive, wie Punkten, Linien und Ebenen in 2D und 3D sowie projektiver Kameras, zusammen mit ihrer Unsicherheit verwendet.Der erste Hauptbeitrag dieser Arbeit ist die Verwendung unsicherer orientierter projektiver Geometrie, anstatt von unsicherer projektiver Geometrie, welche die Repräsentation von komplexeren zusammengesetzten Objekten, wie Liniensegmenten und Polygonen in 2D und 3D sowie 2D Edgels und 3D Facetten, ermöglicht. Innerhalb dieser unsicheren orientierten projektiven Repräsentation wird ein Verfahren zum testen paarweiser Relationen zwischen den verschiedenen unsicheren orientierten projektiven geometrischen Elementen entwickelt. Dabei liegt die Neuheit wieder in der Möglichkeit, Relationen zwischen den neuen zusammengesetzten Elementen zu prüfen. Der zweite Hauptbeitrag dieser Arbeit ist die Entwicklung einer Datenstruktur, welche speziell auf die effiziente Prüfung von solchen Relationen zwischen vielen Elementen ausgelegt ist. Die Möglichkeit zur effizienten Prüfung von Relationen zwischen den geometrischen Elementen erlaubt nun die Entwicklung eines Systems zur Gruppierung dieser Elemente. Verschiedene Gruppierungsmethoden werden vorgestellt. Der dritte Hauptbeitrag dieser Arbeit ist die Entwicklung einer neuen Gruppierungsmethode, die durch die Analyse der änderung der Entropie beim Hinzufügen von Beobachtungen in die Schätzung Effizienz und Robustheit gegeneinander ausbalanciert und dadurch bessere Gruppierungsergebnisse erzielt. Zum Schluss wird die Anwendbarkeit der vorgeschlagenen Repräsentationen, Tests und Gruppierungsmethoden für die ausschließlich geometriebasierte Gebäuderekonstruktion aus orientierten Luftbildern demonstriert. Es wird gezeigt, dass unter der Annahme von hoch redundanten Datensätzen vernünftige Rekonstruktionsergebnisse durch Gruppierung von geometrischen Primitiven erzielbar sind

    Attentional Selection in Object Recognition

    Get PDF
    A key problem in object recognition is selection, namely, the problem of identifying regions in an image within which to start the recognition process, ideally by isolating regions that are likely to come from a single object. Such a selection mechanism has been found to be crucial in reducing the combinatorial search involved in the matching stage of object recognition. Even though selection is of help in recognition, it has largely remained unsolved because of the difficulty in isolating regions belonging to objects under complex imaging conditions involving occlusions, changing illumination, and object appearances. This thesis presents a novel approach to the selection problem by proposing a computational model of visual attentional selection as a paradigm for selection in recognition. In particular, it proposes two modes of attentional selection, namely, attracted and pay attention modes as being appropriate for data and model-driven selection in recognition. An implementation of this model has led to new ways of extracting color, texture and line group information in images, and their subsequent use in isolating areas of the scene likely to contain the model object. Among the specific results in this thesis are: a method of specifying color by perceptual color categories for fast color region segmentation and color-based localization of objects, and a result showing that the recognition of texture patterns on model objects is possible under changes in orientation and occlusions without detailed segmentation. The thesis also presents an evaluation of the proposed model by integrating with a 3D from 2D object recognition system and recording the improvement in performance. These results indicate that attentional selection can significantly overcome the computational bottleneck in object recognition, both due to a reduction in the number of features, and due to a reduction in the number of matches during recognition using the information derived during selection. Finally, these studies have revealed a surprising use of selection, namely, in the partial solution of the pose of a 3D object

    Single-Microphone Speech Enhancement and Separation Using Deep Learning

    Get PDF
    The cocktail party problem comprises the challenging task of understanding a speech signal in a complex acoustic environment, where multiple speakers and background noise signals simultaneously interfere with the speech signal of interest. A signal processing algorithm that can effectively increase the speech intelligibility and quality of speech signals in such complicated acoustic situations is highly desirable. Especially for applications involving mobile communication devices and hearing assistive devices. Due to the re-emergence of machine learning techniques, today, known as deep learning, the challenges involved with such algorithms might be overcome. In this PhD thesis, we study and develop deep learning-based techniques for two sub-disciplines of the cocktail party problem: single-microphone speech enhancement and single-microphone multi-talker speech separation. Specifically, we conduct in-depth empirical analysis of the generalizability capability of modern deep learning-based single-microphone speech enhancement algorithms. We show that performance of such algorithms is closely linked to the training data, and good generalizability can be achieved with carefully designed training data. Furthermore, we propose uPIT, a deep learning-based algorithm for single-microphone speech separation and we report state-of-the-art results on a speaker-independent multi-talker speech separation task. Additionally, we show that uPIT works well for joint speech separation and enhancement without explicit prior knowledge about the noise type or number of speakers. Finally, we show that deep learning-based speech enhancement algorithms designed to minimize the classical short-time spectral amplitude mean squared error leads to enhanced speech signals which are essentially optimal in terms of STOI, a state-of-the-art speech intelligibility estimator.Comment: PhD Thesis. 233 page
    • …
    corecore