292 research outputs found

    From Codes to Patterns: Designing Interactive Decoration for Tableware

    Full text link
    ABSTRACT We explore the idea of making aesthetic decorative patterns that contain multiple visual codes. We chart an iterative collaboration with ceramic designers and a restaurant to refine a recognition technology to work reliably on ceramics, produce a pattern book of designs, and prototype sets of tableware and a mobile app to enhance a dining experience. We document how the designers learned to work with and creatively exploit the technology, enriching their patterns with embellishments and backgrounds and developing strategies for embedding codes into complex designs. We discuss the potential and challenges of interacting with such patterns. We argue for a transition from designing ‘codes to patterns’ that reflects the skills of designers alongside the development of new technologies

    Vehicle make and model recognition for intelligent transportation monitoring and surveillance.

    Get PDF
    Vehicle Make and Model Recognition (VMMR) has evolved into a significant subject of study due to its importance in numerous Intelligent Transportation Systems (ITS), such as autonomous navigation, traffic analysis, traffic surveillance and security systems. A highly accurate and real-time VMMR system significantly reduces the overhead cost of resources otherwise required. The VMMR problem is a multi-class classification task with a peculiar set of issues and challenges like multiplicity, inter- and intra-make ambiguity among various vehicles makes and models, which need to be solved in an efficient and reliable manner to achieve a highly robust VMMR system. In this dissertation, facing the growing importance of make and model recognition of vehicles, we present a VMMR system that provides very high accuracy rates and is robust to several challenges. We demonstrate that the VMMR problem can be addressed by locating discriminative parts where the most significant appearance variations occur in each category, and learning expressive appearance descriptors. Given these insights, we consider two data driven frameworks: a Multiple-Instance Learning-based (MIL) system using hand-crafted features and an extended application of deep neural networks using MIL. Our approach requires only image level class labels, and the discriminative parts of each target class are selected in a fully unsupervised manner without any use of part annotations or segmentation masks, which may be costly to obtain. This advantage makes our system more intelligent, scalable, and applicable to other fine-grained recognition tasks. We constructed a dataset with 291,752 images representing 9,170 different vehicles to validate and evaluate our approach. Experimental results demonstrate that the localization of parts and distinguishing their discriminative powers for categorization improve the performance of fine-grained categorization. Extensive experiments conducted using our approaches yield superior results for images that were occluded, under low illumination, partial camera views, or even non-frontal views, available in our real-world VMMR dataset. The approaches presented herewith provide a highly accurate VMMR system for rea-ltime applications in realistic environments.\\ We also validate our system with a significant application of VMMR to ITS that involves automated vehicular surveillance. We show that our application can provide law inforcement agencies with efficient tools to search for a specific vehicle type, make, or model, and to track the path of a given vehicle using the position of multiple cameras

    Learning cognitive maps: Finding useful structure in an uncertain world

    Get PDF
    In this chapter we will describe the central mechanisms that influence how people learn about large-scale space. We will focus particularly on how these mechanisms enable people to effectively cope with both the uncertainty inherent in a constantly changing world and also with the high information content of natural environments. The major lessons are that humans get by with a less is more approach to building structure, and that they are able to quickly adapt to environmental changes thanks to a range of general purpose mechanisms. By looking at abstract principles, instead of concrete implementation details, it is shown that the study of human learning can provide valuable lessons for robotics. Finally, these issues are discussed in the context of an implementation on a mobile robot. © 2007 Springer-Verlag Berlin Heidelberg

    Probabilistic techniques in semantic mapping for mobile robotics

    Get PDF
    Los mapas semánticos son representaciones del mundo que permiten a un robot entender no sólo los aspectos espaciales de su lugar de trabajo, sino también el significado de sus elementos (objetos, habitaciones, etc.) y como los humanos interactúan con ellos (e.g. funcionalidades, eventos y relaciones). Para conseguirlo, un mapa semántico añade a las representaciones puramente espaciales, tales como mapas geométricos o topológicos, meta-información sobre los tipos de elementos y relaciones que pueden encontrarse en el entorno de trabajo. Esta meta-información, denominada conocimiento semántico o de sentido común, se codifica típicamente en Bases de Conocimiento. Un ejemplo de este tipo de información podría ser: "los frigoríficos son objetos grandes, con forma rectangular, colocados normalmente en las cocinas, y que pueden contener comida perecedera y medicación". Codificar y manejar este conocimiento semántico permite al robot razonar acerca de la información obtenida de un cierto lugar de trabajo, así como inferir nueva información con el fin de ejecutar eficientemente tareas de alto nivel como "¡hola robot! llévale la medicación a la abuela, por favor". La presente tesis propone la utilización de técnicas probabilísticas para construir y mantener mapas semánticos, lo cual presenta tres ventajas principales en comparación con los enfoques tradicionales: i) permite manejar incertidumbre (proveniente de los sensores imprecisos del robot y de los modelos empleados), ii) provee representaciones del entorno coherentes por medio del aprovechamiento de las relaciones contextuales entre los elementos observados (e.g. los frigoríficos usualmente se encuentran en las cocinas) desde un punto de vista holístico, y iii) produce valores de certidumbre que reflejan el grado de exactitud de la comprensión del robot acerca de su entorno. Específicamente, las contribuciones presentadas pueden agruparse en dos temas principales. El primer conjunto de contribuciones se basa en el problema del reconocimiento de objetos y/o habitaciones, ya que los sistemas de mapeo semántico deben contar con algoritmos de reconocimiento fiables para la construcción de representaciones válidas. Para ello se ha explorado la utilización de Modelos Gráficos Probabilísticos (Probabilistic Graphical Models o PGMs en inglés) con el fin de aprovechar las relaciones de contexto entre objetos y/o habitaciones a la vez que se maneja la incertidumbre inherente al problema de reconocimiento, y el empleo de Bases de Conocimiento para mejorar su desempeño de distintos modos, e.g., detectando resultados incoherentes, proveyendo información a priori, reduciendo la complejidad de los algoritmos de inferencia probabilística, generando ejemplos de entrenamiento sintéticos, habilitando el aprendizaje a partir de experiencias pasadas, etc. El segundo grupo de contribuciones acomoda los resultados probabilísticos provenientes de los algoritmos de reconocimiento desarrollados en una nueva representación semántica, denominada Multiversal Semantic Map (MvSmap). Este mapa gestiona múltiples interpretaciones del espacio de trabajo del robot, llamadas universos, los cuales son anotados con la probabilidad de ser los correctos de acuerdo con el conocimiento actual del robot. Así, este enfoque proporciona una creencia fundamentada sobre la exactitud de la comprensión del robot sobre su entorno, lo que le permite operar de una manera más eficiente y coherente. Los algoritmos probabilísticos propuestos han sido testeados concienzudamente y comparados con otros enfoques actuales e innovadores empleando conjuntos de datos del estado del arte. De manera adicional, esta tesis también contribuye con dos conjuntos de datos, UMA-Offices and Robot@Home, los cuales contienen información sensorial capturada en distintos entornos de oficinas y casas, así como dos herramientas software, la librería Undirected Probabilistic Graphical Models in C++ (UPGMpp), y el conjunto de herramientas Object Labeling Toolkit (OLT), para el trabajo con Modelos Gráficos Probabilísticos y el procesamiento de conjuntos de datos respectivamente

    Inferring Geodesic Cerebrovascular Graphs: Image Processing, Topological Alignment and Biomarkers Extraction

    Get PDF
    A vectorial representation of the vascular network that embodies quantitative features - location, direction, scale, and bifurcations - has many potential neuro-vascular applications. Patient-specific models support computer-assisted surgical procedures in neurovascular interventions, while analyses on multiple subjects are essential for group-level studies on which clinical prediction and therapeutic inference ultimately depend. This first motivated the development of a variety of methods to segment the cerebrovascular system. Nonetheless, a number of limitations, ranging from data-driven inhomogeneities, the anatomical intra- and inter-subject variability, the lack of exhaustive ground-truth, the need for operator-dependent processing pipelines, and the highly non-linear vascular domain, still make the automatic inference of the cerebrovascular topology an open problem. In this thesis, brain vessels’ topology is inferred by focusing on their connectedness. With a novel framework, the brain vasculature is recovered from 3D angiographies by solving a connectivity-optimised anisotropic level-set over a voxel-wise tensor field representing the orientation of the underlying vasculature. Assuming vessels joining by minimal paths, a connectivity paradigm is formulated to automatically determine the vascular topology as an over-connected geodesic graph. Ultimately, deep-brain vascular structures are extracted with geodesic minimum spanning trees. The inferred topologies are then aligned with similar ones for labelling and propagating information over a non-linear vectorial domain, where the branching pattern of a set of vessels transcends a subject-specific quantized grid. Using a multi-source embedding of a vascular graph, the pairwise registration of topologies is performed with the state-of-the-art graph matching techniques employed in computer vision. Functional biomarkers are determined over the neurovascular graphs with two complementary approaches. Efficient approximations of blood flow and pressure drop account for autoregulation and compensation mechanisms in the whole network in presence of perturbations, using lumped-parameters analog-equivalents from clinical angiographies. Also, a localised NURBS-based parametrisation of bifurcations is introduced to model fluid-solid interactions by means of hemodynamic simulations using an isogeometric analysis framework, where both geometry and solution profile at the interface share the same homogeneous domain. Experimental results on synthetic and clinical angiographies validated the proposed formulations. Perspectives and future works are discussed for the group-wise alignment of cerebrovascular topologies over a population, towards defining cerebrovascular atlases, and for further topological optimisation strategies and risk prediction models for therapeutic inference. Most of the algorithms presented in this work are available as part of the open-source package VTrails

    Text Segmentation in Web Images Using Colour Perception and Topological Features

    Get PDF
    The research presented in this thesis addresses the problem of Text Segmentation in Web images. Text is routinely created in image form (headers, banners etc.) on Web pages, as an attempt to overcome the stylistic limitations of HTML. This text however, has a potentially high semantic value in terms of indexing and searching for the corresponding Web pages. As current search engine technology does not allow for text extraction and recognition in images, the text in image form is ignored. Moreover, it is desirable to obtain a uniform representation of all visible text of a Web page (for applications such as voice browsing or automated content analysis). This thesis presents two methods for text segmentation in Web images using colour perception and topological features. The nature of Web images and the implicit problems to text segmentation are described, and a study is performed to assess the magnitude of the problem and establish the need for automated text segmentation methods. Two segmentation methods are subsequently presented: the Split-and-Merge segmentation method and the Fuzzy segmentation method. Although approached in a distinctly different way in each method, the safe assumption that a human being should be able to read the text in any given Web Image is the foundation of both methods’ reasoning. This anthropocentric character of the methods along with the use of topological features of connected components, comprise the underlying working principles of the methods. An approach for classifying the connected components resulting from the segmentation methods as either characters or parts of the background is also presented

    Robust Visual Self-localization and Navigation in Outdoor Environments Using Slow Feature Analysis

    Get PDF
    Metka B. Robust Visual Self-localization and Navigation in Outdoor Environments Using Slow Feature Analysis. Bielefeld: Universität Bielefeld; 2019.Self-localization and navigation in outdoor environments are fundamental problems a mobile robot has to solve in order to autonomously execute tasks in a spatial environ- ment. Techniques based on the Global Positioning System (GPS) or laser-range finders have been well established but suffer from the drawbacks of limited satellite availability or high hardware effort and costs. Vision-based methods can provide an interesting al- ternative, but are still a field of active research due to the challenges of visual perception such as illumination and weather changes or long-term seasonal effects. This thesis approaches the problem of robust visual self-localization and navigation using a biologically motivated model based on unsupervised Slow Feature Analysis (SFA). It is inspired by the discovery of neurons in a rat’s brain that form a neural representation of the animal’s spatial attributes. A similar hierarchical SFA network has been shown to learn representations of either the position or the orientation directly from the visual input of a virtual rat depending on the movement statistics during training. An extension to the hierarchical SFA network is introduced that allows to learn an orientation invariant representation of the position by manipulating the perceived im- age statistics exploiting the properties of panoramic vision. The model is applied on a mobile robot in real world open field experiments obtaining localization accuracies comparable to state-of-the-art approaches. The self-localization performance can be fur- ther improved by incorporating wheel odometry into the purely vision based approach. To achieve this, a method for the unsupervised learning of a mapping from slow fea- ture to metric space is developed. Robustness w.r.t. short- and long-term appearance changes is tackled by re-structuring the temporal order of the training image sequence based on the identification of crossings in the training trajectory. Re-inserting images of the same place in different conditions into the training sequence increases the temporal variation of environmental effects and thereby improves invariance due to the slowness objective of SFA. Finally, a straightforward method for navigation in slow feature space is presented. Navigation can be performed efficiently by following the SFA-gradient, approximated from distance measurements between the slow feature values at the target and the current location. It is shown that the properties of the learned representations enable complex navigation behaviors without explicit trajectory planning

    Camera Marker Networks for Pose Estimation and Scene Understanding in Construction Automation and Robotics.

    Full text link
    The construction industry faces challenges that include high workplace injuries and fatalities, stagnant productivity, and skill shortage. Automation and Robotics in Construction (ARC) has been proposed in the literature as a potential solution that makes machinery easier to collaborate with, facilitates better decision-making, or enables autonomous behavior. However, there are two primary technical challenges in ARC: 1) unstructured and featureless environments; and 2) differences between the as-designed and the as-built. It is therefore impossible to directly replicate conventional automation methods adopted in industries such as manufacturing on construction sites. In particular, two fundamental problems, pose estimation and scene understanding, must be addressed to realize the full potential of ARC. This dissertation proposes a pose estimation and scene understanding framework that addresses the identified research gaps by exploiting cameras, markers, and planar structures to mitigate the identified technical challenges. A fast plane extraction algorithm is developed for efficient modeling and understanding of built environments. A marker registration algorithm is designed for robust, accurate, cost-efficient, and rapidly reconfigurable pose estimation in unstructured and featureless environments. Camera marker networks are then established for unified and systematic design, estimation, and uncertainty analysis in larger scale applications. The proposed algorithms' efficiency has been validated through comprehensive experiments. Specifically, the speed, accuracy and robustness of the fast plane extraction and the marker registration have been demonstrated to be superior to existing state-of-the-art algorithms. These algorithms have also been implemented in two groups of ARC applications to demonstrate the proposed framework's effectiveness, wherein the applications themselves have significant social and economic value. The first group is related to in-situ robotic machinery, including an autonomous manipulator for assembling digital architecture designs on construction sites to help improve productivity and quality; and an intelligent guidance and monitoring system for articulated machinery such as excavators to help improve safety. The second group emphasizes human-machine interaction to make ARC more effective, including a mobile Building Information Modeling and way-finding platform with discrete location recognition to increase indoor facility management efficiency; and a 3D scanning and modeling solution for rapid and cost-efficient dimension checking and concise as-built modeling.PHDCivil EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/113481/1/cforrest_1.pd

    New editing techniques for video post-processing

    Get PDF
    This thesis contributes to capturing 3D cloth shape, editing cloth texture and altering object shape and motion in multi-camera and monocular video recordings. We propose a technique to capture cloth shape from a 3D scene flow by determining optical flow in several camera views. Together with a silhouette matching constraint we can track and reconstruct cloth surfaces in long video sequences. In the area of garment motion capture, we present a system to reconstruct time-coherent triangle meshes from multi-view video recordings. Texture mapping of the acquired triangle meshes is used to replace the recorded texture with new cloth patterns. We extend this work to the more challenging single camera view case. Extracting texture deformation and shading effects simultaneously enables us to achieve texture replacement effects for garments in monocular video recordings. Finally, we propose a system for the keyframe editing of video objects. A color-based segmentation algorithm together with automatic video inpainting for filling in missing background texture allows us to edit the shape and motion of 2D video objects. We present examples for altering object trajectories, applying non-rigid deformation and simulating camera motion.In dieser Dissertation stellen wir Beiträge zur 3D-Rekonstruktion von Stoffoberfächen, zum Editieren von Stofftexturen und zum Editieren von Form und Bewegung von Videoobjekten in Multikamera- und Einkamera-Aufnahmen vor. Wir beschreiben eine Methode für die 3D-Rekonstruktion von Stoffoberflächen, die auf der Bestimmung des optischen Fluß in mehreren Kameraansichten basiert. In Kombination mit einem Abgleich der Objektsilhouetten im Video und in der Rekonstruktion erhalten wir Rekonstruktionsergebnisse für längere Videosequenzen. Für die Rekonstruktion von Kleidungsstücken beschreiben wir ein System, das zeitlich kohärente Dreiecksnetze aus Multikamera-Aufnahmen rekonstruiert. Mittels Texturemapping der erhaltenen Dreiecksnetze wird die Stofftextur in der Aufnahme mit neuen Texturen ersetzt. Wir setzen diese Arbeit fort, indem wir den anspruchsvolleren Fall mit nur einer einzelnen Videokamera betrachten. Um realistische Resultate beim Ersetzen der Textur zu erzielen, werden sowohl Texturdeformationen durch zugrundeliegende Deformation der Oberfläche als auch Beleuchtungseffekte berücksichtigt. Im letzten Teil der Dissertation stellen wir ein System zum Editieren von Videoobjekten mittels Keyframes vor. Dies wird durch eine Kombination eines farbbasierten Segmentierungsalgorithmus mit automatischem Auffüllen des Hintergrunds erreicht, wodurch Form und Bewegung von 2D-Videoobjekten editiert werden können. Wir zeigen Beispiele für editierte Objekttrajektorien, beliebige Deformationen und simulierte Kamerabewegung
    corecore