1,007 research outputs found

    Human-robot interaction and computer-vision-based services for autonomous robots

    Get PDF
    L'Aprenentatge per Imitació (IL), o Programació de robots per Demostració (PbD), abasta mètodes pels quals un robot aprèn noves habilitats a través de l'orientació humana i la imitació. La PbD s'inspira en la forma en què els éssers humans aprenen noves habilitats per imitació amb la finalitat de desenvolupar mètodes pels quals les noves tasques es poden transferir als robots. Aquesta tesi està motivada per la pregunta genèrica de "què imitar?", Que es refereix al problema de com extreure les característiques essencials d'una tasca. Amb aquesta finalitat, aquí adoptem la perspectiva del Reconeixement d'Accions (AR) per tal de permetre que el robot decideixi el què cal imitar o inferir en interactuar amb un ésser humà. L'enfoc proposat es basa en un mètode ben conegut que prové del processament del llenguatge natural: és a dir, la bossa de paraules (BoW). Aquest mètode s'aplica a grans bases de dades per tal d'obtenir un model entrenat. Encara que BoW és una tècnica d'aprenentatge de màquines que s'utilitza en diversos camps de la investigació, en la classificació d'accions per a l'aprenentatge en robots està lluny de ser acurada. D'altra banda, se centra en la classificació d'objectes i gestos en lloc d'accions. Per tant, en aquesta tesi es demostra que el mètode és adequat, en escenaris de classificació d'accions, per a la fusió d'informació de diferents fonts o de diferents assajos. Aquesta tesi fa tres contribucions: (1) es proposa un mètode general per fer front al reconeixement d'accions i per tant contribuir a l'aprenentatge per imitació; (2) la metodologia pot aplicar-se a grans bases de dades, que inclouen diferents modes de captura de les accions; i (3) el mètode s'aplica específicament en un projecte internacional d'innovació real anomenat Vinbot.El Aprendizaje por Imitación (IL), o Programación de robots por Demostración (PbD), abarca métodos por los cuales un robot aprende nuevas habilidades a través de la orientación humana y la imitación. La PbD se inspira en la forma en que los seres humanos aprenden nuevas habilidades por imitación con el fin de desarrollar métodos por los cuales las nuevas tareas se pueden transferir a los robots. Esta tesis está motivada por la pregunta genérica de "qué imitar?", que se refiere al problema de cómo extraer las características esenciales de una tarea. Con este fin, aquí adoptamos la perspectiva del Reconocimiento de Acciones (AR) con el fin de permitir que el robot decida lo que hay que imitar o inferir al interactuar con un ser humano. El enfoque propuesto se basa en un método bien conocido que proviene del procesamiento del lenguaje natural: es decir, la bolsa de palabras (BoW). Este método se aplica a grandes bases de datos con el fin de obtener un modelo entrenado. Aunque BoW es una técnica de aprendizaje de máquinas que se utiliza en diversos campos de la investigación, en la clasificación de acciones para el aprendizaje en robots está lejos de ser acurada. Además, se centra en la clasificación de objetos y gestos en lugar de acciones. Por lo tanto, en esta tesis se demuestra que el método es adecuado, en escenarios de clasificación de acciones, para la fusión de información de diferentes fuentes o de diferentes ensayos. Esta tesis hace tres contribuciones: (1) se propone un método general para hacer frente al reconocimiento de acciones y por lo tanto contribuir al aprendizaje por imitación; (2) la metodología puede aplicarse a grandes bases de datos, que incluyen diferentes modos de captura de las acciones; y (3) el método se aplica específicamente en un proyecto internacional de innovación real llamado Vinbot.Imitation Learning (IL), or robot Programming by Demonstration (PbD), covers methods by which a robot learns new skills through human guidance and imitation. PbD takes its inspiration from the way humans learn new skills by imitation in order to develop methods by which new tasks can be transmitted to robots. This thesis is motivated by the generic question of “what to imitate?” which concerns the problem of how to extract the essential features of a task. To this end, here we adopt Action Recognition (AR) perspective in order to allow the robot to decide what has to be imitated or inferred when interacting with a human kind. The proposed approach is based on a well-known method from natural language processing: namely, Bag of Words (BoW). This method is applied to large databases in order to obtain a trained model. Although BoW is a machine learning technique that is used in various fields of research, in action classification for robot learning it is far from accurate. Moreover, it focuses on the classification of objects and gestures rather than actions. Thus, in this thesis we show that the method is suitable in action classification scenarios for merging information from different sources or different trials. This thesis makes three contributions: (1) it proposes a general method for dealing with action recognition and thus to contribute to imitation learning; (2) the methodology can be applied to large databases which include different modes of action captures; and (3) the method is applied specifically in a real international innovation project called Vinbot

    Situation Assessment for Mobile Robots

    Get PDF

    New perspectives and methods for stream learning in the presence of concept drift.

    Get PDF
    153 p.Applications that generate data in the form of fast streams from non-stationary environments, that is,those where the underlying phenomena change over time, are becoming increasingly prevalent. In thiskind of environments the probability density function of the data-generating process may change overtime, producing a drift. This causes that predictive models trained over these stream data become obsoleteand do not adapt suitably to the new distribution. Specially in online learning scenarios, there is apressing need for new algorithms that adapt to this change as fast as possible, while maintaining goodperformance scores. Examples of these applications include making inferences or predictions based onfinancial data, energy demand and climate data analysis, web usage or sensor network monitoring, andmalware/spam detection, among many others.Online learning and concept drift are two of the most hot topics in the recent literature due to theirrelevance for the so-called Big Data paradigm, where nowadays we can find an increasing number ofapplications based on training data continuously available, named as data streams. Thus, learning in nonstationaryenvironments requires adaptive or evolving approaches that can monitor and track theunderlying changes, and adapt a model to accommodate those changes accordingly. In this effort, Iprovide in this thesis a comprehensive state-of-the-art approaches as well as I identify the most relevantopen challenges in the literature, while focusing on addressing three of them by providing innovativeperspectives and methods.This thesis provides with a complete overview of several related fields, and tackles several openchallenges that have been identified in the very recent state of the art. Concretely, it presents aninnovative way to generate artificial diversity in ensembles, a set of necessary adaptations andimprovements for spiking neural networks in order to be used in online learning scenarios, and finally, adrift detector based on this former algorithm. All of these approaches together constitute an innovativework aimed at presenting new perspectives and methods for the field

    New perspectives and methods for stream learning in the presence of concept drift.

    Get PDF
    153 p.Applications that generate data in the form of fast streams from non-stationary environments, that is,those where the underlying phenomena change over time, are becoming increasingly prevalent. In thiskind of environments the probability density function of the data-generating process may change overtime, producing a drift. This causes that predictive models trained over these stream data become obsoleteand do not adapt suitably to the new distribution. Specially in online learning scenarios, there is apressing need for new algorithms that adapt to this change as fast as possible, while maintaining goodperformance scores. Examples of these applications include making inferences or predictions based onfinancial data, energy demand and climate data analysis, web usage or sensor network monitoring, andmalware/spam detection, among many others.Online learning and concept drift are two of the most hot topics in the recent literature due to theirrelevance for the so-called Big Data paradigm, where nowadays we can find an increasing number ofapplications based on training data continuously available, named as data streams. Thus, learning in nonstationaryenvironments requires adaptive or evolving approaches that can monitor and track theunderlying changes, and adapt a model to accommodate those changes accordingly. In this effort, Iprovide in this thesis a comprehensive state-of-the-art approaches as well as I identify the most relevantopen challenges in the literature, while focusing on addressing three of them by providing innovativeperspectives and methods.This thesis provides with a complete overview of several related fields, and tackles several openchallenges that have been identified in the very recent state of the art. Concretely, it presents aninnovative way to generate artificial diversity in ensembles, a set of necessary adaptations andimprovements for spiking neural networks in order to be used in online learning scenarios, and finally, adrift detector based on this former algorithm. All of these approaches together constitute an innovativework aimed at presenting new perspectives and methods for the field

    Tracking interacting targets in multi-modal sensors

    Get PDF
    PhDObject tracking is one of the fundamental tasks in various applications such as surveillance, sports, video conferencing and activity recognition. Factors such as occlusions, illumination changes and limited field of observance of the sensor make tracking a challenging task. To overcome these challenges the focus of this thesis is on using multiple modalities such as audio and video for multi-target, multi-modal tracking. Particularly, this thesis presents contributions to four related research topics, namely, pre-processing of input signals to reduce noise, multi-modal tracking, simultaneous detection and tracking, and interaction recognition. To improve the performance of detection algorithms, especially in the presence of noise, this thesis investigate filtering of the input data through spatio-temporal feature analysis as well as through frequency band analysis. The pre-processed data from multiple modalities is then fused within Particle filtering (PF). To further minimise the discrepancy between the real and the estimated positions, we propose a strategy that associates the hypotheses and the measurements with a real target, using a Weighted Probabilistic Data Association (WPDA). Since the filtering involved in the detection process reduces the available information and is inapplicable on low signal-to-noise ratio data, we investigate simultaneous detection and tracking approaches and propose a multi-target track-beforedetect Particle filtering (MT-TBD-PF). The proposed MT-TBD-PF algorithm bypasses the detection step and performs tracking in the raw signal. Finally, we apply the proposed multi-modal tracking to recognise interactions between targets in regions within, as well as outside the cameras’ fields of view. The efficiency of the proposed approaches are demonstrated on large uni-modal, multi-modal and multi-sensor scenarios from real world detections, tracking and event recognition datasets and through participation in evaluation campaigns

    Domain-Specific Computing Architectures and Paradigms

    Full text link
    We live in an exciting era where artificial intelligence (AI) is fundamentally shifting the dynamics of industries and businesses around the world. AI algorithms such as deep learning (DL) have drastically advanced the state-of-the-art cognition and learning capabilities. However, the power of modern AI algorithms can only be enabled if the underlying domain-specific computing hardware can deliver orders of magnitude more performance and energy efficiency. This work focuses on this goal and explores three parts of the domain-specific computing acceleration problem; encapsulating specialized hardware and software architectures and paradigms that support the ever-growing processing demand of modern AI applications from the edge to the cloud. This first part of this work investigates the optimizations of a sparse spatio-temporal (ST) cognitive system-on-a-chip (SoC). This design extracts ST features from videos and leverages sparse inference and kernel compression to efficiently perform action classification and motion tracking. The second part of this work explores the significance of dataflows and reduction mechanisms for sparse deep neural network (DNN) acceleration. This design features a dynamic, look-ahead index matching unit in hardware to efficiently discover fine-grained parallelism, achieving high energy efficiency and low control complexity for a wide variety of DNN layers. Lastly, this work expands the scope to real-time machine learning (RTML) acceleration. A new high-level architecture modeling framework is proposed. Specifically, this framework consists of a set of high-performance RTML-specific architecture design templates, and a Python-based high-level modeling and compiler tool chain for efficient cross-stack architecture design and exploration.PHDElectrical and Computer EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/162870/1/lchingen_1.pd

    Unsupervised learning on social data

    Get PDF

    Myocardial tagging by Cardiovascular Magnetic Resonance: evolution of techniques--pulse sequences, analysis algorithms, and applications

    Get PDF
    Cardiovascular magnetic resonance (CMR) tagging has been established as an essential technique for measuring regional myocardial function. It allows quantification of local intramyocardial motion measures, e.g. strain and strain rate. The invention of CMR tagging came in the late eighties, where the technique allowed for the first time for visualizing transmural myocardial movement without having to implant physical markers. This new idea opened the door for a series of developments and improvements that continue up to the present time. Different tagging techniques are currently available that are more extensive, improved, and sophisticated than they were twenty years ago. Each of these techniques has different versions for improved resolution, signal-to-noise ratio (SNR), scan time, anatomical coverage, three-dimensional capability, and image quality. The tagging techniques covered in this article can be broadly divided into two main categories: 1) Basic techniques, which include magnetization saturation, spatial modulation of magnetization (SPAMM), delay alternating with nutations for tailored excitation (DANTE), and complementary SPAMM (CSPAMM); and 2) Advanced techniques, which include harmonic phase (HARP), displacement encoding with stimulated echoes (DENSE), and strain encoding (SENC). Although most of these techniques were developed by separate groups and evolved from different backgrounds, they are in fact closely related to each other, and they can be interpreted from more than one perspective. Some of these techniques even followed parallel paths of developments, as illustrated in the article. As each technique has its own advantages, some efforts have been made to combine different techniques together for improved image quality or composite information acquisition. In this review, different developments in pulse sequences and related image processing techniques are described along with the necessities that led to their invention, which makes this article easy to read and the covered techniques easy to follow. Major studies that applied CMR tagging for studying myocardial mechanics are also summarized. Finally, the current article includes a plethora of ideas and techniques with over 300 references that motivate the reader to think about the future of CMR tagging
    corecore