434 research outputs found

    Human-robot interaction and computer-vision-based services for autonomous robots

    Get PDF
    L'Aprenentatge per Imitació (IL), o Programació de robots per Demostració (PbD), abasta mètodes pels quals un robot aprèn noves habilitats a través de l'orientació humana i la imitació. La PbD s'inspira en la forma en què els éssers humans aprenen noves habilitats per imitació amb la finalitat de desenvolupar mètodes pels quals les noves tasques es poden transferir als robots. Aquesta tesi està motivada per la pregunta genèrica de "què imitar?", Que es refereix al problema de com extreure les característiques essencials d'una tasca. Amb aquesta finalitat, aquí adoptem la perspectiva del Reconeixement d'Accions (AR) per tal de permetre que el robot decideixi el què cal imitar o inferir en interactuar amb un ésser humà. L'enfoc proposat es basa en un mètode ben conegut que prové del processament del llenguatge natural: és a dir, la bossa de paraules (BoW). Aquest mètode s'aplica a grans bases de dades per tal d'obtenir un model entrenat. Encara que BoW és una tècnica d'aprenentatge de màquines que s'utilitza en diversos camps de la investigació, en la classificació d'accions per a l'aprenentatge en robots està lluny de ser acurada. D'altra banda, se centra en la classificació d'objectes i gestos en lloc d'accions. Per tant, en aquesta tesi es demostra que el mètode és adequat, en escenaris de classificació d'accions, per a la fusió d'informació de diferents fonts o de diferents assajos. Aquesta tesi fa tres contribucions: (1) es proposa un mètode general per fer front al reconeixement d'accions i per tant contribuir a l'aprenentatge per imitació; (2) la metodologia pot aplicar-se a grans bases de dades, que inclouen diferents modes de captura de les accions; i (3) el mètode s'aplica específicament en un projecte internacional d'innovació real anomenat Vinbot.El Aprendizaje por Imitación (IL), o Programación de robots por Demostración (PbD), abarca métodos por los cuales un robot aprende nuevas habilidades a través de la orientación humana y la imitación. La PbD se inspira en la forma en que los seres humanos aprenden nuevas habilidades por imitación con el fin de desarrollar métodos por los cuales las nuevas tareas se pueden transferir a los robots. Esta tesis está motivada por la pregunta genérica de "qué imitar?", que se refiere al problema de cómo extraer las características esenciales de una tarea. Con este fin, aquí adoptamos la perspectiva del Reconocimiento de Acciones (AR) con el fin de permitir que el robot decida lo que hay que imitar o inferir al interactuar con un ser humano. El enfoque propuesto se basa en un método bien conocido que proviene del procesamiento del lenguaje natural: es decir, la bolsa de palabras (BoW). Este método se aplica a grandes bases de datos con el fin de obtener un modelo entrenado. Aunque BoW es una técnica de aprendizaje de máquinas que se utiliza en diversos campos de la investigación, en la clasificación de acciones para el aprendizaje en robots está lejos de ser acurada. Además, se centra en la clasificación de objetos y gestos en lugar de acciones. Por lo tanto, en esta tesis se demuestra que el método es adecuado, en escenarios de clasificación de acciones, para la fusión de información de diferentes fuentes o de diferentes ensayos. Esta tesis hace tres contribuciones: (1) se propone un método general para hacer frente al reconocimiento de acciones y por lo tanto contribuir al aprendizaje por imitación; (2) la metodología puede aplicarse a grandes bases de datos, que incluyen diferentes modos de captura de las acciones; y (3) el método se aplica específicamente en un proyecto internacional de innovación real llamado Vinbot.Imitation Learning (IL), or robot Programming by Demonstration (PbD), covers methods by which a robot learns new skills through human guidance and imitation. PbD takes its inspiration from the way humans learn new skills by imitation in order to develop methods by which new tasks can be transmitted to robots. This thesis is motivated by the generic question of “what to imitate?” which concerns the problem of how to extract the essential features of a task. To this end, here we adopt Action Recognition (AR) perspective in order to allow the robot to decide what has to be imitated or inferred when interacting with a human kind. The proposed approach is based on a well-known method from natural language processing: namely, Bag of Words (BoW). This method is applied to large databases in order to obtain a trained model. Although BoW is a machine learning technique that is used in various fields of research, in action classification for robot learning it is far from accurate. Moreover, it focuses on the classification of objects and gestures rather than actions. Thus, in this thesis we show that the method is suitable in action classification scenarios for merging information from different sources or different trials. This thesis makes three contributions: (1) it proposes a general method for dealing with action recognition and thus to contribute to imitation learning; (2) the methodology can be applied to large databases which include different modes of action captures; and (3) the method is applied specifically in a real international innovation project called Vinbot

    A Fine Motor Skill Classifying Framework to Support Children's Self-Regulation Skills and School Readiness

    Get PDF
    Children’s self-regulation skills predict their school-readiness and social behaviors, and assessing these skills enables parents and teachers to target areas for improvement or prepare children to enter school ready to learn and achieve. Assessing these skills enables parents and teachers to target areas for improvement or prepare children to enter school ready to learn and achieve. To assess children’s fine motor skills, current educators are assessing those skills by either determining their shape drawing correctness or measuring their drawing time durations through paper-based assessments. However, the methods involve human experts manually assessing children’s fine motor skills, which are time consuming and prone to human error and bias. As there are many children that use sketch-based applications on mobile and tablet devices, computer-based fine motor skill assessment has high potential to solve the limitations of the paper-based assessments. Furthermore, sketch recognition technology is able to offer more detailed, accurate, and immediate drawing skill information than the paper-based assessments such as drawing time or curvature difference. While a number of educational sketch applications exist for teaching children how to sketch, they are lacking the ability to assess children’s fine motor skills and have not proved the validity of the traditional methods onto tablet-environments. We introduce our fine motor skill classifying framework based on children’s digital drawings on tablet-computers. The framework contains two fine motor skill classifiers and a sketch-based educational interface (EasySketch). The fine motor skill classifiers contain: (1) KimCHI: the classifier that determines children’s fine motor skills based on their overall drawing skills and (2) KimCHI2: the classifier that determines children’s fine motor skills based on their curvature- and corner-drawing skills. Our fine motor skill classifiers determine children’s fine motor skills by generating 131 sketch features, which can analyze their drawing ability (e.g. DCR sketch feature can determine their curvature-drawing skills). We first implemented the KimCHI classifier, which can determine children’s fine motor skills based on their overall drawing skills. From our evaluation with 10- fold cross-validation, we found that the classifier can determine children’s fine motor skills with an f-measure of 0.904. After that, we implemented the KimCHI2 classifier, which can determine children’s fine motor skills based on their curvature- and corner-drawing skills. From our evaluation with 10-fold cross-validation, we found that the classifier can determine children’s curvature-drawing skills with an f-measure of 0.82 and corner-drawing skills with an f-measure of 0.78. The KimCHI2 classifier outperformed the KimCHI classifier during the fine motor skill evaluation. EasySketch is a sketch-based educational interface that (1) determines children’s fine motor skills based on their drawing skills and (2) assists children how to draw basic shapes such as alphabet letters or numbers based on their learning progress. When we evaluated our interface with children, our interface determined children’s fine motor skills more accurately than the conventional methodology by f-measures of 0.907 and 0.744, accordingly. Furthermore, children improved their drawing skills from our pedagogical feedback. Finally, we introduce our findings that sketch features (DCR and Polyline Test) can explain children’s fine motor skill developmental stages. From the sketch feature distributions per each age group, we found that from age 5 years, they show notable fine motor skill development

    Supporting Multi-Criteria Decision Support Queries over Disparate Data Sources

    Get PDF
    In the era of big data revolution, marked by an exponential growth of information, extracting value from data enables analysts and businesses to address challenging problems such as drug discovery, fraud detection, and earthquake predictions. Multi-Criteria Decision Support (MCDS) queries are at the core of big-data analytics resulting in several classes of MCDS queries such as OLAP, Top-K, Pareto-optimal, and nearest neighbor queries. The intuitive nature of specifying multi-dimensional preferences has made Pareto-optimal queries, also known as skyline queries, popular. Existing skyline algorithms however do not address several crucial issues such as performing skyline evaluation over disparate sources, progressively generating skyline results, or robustly handling workload with multiple skyline over join queries. In this dissertation we thoroughly investigate topics in the area of skyline-aware query evaluation. In this dissertation, we first propose a novel execution framework called SKIN that treats skyline over joins as first class citizens during query processing. This is in contrast to existing techniques that treat skylines as an add-on, loosely integrated with query processing by being placed on top of the query plan. SKIN is effective in exploiting the skyline characteristics of the tuples within individual data sources as well as across disparate sources. This enables SKIN to significantly reduce two primary costs, namely the cost of generating the join results and the cost of skyline comparisons to compute the final results. Second, we address the crucial business need to report results early; as soon as they are being generated so that users can formulate competitive decisions in near real-time. On top of SKIN, we built a progressive query evaluation framework ProgXe to transform the execution of queries involving skyline over joins to become non-blocking, i.e., to be progressively generating results early and often. By exploiting SKIN\u27s principle of processing query at multiple levels of abstraction, ProgXe is able to: (1) extract the output dependencies in the output spaces by analyzing both the input and output space, and (2) exploit this knowledge of abstract-level relationships to guarantee correctness of early output. Third, real-world applications handle query workloads with diverse Quality of Service (QoS) requirements also referred to as contracts. Time sensitive queries, such as fraud detection, require results to progressively output with minimal delay, while ad-hoc and reporting queries can tolerate delay. In this dissertation, by building on the principles of ProgXe we propose the Contract-Aware Query Execution (CAQE) framework to support the open problem of contract driven multi-query processing. CAQE employs an adaptive execution strategy to continuously monitor the run-time satisfaction of queries and aggressively take corrective steps whenever the contracts are not being met. Lastly, to elucidate the portability of the core principle of this dissertation, the reasoning and query processing at different levels of data abstraction, we apply them to solve an orthogonal research question to auto-generate recommendation queries that facilitate users in exploring a complex database system. User queries are often too strict or too broad requiring a frustrating trial-and-error refinement process to meet the desired result cardinality while preserving original query semantics. Based on the principles of SKIN, we propose CAPRI to automatically generate refined queries that: (1) attain the desired cardinality and (2) minimize changes to the original query intentions. In our comprehensive experimental study of each part of this dissertation, we demonstrate the superiority of the proposed strategies over state-of-the-art techniques in both efficiency, as well as resource consumption

    Interactive Spaces Natural interfaces supporting gestures and manipulations in interactive spaces

    Get PDF
    This doctoral dissertation focuses on the development of interactive spaces through the use of natural interfaces based on gestures and manipulative actions. In the real world people use their senses to perceive the external environment and they use manipulations and gestures to explore the world around them, communicate and interact with other individuals. From this perspective the use of natural interfaces that exploit the human sensorial and explorative abilities helps filling the gap between physical and digital world. In the first part of this thesis we describe the work made for improving interfaces and devices for tangible, multi touch and free hand interactions. The idea is to design devices able to work also in uncontrolled environments, and in situations where control is mostly of the physical type where even the less experienced users can express their manipulative exploration and gesture communication abilities. We also analyze how it can be possible to mix these techniques to create an interactive space, specifically designed for teamwork where the natural interfaces are distributed in order to encourage collaboration. We then give some examples of how these interactive scenarios can host various types of applications facilitating, for instance, the exploration of 3D models, the enjoyment of multimedia contents and social interaction. Finally we discuss our results and put them in a wider context, focusing our attention particularly on how the proposed interfaces actually improve people’s lives and activities and the interactive spaces become a place of aggregation where we can pursue objectives that are both personal and shared with others

    Interactive Spaces Natural interfaces supporting gestures and manipulations in interactive spaces

    Get PDF
    This doctoral dissertation focuses on the development of interactive spaces through the use of natural interfaces based on gestures and manipulative actions. In the real world people use their senses to perceive the external environment and they use manipulations and gestures to explore the world around them, communicate and interact with other individuals. From this perspective the use of natural interfaces that exploit the human sensorial and explorative abilities helps filling the gap between physical and digital world. In the first part of this thesis we describe the work made for improving interfaces and devices for tangible, multi touch and free hand interactions. The idea is to design devices able to work also in uncontrolled environments, and in situations where control is mostly of the physical type where even the less experienced users can express their manipulative exploration and gesture communication abilities. We also analyze how it can be possible to mix these techniques to create an interactive space, specifically designed for teamwork where the natural interfaces are distributed in order to encourage collaboration. We then give some examples of how these interactive scenarios can host various types of applications facilitating, for instance, the exploration of 3D models, the enjoyment of multimedia contents and social interaction. Finally we discuss our results and put them in a wider context, focusing our attention particularly on how the proposed interfaces actually improve people’s lives and activities and the interactive spaces become a place of aggregation where we can pursue objectives that are both personal and shared with others

    An analytical study on image databases

    Get PDF
    Thesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1997.Includes bibliographical references (leaves 87-88).by Francine Ming Fang.M.Eng
    corecore