381 research outputs found

    Manoeuvring drone (Tello Talent) using eye gaze and or fingers gestures

    Get PDF
    The project aims to combine hands and eyes to control a Tello Talent drone based on computer vision, machine learning and an eye tracking device for gaze detection and interaction. The main purpose of this project is gaming, experimental and educational for next coming generation, in addition it is very useful for the peoples who cannot use their hands, they can maneuver the drone by their eyes movement, and hopefully this will bring them some fun. The idea of this project is inspired by the progress and development in the innovative technologies such as machine learning, computer vision and object detection that offer a large field of applications which can be used in diverse domains, there are many researcher are improving, instructing and innovating the new intelligent manner for controlling the drones by combining computer vision, machine learning, artificial intelligent, etc. This project can help anyone even the people who they don¿t have any prior knowledge of programming or Computer Vision or theory of eye tracking system, they learn the basic knowledge of drone concept, object detection, programing, and integrating different hardware and software involved, then playing. As a final objective, they can able to build simple application that can control the drones by using movements of hands, eyes or both, during the practice they should take in consideration the operating condition and safety required by the manufacturers of drones and eye tracking device. The concept of Tello Talent drone is based on a series of features, functions and scripts which are already been developed, embedded in autopilot memories and are accessible by users via an SDK protocol. The SDK is used as an easy guide to developing simple and complex applications; it allows the user to develop several flying mission programs. There are different experiments were studied for checking which scenario is better in detecting the hands movement and exploring the keys points in real-time with low computing power computer. As a result, I find that the Google artificial intelligent research group offers an open source platform dedicated for developing this application; the platform is called MediaPipe based on customizable machine learning solution for live streaming video. In this project the MediaPipe and the eye tracking module are the fundamental tools for developing and realizing the application

    Robust Modeling of Epistemic Mental States and Their Applications in Assistive Technology

    Get PDF
    This dissertation presents the design and implementation of EmoAssist: Emotion-Enabled Assistive Tool to Enhance Dyadic Conversation for the Blind . The key functionalities of the system are to recognize behavioral expressions and to predict 3-D affective dimensions from visual cues and to provide audio feedback to the visually impaired in a natural environment. Prior to describing the EmoAssist, this dissertation identifies and advances research challenges in the analysis of the facial features and their temporal dynamics with Epistemic Mental States in dyadic conversation. A number of statistical analyses and simulations were performed to get the answer of important research questions about the complex interplay between facial features and mental states. It was found that the non-linear relations are mostly prevalent rather than the linear ones. Further, the portable prototype of assistive technology that can aid blind individual to understand his/her interlocutor\u27s mental states has been designed based on the analysis. A number of challenges related to the system, communication protocols, error-free tracking of face and robust modeling of behavioral expressions /affective dimensions were addressed to make the EmoAssist effective in a real world scenario. In addition, orientation-sensor information from the phone was used to correct image alignment to improve the robustness in real life deployment. It was observed that the EmoAssist can predict affective dimensions with acceptable accuracy (Maximum Correlation-Coefficient for valence: 0.76, arousal: 0.78, and dominance: 0.76) in natural conversation. The overall minimum and maximum response-times are (64.61 milliseconds) and (128.22 milliseconds), respectively. The integration of sensor information for correcting the orientation has helped in significant improvement (16% in average) of accuracy in recognizing behavioral expressions. A user study with ten blind people shows that the EmoAssist is highly acceptable to them (Average acceptability rating using Likert: 6.0 where 1 and 7 are the lowest and highest possible ratings, respectively) in social interaction

    A vision-based approach for human hand tracking and gesture recognition.

    Get PDF
    Hand gesture interface has been becoming an active topic of human-computer interaction (HCI). The utilization of hand gestures in human-computer interface enables human operators to interact with computer environments in a natural and intuitive manner. In particular, bare hand interpretation technique frees users from cumbersome, but typically required devices in communication with computers, thus offering the ease and naturalness in HCI. Meanwhile, virtual assembly (VA) applies virtual reality (VR) techniques in mechanical assembly. It constructs computer tools to help product engineers planning, evaluating, optimizing, and verifying the assembly of mechanical systems without the need of physical objects. However, traditional devices such as keyboards and mice are no longer adequate due to their inefficiency in handling three-dimensional (3D) tasks. Special VR devices, such as data gloves, have been mandatory in VA. This thesis proposes a novel gesture-based interface for the application of VA. It develops a hybrid approach to incorporate an appearance-based hand localization technique with a skin tone filter in support of gesture recognition and hand tracking in the 3D space. With this interface, bare hands become a convenient substitution of special VR devices. Experiment results demonstrate the flexibility and robustness introduced by the proposed method to HCI.Dept. of Computer Science. Paper copy at Leddy Library: Theses & Major Papers - Basement, West Bldg. / Call Number: Thesis2004 .L8. Source: Masters Abstracts International, Volume: 43-03, page: 0883. Adviser: Xiaobu Yuan. Thesis (M.Sc.)--University of Windsor (Canada), 2004

    Robust and real-time hand detection and tracking in monocular video

    Get PDF
    In recent years, personal computing devices such as laptops, tablets and smartphones have become ubiquitous. Moreover, intelligent sensors are being integrated into many consumer devices such as eyeglasses, wristwatches and smart televisions. With the advent of touchscreen technology, a new human-computer interaction (HCI) paradigm arose that allows users to interface with their device in an intuitive manner. Using simple gestures, such as swipe or pinch movements, a touchscreen can be used to directly interact with a virtual environment. Nevertheless, touchscreens still form a physical barrier between the virtual interface and the real world. An increasingly popular field of research that tries to overcome this limitation, is video based gesture recognition, hand detection and hand tracking. Gesture based interaction allows the user to directly interact with the computer in a natural manner by exploring a virtual reality using nothing but his own body language. In this dissertation, we investigate how robust hand detection and tracking can be accomplished under real-time constraints. In the context of human-computer interaction, real-time is defined as both low latency and low complexity, such that a complete video frame can be processed before the next one becomes available. Furthermore, for practical applications, the algorithms should be robust to illumination changes, camera motion, and cluttered backgrounds in the scene. Finally, the system should be able to initialize automatically, and to detect and recover from tracking failure. We study a wide variety of existing algorithms, and propose significant improvements and novel methods to build a complete detection and tracking system that meets these requirements. Hand detection, hand tracking and hand segmentation are related yet technically different challenges. Whereas detection deals with finding an object in a static image, tracking considers temporal information and is used to track the position of an object over time, throughout a video sequence. Hand segmentation is the task of estimating the hand contour, thereby separating the object from its background. Detection of hands in individual video frames allows us to automatically initialize our tracking algorithm, and to detect and recover from tracking failure. Human hands are highly articulated objects, consisting of finger parts that are connected with joints. As a result, the appearance of a hand can vary greatly, depending on the assumed hand pose. Traditional detection algorithms often assume that the appearance of the object of interest can be described using a rigid model and therefore can not be used to robustly detect human hands. Therefore, we developed an algorithm that detects hands by exploiting their articulated nature. Instead of resorting to a template based approach, we probabilistically model the spatial relations between different hand parts, and the centroid of the hand. Detecting hand parts, such as fingertips, is much easier than detecting a complete hand. Based on our model of the spatial configuration of hand parts, the detected parts can be used to obtain an estimate of the complete hand's position. To comply with the real-time constraints, we developed techniques to speed-up the process by efficiently discarding unimportant information in the image. Experimental results show that our method is competitive with the state-of-the-art in object detection while providing a reduction in computational complexity with a factor 1 000. Furthermore, we showed that our algorithm can also be used to detect other articulated objects such as persons or animals and is therefore not restricted to the task of hand detection. Once a hand has been detected, a tracking algorithm can be used to continuously track its position in time. We developed a probabilistic tracking method that can cope with uncertainty caused by image noise, incorrect detections, changing illumination, and camera motion. Furthermore, our tracking system automatically determines the number of hands in the scene, and can cope with hands entering or leaving the video canvas. We introduced several novel techniques that greatly increase tracking robustness, and that can also be applied in other domains than hand tracking. To achieve real-time processing, we investigated several techniques to reduce the search space of the problem, and deliberately employ methods that are easily parallelized on modern hardware. Experimental results indicate that our methods outperform the state-of-the-art in hand tracking, while providing a much lower computational complexity. One of the methods used by our probabilistic tracking algorithm, is optical flow estimation. Optical flow is defined as a 2D vector field describing the apparent velocities of objects in a 3D scene, projected onto the image plane. Optical flow is known to be used by many insects and birds to visually track objects and to estimate their ego-motion. However, most optical flow estimation methods described in literature are either too slow to be used in real-time applications, or are not robust to illumination changes and fast motion. We therefore developed an optical flow algorithm that can cope with large displacements, and that is illumination independent. Furthermore, we introduce a regularization technique that ensures a smooth flow-field. This regularization scheme effectively reduces the number of noisy and incorrect flow-vector estimates, while maintaining the ability to handle motion discontinuities caused by object boundaries in the scene. The above methods are combined into a hand tracking framework which can be used for interactive applications in unconstrained environments. To demonstrate the possibilities of gesture based human-computer interaction, we developed a new type of computer display. This display is completely transparent, allowing multiple users to perform collaborative tasks while maintaining eye contact. Furthermore, our display produces an image that seems to float in thin air, such that users can touch the virtual image with their hands. This floating imaging display has been showcased on several national and international events and tradeshows. The research that is described in this dissertation has been evaluated thoroughly by comparing detection and tracking results with those obtained by state-of-the-art algorithms. These comparisons show that the proposed methods outperform most algorithms in terms of accuracy, while achieving a much lower computational complexity, resulting in a real-time implementation. Results are discussed in depth at the end of each chapter. This research further resulted in an international journal publication; a second journal paper that has been submitted and is under review at the time of writing this dissertation; nine international conference publications; a national conference publication; a commercial license agreement concerning the research results; two hardware prototypes of a new type of computer display; and a software demonstrator

    Tecnología para Tiendas Inteligentes

    Get PDF
    Trabajo de Fin de Grado en Doble Grado en Ingeniería Informática y Matemáticas, Facultad de Informática UCM, Departamento de Ingeniería del Software e Inteligencia Artificial, Curso 2020/2021Smart stores technologies exemplify how Artificial Intelligence and Internet of Things can effectively join forces to shape the future of retailing. With an increasing number of companies proposing and implementing their own smart store concepts, such as Amazon Go or Tao Cafe, a new field is clearly emerging. Since the technologies used to build their infrastructure offer significant competitive advantages, companies are not publicly sharing their own designs. For this reason, this work presents a new smart store model named Mercury, which aims to take the edge off of the lack of public and accessible information and research documents in this field. We do not only introduce a comprehensive smart store model, but also work-through a feasible detailed implementation so that anyone can build their own system upon it.Las tecnologías utilizadas en las tiendas inteligentes ejemplifican cómo la Inteligencia Artificial y el Internet de las Cosas pueden unir, de manera efectiva, fuerzas para transformar el futuro de la venta al por menor. Con un creciente número de empresas proponiendo e implementando sus propios conceptos de tiendas inteligentes, como Amazon Go o Tao Cafe, un nuevo campo está claramente emergiendo. Debido a que las tecnologías utilizadas para construir sus infraestructuras ofrecen una importante ventaja competitiva, las empresas no están compartiendo públicamente sus diseños. Por esta razón, este trabajo presenta un nuevo modelo de tienda inteligente llamado Mercury, que tiene como objetivo mitigar la falta de información pública y accesible en este campo. No solo introduciremos un modelo general y completo de tienda inteligente, sino que también proponemos una implementación detallada y concreta para que cualquier persona pueda construir su propia tienda inteligente siguiendo nuestro modelo.Depto. de Ingeniería de Software e Inteligencia Artificial (ISIA)Fac. de InformáticaTRUEunpu

    Robot manipulation in human environments

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2007.Includes bibliographical references (p. 211-228).Human environments present special challenges for robot manipulation. They are often dynamic, difficult to predict, and beyond the control of a robot engineer. Fortunately, many characteristics of these settings can be used to a robot's advantage. Human environments are typically populated by people, and a robot can rely on the guidance and assistance of a human collaborator. Everyday objects exhibit common, task-relevant features that reduce the cognitive load required for the object's use. Many tasks can be achieved through the detection and control of these sparse perceptual features. And finally, a robot is more than a passive observer of the world. It can use its body to reduce its perceptual uncertainty about the world. In this thesis we present advances in robot manipulation that address the unique challenges of human environments. We describe the design of a humanoid robot named Domo, develop methods that allow Domo to assist a person in everyday tasks, and discuss general strategies for building robots that work alongside people in their homes and workplaces.by Aaron Ladd Edsinger.Ph.D

    Enhanced Concrete Bridge Assessment Using Artificial Intelligence and Mixed Reality

    Get PDF
    Conventional methods for visual assessment of civil infrastructures have certain limitations, such as subjectivity of the collected data, long inspection time, and high cost of labor. Although some new technologies (i.e. robotic techniques) that are currently in practice can collect objective, quantified data, the inspector\u27s own expertise is still critical in many instances since these technologies are not designed to work interactively with human inspector. This study aims to create a smart, human-centered method that offers significant contributions to infrastructure inspection, maintenance, management practice, and safety for the bridge owners. By developing a smart Mixed Reality (MR) framework, which can be integrated into a wearable holographic headset device, a bridge inspector, for example, can automatically analyze a certain defect such as a crack that he or she sees on an element, display its dimension information in real-time along with the condition state. Such systems can potentially decrease the time and cost of infrastructure inspections by accelerating essential tasks of the inspector such as defect measurement, condition assessment and data processing to management systems. The human centered artificial intelligence (AI) will help the inspector collect more quantified and objective data while incorporating inspector\u27s professional judgment. This study explains in detail the described system and related methodologies of implementing attention guided semi-supervised deep learning into mixed reality technology, which interacts with the human inspector during assessment. Thereby, the inspector and the AI will collaborate/communicate for improved visual inspection
    corecore