1,918 research outputs found

    AssistSR: Task-oriented Video Segment Retrieval for Personal AI Assistant

    Full text link
    It is still a pipe dream that personal AI assistants on the phone and AR glasses can assist our daily life in addressing our questions like ``how to adjust the date for this watch?'' and ``how to set its heating duration? (while pointing at an oven)''. The queries used in conventional tasks (i.e. Video Question Answering, Video Retrieval, Moment Localization) are often factoid and based on pure text. In contrast, we present a new task called Task-oriented Question-driven Video Segment Retrieval (TQVSR). Each of our questions is an image-box-text query that focuses on affordance of items in our daily life and expects relevant answer segments to be retrieved from a corpus of instructional video-transcript segments. To support the study of this TQVSR task, we construct a new dataset called AssistSR. We design novel guidelines to create high-quality samples. This dataset contains 3.2k multimodal questions on 1.6k video segments from instructional videos on diverse daily-used items. To address TQVSR, we develop a simple yet effective model called Dual Multimodal Encoders (DME) that significantly outperforms several baseline methods while still having large room for improvement in the future. Moreover, we present detailed ablation analyses. Code and data are available at \url{https://github.com/StanLei52/TQVSR}.Comment: 20 pages, 12 figure

    High performance wearable ultrasound as a human-machine interface for wrist and hand kinematic tracking

    Get PDF
    Objective: Non-invasive human machine interfaces (HMIs) have high potential in medical, entertainment, and industrial applications. Traditionally, surface electromyography (sEMG) has been used to track muscular activity and infer motor intention. Ultrasound (US) has received increasing attention as an alternative to sEMG-based HMIs. Here, we developed a portable US armband system with 24 channels and a multiple receiver approach, and compared it with existing sEMG- and US-based HMIs on movement intention decoding. Methods: US and motion capture data was recorded while participants performed wrist and hand movements of four degrees of freedom (DoFs) and their combinations. A linear regression model was used to offline predict hand kinematics from the US (or sEMG, for comparison) features. The method was further validated in real-time for a 3-DoF target reaching task. Results: In the offline analysis, the wearable US system achieved an average R2 of 0.94 in the prediction of four DoFs of the wrist and hand while sEMG reached a performance of R2=0.06 . In online control, the participants achieved an average 93% completion rate of the targets. Conclusion: When tailored for HMIs, the proposed US A-mode system and processing pipeline can successfully regress hand kinematics both in offline and online settings with performances comparable or superior to previously published interfaces. Significance: Wearable US technology may provide a new generation of HMIs that use muscular deformation to estimate limb movements. The wearable US system allowed for robust proportional and simultaneous control over multiple DoFs in both offline and online settings

    The unexpected resurgence of Weyl geometry in late 20-th century physics

    Full text link
    Weyl's original scale geometry of 1918 ("purely infinitesimal geometry") was withdrawn by its author from physical theorizing in the early 1920s. It had a comeback in the last third of the 20th century in different contexts: scalar tensor theories of gravity, foundations of gravity, foundations of quantum mechanics, elementary particle physics, and cosmology. It seems that Weyl geometry continues to offer an open research potential for the foundations of physics even after the turn to the new millennium.Comment: Completely rewritten conference paper 'Beyond Einstein', Mainz Sep 2008. Preprint ELHC (Epistemology of the LHC) 2017-02, 92 pages, 1 figur

    Computer Vision-Based Hand Tracking and 3D Reconstruction as a Human-Computer Input Modality with Clinical Application

    Get PDF
    The recent pandemic has impeded patients with hand injuries from connecting in person with their therapists. To address this challenge and improve hand telerehabilitation, we propose two computer vision-based technologies, photogrammetry and augmented reality as alternative and affordable solutions for visualization and remote monitoring of hand trauma without costly equipment. In this thesis, we extend the application of 3D rendering and virtual reality-based user interface to hand therapy. We compare the performance of four popular photogrammetry software in reconstructing a 3D model of a synthetic human hand from videos captured through a smartphone. The visual quality, reconstruction time and geometric accuracy of output model meshes are compared. Reality Capture produces the best result, with output mesh having the least error of 1mm and a total reconstruction time of 15 minutes. We developed an augmented reality app using MediaPipe algorithms that extract hand key points, finger joint coordinates and angles in real-time from hand images or live stream media. We conducted a study to investigate its input variability and validity as a reliable tool for remote assessment of finger range of motion. The intraclass correlation coefficient between DIGITS and in-person measurement obtained is 0.767- 0.81 for finger extension and 0.958–0.857 for finger flexion. Finally, we develop and surveyed the usability of a mobile application that collects patient data medical history, self-reported pain levels and hand 3D models and transfer them to therapists. These technologies can improve hand telerehabilitation, aid clinicians in monitoring hand conditions remotely and make decisions on appropriate therapy, medication, and hand orthoses

    EEG-Controlling Robotic Car and Alphabetic Display by Support Vector Machine for Aiding Amyotrophic Lateral Sclerosis Patients

    Get PDF
    This thesis presents the design and experiment of a system that can detect the human thinking such as driving directions and letters using the brainwave signals known as electroencephalogram (EEG) and a machine learning algorithm called support vector machine (SVM). This research is motivated by amyotrophic lateral sclerosis (ALS) disease which makes patients seriously lose mobility and speaking capabilities. The developed system in this thesis has three main steps. First, wearing EPOC headset from Emotiv Company, a user can record the EEG signals when he/she is thinking a direction or a letter, and also save the data in a personal computer wirelessly. Next, a large amount of EEG data carrying the information of different directions and letters from this user are used to train SVM classification model exhaustively. Finally, the well-trained SVM model will be used to detect any new thought about directions and letters from the user. The detection results from the SVM model will be transmitted wirelessly to a robotic car with LCD display built with Arduino microcontrollers to control its motions as well as the alphabetic display on LCD. One of the great potential applications of the developed system is to make an advanced brain control wheel chair system with LCD display for aiding ALS patients with their mobility and daily communications

    Application of advanced technology to space automation

    Get PDF
    Automated operations in space provide the key to optimized mission design and data acquisition at minimum cost for the future. The results of this study strongly accentuate this statement and should provide further incentive for immediate development of specific automtion technology as defined herein. Essential automation technology requirements were identified for future programs. The study was undertaken to address the future role of automation in the space program, the potential benefits to be derived, and the technology efforts that should be directed toward obtaining these benefits
    corecore