14 research outputs found

    The multimodal edge of human aerobotic interaction

    No full text
    This paper presents the idea of a multimodal human aerobotic interaction. An overview of the aerobotic system and its application is given. The joystick-based controller interface and its limitations is discussed. Two techniques are suggested as emerging alternatives to the joystick-based controller interface used in human aerobotic interaction. The first technique is a multimodal combination of speech, gaze, gesture, and other non-verbal cues already used in regular human-humaninteraction. The second is telepathic interaction via brain computer interfaces. The potential limitations of these alternatives is highlighted, and the considerations for further works are presented

    The performance and cognitive workload analysis of a multimodal speech and visual gesture (mSVG) UAV control interface

    No full text
    This paper conducts a comparison of the performance and cognitive workload between three UAV control interfaces on an nCA (navigation control autonomy) Tier 1-III flight navigation task. The first interface is the standard RC Joystick (RCJ) controller, the second interface is the multimodal speech and visual gesture (mSVG) interface, and the third interface is the modified version of the RCJ interface with altitude, attitude, and position (AAP) assist. The modified RCJ interface was achieved with the aid of the Keyboard (KBD). A model of the mSVG interface previously designed and tested was used in this comparison. An experiment study was designed to measure the completion time and navigation accuracy of participants using each of the three interfaces, on a developed path_v02 test flight path. Thirty-seven (37) participants volunteered. The NASA task load index (TLX) survey questionnaire was administered at the end of each interface experiment to access the participants experience and to estimate the interface cognitive workload. A commercial software, the RealFlight Drone Simulator (RFDS) was used to estimate the RCJ skill level of the participants. From the results of the experiment, it was shown that the flying hours, the number of months flying, and the RFDS Level 4 challenge performance was a good estimator for participants RCJ flying skill level. A two-way result was obtained in the comparison of the RCJ and mSVG interfaces. It was concluded that, although the mSVG was better than the standard RCJ interface, the AAP-assisted RCJ was found to be as effective as (in some cases better than) the mSVG interface. It was also shown, from the speech gesture ratio result, that theparticipants had a preference for gesture over speech when using the mSVG interface. Some further works such as an outdoor field test and a performance comparison at higher nCA levels were suggested

    Effects of varying noise levels and lighting levels on multimodal speech and visual gesture interaction with aerobots

    No full text
    This paper investigated the effects of varying noise levels and varying lighting levels on speech and gesture control command interfaces for aerobots. The aim was to determine the practical suitability of the multimodal combination of speech and visual gesture in human aerobotic interaction, by investigating the limits and feasibility of use of the individual components. In order to determine this, a custom multimodal speech and visual gesture interface was developed using CMU (Carnegie Mellon University) sphinx and OpenCV (Open source Computer Vision) libraries, respectively. An experiment study was designed to measure the individual effects of each of the two main components of speech and gesture, and 37 participants were recruited to participate in the experiment. The ambient noise level was varied from 55 dB to 85 dB. The ambient lighting level was varied from 10 Lux to 1400 Lux, under different lighting colour temperature mixtures of yellow (3500 K) and white (5500 K), and different background for capturing the finger gestures. The results of the experiment, which consisted of around 3108 speech utterance and 999 gesture quality observations, were presented and discussed. It was observed that speech recognition accuracy/success rate falls as noise levels rise, with 75 dB noise level being the aerobot’s practical application limit, as the speech control interaction becomes very unreliable due to poor recognition beyond this. It was concluded that multi-word speech commands were considered more reliable and effective than single-word speech commands. In addition, some speech command words (e.g., land) were more noise resistant than others (e.g., hover) at higher noise levels, due to their articulation. From the results of the gesture-lighting experiment, the effects of both lighting conditions and the environment background on the quality of gesture recognition, was almost insignificant, less than 0.5%. The implication of this is that other factors such as the gesture capture system design and technology (camera and computer hardware), type of gesture being captured (upper body, whole body, hand, fingers, or facial gestures), and the image processing technique (gesture classification algorithms), are more important in developing a successful gesture recognition system. Some further works were suggested based on the conclusions drawn from this findings which included using alternative ASR (Automatic Speech Recognition) speech models and developing more robust gesture recognition algorithm

    A practical mSVG interaction method for patrol, search, and rescue aerobots

    No full text
    This paper briefly presents the multimodal speech and visual gesture (mSVG) control for aerobots at higher nCA autonomy levels, using a patrol, search, and rescue application example. The developed mSVG control architecture was presented and briefly discussed. This was successfully tested using both MATLAB simulation and python based ROS Gazebo UAV simulations. Some limitations were identified, which formed the basis for the further works presented

    Effects of Varying Noise Levels and Lighting Levels on Multimodal Speech and Visual Gesture Interaction with Aerobots

    No full text
    This paper investigated the effects of varying noise levels and varying lighting levels on speech and gesture control command interfaces for aerobots. The aim was to determine the practical suitability of the multimodal combination of speech and visual gesture in human aerobotic interaction, by investigating the limits and feasibility of use of the individual components. In order to determine this, a custom multimodal speech and visual gesture interface was developed using CMU (Carnegie Mellon University) sphinx and OpenCV (Open source Computer Vision) libraries, respectively. An experiment study was designed to measure the individual effects of each of the two main components of speech and gesture, and 37 participants were recruited to participate in the experiment. The ambient noise level was varied from 55 dB to 85 dB. The ambient lighting level was varied from 10 Lux to 1400 Lux, under different lighting colour temperature mixtures of yellow (3500 K) and white (5500 K), and different background for capturing the finger gestures. The results of the experiment, which consisted of around 3108 speech utterance and 999 gesture quality observations, were presented and discussed. It was observed that speech recognition accuracy/success rate falls as noise levels rise, with 75 dB noise level being the aerobot’s practical application limit, as the speech control interaction becomes very unreliable due to poor recognition beyond this. It was concluded that multi-word speech commands were considered more reliable and effective than single-word speech commands. In addition, some speech command words (e.g., land) were more noise resistant than others (e.g., hover) at higher noise levels, due to their articulation. From the results of the gesture-lighting experiment, the effects of both lighting conditions and the environment background on the quality of gesture recognition, was almost insignificant, less than 0.5%. The implication of this is that other factors such as the gesture capture system design and technology (camera and computer hardware), type of gesture being captured (upper body, whole body, hand, fingers, or facial gestures), and the image processing technique (gesture classification algorithms), are more important in developing a successful gesture recognition system. Some further works were suggested based on the conclusions drawn from this findings which included using alternative ASR (Automatic Speech Recognition) speech models and developing more robust gesture recognition algorithm

    Multimodal human aerobotic interaction

    No full text
    This chapter discusses HCI interfaces used in controlling aerial robotic systems (otherwise known as aerobots). The autonomy control level of aerobot is also discussed. However, due to the limitations of existing models, a novel classification model of autonomy, specifically designed for multirotor aerial robots, called the navigation control autonomy (nCA) model is also developed. Unlike the existing models such as the AFRL and ONR, this model is presented in tiers and has a two-dimensional pyramidal structure. This model is able to identify the control void existing beyond tier-one autonomy components modes and to map the upper and lower limits of control interfaces. Two solutions are suggested for dealing with the existing control void and the limitations of the RC joystick controller – the multimodal HHI-like interface and the unimodal BCI interface. In addition to these, some human factors based performance measurement is recommended, and the plans for further works presented

    Quantifying the effects of varying light-visibility and noise-sound levels in practical multimodal speech and visual gesture (mSVG) interaction with aerobots

    No full text
    This paper discusses the research work conducted to quantify the effective range of lighting levels and ambient noise levels in order to inform the design and development of a multimodal speech and visual gesture (mSVG) control interface for the control of a UAV. Noise level variation from 55 dB to 85 dB is observed under control lab conditions to determine where speech commands for a UAV fails, and to consider why, and possibly suggest a solution around this. Similarly, lighting levels are varied within the control lab condition to determine a range of effective visibility levels. The limitation of this work and some further work from this were also presented

    The multimodal speech and visual gesture (mSVG) control model for a practical patrol, search, and rescue aerobot

    No full text
    This paper describes a model of the multimodal speech and visual gesture (mSVG) control for aerobots operating at higher nCA autonomy levels, within the context of a patrol, search, and rescue application. The developed mSVG control architecture, its mathematical navigation model, and some high level command operation models were discussed. This was successfully tested using both MATLAB simulation and python based ROS Gazebo UAV simulations. Some limitations were identified, which formed the basis for the further works presented

    Accelerating the development of a psychological intervention to restore treatment decision-making capacity in patients with schizophrenia-spectrum disorder: study protocol for a multi-site, assessor-blinded, pilot Umbrella trial (the DEC:IDES trial)

    Get PDF
    Background:A high proportion of patients diagnosed with schizophrenia-spectrum disorders will at some point in their lives be assessed as not having the capacity to make their own decisions about pharmacological treatment or inpatient care (‘capacity’). Few will be helped to regain it before these interventions proceed. This is partly because effective and safe methods to do so are lacking. Our aim is to accelerate their development by testing, for the first time in mental healthcare, the feasibility, acceptability and safety of running an ‘Umbrella’ trial. This involves running, concurrently and under one multi-site infrastructure, multiple assessor-blind randomised controlled trials, each of which are designed to examine the effect on capacity of improving a single psychological mechanism (‘mechanism’). Our primary objectives are to demonstrate feasibility of (i) recruitment and (ii) data retention on the MacArthur Competence Assessment Tool-Treatment (MacCAT-T; planned primary outcome for a future trial) at end-of-treatment. We selected three mechanisms to test; ‘self-stigma’, low self-esteem and the ‘jumping to conclusions’ bias. Each is highly prevalent in psychosis, responsive to psychological intervention, and hypothesised to contribute to impaired capacity. Methods:Sixty participants with schizophrenia-spectrum diagnoses, impaired capacity, and one or more mechanism(s) will be recruited from outpatient and inpatient mental health services in three UK sites (Lothian, Scotland; Lancashire and Pennine, North West England). Those lacking capacity to consent to research could take part if key criteria were met, including either proxy consent (Scotland) or favourable Consultee advice (England). They will be allocated to one of three randomised controlled trials, depending on which mechanism(s) they have. They will then be randomised to receive, over an 8-week period and in addition to treatment as usual (TAU), 6 sessions of either a psychological intervention which targets the mechanism, or 6 sessions of assessment of the causes of their incapacity (control condition). Participants are assessed at 0 (baseline), 8 (end-of-treatment) and 24 (follow-up) weeks post-randomisation using measures of capacity (MacCAT-T), mechanism, adverse events, psychotic symptoms, subjective recovery, quality of life, service use, anxiety, core schemata, and depression. Two nested qualitative studies will be conducted; one to understand participant and clinician experiences, and one to investigate the validity of MacCAT-T appreciation ratings. Discussion:This will be the first Umbrella trial in mental healthcare. It will produce the first 3 single-blind randomised controlled trials of psychological interventions to support treatment decision-making in schizophrenia-spectrum disorder. Demonstrating feasibility will have significant implications not only for those seeking to support capacity in psychosis, but also those who wish to accelerate the development of psychological interventions for other conditions.Trial registration: ClinicalTrials.gov NCT04309435. Pre-registered on 16th March 2020.https://clinicaltrials.gov/ct2/show/NCT0430943
    corecore