154 research outputs found

    Detecting head movement using gyroscope data collected via in-ear wearables

    Get PDF
    Abstract. Head movement is considered as an effective, natural, and simple method to determine the pointing towards an object. Head movement detection technology has significant potentiality in diverse field of applications and studies in this field verify such claim. The application includes fields like users interaction with computers, controlling many devices externally, power wheelchair operation, detecting drivers’ drowsiness while they drive, video surveillance system, and many more. Due to the diversity in application, the method of detecting head movement is also wide-ranging. A number of approaches such as acoustic-based, video-based, computer-vision based, inertial sensor data based head movement detection methods have been introduced by researchers over the years. In order to generate inertial sensor data, various types of wearables are available for example wrist band, smart watch, head-mounted device, and so on. For this thesis, eSense — a representative earable device — that has built-in inertial sensor to generate gyroscope data is employed. This eSense device is a True Wireless Stereo (TWS) earbud. It is augmented with some key equipment such as a 6-axis inertial motion unit, a microphone, and dual mode Bluetooth (Bluetooth Classic and Bluetooth Low Energy). Features are extracted from gyroscope data collected via eSense device. Subsequently, four machine learning models — Random Forest (RF), Support Vector Machine (SVM), Naïve Bayes, and Perceptron — are applied aiming to detect head movement. The performance of these models is evaluated by four different evaluation metrics such as Accuracy, Precision, Recall, and F1 score. Result shows that machine learning models that have been applied in this thesis are able to detect head movement. Comparing the performance of all these machine learning models, Random Forest performs better than others, it is able to detect head movement with approximately 77% accuracy. The accuracy rate of other three models such as Support Vector Machine, Naïve Bayes, and Perceptron is close to each other, where these models detect head movement with about 42%, 40%, and 39% accuracy, respectively. Besides, the result of other evaluation metrics like Precision, Recall, and F1 score verifies that using these machine learning models, different head direction such as left, right, or straight can be detected

    Learning Sensory Representations with Minimal Supervision

    Get PDF

    Development of a real-time classifier for the identification of the Sit-To-Stand motion pattern

    Get PDF
    The Sit-to-Stand (STS) movement has significant importance in clinical practice, since it is an indicator of lower limb functionality. As an optimal trade-off between costs and accuracy, accelerometers have recently been used to synchronously recognise the STS transition in various Human Activity Recognition-based tasks. However, beyond the mere identification of the entire action, a major challenge remains the recognition of clinically relevant phases inside the STS motion pattern, due to the intrinsic variability of the movement. This work presents the development process of a deep-learning model aimed at recognising specific clinical valid phases in the STS, relying on a pool of 39 young and healthy participants performing the task under self-paced (SP) and controlled speed (CT). The movements were registered using a total of 6 inertial sensors, and the accelerometric data was labelised into four sequential STS phases according to the Ground Reaction Force profiles acquired through a force plate. The optimised architecture combined convolutional and recurrent neural networks into a hybrid approach and was able to correctly identify the four STS phases, both under SP and CT movements, relying on the single sensor placed on the chest. The overall accuracy estimate (median [95% confidence intervals]) for the hybrid architecture was 96.09 [95.37 - 96.56] in SP trials and 95.74 [95.39 \u2013 96.21] in CT trials. Moreover, the prediction delays ( 4533 ms) were compatible with the temporal characteristics of the dataset, sampled at 10 Hz (100 ms). These results support the implementation of the proposed model in the development of digital rehabilitation solutions able to synchronously recognise the STS movement pattern, with the aim of effectively evaluate and correct its execution

    CorrNet: Fine-grained emotion recognition for video watching using wearable physiological sensors

    Get PDF
    Recognizing user emotions while they watch short-form videos anytime and anywhere is essential for facilitating video content customization and personalization. However, most works either classify a single emotion per video stimuli, or are restricted to static, desktop environments. To address this, we propose a correlation-based emotion recognition algorithm (CorrNet) to recognize the valence and arousal (V-A) of each instance (fine-grained segment of signals) using only wearable, physiological signals (e.g., electrodermal activity, heart rate). CorrNet takes advantage of features both inside each instance (intra-modality features) and between different instances for the same video stimuli (correlation-based features). We first test our approach on an indoor-desktop affect dataset (CASE), and thereafter on an outdoor-mobile affect dataset (MERCA) which we collected using a smart wristband and wearable eyetracker. Results show that for subject-independent binary classification (high-low), CorrNet yields promising recognition accuracies: 76.37% and 74.03% for V-A on CASE, and 70.29% and 68.15% for V-A on MERCA. Our findings show: (1) instance segment lengths between 1–4 s result in highest recognition accuracies (2) accuracies between laboratory-grade and wearable sensors are comparable, even under low sampling rates (≤64 Hz) (3) large amounts of neu-tral V-A labels, an artifact of continuous affect annotation, result in varied recognition performance

    Multimodal radar sensing for ambient assisted living

    Get PDF
    Data acquired from health and behavioural monitoring of daily life activities can be exploited to provide real-time medical and nursing service with affordable cost and higher efficiency. A variety of sensing technologies for this purpose have been developed and presented in the literature, for instance, wearable IMU (Inertial Measurement Unit) to measure acceleration and angular speed of the person, cameras to record the images or video sequence, PIR (Pyroelectric infrared) sensor to detect the presence of the person based on Pyroelectric Effect, and radar to estimate distance and radial velocity of the person. Each sensing technology has pros and cons, and may not be optimal for the tasks. It is possible to leverage the strength of all these sensors through information fusion in a multimodal fashion. The fusion can take place at three different levels, namely, i) signal level where commensurate data are combined, ii) feature level where feature vectors of different sensors are concatenated and iii) decision level where confidence level or prediction label of classifiers are used to generate a new output. For each level, there are different fusion algorithms, the key challenge here is mainly on choosing the best existing fusion algorithm and developing novel fusion algorithms that more suitable for the current application. The fundamental contribution of this thesis is therefore exploring possible information fusion between radar, primarily FMCW (Frequency Modulated Continuous Wave) radar, and wearable IMU, between distributed radar sensors, and between UWB impulse radar and pressure sensor array. The objective is to sense and classify daily activities patterns, gait styles and micro-gestures as well as producing early warnings of high-risk events such as falls. Initially, only “snapshot” activities (single activity within a short X-s measurement) have been collected and analysed for verifying the accuracy improvement due to information fusion. Then continuous activities (activities that are performed one after another with random duration and transitions) have been collected to simulate the real-world case scenario. To overcome the drawbacks of conventional sliding-window approach on continuous data, a Bi-LSTM (Bidirectional Long Short-Term Memory) network is proposed to identify the transitions of daily activities. Meanwhile, a hybrid fusion framework is presented to exploit the power of soft and hard fusion. Moreover, a trilateration-based signal level fusion method has been successfully applied on the range information of three UWB (Ultra-wideband) impulse radar and the results show comparable performance as using micro-Doppler signature, at the price of much less computation loads. For classifying ‘snapshot’ activities, fusion between radar and wearable shows approximately 12% accuracy improvement compared to using radar only, whereas for classifying continuous activities and gaits, our proposed hybrid fusion and trilateration-based signal level improves roughly 6.8% (before 89%, after 95.8%) and 7.3% (before 85.4%, after 92.7%), respectively

    Human robot interaction in a crowded environment

    No full text
    Human Robot Interaction (HRI) is the primary means of establishing natural and affective communication between humans and robots. HRI enables robots to act in a way similar to humans in order to assist in activities that are considered to be laborious, unsafe, or repetitive. Vision based human robot interaction is a major component of HRI, with which visual information is used to interpret how human interaction takes place. Common tasks of HRI include finding pre-trained static or dynamic gestures in an image, which involves localising different key parts of the human body such as the face and hands. This information is subsequently used to extract different gestures. After the initial detection process, the robot is required to comprehend the underlying meaning of these gestures [3]. Thus far, most gesture recognition systems can only detect gestures and identify a person in relatively static environments. This is not realistic for practical applications as difficulties may arise from people‟s movements and changing illumination conditions. Another issue to consider is that of identifying the commanding person in a crowded scene, which is important for interpreting the navigation commands. To this end, it is necessary to associate the gesture to the correct person and automatic reasoning is required to extract the most probable location of the person who has initiated the gesture. In this thesis, we have proposed a practical framework for addressing the above issues. It attempts to achieve a coarse level understanding about a given environment before engaging in active communication. This includes recognizing human robot interaction, where a person has the intention to communicate with the robot. In this regard, it is necessary to differentiate if people present are engaged with each other or their surrounding environment. The basic task is to detect and reason about the environmental context and different interactions so as to respond accordingly. For example, if individuals are engaged in conversation, the robot should realize it is best not to disturb or, if an individual is receptive to the robot‟s interaction, it may approach the person. Finally, if the user is moving in the environment, it can analyse further to understand if any help can be offered in assisting this user. The method proposed in this thesis combines multiple visual cues in a Bayesian framework to identify people in a scene and determine potential intentions. For improving system performance, contextual feedback is used, which allows the Bayesian network to evolve and adjust itself according to the surrounding environment. The results achieved demonstrate the effectiveness of the technique in dealing with human-robot interaction in a relatively crowded environment [7]

    【研究分野別】シーズ集 [英語版]

    Get PDF
    [英語版

    Co-designing opportunities for human-centred machine learning in supporting type 1 diabetes decision-making

    Get PDF
    Type 1 Diabetes (T1D) self-management requires hundreds of daily decisions. Diabetes technologies that use machine learning have significant potential to simplify this process and provide better decision support, but often rely on cumbersome data logging and cognitively demanding reflection on collected data. We set out to use co-design to identify opportunities for machine learning to support diabetes self-management in everyday settings. However, over nine months of interviews and design workshops with 15 people with T1D, we had to re-assess our assumptions about user needs. Our participants reported confidence in their personal knowledge and rejected machine learning based decision support when coping with routine situations, but highlighted the need for technological support in the context of unfamiliar or unexpected situations (holidays, illness, etc.). However, these are the situations where prior data are often lacking and drawing data-driven conclusions is challenging. Reflecting this challenge, we provide suggestions on how machine learning and other artificial intelligence approaches, e.g., expert systems, could enable decision-making support in both routine and unexpected situations
    corecore