10 research outputs found

    Dual Discriminator Adversarial Distillation for Data-free Model Compression

    Full text link
    Knowledge distillation has been widely used to produce portable and efficient neural networks which can be well applied on edge devices for computer vision tasks. However, almost all top-performing knowledge distillation methods need to access the original training data, which usually has a huge size and is often unavailable. To tackle this problem, we propose a novel data-free approach in this paper, named Dual Discriminator Adversarial Distillation (DDAD) to distill a neural network without any training data or meta-data. To be specific, we use a generator to create samples through dual discriminator adversarial distillation, which mimics the original training data. The generator not only uses the pre-trained teacher's intrinsic statistics in existing batch normalization layers but also obtains the maximum discrepancy from the student model. Then the generated samples are used to train the compact student network under the supervision of the teacher. The proposed method obtains an efficient student network which closely approximates its teacher network, despite using no original training data. Extensive experiments are conducted to to demonstrate the effectiveness of the proposed approach on CIFAR-10, CIFAR-100 and Caltech101 datasets for classification tasks. Moreover, we extend our method to semantic segmentation tasks on several public datasets such as CamVid and NYUv2. All experiments show that our method outperforms all baselines for data-free knowledge distillation

    The Rocketbox Library and the Utility of Freely Available Rigged Avatars

    Get PDF
    As part of the open sourcing of the Microsoft Rocketbox avatar library for research and academic purposes, here we discuss the importance of rigged avatars for the Virtual and Augmented Reality (VR, AR) research community. Avatars, virtual representations of humans, are widely used in VR applications. Furthermore many research areas ranging from crowd simulation to neuroscience, psychology, or sociology have used avatars to investigate new theories or to demonstrate how they influence human performance and interactions. We divide this paper in two main parts: the first one gives an overview of the different methods available to create and animate avatars. We cover the current main alternatives for face and body animation as well introduce upcoming capture methods. The second part presents the scientific evidence of the utility of using rigged avatars for embodiment but also for applications such as crowd simulation and entertainment. All in all this paper attempts to convey why rigged avatars will be key to the future of VR and its wide adoption

    3D face reconstruction and gaze tracking in the HMD for virtual interaction

    Get PDF
    With the rapid development of virtual reality (VR) technology, VR headsets, a.k.a. Head-Mounted Displays (HMDs), are widely available, allowing immersive 3D content to be viewed. A natural need for truly immersive VR is to allow bidirectional communication: the user should be able to interact with the virtual world using facial expressions and eye gaze, in addition to traditional means of interaction. The typical application scenario includes VR virtual conferencing and virtual roaming, where ideally users are able to see other users expressions and have eye contact with them in the virtual world. In addition, eye gaze also provides a natural means of interaction with virtual objects. Despite significant achievements in recent years for reconstruction of 3D faces from RGB or RGB-D images, it remains a challenge to reliably capture and reconstruct 3D facial expressions including eye gaze when the user is wearing an HMD, because the majority of the face is occluded, especially those areas around the eyes which are essential for recognizing facial expressions and eye gaze. In this paper, we introduce a novel real-time system that is able to capture and reconstruct 3D faces wearing HMDs, and robustly recover eye gaze. We further propose a novel method to map eye gaze directions to the 3D virtual world, which provides a novel and useful interactive mode in VR. We compare our method with state of-the-art techniques both qualitatively and quantitatively, and demonstrate the effectiveness of our system using live capture

    CorrNet: Fine-grained emotion recognition for video watching using wearable physiological sensors

    Get PDF
    Recognizing user emotions while they watch short-form videos anytime and anywhere is essential for facilitating video content customization and personalization. However, most works either classify a single emotion per video stimuli, or are restricted to static, desktop environments. To address this, we propose a correlation-based emotion recognition algorithm (CorrNet) to recognize the valence and arousal (V-A) of each instance (fine-grained segment of signals) using only wearable, physiological signals (e.g., electrodermal activity, heart rate). CorrNet takes advantage of features both inside each instance (intra-modality features) and between different instances for the same video stimuli (correlation-based features). We first test our approach on an indoor-desktop affect dataset (CASE), and thereafter on an outdoor-mobile affect dataset (MERCA) which we collected using a smart wristband and wearable eyetracker. Results show that for subject-independent binary classification (high-low), CorrNet yields promising recognition accuracies: 76.37% and 74.03% for V-A on CASE, and 70.29% and 68.15% for V-A on MERCA. Our findings show: (1) instance segment lengths between 1–4 s result in highest recognition accuracies (2) accuracies between laboratory-grade and wearable sensors are comparable, even under low sampling rates (≤64 Hz) (3) large amounts of neu-tral V-A labels, an artifact of continuous affect annotation, result in varied recognition performance

    Affective state recognition in Virtual Reality from electromyography and photoplethysmography using head-mounted wearable sensors.

    Get PDF
    The three core components of Affective Computing (AC) are emotion expression recognition, emotion processing, and emotional feedback. Affective states are typically characterized in a two-dimensional space consisting of arousal, i.e., the intensity of the emotion felt; and valence, i.e., the degree to which the current emotion is pleasant or unpleasant. These fundamental properties of emotion can not only be measured using subjective ratings from users, but also with the help of physiological and behavioural measures, which potentially provide an objective evaluation across users. Multiple combinations of measures are utilised in AC for a range of applications, including education, healthcare, marketing, and entertainment. As the uses of immersive Virtual Reality (VR) technologies are growing, there is a rapidly increasing need for robust affect recognition in VR settings. However, the integration of affect detection methodologies with VR remains an unmet challenge due to constraints posed by the current VR technologies, such as Head Mounted Displays. This EngD project is designed to overcome some of the challenges by effectively integrating valence and arousal recognition methods in VR technologies and by testing their reliability in seated and room-scale full immersive VR conditions. The aim of this EngD research project is to identify how affective states are elicited in VR and how they can be efficiently measured, without constraining the movement and decreasing the sense of presence in the virtual world. Through a three-years long collaboration with Emteq labs Ltd, a wearable technology company, we assisted in the development of a novel multimodal affect detection system, specifically tailored towards the requirements of VR. This thesis will describe the architecture of the system, the research studies that enabled this development, and the future challenges. The studies conducted, validated the reliability of our proposed system, including the VR stimuli design, data measures and processing pipeline. This work could inform future studies in the field of AC in VR and assist in the development of novel applications and healthcare interventions