5 research outputs found

    SpeakingFaces: A Large-Scale Multimodal Dataset of Voice Commands with Visual and Thermal Video Streams

    Full text link
    We present SpeakingFaces as a publicly-available large-scale dataset developed to support multimodal machine learning research in contexts that utilize a combination of thermal, visual, and audio data streams; examples include human-computer interaction (HCI), biometric authentication, recognition systems, domain transfer, and speech recognition. SpeakingFaces is comprised of well-aligned high-resolution thermal and visual spectra image streams of fully-framed faces synchronized with audio recordings of each subject speaking approximately 100 imperative phrases. Data were collected from 142 subjects, yielding over 13,000 instances of synchronized data (~3.8 TB). For technical validation, we demonstrate two baseline examples. The first baseline shows classification by gender, utilizing different combinations of the three data streams in both clean and noisy environments. The second example consists of thermal-to-visual facial image translation, as an instance of domain transfer.Comment: 6 pages, 4 figures, 3 table

    A Particle-Based COVID-19 Simulator With Contact Tracing and Testing

    No full text
    Goal: The COVID-19 pandemic has emerged as the most severe public health crisis in over a century. As of January 2021, there are more than 100 million cases and 2.1 million deaths. For informed decision making, reliable statistical data and capable simulation tools are needed. Our goal is to develop an epidemic simulator that can model the effects of random population testing and contact tracing. Methods: Our simulator models individuals as particles with the position, velocity, and epidemic status states on a 2D map and runs an SEIR epidemic model with contact tracing and testing modules. The simulator is available on GitHub under the MIT license. Results: The results show that the synergistic use of contact tracing and massive testing is effective in suppressing the epidemic (the number of deaths was reduced by 72%). Conclusions: The Particle-based COVID-19 simulator enables the modeling of intervention measures, random testing, and contact tracing, for epidemic mitigation and suppression

    SPEAKINGFACES: A LARGE-SCALE MULTIMODAL DATASET OF VOICE COMMANDS WITH VISUAL AND THERMAL VIDEO STREAMS

    No full text
    We present SpeakingFaces as a publicly-available large-scale multimodal dataset developed to support machine learning research in contexts that utilize a combination of thermal, visual, and audio data streams; examples include human–computer interaction, biometric authentication, recognition systems, domain transfer, and speech recognition. SpeakingFaces is comprised of aligned high-resolution thermal and visual spectra image streams of fully-framed faces synchronized with audio recordings of each subject speaking approximately 100 imperative phrases. Data were collected from 142 subjects, yielding over 13,000 instances of synchronized data (∼3.8 TB). For technical validation, we demonstrate two baseline examples. The first baseline shows classification by gender, utilizing different combinations of the three data streams in both clean and noisy environments. The second example consists of thermal-to-visual facial image translation, as an instance of domain transfer

    Deep Learning Based Object Recognition Using Physically-Realistic Synthetic Depth Scenes

    No full text
    Recognizing objects and estimating their poses have a wide range of application in robotics. For instance, to grasp objects, robots need the position and orientation of objects in 3D. The task becomes challenging in a cluttered environment with different types of objects. A popular approach to tackle this problem is to utilize a deep neural network for object recognition. However, deep learning-based object detection in cluttered environments requires a substantial amount of data. Collection of these data requires time and extensive human labor for manual labeling. In this study, our objective was the development and validation of a deep object recognition framework using a synthetic depth image dataset. We synthetically generated a depth image dataset of 22 objects randomly placed in a 0.5 m × 0.5 m × 0.1 m box, and automatically labeled all objects with an occlusion rate below 70%. Faster Region Convolutional Neural Network (R-CNN) architecture was adopted for training using a dataset of 800,000 synthetic depth images, and its performance was tested on a real-world depth image dataset consisting of 2000 samples. Deep object recognizer has 40.96% detection accuracy on the real depth images and 93.5% on the synthetic depth images. Training the deep learning model with noise-added synthetic images improves the recognition accuracy for real images to 46.3%. The object detection framework can be trained on synthetically generated depth data, and then employed for object recognition on the real depth data in a cluttered environment. Synthetic depth data-based deep object detection has the potential to substantially decrease the time and human effort required for the extensive data collection and labeling

    A NETWORK-BASED STOCHASTIC EPIDEMIC SIMULATOR: CONTROLLING COVID-19 WITH REGION-SPECIFIC POLICIES

    No full text
    —In this work, we present an open-source stochastic epidemic simulator calibrated with extant epidemic experience of COVID-19. The simulator models a country as a network representing each node as an administrative region. The transportation connections between the nodes are modeled as the edges of this network. Each node runs a Susceptible-Exposed-InfectedRecovered (SEIR) model and population transfer between the nodes is considered using the transportation networks which allows modeling of the geographic spread of the disease. The simulator incorporates information ranging from population demographics and mobility data to health care resource capacity, by region, with interactive controls of system variables to allow dynamic and interactive modeling of events. The single-node simulator was validated using the thoroughly reported data from Lombardy, Italy. Then, the epidemic situation in Kazakhstan as of 31 May 2020 was accurately recreated. Afterward, we simulated a number of scenarios for Kazakhstan with different sets of policies. We also demonstrate the effects of region-based policies such as transportation limitations between administrative units and the application of different policies for different regions based on the epidemic intensity and geographic location. The results show that the simulator can be used to estimate outcomes of policy options to inform deliberations on governmental interdiction policie
    corecore