12 research outputs found

    Multimodal Generation of Novel Action Appearances for Synthetic-to-Real Recognition of Activities of Daily Living

    Full text link
    Domain shifts, such as appearance changes, are a key challenge in real-world applications of activity recognition models, which range from assistive robotics and smart homes to driver observation in intelligent vehicles. For example, while simulations are an excellent way of economical data collection, a Synthetic-to-Real domain shift leads to a > 60% drop in accuracy when recognizing activities of Daily Living (ADLs). We tackle this challenge and introduce an activity domain generation framework which creates novel ADL appearances (novel domains) from different existing activity modalities (source domains) inferred from video training data. Our framework computes human poses, heatmaps of body joints, and optical flow maps and uses them alongside the original RGB videos to learn the essence of source domains in order to generate completely new ADL domains. The model is optimized by maximizing the distance between the existing source appearances and the generated novel appearances while ensuring that the semantics of an activity is preserved through an additional classification loss. While source data multimodality is an important concept in this design, our setup does not rely on multi-sensor setups, (i.e., all source modalities are inferred from a single video only.) The newly created activity domains are then integrated in the training of the ADL classification networks, resulting in models far less susceptible to changes in data distributions. Extensive experiments on the Synthetic-to-Real benchmark Sims4Action demonstrate the potential of the domain generation paradigm for cross-domain ADL recognition, setting new state-of-the-art results. Our code is publicly available at https://github.com/Zrrr1997/syn2real_DGComment: 8 pages, 7 figures, to be published in IROS 202

    ModSelect: Automatic Modality Selection for Synthetic-to-Real Domain Generalization

    Get PDF
    Modality selection is an important step when designing multimodal systems, especially in the case of cross-domain activity recognition as certain modalities are more robust to domain shift than others. However, selecting only the modalities which have a positive contribution requires a systematic approach. We tackle this problem by proposing an unsupervised modality selection method (ModSelect), which does not require any ground-truth labels. We determine the correlation between the predictions of multiple unimodal classifiers and the domain discrepancy between their embeddings. Then, we systematically compute modality selection thresholds, which select only modalities with a high correlation and low domain discrepancy. We show in our experiments that our method ModSelect chooses only modalities with positive contributions and consistently improves the performance on a Synthetic-to-Real domain adaptation benchmark, narrowing the domain gap

    ModSelect: Automatic Modality Selection for Synthetic-to-Real Domain Generalization

    Full text link
    Modality selection is an important step when designing multimodal systems, especially in the case of cross-domain activity recognition as certain modalities are more robust to domain shift than others. However, selecting only the modalities which have a positive contribution requires a systematic approach. We tackle this problem by proposing an unsupervised modality selection method (ModSelect), which does not require any ground-truth labels. We determine the correlation between the predictions of multiple unimodal classifiers and the domain discrepancy between their embeddings. Then, we systematically compute modality selection thresholds, which select only modalities with a high correlation and low domain discrepancy. We show in our experiments that our method ModSelect chooses only modalities with positive contributions and consistently improves the performance on a Synthetic-to-Real domain adaptation benchmark, narrowing the domain gap.Comment: 14 pages, 6 figures, Accepted at ECCV 2022 OOD worksho

    Quantized Distillation: Optimizing Driver Activity Recognition Models for Resource-Constrained Environments

    Full text link
    Deep learning-based models are at the forefront of most driver observation benchmarks due to their remarkable accuracies but are also associated with high computational costs. This is challenging, as resources are often limited in real-world driving scenarios. This paper introduces a lightweight framework for resource-efficient driver activity recognition. The framework enhances 3D MobileNet, a neural architecture optimized for speed in video classification, by incorporating knowledge distillation and model quantization to balance model accuracy and computational efficiency. Knowledge distillation helps maintain accuracy while reducing the model size by leveraging soft labels from a larger teacher model (I3D), instead of relying solely on original ground truth data. Model quantization significantly lowers memory and computation demands by using lower precision integers for model weights and activations. Extensive testing on a public dataset for in-vehicle monitoring during autonomous driving demonstrates that this new framework achieves a threefold reduction in model size and a 1.4-fold improvement in inference time, compared to an already optimized architecture. The code for this study is available at https://github.com/calvintanama/qd-driver-activity-reco.Comment: Accepted at IROS 202

    AutoPET Challenge 2023: Sliding Window-based Optimization of U-Net

    Full text link
    Tumor segmentation in medical imaging is crucial and relies on precise delineation. Fluorodeoxyglucose Positron-Emission Tomography (FDG-PET) is widely used in clinical practice to detect metabolically active tumors. However, FDG-PET scans may misinterpret irregular glucose consumption in healthy or benign tissues as cancer. Combining PET with Computed Tomography (CT) can enhance tumor segmentation by integrating metabolic and anatomic information. FDG-PET/CT scans are pivotal for cancer staging and reassessment, utilizing radiolabeled fluorodeoxyglucose to highlight metabolically active regions. Accurately distinguishing tumor-specific uptake from physiological uptake in normal tissues is a challenging aspect of precise tumor segmentation. The AutoPET challenge addresses this by providing a dataset of 1014 FDG-PET/CT studies, encouraging advancements in accurate tumor segmentation and analysis within the FDG-PET/CT domain. Code: https://github.com/matt3o/AutoPET2-Submission/Comment: 9 pages, 1 figure, MICCAI 2023 - AutoPET Challenge Submission Version 2: Added all results on the preliminary test se

    Pose2Drone: A Skeleton-Pose-based Framework for Human-Drone Interaction

    Full text link
    Drones have become a common tool, which is utilized in many tasks such as aerial photography, surveillance, and delivery. However, operating a drone requires more and more interaction with the user. A natural and safe method for Human-Drone Interaction (HDI) is using gestures. In this paper, we introduce an HDI framework building upon skeleton-based pose estimation. Our framework provides the functionality to control the movement of the drone with simple arm gestures and to follow the user while keeping a safe distance. We also propose a monocular distance estimation method, which is entirely based on image features and does not require any additional depth sensors. To perform comprehensive experiments and quantitative analysis, we create a customized testing dataset. The experiments indicate that our HDI framework can achieve an average of 93.5\% accuracy in the recognition of 11 common gestures. The code is available at: https://github.com/Zrrr1997/Pose2Dron

    Assessment of the acute impact of normobaric hypoxia as a part of an intermittent hypoxic training on heart rate variability

    No full text
    AbstractAimTo assess the dynamics of ANS by means of heart rate variability (HRV) during and after acute exposure to normobaric hypoxia, representing a single session of an intermittent hypoxic training protocol.Material and methodsTwenty four healthy males aged 28.0±7.2 (mean±SD) breathed hypoxic air (FIO2=12.3±1.5%) for one hour delivered via hypoxicator (AltiPro 8850 Summit+, Altitude Tech, Canada). Pulse oximetry and HRV were measured before, during and after the hypoxic exposure.ResultsAt the end of the hypoxic session all of the tested subjects had higher low frequency (lnLF) (6.9±1.1ms2 vs. 7.5±1.1ms2; p=0.042), LF/HF (1.5±0.8 vs. 3.3±2.8; p=0.007) and standard deviation 2 of the Poincaré plot (SD2) (92.8±140.0ms vs. 120.2±54.2ms; p=0.005) as well as increase in the Total power (7.7±1.1ms2 vs. 8.1±1.2ms2; p=0.032) and the Standard deviation of normal-to-normal interbeat intervals (SDNN) (57.3±31.0ms vs. 72.3±41.1ms; p=0.024) but lower Sample entropy (SampEn) (1.6±0.2 vs. 1.4±0.2; p=0.010). Immediately after the hypoxic exposure LF/HF lowered (3.3±2.8 vs. 2.2±1.8; p=0.001) but lnHF significantly increased (6.6±1.4ms2 vs. 7.1±1.3ms2; p=0.020).ConclusionAcute normobaric hypoxia as a part of a single session of an intermittent hypoxic training protocol leads to changes in the activity of the ANS. The sympathetic tone prevails during hypoxic exposure and parasympathetic tone increases immediately after the hypoxic factor is withdrawn

    Evaluation of Acute Exogenous Hypoxia Impact on the Fraction of Exhaled Nitric Oxide in Healthy Males

    No full text
    Introduction: Exogenous hypoxia increases ventilation and contracts the pulmonary vessels. Whether those factors change the values of nitric oxide in exhaled air has not yet been evaluated. Objective: To examine the effect of exogenous normobaric hypoxia on the values of the fraction of nitric oxide in exhaled breath (FeNO). Subjects аnd Methods: Twenty healthy non-smoker males at mean age of 25.4 (SD = 3.7) were tested. The basal FeNO values were compared with those at 7 min. and 15 min. after introducing into the hypoxic environment (hypoxic tent), imitating atmospheric air with oxygen concentration corresponding to 3200 m above sea level. Exhaled breath temperature was measured at baseline and at 10-12 min. of the hypoxic exposition. Heart rate and oxygen saturation were registered by pulse-oximetry. Results: All the subjects had FeNO values in the reference range. The mean baseline value was 14.0 ± 3.2 ppb, and in hypoxic conditions - 15.5 ± 3.8 ppb (7 min.) and 15.3 ± 3.6 ppb (15 min.), respectively, as the elevation is statistically significant (p = 0.011 and p = 0.008). The values of exhaled breath temperature were 33.79 ± 1.55°С and 33.87 ± 1.83°С (p = 0.70) at baseline and in hypoxic conditions, respectively. Baseline oxygen saturation in all subjects was higher than that, measured in hypoxia (96.93 ± 1.29% vs. 94.27 ± 2.53%; p < 0.001). Conclusions: Exogenous hypoxia leads to an increase of FeNO values, but does not affect the exhaled breath temperature

    The impact of dietary intervention on myokines: a narrative review

    No full text
    This review elucidates the role of dietary factors in influencing myokine secretion, complementing the recognized benefits of physical activity on metabolic health. While exercise is known to positively modulate myokines, contributing to their health-promoting effects, an exploration into how diet affects these critical cytokines remains sparse. Through analysis of the literature, this work identifies dietary patterns that markedly affect myokine levels, underscoring the complex relationship between nutrition and myokine activity. The investigation reveals that specific dietary approaches can profoundly alter myokine secretion, suggesting an essential avenue for combined dietary and exercise interventions to optimize health outcomes. These insights highlight the need for further detailed study into diet-induced myokine modulation

    MedShapeNet -- A Large-Scale Dataset of 3D Medical Shapes for Computer Vision

    No full text
    16 pagesPrior to the deep learning era, shape was commonly used to describe the objects. Nowadays, state-of-the-art (SOTA) algorithms in medical imaging are predominantly diverging from computer vision, where voxel grids, meshes, point clouds, and implicit surface models are used. This is seen from numerous shape-related publications in premier vision conferences as well as the growing popularity of ShapeNet (about 51,300 models) and Princeton ModelNet (127,915 models). For the medical domain, we present a large collection of anatomical shapes (e.g., bones, organs, vessels) and 3D models of surgical instrument, called MedShapeNet, created to facilitate the translation of data-driven vision algorithms to medical applications and to adapt SOTA vision algorithms to medical problems. As a unique feature, we directly model the majority of shapes on the imaging data of real patients. As of today, MedShapeNet includes 23 dataset with more than 100,000 shapes that are paired with annotations (ground truth). Our data is freely accessible via a web interface and a Python application programming interface (API) and can be used for discriminative, reconstructive, and variational benchmarks as well as various applications in virtual, augmented, or mixed reality, and 3D printing. Exemplary, we present use cases in the fields of classification of brain tumors, facial and skull reconstructions, multi-class anatomy completion, education, and 3D printing. In future, we will extend the data and improve the interfaces. The project pages are: https://medshapenet.ikim.nrw/ and https://github.com/Jianningli/medshapenet-feedbac
    corecore