59 research outputs found

    Cross-modal self-attention mechanism for controlling robot volleyball motion

    Get PDF
    IntroductionThe emergence of cross-modal perception and deep learning technologies has had a profound impact on modern robotics. This study focuses on the application of these technologies in the field of robot control, specifically in the context of volleyball tasks. The primary objective is to achieve precise control of robots in volleyball tasks by effectively integrating information from different sensors using a cross-modal self-attention mechanism.MethodsOur approach involves the utilization of a cross-modal self-attention mechanism to integrate information from various sensors, providing robots with a more comprehensive scene perception in volleyball scenarios. To enhance the diversity and practicality of robot training, we employ Generative Adversarial Networks (GANs) to synthesize realistic volleyball scenarios. Furthermore, we leverage transfer learning to incorporate knowledge from other sports datasets, enriching the process of skill acquisition for robots.ResultsTo validate the feasibility of our approach, we conducted experiments where we simulated robot volleyball scenarios using multiple volleyball-related datasets. We measured various quantitative metrics, including accuracy, recall, precision, and F1 score. The experimental results indicate a significant enhancement in the performance of our approach in robot volleyball tasks.DiscussionThe outcomes of this study offer valuable insights into the application of multi-modal perception and deep learning in the field of sports robotics. By effectively integrating information from different sensors and incorporating synthetic data through GANs and transfer learning, our approach demonstrates improved robot performance in volleyball tasks. These findings not only advance the field of robotics but also open up new possibilities for human-robot collaboration in sports and athletic performance improvement. This research paves the way for further exploration of advanced technologies in sports robotics, benefiting both the scientific community and athletes seeking performance enhancement through robotic assistance

    The classification of skateboarding trick images by means of transfer learning and machine learning models

    Get PDF
    The evaluation of tricks executions in skateboarding is commonly executed manually and subjectively. The panels of judges often rely on their prior experience in identifying the effectiveness of tricks performance during skateboarding competitions. This technique of classifying tricks is deemed as not a practical solution for the evaluation of skateboarding tricks mainly for big competitions. Therefore, an objective and unbiased means of evaluating skateboarding tricks for analyzing skateboarder’s trick is nontrivial. This study aims at classifying flat ground tricks namely Ollie, Kickflip, Pop Shove-it, Nollie Frontside Shove-it, and Frontside 180 through the camera vision and the combination of Transfer Learning (TL) and Machine Learning (ML). An amateur skateboarder (23 years of age with ± 5.0 years’ experience) executed five tricks for each type of trick repeatedly on an HZ skateboard from a YI action camera placed at a distance of 1.26 m on a cemented ground. The features from the image obtained are extracted automatically via 18 TL models. The features extracted from the models are then fed into different tuned ML classifiers models, for instance, Support Vector Machine (SVM), k-Nearest Neighbors (k-NN), and Random Forest (RF). The grid search optimization technique through five-fold cross-validation was used to tune the hyperparameters of the classifiers evaluated. The data (722 images) was split into training, validation, and testing with a stratified ratio of 60:20:20, respectively. The study demonstrated that VGG16 + SVM and VGG19 + RF attained classification accuracy (CA) of 100% and 98%, respectively on the test dataset, followed by VGG19 + k-NN and also DenseNet201 + k-NN that achieved a CA of 97%. In order to evaluate the developed pipelines, robustness evaluation was carried out via the form of independent testing that employed the augmented images (2250 images). It was found that VGG16 + SVM, VGG19 + k-NN, and DenseNet201 + RF (by average) are able to yield reasonable CA with 99%, 98%, and 97%, respectively. Conclusively, based on the robustness evaluation, it can be ascertained that the VGG16 + SVM pipeline able to classify the tricks exceptionally well. Therefore, from the present study, it has been demonstrated that the proposed pipelines may facilitate judges in providing a more accurate evaluation of the tricks performed as opposed to the traditional method that is currently applied in competitions

    DatasetDM: Synthesizing Data with Perception Annotations Using Diffusion Models

    Full text link
    Current deep networks are very data-hungry and benefit from training on largescale datasets, which are often time-consuming to collect and annotate. By contrast, synthetic data can be generated infinitely using generative models such as DALL-E and diffusion models, with minimal effort and cost. In this paper, we present DatasetDM, a generic dataset generation model that can produce diverse synthetic images and the corresponding high-quality perception annotations (e.g., segmentation masks, and depth). Our method builds upon the pre-trained diffusion model and extends text-guided image synthesis to perception data generation. We show that the rich latent code of the diffusion model can be effectively decoded as accurate perception annotations using a decoder module. Training the decoder only needs less than 1% (around 100 images) manually labeled images, enabling the generation of an infinitely large annotated dataset. Then these synthetic data can be used for training various perception models for downstream tasks. To showcase the power of the proposed approach, we generate datasets with rich dense pixel-wise labels for a wide range of downstream tasks, including semantic segmentation, instance segmentation, and depth estimation. Notably, it achieves 1) state-of-the-art results on semantic segmentation and instance segmentation; 2) significantly more robust on domain generalization than using the real data alone; and state-of-the-art results in zero-shot segmentation setting; and 3) flexibility for efficient application and novel task composition (e.g., image editing). The project website and code can be found at https://weijiawu.github.io/DatasetDM_page/ and https://github.com/showlab/DatasetDM, respectivel

    Human activity recognition for people with knee osteoarthritis—A proof‐of‐concept

    Get PDF
    Clinicians lack objective means for monitoring if their knee osteoarthritis patients are improving outside of the clinic (e.g., at home). Previous human activity recognition (HAR) models using wearable sensor data have only used data from healthy people and such models are typically imprecise for people who have medical conditions affecting movement. HAR models designed for people with knee osteoarthritis have classified rehabilitation exercises but not the clinically relevant activities of transitioning from a chair, negotiating stairs and walking, which are commonly monitored for improvement during therapy for this condition. Therefore, it is unknown if a HAR model trained on data from people who have knee osteoarthritis can be accurate in classifying these three clinically relevant activities. Therefore, we collected inertial measurement unit (IMU) data from 18 participants with knee osteoarthritis and trained convolutional neural network models to identify chair, stairs and walking activities, and phases. The model accuracy was 85% at the first level of classification (activity), 89–97% at the second (direction of movement) and 60–67% at the third level (phase). This study is the first proof‐of‐concept that an accurate HAR system can be developed using IMU data from people with knee osteoarthritis to classify activities and phases of activities

    Home-based rehabilitation of the shoulder using auxiliary systems and artificial intelligence: an overview

    Get PDF
    Advancements in modern medicine have bolstered the usage of home-based rehabilitation services for patients, particularly those recovering from diseases or conditions that necessitate a structured rehabilitation process. Understanding the technological factors that can influence the efficacy of home-based rehabilitation is crucial for optimizing patient outcomes. As technologies continue to evolve rapidly, it is imperative to document the current state of the art and elucidate the key features of the hardware and software employed in these rehabilitation systems. This narrative review aims to provide a summary of the modern technological trends and advancements in home-based shoulder rehabilitation scenarios. It specifically focuses on wearable devices, robots, exoskeletons, machine learning, virtual and augmented reality, and serious games. Through an in-depth analysis of existing literature and research, this review presents the state of the art in home-based rehabilitation systems, highlighting their strengths and limitations. Furthermore, this review proposes hypotheses and potential directions for future upgrades and enhancements in these technologies. By exploring the integration of these technologies into home-based rehabilitation, this review aims to shed light on the current landscape and offer insights into the future possibilities for improving patient outcomes and optimizing the effectiveness of home-based rehabilitation programs.info:eu-repo/semantics/publishedVersio
    • 

    corecore