58 research outputs found

    HumanMAC: Masked Motion Completion for Human Motion Prediction

    Full text link
    Human motion prediction is a classical problem in computer vision and computer graphics, which has a wide range of practical applications. Previous effects achieve great empirical performance based on an encoding-decoding style. The methods of this style work by first encoding previous motions to latent representations and then decoding the latent representations into predicted motions. However, in practice, they are still unsatisfactory due to several issues, including complicated loss constraints, cumbersome training processes, and scarce switch of different categories of motions in prediction. In this paper, to address the above issues, we jump out of the foregoing style and propose a novel framework from a new perspective. Specifically, our framework works in a masked completion fashion. In the training stage, we learn a motion diffusion model that generates motions from random noise. In the inference stage, with a denoising procedure, we make motion prediction conditioning on observed motions to output more continuous and controllable predictions. The proposed framework enjoys promising algorithmic properties, which only needs one loss in optimization and is trained in an end-to-end manner. Additionally, it accomplishes the switch of different categories of motions effectively, which is significant in realistic tasks, e.g., the animation task. Comprehensive experiments on benchmarks confirm the superiority of the proposed framework. The project page is available at https://lhchen.top/Human-MAC

    Read and Reap the Rewards: Learning to Play Atari with the Help of Instruction Manuals

    Full text link
    High sample complexity has long been a challenge for RL. On the other hand, humans learn to perform tasks not only from interaction or demonstrations, but also by reading unstructured text documents, e.g., instruction manuals. Instruction manuals and wiki pages are among the most abundant data that could inform agents of valuable features and policies or task-specific environmental dynamics and reward structures. Therefore, we hypothesize that the ability to utilize human-written instruction manuals to assist learning policies for specific tasks should lead to a more efficient and better-performing agent. We propose the Read and Reward framework. Read and Reward speeds up RL algorithms on Atari games by reading manuals released by the Atari game developers. Our framework consists of a QA Extraction module that extracts and summarizes relevant information from the manual and a Reasoning module that evaluates object-agent interactions based on information from the manual. An auxiliary reward is then provided to a standard A2C RL agent, when interaction is detected. Experimentally, various RL algorithms obtain significant improvement in performance and training speed when assisted by our design

    Retrieving positions of closely packed sub-wavelength nanoparticles from their diffraction patterns

    Full text link
    Distinguishing two objects or point sources located closer than the Rayleigh distance is impossible in conventional microscopy. Understandably, the task becomes increasingly harder with a growing number of particles placed in close proximity. It has been recently demonstrated that subwavelength nanoparticles in closely packed clusters can be counted by AI-enabled analysis of the diffraction patterns of coherent light scattered by the cluster. Here we show that deep learning analysis can determine the actual position of the nanoparticle in the cluster of subwavelength particles from a sing-shot diffraction pattern even if they are separated by distances below the Rayleigh resolution limit of a conventional microscope.Comment: 6 pages, 3 figure

    Prediction model of obstructive sleep apnea–related hypertension: Machine learning–based development and interpretation study

    Get PDF
    BackgroundObstructive sleep apnea (OSA) is a globally prevalent disease closely associated with hypertension. To date, no predictive model for OSA-related hypertension has been established. We aimed to use machine learning (ML) to construct a model to analyze risk factors and predict OSA-related hypertension.Materials and methodsWe retrospectively collected the clinical data of OSA patients diagnosed by polysomnography from October 2019 to December 2021 and randomly divided them into training and validation sets. A total of 1,493 OSA patients with 27 variables were included. Independent risk factors for the risk of OSA-related hypertension were screened by the multifactorial logistic regression models. Six ML algorithms, including the logistic regression (LR), the gradient boosting machine (GBM), the extreme gradient boosting (XGBoost), adaptive boosting (AdaBoost), bootstrapped aggregating (Bagging), and the multilayer perceptron (MLP), were used to develop the model on the training set. The validation set was used to tune the model hyperparameters to determine the final prediction model. We compared the accuracy and discrimination of the models to identify the best machine learning algorithm for predicting OSA-related hypertension. In addition, a web-based tool was developed to promote its clinical application. We used permutation importance and Shapley additive explanations (SHAP) to determine the importance of the selected features and interpret the ML models.ResultsA total of 18 variables were selected for the models. The GBM model achieved the most extraordinary discriminatory ability (area under the receiver operating characteristic curve = 0.873, accuracy = 0.885, sensitivity = 0.713), and on the basis of this model, an online tool was built to help clinicians optimize OSA-related hypertension patient diagnosis. Finally, age, family history of hypertension, minimum arterial oxygen saturation, body mass index, and percentage of time of SaO2 < 90% were revealed by the SHAP method as the top five critical variables contributing to the diagnosis of OSA-related hypertension.ConclusionWe established a risk prediction model for OSA-related hypertension patients using the ML method and demonstrated that among the six ML models, the gradient boosting machine model performs best. This prediction model could help to identify high-risk OSA-related hypertension patients, provide early and individualized diagnoses and treatment plans, protect patients from the serious consequences of OSA-related hypertension, and minimize the burden on society

    CLIP-Based Adaptive Graph Attention Network for Large-Scale Unsupervised Multi-Modal Hashing Retrieval

    No full text
    With the proliferation of multi-modal data generated by various sensors, unsupervised multi-modal hashing retrieval has been extensively studied due to its advantages in storage, retrieval efficiency, and label independence. However, there are still two obstacles to existing unsupervised methods: (1) As existing methods cannot fully capture the complementary and co-occurrence information of multi-modal data, existing methods suffer from inaccurate similarity measures. (2) Existing methods suffer from unbalanced multi-modal learning and data semantic structure being corrupted in the process of hash codes binarization. To address these obstacles, we devise an effective CLIP-based Adaptive Graph Attention Network (CAGAN) for large-scale unsupervised multi-modal hashing retrieval. Firstly, we use the multi-modal model CLIP to extract fine-grained semantic features, mine similar information from different perspectives of multi-modal data and perform similarity fusion and enhancement. In addition, this paper proposes an adaptive graph attention network to assist the learning of hash codes, which uses an attention mechanism to learn adaptive graph similarity across modalities. It further aggregates the intrinsic neighborhood information of neighboring data nodes through a graph convolutional network to generate more discriminative hash codes. Finally, this paper employs an iterative approximate optimization strategy to mitigate the information loss in the binarization process. Extensive experiments on three benchmark datasets demonstrate that the proposed method significantly outperforms several representative hashing methods in unsupervised multi-modal retrieval tasks

    A Kind of Reinforcement Learning to Improve Genetic Algorithm for Multiagent Task Scheduling

    No full text
    It is difficult to coordinate the various processes in the process industry. We built a multiagent distributed hierarchical intelligent control model for manufacturing systems integrating multiple production units based on multiagent system technology. The model organically combines multiple intelligent agent modules and physical entities to form an intelligent control system with certain functions. The model consists of system management agent, workshop control agent, and equipment agent. For the task assignment problem with this model, we combine reinforcement learning to improve the genetic algorithm for multiagent task scheduling and use the standard task scheduling dataset in OR-Library for simulation experiment analysis. Experimental results show that the algorithm is superior
    • …
    corecore