109 research outputs found

    Skeleton2vec: A Self-supervised Learning Framework with Contextualized Target Representations for Skeleton Sequence

    Full text link
    Self-supervised pre-training paradigms have been extensively explored in the field of skeleton-based action recognition. In particular, methods based on masked prediction have pushed the performance of pre-training to a new height. However, these methods take low-level features, such as raw joint coordinates or temporal motion, as prediction targets for the masked regions, which is suboptimal. In this paper, we show that using high-level contextualized features as prediction targets can achieve superior performance. Specifically, we propose Skeleton2vec, a simple and efficient self-supervised 3D action representation learning framework, which utilizes a transformer-based teacher encoder taking unmasked training samples as input to create latent contextualized representations as prediction targets. Benefiting from the self-attention mechanism, the latent representations generated by the teacher encoder can incorporate the global context of the entire training samples, leading to a richer training task. Additionally, considering the high temporal correlations in skeleton sequences, we propose a motion-aware tube masking strategy which divides the skeleton sequence into several tubes and performs persistent masking within each tube based on motion priors, thus forcing the model to build long-range spatio-temporal connections and focus on action-semantic richer regions. Extensive experiments on NTU-60, NTU-120, and PKU-MMD datasets demonstrate that our proposed Skeleton2vec outperforms previous methods and achieves state-of-the-art results.Comment: Submitted to CVPR 202

    Gradient Attention Balance Network: Mitigating Face Recognition Racial Bias via Gradient Attention

    Full text link
    Although face recognition has made impressive progress in recent years, we ignore the racial bias of the recognition system when we pursue a high level of accuracy. Previous work found that for different races, face recognition networks focus on different facial regions, and the sensitive regions of darker-skinned people are much smaller. Based on this discovery, we propose a new de-bias method based on gradient attention, called Gradient Attention Balance Network (GABN). Specifically, we use the gradient attention map (GAM) of the face recognition network to track the sensitive facial regions and make the GAMs of different races tend to be consistent through adversarial learning. This method mitigates the bias by making the network focus on similar facial regions. In addition, we also use masks to erase the Top-N sensitive facial regions, forcing the network to allocate its attention to a larger facial region. This method expands the sensitive region of darker-skinned people and further reduces the gap between GAM of darker-skinned people and GAM of Caucasians. Extensive experiments show that GABN successfully mitigates racial bias in face recognition and learns more balanced performance for people of different races.Comment: Accepted by CVPR 2023 worksho

    Deep Learning for Automated Contouring of Gross Tumor Volumes in Esophageal Cancer

    Get PDF
    PurposeThe aim of this study was to propose and evaluate a novel three-dimensional (3D) V-Net and two-dimensional (2D) U-Net mixed (VUMix-Net) architecture for a fully automatic and accurate gross tumor volume (GTV) in esophageal cancer (EC)–delineated contours.MethodsWe collected the computed tomography (CT) scans of 215 EC patients. 3D V-Net, 2D U-Net, and VUMix-Net were developed and further applied simultaneously to delineate GTVs. The Dice similarity coefficient (DSC) and 95th-percentile Hausdorff distance (95HD) were used as quantitative metrics to evaluate the performance of the three models in ECs from different segments. The CT data of 20 patients were randomly selected as the ground truth (GT) masks, and the corresponding delineation results were generated by artificial intelligence (AI). Score differences between the two groups (GT versus AI) and the evaluation consistency were compared.ResultsIn all patients, there was a significant difference in the 2D DSCs from U-Net, V-Net, and VUMix-Net (p=0.01). In addition, VUMix-Net showed achieved better 3D-DSC and 95HD values. There was a significant difference among the 3D-DSC (mean ± STD) and 95HD values for upper-, middle-, and lower-segment EC (p<0.001), and the middle EC values were the best. In middle-segment EC, VUMix-Net achieved the highest 2D-DSC values (p<0.001) and lowest 95HD values (p=0.044).ConclusionThe new model (VUMix-Net) showed certain advantages in delineating the GTVs of EC. Additionally, it can generate the GTVs of EC that meet clinical requirements and have the same quality as human-generated contours. The system demonstrated the best performance for the ECs of the middle segment

    The Ginger-shaped Asteroid 4179 Toutatis: New Observations from a Successful Flyby of Chang'e-2

    Full text link
    On 13 December 2012, Chang'e-2 conducted a successful flyby of the near-Earth asteroid 4179 Toutatis at a closest distance of 770 ±\pm 120 meters from the asteroid's surface. The highest-resolution image, with a resolution of better than 3 meters, reveals new discoveries on the asteroid, e.g., a giant basin at the big end, a sharply perpendicular silhouette near the neck region, and direct evidence of boulders and regolith, which suggests that Toutatis may bear a rubble-pile structure. Toutatis' maximum physical length and width are (4.75 ×\times 1.95 km) ±\pm10%\%, respectively, and the direction of the +zz axis is estimated to be (250±\pm5^\circ, 63±\pm5^\circ) with respect to the J2000 ecliptic coordinate system. The bifurcated configuration is indicative of a contact binary origin for Toutatis, which is composed of two lobes (head and body). Chang'e-2 observations have significantly improved our understanding of the characteristics, formation, and evolution of asteroids in general.Comment: 21 pages, 3 figures, 1 tabl

    Effect of Continuous Loading Coupled with Wet–Dry Cycles on Strength Deterioration of Concrete

    No full text
    In practical engineering, concrete is often under continuous stress conditions and there are limitations in considering the effect of wet–dry cycles alone on the strength deterioration of concrete. In order to study the deterioration of concrete strength under the coupling of load and wet-dry cycles, concrete specimens were loaded with 0%, 10%, 20%, and 35% stress levels and coupled to undergo one, three, and seven wet–dry cycles. The strength deterioration of the concrete was obtained by uniaxial compression and the regression equation was established. The strength deterioration mechanism of the concrete under the coupled conditions was analyzed and revealed through an AE acoustic emission technique and nuclear magnetic resonance technique. The results of the study show that, with the same number of wet–dry cycles, there are two thresholds of a and b for the uniaxial compressive strength of concrete with the stress level, and with the progression of wet–dry cycles, the length of the interval from a to b gradually shortens until it reaches 0. The cumulative AE energy of concrete decreases with the progression of wet–dry cycles; using the initiating crack stress as the threshold, the calm phase of concrete acoustic emission, the fluctuating phase, and the NMR T2 spectral peak area show different patterns of variation with the increase in the number of wet–dry cycles
    corecore