9 research outputs found

    Accurate and Fast Compressed Video Captioning

    Full text link
    Existing video captioning approaches typically require to first sample video frames from a decoded video and then conduct a subsequent process (e.g., feature extraction and/or captioning model learning). In this pipeline, manual frame sampling may ignore key information in videos and thus degrade performance. Additionally, redundant information in the sampled frames may result in low efficiency in the inference of video captioning. Addressing this, we study video captioning from a different perspective in compressed domain, which brings multi-fold advantages over the existing pipeline: 1) Compared to raw images from the decoded video, the compressed video, consisting of I-frames, motion vectors and residuals, is highly distinguishable, which allows us to leverage the entire video for learning without manual sampling through a specialized model design; 2) The captioning model is more efficient in inference as smaller and less redundant information is processed. We propose a simple yet effective end-to-end transformer in the compressed domain for video captioning that enables learning from the compressed video for captioning. We show that even with a simple design, our method can achieve state-of-the-art performance on different benchmarks while running almost 2x faster than existing approaches. Code is available at https://github.com/acherstyx/CoCap

    Rethinking Domain Generalization for Face Anti-spoofing: Separability and Alignment

    Full text link
    This work studies the generalization issue of face anti-spoofing (FAS) models on domain gaps, such as image resolution, blurriness and sensor variations. Most prior works regard domain-specific signals as a negative impact, and apply metric learning or adversarial losses to remove them from feature representation. Though learning a domain-invariant feature space is viable for the training data, we show that the feature shift still exists in an unseen test domain, which backfires on the generalizability of the classifier. In this work, instead of constructing a domain-invariant feature space, we encourage domain separability while aligning the live-to-spoof transition (i.e., the trajectory from live to spoof) to be the same for all domains. We formulate this FAS strategy of separability and alignment (SA-FAS) as a problem of invariant risk minimization (IRM), and learn domain-variant feature representation but domain-invariant classifier. We demonstrate the effectiveness of SA-FAS on challenging cross-domain FAS datasets and establish state-of-the-art performance.Comment: Accepted in CVPR202

    Effect of lithium anti-ablation and grain refinement introduced by TiC nanoparticles in LPBF Al–Li alloy

    No full text
    Al–Li alloys are extensively utilized in the field of aerospace due to their low density and high specific strength. However, laser powder bed fusion (LPBF) processed Al–Li alloys still encounter challenges because of hot cracking and Li element ablation. In this study, a TiC nanoparticle modified Al–Mg–Li alloy is developed for LPBF process. Full dense printed TiC modified Al–Mg–Li alloy can be obtained. The presence of TiC nanoparticles in the melt pool effectively increased the viscosity of Al alloy liquid, leading to a reduction in the metal vaporization and liquid spatters, thus preventing the Li ablation during LBPF. The Li content was significantly increased from 0.87 wt.% in the Al–Mg–Li alloy to 1.34 % in the TiC modified Al–Mg–Li alloy. Moreover, the TiC nanoparticles played a key role in transition of columnar to equiaxed grain. The average grain size of TiC modified Al–Mg–Li alloy was refined to about 1.5 μm, two orders of magnitude smaller than that in printed Al–Mg–Li alloy. A gradient transition reaction from TiC to Al3Ti was found on the TiC nanoparticles surface during LPBF. The in-situ formed Al3Ti phase on TiC nanoparticles significantly decreased the lattice mismatch with Al matrix, thereby resulting in an outstanding mechanical property of ultimate tensile strength of 343 MPa and elongation of 9.3 %. The effect of Li element anti-ablation induced by TiC nanoparticles provided a new pathway for additive manufacturing light-weight alloy

    A novel experimental method for in situ strain measurement during selective laser melting

    No full text
    Selective laser Melting (SLM), a powder bed-based additive manufacturing technology, has been developed and applied in multiple industrial fields in the last decade. However, the distortion and swelling in the SLM process resulting from thermal stress cannot be predicted subject to measurement. In this work, an in situ distortion measurement system applied to the SLM process is presented. The distortion behaviour of component under laser scanning can be precisely recorded in real-time by this system. The detailed evolution and driving force of specimen distortion in the SLM process are discussed based on the experimental results. The distortion in single laser scanning presents a strong instantaneous upward motion of the central section during laser heating and a relatively slow downward recovery motion of the central section during cooling. The distortion behaviour of the sample with and without a layer of metal powder are compared, and laser scanning on the bare sample surface leads to a significantly higher residual distortion. The influence of SLM parameter variables (such as scanning speed, laser power, scanning width, layer thickness and scanning times) on SLM distortion is also analysed. At last, the stress distribution of laser melting is verified by the high-resolution EBSD analysis

    Adaptive Transformers for Robust Few-shot Cross-domain Face Anti-spoofing

    Full text link
    While recent face anti-spoofing methods perform well under the intra-domain setups, an effective approach needs to account for much larger appearance variations of images acquired in complex scenes with different sensors for robust performance. In this paper, we present adaptive vision transformers (ViT) for robust cross-domain face anti-spoofing. Specifically, we adopt ViT as a backbone to exploit its strength to account for long-range dependencies among pixels. We further introduce the ensemble adapters module and feature-wise transformation layers in the ViT to adapt to different domains for robust performance with a few samples. Experiments on several benchmark datasets show that the proposed models achieve both robust and competitive performance against the state-of-the-art methods
    corecore