492 research outputs found

    FEAFA: A Well-Annotated Dataset for Facial Expression Analysis and 3D Facial Animation

    Full text link
    Facial expression analysis based on machine learning requires large number of well-annotated data to reflect different changes in facial motion. Publicly available datasets truly help to accelerate research in this area by providing a benchmark resource, but all of these datasets, to the best of our knowledge, are limited to rough annotations for action units, including only their absence, presence, or a five-level intensity according to the Facial Action Coding System. To meet the need for videos labeled in great detail, we present a well-annotated dataset named FEAFA for Facial Expression Analysis and 3D Facial Animation. One hundred and twenty-two participants, including children, young adults and elderly people, were recorded in real-world conditions. In addition, 99,356 frames were manually labeled using Expression Quantitative Tool developed by us to quantify 9 symmetrical FACS action units, 10 asymmetrical (unilateral) FACS action units, 2 symmetrical FACS action descriptors and 2 asymmetrical FACS action descriptors, and each action unit or action descriptor is well-annotated with a floating point number between 0 and 1. To provide a baseline for use in future research, a benchmark for the regression of action unit values based on Convolutional Neural Networks are presented. We also demonstrate the potential of our FEAFA dataset for 3D facial animation. Almost all state-of-the-art algorithms for facial animation are achieved based on 3D face reconstruction. We hence propose a novel method that drives virtual characters only based on action unit value regression of the 2D video frames of source actors.Comment: 9 pages, 7 figure

    Do you really follow me? Adversarial Instructions for Evaluating the Robustness of Large Language Models

    Full text link
    Large Language Models (LLMs) have shown remarkable proficiency in following instructions, making them valuable in customer-facing applications. However, their impressive capabilities also raise concerns about the amplification of risks posed by adversarial instructions, which can be injected into the model input by third-party attackers to manipulate LLMs' original instructions and prompt unintended actions and content. Therefore, it is crucial to understand LLMs' ability to accurately discern which instructions to follow to ensure their safe deployment in real-world scenarios. In this paper, we propose a pioneering benchmark for automatically evaluating the robustness of LLMs against adversarial instructions. The objective of this benchmark is to quantify the extent to which LLMs are influenced by injected adversarial instructions and assess their ability to differentiate between these adversarial instructions and original user instructions. Through experiments conducted with state-of-the-art instruction-following LLMs, we uncover significant limitations in their robustness against adversarial instruction attacks. Furthermore, our findings indicate that prevalent instruction-tuned models are prone to being overfitted to follow any instruction phrase in the prompt without truly understanding which instructions should be followed. This highlights the need to address the challenge of training models to comprehend prompts instead of merely following instruction phrases and completing the text.Comment: Work in progres

    Multiscale reconstruction of porous media based on multiple dictionaries learning

    Full text link
    Digital modeling of the microstructure is important for studying the physical and transport properties of porous media. Multiscale modeling for porous media can accurately characterize macro-pores and micro-pores in a large-FoV (field of view) high-resolution three-dimensional pore structure model. This paper proposes a multiscale reconstruction algorithm based on multiple dictionaries learning, in which edge patterns and micro-pore patterns from homology high-resolution pore structure are introduced into low-resolution pore structure to build a fine multiscale pore structure model. The qualitative and quantitative comparisons of the experimental results show that the results of multiscale reconstruction are similar to the real high-resolution pore structure in terms of complex pore geometry and pore surface morphology. The geometric, topological and permeability properties of multiscale reconstruction results are almost identical to those of the real high-resolution pore structures. The experiments also demonstrate the proposal algorithm is capable of multiscale reconstruction without regard to the size of the input. This work provides an effective method for fine multiscale modeling of porous media

    LEAP: Efficient and Automated Test Method for NLP Software

    Full text link
    The widespread adoption of DNNs in NLP software has highlighted the need for robustness. Researchers proposed various automatic testing techniques for adversarial test cases. However, existing methods suffer from two limitations: weak error-discovering capabilities, with success rates ranging from 0% to 24.6% for BERT-based NLP software, and time inefficiency, taking 177.8s to 205.28s per test case, making them challenging for time-constrained scenarios. To address these issues, this paper proposes LEAP, an automated test method that uses LEvy flight-based Adaptive Particle swarm optimization integrated with textual features to generate adversarial test cases. Specifically, we adopt Levy flight for population initialization to increase the diversity of generated test cases. We also design an inertial weight adaptive update operator to improve the efficiency of LEAP's global optimization of high-dimensional text examples and a mutation operator based on the greedy strategy to reduce the search time. We conducted a series of experiments to validate LEAP's ability to test NLP software and found that the average success rate of LEAP in generating adversarial test cases is 79.1%, which is 6.1% higher than the next best approach (PSOattack). While ensuring high success rates, LEAP significantly reduces time overhead by up to 147.6s compared to other heuristic-based methods. Additionally, the experimental results demonstrate that LEAP can generate more transferable test cases and significantly enhance the robustness of DNN-based systems.Comment: Accepted at ASE 202

    Decoupling and Interacting Multi-Task Learning Network for Joint Speech and Accent Recognition

    Full text link
    Accents, as variations from standard pronunciation, pose significant challenges for speech recognition systems. Although joint automatic speech recognition (ASR) and accent recognition (AR) training has been proven effective in handling multi-accent scenarios, current multi-task ASR-AR approaches overlook the granularity differences between tasks. Fine-grained units capture pronunciation-related accent characteristics, while coarse-grained units are better for learning linguistic information. Moreover, an explicit interaction of two tasks can also provide complementary information and improve the performance of each other, but it is rarely used by existing approaches. In this paper, we propose a novel Decoupling and Interacting Multi-task Network (DIMNet) for joint speech and accent recognition, which is comprised of a connectionist temporal classification (CTC) branch, an AR branch, an ASR branch, and a bottom feature encoder. Specifically, AR and ASR are first decoupled by separated branches and two-granular modeling units to learn task-specific representations. The AR branch is from our previously proposed linguistic-acoustic bimodal AR model and the ASR branch is an encoder-decoder based Conformer model. Then, for the task interaction, the CTC branch provides aligned text for the AR task, while accent embeddings extracted from our AR model are incorporated into the ASR branch's encoder and decoder. Finally, during ASR inference, a cross-granular rescoring method is introduced to fuse the complementary information from the CTC and attention decoder after the decoupling. Our experiments on English and Chinese datasets demonstrate the effectiveness of the proposed model, which achieves 21.45%/28.53% AR accuracy relative improvement and 32.33%/14.55% ASR error rate relative reduction over a published standard baseline, respectively.Comment: Accepted by IEEE Transactions on Audio, Speech and Language Processing (TASLP

    A retrospective study evaluating the correlation between the severity of intervertebral disc injury and the anteroposterior type of thoracolumbar vertebral fractures

    Get PDF
    OBJECTIVE: To evaluate the correlation between the severity of intervertebral disc injury and the anteroposterior type of thoracolumbar vertebral fractures. METHODS: Fifty-six cases of thoracolumbar vertebral fractures treated in our trauma center from October 2012 to October 2013 were included in this study. The fractures were classified by the anteroposterior classification, whereas the severity of intervertebral disc injury was evaluated using magnetic resonance imaging. The Spearman correlation coefficient was used to analyze the correlation between the severity of intervertebral disc injury and the anteroposterior type of thoracolumbar fractures, whereas a χ2 test was adopted to measure the variability between different fracture types and upper and lower adjacent disc injuries. RESULTS: The Spearman correlation coefficients between fracture types and the severity of the upper and lower adjacent disc injuries were 0.739 (P
    corecore