492 research outputs found
FEAFA: A Well-Annotated Dataset for Facial Expression Analysis and 3D Facial Animation
Facial expression analysis based on machine learning requires large number of
well-annotated data to reflect different changes in facial motion. Publicly
available datasets truly help to accelerate research in this area by providing
a benchmark resource, but all of these datasets, to the best of our knowledge,
are limited to rough annotations for action units, including only their
absence, presence, or a five-level intensity according to the Facial Action
Coding System. To meet the need for videos labeled in great detail, we present
a well-annotated dataset named FEAFA for Facial Expression Analysis and 3D
Facial Animation. One hundred and twenty-two participants, including children,
young adults and elderly people, were recorded in real-world conditions. In
addition, 99,356 frames were manually labeled using Expression Quantitative
Tool developed by us to quantify 9 symmetrical FACS action units, 10
asymmetrical (unilateral) FACS action units, 2 symmetrical FACS action
descriptors and 2 asymmetrical FACS action descriptors, and each action unit or
action descriptor is well-annotated with a floating point number between 0 and
1. To provide a baseline for use in future research, a benchmark for the
regression of action unit values based on Convolutional Neural Networks are
presented. We also demonstrate the potential of our FEAFA dataset for 3D facial
animation. Almost all state-of-the-art algorithms for facial animation are
achieved based on 3D face reconstruction. We hence propose a novel method that
drives virtual characters only based on action unit value regression of the 2D
video frames of source actors.Comment: 9 pages, 7 figure
Do you really follow me? Adversarial Instructions for Evaluating the Robustness of Large Language Models
Large Language Models (LLMs) have shown remarkable proficiency in following
instructions, making them valuable in customer-facing applications. However,
their impressive capabilities also raise concerns about the amplification of
risks posed by adversarial instructions, which can be injected into the model
input by third-party attackers to manipulate LLMs' original instructions and
prompt unintended actions and content. Therefore, it is crucial to understand
LLMs' ability to accurately discern which instructions to follow to ensure
their safe deployment in real-world scenarios. In this paper, we propose a
pioneering benchmark for automatically evaluating the robustness of LLMs
against adversarial instructions. The objective of this benchmark is to
quantify the extent to which LLMs are influenced by injected adversarial
instructions and assess their ability to differentiate between these
adversarial instructions and original user instructions. Through experiments
conducted with state-of-the-art instruction-following LLMs, we uncover
significant limitations in their robustness against adversarial instruction
attacks. Furthermore, our findings indicate that prevalent instruction-tuned
models are prone to being overfitted to follow any instruction phrase in the
prompt without truly understanding which instructions should be followed. This
highlights the need to address the challenge of training models to comprehend
prompts instead of merely following instruction phrases and completing the
text.Comment: Work in progres
Multiscale reconstruction of porous media based on multiple dictionaries learning
Digital modeling of the microstructure is important for studying the physical
and transport properties of porous media. Multiscale modeling for porous media
can accurately characterize macro-pores and micro-pores in a large-FoV (field
of view) high-resolution three-dimensional pore structure model. This paper
proposes a multiscale reconstruction algorithm based on multiple dictionaries
learning, in which edge patterns and micro-pore patterns from homology
high-resolution pore structure are introduced into low-resolution pore
structure to build a fine multiscale pore structure model. The qualitative and
quantitative comparisons of the experimental results show that the results of
multiscale reconstruction are similar to the real high-resolution pore
structure in terms of complex pore geometry and pore surface morphology. The
geometric, topological and permeability properties of multiscale reconstruction
results are almost identical to those of the real high-resolution pore
structures. The experiments also demonstrate the proposal algorithm is capable
of multiscale reconstruction without regard to the size of the input. This work
provides an effective method for fine multiscale modeling of porous media
LEAP: Efficient and Automated Test Method for NLP Software
The widespread adoption of DNNs in NLP software has highlighted the need for
robustness. Researchers proposed various automatic testing techniques for
adversarial test cases. However, existing methods suffer from two limitations:
weak error-discovering capabilities, with success rates ranging from 0% to
24.6% for BERT-based NLP software, and time inefficiency, taking 177.8s to
205.28s per test case, making them challenging for time-constrained scenarios.
To address these issues, this paper proposes LEAP, an automated test method
that uses LEvy flight-based Adaptive Particle swarm optimization integrated
with textual features to generate adversarial test cases. Specifically, we
adopt Levy flight for population initialization to increase the diversity of
generated test cases. We also design an inertial weight adaptive update
operator to improve the efficiency of LEAP's global optimization of
high-dimensional text examples and a mutation operator based on the greedy
strategy to reduce the search time. We conducted a series of experiments to
validate LEAP's ability to test NLP software and found that the average success
rate of LEAP in generating adversarial test cases is 79.1%, which is 6.1%
higher than the next best approach (PSOattack). While ensuring high success
rates, LEAP significantly reduces time overhead by up to 147.6s compared to
other heuristic-based methods. Additionally, the experimental results
demonstrate that LEAP can generate more transferable test cases and
significantly enhance the robustness of DNN-based systems.Comment: Accepted at ASE 202
Decoupling and Interacting Multi-Task Learning Network for Joint Speech and Accent Recognition
Accents, as variations from standard pronunciation, pose significant
challenges for speech recognition systems. Although joint automatic speech
recognition (ASR) and accent recognition (AR) training has been proven
effective in handling multi-accent scenarios, current multi-task ASR-AR
approaches overlook the granularity differences between tasks. Fine-grained
units capture pronunciation-related accent characteristics, while
coarse-grained units are better for learning linguistic information. Moreover,
an explicit interaction of two tasks can also provide complementary information
and improve the performance of each other, but it is rarely used by existing
approaches. In this paper, we propose a novel Decoupling and Interacting
Multi-task Network (DIMNet) for joint speech and accent recognition, which is
comprised of a connectionist temporal classification (CTC) branch, an AR
branch, an ASR branch, and a bottom feature encoder. Specifically, AR and ASR
are first decoupled by separated branches and two-granular modeling units to
learn task-specific representations. The AR branch is from our previously
proposed linguistic-acoustic bimodal AR model and the ASR branch is an
encoder-decoder based Conformer model. Then, for the task interaction, the CTC
branch provides aligned text for the AR task, while accent embeddings extracted
from our AR model are incorporated into the ASR branch's encoder and decoder.
Finally, during ASR inference, a cross-granular rescoring method is introduced
to fuse the complementary information from the CTC and attention decoder after
the decoupling. Our experiments on English and Chinese datasets demonstrate the
effectiveness of the proposed model, which achieves 21.45%/28.53% AR accuracy
relative improvement and 32.33%/14.55% ASR error rate relative reduction over a
published standard baseline, respectively.Comment: Accepted by IEEE Transactions on Audio, Speech and Language
Processing (TASLP
Recommended from our members
EZH2 RIP-seq Identifies Tissue-specific Long Non-coding RNAs.
BackgroundPolycomb Repressive Complex 2 (PRC2) catalyzes histone methylation at H3 Lys27, and plays crucial roles during development and diseases in numerous systems. Its catalytic subunit EZH2 represents a key nuclear target for long non-coding RNAs (lncRNAs) that emerging to be a novel class of epigenetic regulator and participate in diverse cellular processes. LncRNAs are characterized by high tissue-specificity; however, little is known about the tissue profile of the EZH2- interacting lncRNAs.ObjectiveHere we performed a global screening for EZH2-binding lncRNAs in tissues including brain, lung, heart, liver, kidney, intestine, spleen, testis, muscle and blood by combining RNA immuno- precipitation and RNA sequencing. We identified 1328 EZH2-binding lncRNAs, among which 470 were shared in at least two tissues while 858 were only detected in single tissue. An RNA motif with specific secondary structure was identified in a number of lncRNAs, albeit not in all EZH2-binding lncRNAs. The EZH2-binding lncRNAs fell into four categories including intergenic lncRNA, antisense lncRNA, intron-related lncRNA and promoter-related lncRNA, suggesting diverse regulations of both cis and trans-mechanisms. A promoter-related lncRNA Hnf1aos1 bound to EZH2 specifically in the liver, a feature same as its paired coding gene Hnf1a, further confirming the validity of our study. In addition to the well known EZH2-binding lncRNAs like Kcnq1ot1, Gas5, Meg3, Hotair and Malat1, majority of the lncRNAs were firstly reported to be associated with EZH2.ConclusionOur findings provide a profiling view of the EZH2-interacting lncRNAs across different tissues, and suggest critical roles of lncRNAs during cell differentiation and maturation
A retrospective study evaluating the correlation between the severity of intervertebral disc injury and the anteroposterior type of thoracolumbar vertebral fractures
OBJECTIVE: To evaluate the correlation between the severity of intervertebral disc injury and the anteroposterior type of thoracolumbar vertebral fractures. METHODS: Fifty-six cases of thoracolumbar vertebral fractures treated in our trauma center from October 2012 to October 2013 were included in this study. The fractures were classified by the anteroposterior classification, whereas the severity of intervertebral disc injury was evaluated using magnetic resonance imaging. The Spearman correlation coefficient was used to analyze the correlation between the severity of intervertebral disc injury and the anteroposterior type of thoracolumbar fractures, whereas a χ2 test was adopted to measure the variability between different fracture types and upper and lower adjacent disc injuries. RESULTS: The Spearman correlation coefficients between fracture types and the severity of the upper and lower adjacent disc injuries were 0.739 (P
- …