8 research outputs found
How Trustworthy are Open-Source LLMs? An Assessment under Malicious Demonstrations Shows their Vulnerabilities
The rapid progress in open-source Large Language Models (LLMs) is
significantly driving AI development forward. However, there is still a limited
understanding of their trustworthiness. Deploying these models at scale without
sufficient trustworthiness can pose significant risks, highlighting the need to
uncover these issues promptly. In this work, we conduct an adversarial
assessment of open-source LLMs on trustworthiness, scrutinizing them across
eight different aspects including toxicity, stereotypes, ethics, hallucination,
fairness, sycophancy, privacy, and robustness against adversarial
demonstrations. We propose advCoU, an extended Chain of Utterances-based (CoU)
prompting strategy by incorporating carefully crafted malicious demonstrations
for trustworthiness attack. Our extensive experiments encompass recent and
representative series of open-source LLMs, including Vicuna, MPT, Falcon,
Mistral, and Llama 2. The empirical outcomes underscore the efficacy of our
attack strategy across diverse aspects. More interestingly, our result analysis
reveals that models with superior performance in general NLP tasks do not
always have greater trustworthiness; in fact, larger models can be more
vulnerable to attacks. Additionally, models that have undergone instruction
tuning, focusing on instruction following, tend to be more susceptible,
although fine-tuning LLMs for safety alignment proves effective in mitigating
adversarial trustworthiness attacks.Comment: NAACL 202
Controllable Decontextualization of Yes/No Question and Answers into Factual Statements
Yes/No or polar questions represent one of the main linguistic question
categories. They consist of a main interrogative clause, for which the answer
is binary (assertion or negation). Polar questions and answers (PQA) represent
a valuable knowledge resource present in many community and other curated QA
sources, such as forums or e-commerce applications. Using answers to polar
questions alone in other contexts is not trivial. Answers are contextualized,
and presume that the interrogative question clause and any shared knowledge
between the asker and answerer are provided.
We address the problem of controllable rewriting of answers to polar
questions into decontextualized and succinct factual statements. We propose a
Transformer sequence to sequence model that utilizes soft-constraints to ensure
controllable rewriting, such that the output statement is semantically
equivalent to its PQA input. Evaluation on three separate PQA datasets as
measured through automated and human evaluation metrics show that our proposed
approach achieves the best performance when compared to existing baselines.Comment: Accepted at ECIR 202
A Multi-Aspect Framework for Counter Narrative Evaluation using Large Language Models
Counter narratives - informed responses to hate speech contexts designed to
refute hateful claims and de-escalate encounters - have emerged as an effective
hate speech intervention strategy. While previous work has proposed automatic
counter narrative generation methods to aid manual interventions, the
evaluation of these approaches remains underdeveloped. Previous automatic
metrics for counter narrative evaluation lack alignment with human judgment as
they rely on superficial reference comparisons instead of incorporating key
aspects of counter narrative quality as evaluation criteria. To address prior
evaluation limitations, we propose a novel evaluation framework prompting LLMs
to provide scores and feedback for generated counter narrative candidates using
5 defined aspects derived from guidelines from counter narrative specialized
NGOs. We found that LLM evaluators achieve strong alignment to human-annotated
scores and feedback and outperform alternative metrics, indicating their
potential as multi-aspect, reference-free and interpretable evaluators for
counter narrative evaluation.Comment: 22 pages, camera-ready version; references added, typos corrected,
methodology section expanded, additional tabl
MagicBrush: A Manually Annotated Dataset for Instruction-Guided Image Editing
Text-guided image editing is widely needed in daily life, ranging from
personal use to professional applications such as Photoshop. However, existing
methods are either zero-shot or trained on an automatically synthesized
dataset, which contains a high volume of noise. Thus, they still require lots
of manual tuning to produce desirable outcomes in practice. To address this
issue, we introduce MagicBrush (https://osu-nlp-group.github.io/MagicBrush/),
the first large-scale, manually annotated dataset for instruction-guided real
image editing that covers diverse scenarios: single-turn, multi-turn,
mask-provided, and mask-free editing. MagicBrush comprises over 10K manually
annotated triplets (source image, instruction, target image), which supports
trainining large-scale text-guided image editing models. We fine-tune
InstructPix2Pix on MagicBrush and show that the new model can produce much
better images according to human evaluation. We further conduct extensive
experiments to evaluate current image editing baselines from multiple
dimensions including quantitative, qualitative, and human evaluations. The
results reveal the challenging nature of our dataset and the gap between
current baselines and real-world editing needs.Comment: NeurIPS 2023; Website: https://osu-nlp-group.github.io/MagicBrush
A Trembling House of Cards? Mapping Adversarial Attacks against Language Agents
Language agents powered by large language models (LLMs) have seen exploding
development. Their capability of using language as a vehicle for thought and
communication lends an incredible level of flexibility and versatility. People
have quickly capitalized on this capability to connect LLMs to a wide range of
external components and environments: databases, tools, the Internet, robotic
embodiment, etc. Many believe an unprecedentedly powerful automation technology
is emerging. However, new automation technologies come with new safety risks,
especially for intricate systems like language agents. There is a surprisingly
large gap between the speed and scale of their development and deployment and
our understanding of their safety risks. Are we building a house of cards? In
this position paper, we present the first systematic effort in mapping
adversarial attacks against language agents. We first present a unified
conceptual framework for agents with three major components: Perception, Brain,
and Action. Under this framework, we present a comprehensive discussion and
propose 12 potential attack scenarios against different components of an agent,
covering different attack strategies (e.g., input manipulation, adversarial
demonstrations, jailbreaking, backdoors). We also draw connections to
successful attack strategies previously applied to LLMs. We emphasize the
urgency to gain a thorough understanding of language agent risks before their
widespread deployment
Could an isolated human body lower limb model predict leg biomechanical response of Chinese pedestrians in vehicle collisions?
Purpose: The purpose of the current study was to investigate whether an isolated human body lower limb FE model could predict leg kinematics and biomechanical response of a full body Chinese pedestrian model in vehicle collisions. Methods: A human body lower limb FE model representing midsize Chinese adult male anthropometry was employed with different upper body weight attachments being evaluated by comparing the predictions to those of a full body pedestrian model in vehicle-to-pedestrian collisions considering different front-end shapes. Results: The results indicate that upper body mass has a significant influence on pedestrian lower limb injury risk, the effect varies from vehicle front-end shape and is more remarkable to the femur and knee ligaments than to the tibia. In particular, the upper body mass can generally increase femur and knee ligaments injury risk, but has no obvious effect on the injury risk of tibia. The results also show that a higher attached buttock mass is needed for isolated pedestrian lower limb model for impacts with vehicles of higher bonnet leading edge. Conclusions: The findings of this study may suggest that it is necessary to consider vehicle shape variation in assessment of vehicle pedestrian protection performance and leg-form impactors with adaptive upper body mass should be used for vehicles with different front-end shapes, and the use of regional leg-form impactor modeling the local anthropometry to evaluate the actual lower limb injury of pedestrians in different countries and regions
Seasonal Dynamics and Influencing Factors of Litterfall Production and Carbon Input in Typical Forest Community Types in Lushan Mountain, China
Litterfall is an important part of the process of nutrient circulation and energy flow in forest ecosystems. Mountain forests are strongly eroded by running water in that the surface soil is thinner, and the terrain is complex and diverse. They are more sensitive to climate change, which will affect the ecological processes and carbon sink functions of forest ecosystems. Taking Lushan Mountain as an example, we studied the dynamic characteristics of litterfall components, seasonal changes in carbon input and the influencing factors of typical forest communities in the subtropics. The results showed that the total annual average litterfall components of evergreen broad-leaved forest (EBF) > artificial coniferous forest (ACF) > deciduous broad-leaved forest (DBF) > renew young forest (RYF), and that leaf litterfall is the first productivity in the litterfall components, and the peak of litterfall is mainly concentrated in spring and autumn, showing a single- or double-peaked change pattern. There was a linear relationship between the components of litterfall in the four forest communities and the stand factor, but the correlation degree R2 was small. Overall, the results showed that the total amount of litterfall in the four forest communities was affected by canopy density and stand density. Light, temperature and water at different altitudes had different effects on the amount of litterfall, with excessive temperatures at lower altitudes likely to limit forest growth and development under adequate light and water, and the opposite was true at higher altitudes. The results of Pearson correlation analysis showed that EBF and DBF were negatively correlated with rainfall, that ACF and RYF were negatively correlated with temperature and rainfall, and that wind speed was positively correlated. The average annual carbon input size of the four forest communities was EBF > ACF > RYF > DBF, which may be related to environmental conditions and vegetation types, and the seasonal differences were arranged in order of spring > autumn > summer > winter. It can be seen that, considering performance under future climate change, EBF is more conducive to nutrient input and has good soil fertility maintenance ability