8 research outputs found

    How Trustworthy are Open-Source LLMs? An Assessment under Malicious Demonstrations Shows their Vulnerabilities

    Full text link
    The rapid progress in open-source Large Language Models (LLMs) is significantly driving AI development forward. However, there is still a limited understanding of their trustworthiness. Deploying these models at scale without sufficient trustworthiness can pose significant risks, highlighting the need to uncover these issues promptly. In this work, we conduct an adversarial assessment of open-source LLMs on trustworthiness, scrutinizing them across eight different aspects including toxicity, stereotypes, ethics, hallucination, fairness, sycophancy, privacy, and robustness against adversarial demonstrations. We propose advCoU, an extended Chain of Utterances-based (CoU) prompting strategy by incorporating carefully crafted malicious demonstrations for trustworthiness attack. Our extensive experiments encompass recent and representative series of open-source LLMs, including Vicuna, MPT, Falcon, Mistral, and Llama 2. The empirical outcomes underscore the efficacy of our attack strategy across diverse aspects. More interestingly, our result analysis reveals that models with superior performance in general NLP tasks do not always have greater trustworthiness; in fact, larger models can be more vulnerable to attacks. Additionally, models that have undergone instruction tuning, focusing on instruction following, tend to be more susceptible, although fine-tuning LLMs for safety alignment proves effective in mitigating adversarial trustworthiness attacks.Comment: NAACL 202

    Controllable Decontextualization of Yes/No Question and Answers into Factual Statements

    Full text link
    Yes/No or polar questions represent one of the main linguistic question categories. They consist of a main interrogative clause, for which the answer is binary (assertion or negation). Polar questions and answers (PQA) represent a valuable knowledge resource present in many community and other curated QA sources, such as forums or e-commerce applications. Using answers to polar questions alone in other contexts is not trivial. Answers are contextualized, and presume that the interrogative question clause and any shared knowledge between the asker and answerer are provided. We address the problem of controllable rewriting of answers to polar questions into decontextualized and succinct factual statements. We propose a Transformer sequence to sequence model that utilizes soft-constraints to ensure controllable rewriting, such that the output statement is semantically equivalent to its PQA input. Evaluation on three separate PQA datasets as measured through automated and human evaluation metrics show that our proposed approach achieves the best performance when compared to existing baselines.Comment: Accepted at ECIR 202

    A Multi-Aspect Framework for Counter Narrative Evaluation using Large Language Models

    Full text link
    Counter narratives - informed responses to hate speech contexts designed to refute hateful claims and de-escalate encounters - have emerged as an effective hate speech intervention strategy. While previous work has proposed automatic counter narrative generation methods to aid manual interventions, the evaluation of these approaches remains underdeveloped. Previous automatic metrics for counter narrative evaluation lack alignment with human judgment as they rely on superficial reference comparisons instead of incorporating key aspects of counter narrative quality as evaluation criteria. To address prior evaluation limitations, we propose a novel evaluation framework prompting LLMs to provide scores and feedback for generated counter narrative candidates using 5 defined aspects derived from guidelines from counter narrative specialized NGOs. We found that LLM evaluators achieve strong alignment to human-annotated scores and feedback and outperform alternative metrics, indicating their potential as multi-aspect, reference-free and interpretable evaluators for counter narrative evaluation.Comment: 22 pages, camera-ready version; references added, typos corrected, methodology section expanded, additional tabl

    MagicBrush: A Manually Annotated Dataset for Instruction-Guided Image Editing

    Full text link
    Text-guided image editing is widely needed in daily life, ranging from personal use to professional applications such as Photoshop. However, existing methods are either zero-shot or trained on an automatically synthesized dataset, which contains a high volume of noise. Thus, they still require lots of manual tuning to produce desirable outcomes in practice. To address this issue, we introduce MagicBrush (https://osu-nlp-group.github.io/MagicBrush/), the first large-scale, manually annotated dataset for instruction-guided real image editing that covers diverse scenarios: single-turn, multi-turn, mask-provided, and mask-free editing. MagicBrush comprises over 10K manually annotated triplets (source image, instruction, target image), which supports trainining large-scale text-guided image editing models. We fine-tune InstructPix2Pix on MagicBrush and show that the new model can produce much better images according to human evaluation. We further conduct extensive experiments to evaluate current image editing baselines from multiple dimensions including quantitative, qualitative, and human evaluations. The results reveal the challenging nature of our dataset and the gap between current baselines and real-world editing needs.Comment: NeurIPS 2023; Website: https://osu-nlp-group.github.io/MagicBrush

    A Trembling House of Cards? Mapping Adversarial Attacks against Language Agents

    Full text link
    Language agents powered by large language models (LLMs) have seen exploding development. Their capability of using language as a vehicle for thought and communication lends an incredible level of flexibility and versatility. People have quickly capitalized on this capability to connect LLMs to a wide range of external components and environments: databases, tools, the Internet, robotic embodiment, etc. Many believe an unprecedentedly powerful automation technology is emerging. However, new automation technologies come with new safety risks, especially for intricate systems like language agents. There is a surprisingly large gap between the speed and scale of their development and deployment and our understanding of their safety risks. Are we building a house of cards? In this position paper, we present the first systematic effort in mapping adversarial attacks against language agents. We first present a unified conceptual framework for agents with three major components: Perception, Brain, and Action. Under this framework, we present a comprehensive discussion and propose 12 potential attack scenarios against different components of an agent, covering different attack strategies (e.g., input manipulation, adversarial demonstrations, jailbreaking, backdoors). We also draw connections to successful attack strategies previously applied to LLMs. We emphasize the urgency to gain a thorough understanding of language agent risks before their widespread deployment

    Could an isolated human body lower limb model predict leg biomechanical response of Chinese pedestrians in vehicle collisions?

    No full text
    Purpose: The purpose of the current study was to investigate whether an isolated human body lower limb FE model could predict leg kinematics and biomechanical response of a full body Chinese pedestrian model in vehicle collisions. Methods: A human body lower limb FE model representing midsize Chinese adult male anthropometry was employed with different upper body weight attachments being evaluated by comparing the predictions to those of a full body pedestrian model in vehicle-to-pedestrian collisions considering different front-end shapes. Results: The results indicate that upper body mass has a significant influence on pedestrian lower limb injury risk, the effect varies from vehicle front-end shape and is more remarkable to the femur and knee ligaments than to the tibia. In particular, the upper body mass can generally increase femur and knee ligaments injury risk, but has no obvious effect on the injury risk of tibia. The results also show that a higher attached buttock mass is needed for isolated pedestrian lower limb model for impacts with vehicles of higher bonnet leading edge. Conclusions: The findings of this study may suggest that it is necessary to consider vehicle shape variation in assessment of vehicle pedestrian protection performance and leg-form impactors with adaptive upper body mass should be used for vehicles with different front-end shapes, and the use of regional leg-form impactor modeling the local anthropometry to evaluate the actual lower limb injury of pedestrians in different countries and regions

    Seasonal Dynamics and Influencing Factors of Litterfall Production and Carbon Input in Typical Forest Community Types in Lushan Mountain, China

    No full text
    Litterfall is an important part of the process of nutrient circulation and energy flow in forest ecosystems. Mountain forests are strongly eroded by running water in that the surface soil is thinner, and the terrain is complex and diverse. They are more sensitive to climate change, which will affect the ecological processes and carbon sink functions of forest ecosystems. Taking Lushan Mountain as an example, we studied the dynamic characteristics of litterfall components, seasonal changes in carbon input and the influencing factors of typical forest communities in the subtropics. The results showed that the total annual average litterfall components of evergreen broad-leaved forest (EBF) > artificial coniferous forest (ACF) > deciduous broad-leaved forest (DBF) > renew young forest (RYF), and that leaf litterfall is the first productivity in the litterfall components, and the peak of litterfall is mainly concentrated in spring and autumn, showing a single- or double-peaked change pattern. There was a linear relationship between the components of litterfall in the four forest communities and the stand factor, but the correlation degree R2 was small. Overall, the results showed that the total amount of litterfall in the four forest communities was affected by canopy density and stand density. Light, temperature and water at different altitudes had different effects on the amount of litterfall, with excessive temperatures at lower altitudes likely to limit forest growth and development under adequate light and water, and the opposite was true at higher altitudes. The results of Pearson correlation analysis showed that EBF and DBF were negatively correlated with rainfall, that ACF and RYF were negatively correlated with temperature and rainfall, and that wind speed was positively correlated. The average annual carbon input size of the four forest communities was EBF > ACF > RYF > DBF, which may be related to environmental conditions and vegetation types, and the seasonal differences were arranged in order of spring > autumn > summer > winter. It can be seen that, considering performance under future climate change, EBF is more conducive to nutrient input and has good soil fertility maintenance ability
    corecore