14,400 research outputs found

    Dynamical properties of a trapped dipolar Fermi gas at finite temperature

    Full text link
    We investigate the dynamical properties of a trapped finite-temperature normal Fermi gas with dipole-dipole interaction. For the free expansion dynamics, we show that the expanded gas always becomes stretched along the direction of the dipole moment. In addition, we present the temperature and interaction dependences of the asymptotical aspect ratio. We further study the collapse dynamics of the system by suddenly increasing the dipolar interaction strength. We show that, in contrast to the anisotropic collapse of a dipolar Bose-Einstein condensate, a dipolar Fermi gas always collapses isotropically when the system becomes globally unstable. We also explore the interaction and temperature dependences for the frequencies of the low-lying collective excitations.Comment: 11 pages, 7 figure

    Latent Jailbreak: A Test Suite for Evaluating Both Text Safety and Output Robustness of Large Language Models

    Full text link
    Considerable research efforts have been devoted to ensuring that large language models (LLMs) align with human values and generate safe text. However, an excessive focus on sensitivity to certain topics can compromise the model's robustness in following instructions, thereby impacting its overall performance in completing tasks. Previous benchmarks for jailbreaking LLMs have primarily focused on evaluating the safety of the models without considering their robustness. In this paper, we propose a benchmark that assesses both the safety and robustness of LLMs, emphasizing the need for a balanced approach. To comprehensively study text safety and output robustness, we introduce a latent jailbreak prompt dataset, each involving malicious instruction embedding. Specifically, we instruct the model to complete a regular task, such as translation, with the text to be translated containing malicious instructions. To further analyze safety and robustness, we design a hierarchical annotation framework. We present a systematic analysis of the safety and robustness of LLMs regarding the position of explicit normal instructions, word replacements (verbs in explicit normal instructions, target groups in malicious instructions, cue words for explicit normal instructions), and instruction replacements (different explicit normal instructions). Our results demonstrate that current LLMs not only prioritize certain instruction verbs but also exhibit varying jailbreak rates for different instruction verbs in explicit normal instructions. Code and data are available at https://github.com/qiuhuachuan/latent-jailbreak.Comment: Code and data are available at https://github.com/qiuhuachuan/latent-jailbrea

    5-(2,3,4,5,6-Penta­fluoro­phen­yl)-1,3,4-thia­diazol-2-amine

    Get PDF
    The title compound, C8H2F5N3S, was synthesized by the reaction of perfluoro­benzoic acid and thio­semicarbazide. The dihedral angle between the thia­diazole and perfluoro­phenyl ring is 35.41 (6)°. In the crystal, inter­molecular N—H⋯N hydrogen bonds link the mol­ecules, forming a three-dimensional network

    PsyBench: a balanced and in-depth Psychological Chinese Evaluation Benchmark for Foundation Models

    Full text link
    As Large Language Models (LLMs) are becoming prevalent in various fields, there is an urgent need for improved NLP benchmarks that encompass all the necessary knowledge of individual discipline. Many contemporary benchmarks for foundational models emphasize a broad range of subjects but often fall short in presenting all the critical subjects and encompassing necessary professional knowledge of them. This shortfall has led to skewed results, given that LLMs exhibit varying performance across different subjects and knowledge areas. To address this issue, we present psybench, the first comprehensive Chinese evaluation suite that covers all the necessary knowledge required for graduate entrance exams. psybench offers a deep evaluation of a model's strengths and weaknesses in psychology through multiple-choice questions. Our findings show significant differences in performance across different sections of a subject, highlighting the risk of skewed results when the knowledge in test sets is not balanced. Notably, only the ChatGPT model reaches an average accuracy above 70%70\%, indicating that there is still plenty of room for improvement. We expect that psybench will help to conduct thorough evaluations of base models' strengths and weaknesses and assist in practical application in the field of psychology
    corecore