132 research outputs found

    Is ChatGPT A Good Translator? Yes With GPT-4 As The Engine

    Full text link
    This report provides a preliminary evaluation of ChatGPT for machine translation, including translation prompt, multilingual translation, and translation robustness. We adopt the prompts advised by ChatGPT to trigger its translation ability and find that the candidate prompts generally work well with minor performance differences. By evaluating on a number of benchmark test sets, we find that ChatGPT performs competitively with commercial translation products (e.g., Google Translate) on high-resource European languages but lags behind significantly on low-resource or distant languages. As for the translation robustness, ChatGPT does not perform as well as the commercial systems on biomedical abstracts or Reddit comments but exhibits good results on spoken language. Further, we explore an interesting strategy named pivot prompting\mathbf{pivot~prompting} for distant languages, which asks ChatGPT to translate the source sentence into a high-resource pivot language before into the target language, improving the translation performance noticeably. With the launch of the GPT-4 engine, the translation performance of ChatGPT is significantly boosted, becoming comparable to commercial translation products, even for distant languages. Human analysis on Google Translate and ChatGPT suggests that ChatGPT with GPT-3.5 tends to generate more hallucinations and mis-translation errors while that with GPT-4 makes the least errors. In other words, ChatGPT has already become a good translator. Please refer to our Github project for more details: https://github.com/wxjiao/Is-ChatGPT-A-Good-TranslatorComment: Analyzed/compared the outputs between ChatGPT and Google Translate; both automatic and human evaluatio

    GPT-4 Is Too Smart To Be Safe: Stealthy Chat with LLMs via Cipher

    Full text link
    Safety lies at the core of the development of Large Language Models (LLMs). There is ample work on aligning LLMs with human ethics and preferences, including data filtering in pretraining, supervised fine-tuning, reinforcement learning from human feedback, and red teaming, etc. In this study, we discover that chat in cipher can bypass the safety alignment techniques of LLMs, which are mainly conducted in natural languages. We propose a novel framework CipherChat to systematically examine the generalizability of safety alignment to non-natural languages -- ciphers. CipherChat enables humans to chat with LLMs through cipher prompts topped with system role descriptions and few-shot enciphered demonstrations. We use CipherChat to assess state-of-the-art LLMs, including ChatGPT and GPT-4 for different representative human ciphers across 11 safety domains in both English and Chinese. Experimental results show that certain ciphers succeed almost 100% of the time to bypass the safety alignment of GPT-4 in several safety domains, demonstrating the necessity of developing safety alignment for non-natural languages. Notably, we identify that LLMs seem to have a ''secret cipher'', and propose a novel SelfCipher that uses only role play and several demonstrations in natural language to evoke this capability. SelfCipher surprisingly outperforms existing human ciphers in almost all cases. Our code and data will be released at https://github.com/RobustNLP/CipherChat.Comment: 13 pages, 4 figures, 9 table

    The Earth is Flat? Unveiling Factual Errors in Large Language Models

    Full text link
    Large Language Models (LLMs) like ChatGPT are foundational in various applications due to their extensive knowledge from pre-training and fine-tuning. Despite this, they are prone to generating factual and commonsense errors, raising concerns in critical areas like healthcare, journalism, and education to mislead users. Current methods for evaluating LLMs' veracity are limited by test data leakage or the need for extensive human labor, hindering efficient and accurate error detection. To tackle this problem, we introduce a novel, automatic testing framework, FactChecker, aimed at uncovering factual inaccuracies in LLMs. This framework involves three main steps: First, it constructs a factual knowledge graph by retrieving fact triplets from a large-scale knowledge database. Then, leveraging the knowledge graph, FactChecker employs a rule-based approach to generates three types of questions (Yes-No, Multiple-Choice, and WH questions) that involve single-hop and multi-hop relations, along with correct answers. Lastly, it assesses the LLMs' responses for accuracy using tailored matching strategies for each question type. Our extensive tests on six prominent LLMs, including text-davinci-002, text-davinci-003, ChatGPT~(gpt-3.5-turbo, gpt-4), Vicuna, and LLaMA-2, reveal that FactChecker can trigger factual errors in up to 45\% of questions in these models. Moreover, we demonstrate that FactChecker's test cases can improve LLMs' factual accuracy through in-context learning and fine-tuning (e.g., llama-2-13b-chat's accuracy increase from 35.3\% to 68.5\%). We are making all code, data, and results available for future research endeavors

    An Advanced Adaptive Control of Lower Limb Rehabilitation Robot

    Get PDF
    Rehabilitation robots play an important role in the rehabilitation field, and effective human-robot interaction contributes to promoting the development of the rehabilitation robots. Though many studies about the human-robot interaction have been carried out, there are still several limitations in the flexibility and stability of the control system. Therefore, we proposed an advanced adaptive control method for lower limb rehabilitation robot. The method was devised with a dual closed loop control strategy based on the surface electromyography (sEMG) and plantar pressure to improve the robustness of the adaptive control for the rehabilitation robots. First, in the outer loop control, an advanced variable impedance controller based on the sEMG and plantar pressure was designed to correct robot's reference trajectory. Then, in the inner loop control, a sliding mode iterative learning controller (SMILC) based on the variable boundary saturation function was designed to achieve the tracking of the reference trajectory. The experiment results showed that, in the designed dual closed loop control strategy, a variable impedance controller can effectively reduce trajectory tracking errors and adaptively modify the reference trajectory synchronizing with the motion intention of patients; the designed sliding mode iterative learning controller can effectively reduce chattering in sliding mode control and excellently achieve the tracking of rehabilitation robot's reference trajectory. This study can improve the performance of the human-robot interaction of the rehabilitation robot system, and expand the application to the rehabilitation field

    Spectrophotometric determinationof trace nitrite with a novel self-coupling diazotizing reagent: J-acid

    No full text
    A simple and sensitive method for the spectrophotometric determination of nitrite was described and optimum reaction conditions along with other important analytical parameters were established. In the presence of potassium bromide at 25°C, nitrite reacted with J-acid in hydrochloric acid producing diazonium salt and then coupled with excess J-acid in the sodium carbonate solution yielding red colored azo compounds. At wavelength of 500 nm, Beer’s law was obeyed over the concentration range of 0,02 – 0,60 mg∙L⁻¹. The molar absorptivity was 3,92∙10⁴ L∙mol⁻¹∙cm⁻¹. This method was easily applied to the determination of trace nitrite in environmental water with recoveries of 9₈,7 – 101,2%
    corecore