132 research outputs found
Is ChatGPT A Good Translator? Yes With GPT-4 As The Engine
This report provides a preliminary evaluation of ChatGPT for machine
translation, including translation prompt, multilingual translation, and
translation robustness. We adopt the prompts advised by ChatGPT to trigger its
translation ability and find that the candidate prompts generally work well
with minor performance differences. By evaluating on a number of benchmark test
sets, we find that ChatGPT performs competitively with commercial translation
products (e.g., Google Translate) on high-resource European languages but lags
behind significantly on low-resource or distant languages. As for the
translation robustness, ChatGPT does not perform as well as the commercial
systems on biomedical abstracts or Reddit comments but exhibits good results on
spoken language. Further, we explore an interesting strategy named
for distant languages, which asks ChatGPT to
translate the source sentence into a high-resource pivot language before into
the target language, improving the translation performance noticeably. With the
launch of the GPT-4 engine, the translation performance of ChatGPT is
significantly boosted, becoming comparable to commercial translation products,
even for distant languages. Human analysis on Google Translate and ChatGPT
suggests that ChatGPT with GPT-3.5 tends to generate more hallucinations and
mis-translation errors while that with GPT-4 makes the least errors. In other
words, ChatGPT has already become a good translator. Please refer to our Github
project for more details:
https://github.com/wxjiao/Is-ChatGPT-A-Good-TranslatorComment: Analyzed/compared the outputs between ChatGPT and Google Translate;
both automatic and human evaluatio
GPT-4 Is Too Smart To Be Safe: Stealthy Chat with LLMs via Cipher
Safety lies at the core of the development of Large Language Models (LLMs).
There is ample work on aligning LLMs with human ethics and preferences,
including data filtering in pretraining, supervised fine-tuning, reinforcement
learning from human feedback, and red teaming, etc. In this study, we discover
that chat in cipher can bypass the safety alignment techniques of LLMs, which
are mainly conducted in natural languages. We propose a novel framework
CipherChat to systematically examine the generalizability of safety alignment
to non-natural languages -- ciphers. CipherChat enables humans to chat with
LLMs through cipher prompts topped with system role descriptions and few-shot
enciphered demonstrations. We use CipherChat to assess state-of-the-art LLMs,
including ChatGPT and GPT-4 for different representative human ciphers across
11 safety domains in both English and Chinese. Experimental results show that
certain ciphers succeed almost 100% of the time to bypass the safety alignment
of GPT-4 in several safety domains, demonstrating the necessity of developing
safety alignment for non-natural languages. Notably, we identify that LLMs seem
to have a ''secret cipher'', and propose a novel SelfCipher that uses only role
play and several demonstrations in natural language to evoke this capability.
SelfCipher surprisingly outperforms existing human ciphers in almost all cases.
Our code and data will be released at https://github.com/RobustNLP/CipherChat.Comment: 13 pages, 4 figures, 9 table
The Earth is Flat? Unveiling Factual Errors in Large Language Models
Large Language Models (LLMs) like ChatGPT are foundational in various
applications due to their extensive knowledge from pre-training and
fine-tuning. Despite this, they are prone to generating factual and commonsense
errors, raising concerns in critical areas like healthcare, journalism, and
education to mislead users. Current methods for evaluating LLMs' veracity are
limited by test data leakage or the need for extensive human labor, hindering
efficient and accurate error detection. To tackle this problem, we introduce a
novel, automatic testing framework, FactChecker, aimed at uncovering factual
inaccuracies in LLMs. This framework involves three main steps: First, it
constructs a factual knowledge graph by retrieving fact triplets from a
large-scale knowledge database. Then, leveraging the knowledge graph,
FactChecker employs a rule-based approach to generates three types of questions
(Yes-No, Multiple-Choice, and WH questions) that involve single-hop and
multi-hop relations, along with correct answers. Lastly, it assesses the LLMs'
responses for accuracy using tailored matching strategies for each question
type. Our extensive tests on six prominent LLMs, including text-davinci-002,
text-davinci-003, ChatGPT~(gpt-3.5-turbo, gpt-4), Vicuna, and LLaMA-2, reveal
that FactChecker can trigger factual errors in up to 45\% of questions in these
models. Moreover, we demonstrate that FactChecker's test cases can improve
LLMs' factual accuracy through in-context learning and fine-tuning (e.g.,
llama-2-13b-chat's accuracy increase from 35.3\% to 68.5\%). We are making all
code, data, and results available for future research endeavors
An Advanced Adaptive Control of Lower Limb Rehabilitation Robot
Rehabilitation robots play an important role in the rehabilitation field, and effective human-robot interaction contributes to promoting the development of the rehabilitation robots. Though many studies about the human-robot interaction have been carried out, there are still several limitations in the flexibility and stability of the control system. Therefore, we proposed an advanced adaptive control method for lower limb rehabilitation robot. The method was devised with a dual closed loop control strategy based on the surface electromyography (sEMG) and plantar pressure to improve the robustness of the adaptive control for the rehabilitation robots. First, in the outer loop control, an advanced variable impedance controller based on the sEMG and plantar pressure was designed to correct robot's reference trajectory. Then, in the inner loop control, a sliding mode iterative learning controller (SMILC) based on the variable boundary saturation function was designed to achieve the tracking of the reference trajectory. The experiment results showed that, in the designed dual closed loop control strategy, a variable impedance controller can effectively reduce trajectory tracking errors and adaptively modify the reference trajectory synchronizing with the motion intention of patients; the designed sliding mode iterative learning controller can effectively reduce chattering in sliding mode control and excellently achieve the tracking of rehabilitation robot's reference trajectory. This study can improve the performance of the human-robot interaction of the rehabilitation robot system, and expand the application to the rehabilitation field
Spectrophotometric determinationof trace nitrite with a novel self-coupling diazotizing reagent: J-acid
A simple and sensitive method for the spectrophotometric determination of nitrite was described and optimum reaction conditions along with other important analytical parameters were established. In the presence of potassium bromide at 25°C, nitrite reacted with J-acid in hydrochloric acid producing diazonium salt and then coupled with excess J-acid in the sodium carbonate solution yielding red colored azo compounds. At wavelength of 500 nm, Beer’s law was obeyed over the concentration range of 0,02 – 0,60 mg∙L⁻¹. The molar absorptivity was 3,92∙10⁴ L∙mol⁻¹∙cm⁻¹. This method was easily applied to the determination of trace nitrite in environmental water with recoveries of 9₈,7 – 101,2%
- …