29 research outputs found

    Chain of Thought Explanation for Dialogue State Tracking

    Full text link
    Dialogue state tracking (DST) aims to record user queries and goals during a conversational interaction achieved by maintaining a predefined set of slots and their corresponding values. Current approaches decide slot values opaquely, while humans usually adopt a more deliberate approach by collecting information from relevant dialogue turns and then reasoning the appropriate values. In this work, we focus on the steps needed to figure out slot values by proposing a model named Chain-of-Thought-Explanation (CoTE) for the DST task. CoTE, which is built on the generative DST framework, is designed to create detailed explanations step by step after determining the slot values. This process leads to more accurate and reliable slot values. More-over, to improve the reasoning ability of the CoTE, we further construct more fluent and high-quality explanations with automatic paraphrasing, leading the method CoTE-refined. Experimental results on three widely recognized DST benchmarks-MultiWOZ 2.2, WoZ 2.0, and M2M-demonstrate the remarkable effectiveness of the CoTE. Furthermore, through a meticulous fine-grained analysis, we observe significant benefits of our CoTE on samples characterized by longer dialogue turns, user responses, and reasoning steps

    From GPT-4 to Gemini and Beyond: Assessing the Landscape of MLLMs on Generalizability, Trustworthiness and Causality through Four Modalities

    Full text link
    Multi-modal Large Language Models (MLLMs) have shown impressive abilities in generating reasonable responses with respect to multi-modal contents. However, there is still a wide gap between the performance of recent MLLM-based applications and the expectation of the broad public, even though the most powerful OpenAI's GPT-4 and Google's Gemini have been deployed. This paper strives to enhance understanding of the gap through the lens of a qualitative study on the generalizability, trustworthiness, and causal reasoning capabilities of recent proprietary and open-source MLLMs across four modalities: ie, text, code, image, and video, ultimately aiming to improve the transparency of MLLMs. We believe these properties are several representative factors that define the reliability of MLLMs, in supporting various downstream applications. To be specific, we evaluate the closed-source GPT-4 and Gemini and 6 open-source LLMs and MLLMs. Overall we evaluate 230 manually designed cases, where the qualitative results are then summarized into 12 scores (ie, 4 modalities times 3 properties). In total, we uncover 14 empirical findings that are useful to understand the capabilities and limitations of both proprietary and open-source MLLMs, towards more reliable downstream multi-modal applications

    Rethinking Generalization of Neural Models: A Named Entity Recognition Case Study

    No full text
    While neural network-based models have achieved impressive performance on a large body of NLP tasks, the generalization behavior of different models remains poorly understood: Does this excellent performance imply a perfect generalization model, or are there still some limitations? In this paper, we take the NER task as a testbed to analyze the generalization behavior of existing models from different perspectives and characterize the differences of their generalization abilities through the lens of our proposed measures, which guides us to better design models and training methods. Experiments with in-depth analyses diagnose the bottleneck of existing neural NER models in terms of breakdown performance analysis, annotation errors, dataset bias, and category relationships, which suggest directions for improvement. We have released the datasets: (ReCoNLL, PLONER) for the future research at our project page: http://pfliu.com/InterpretNER/

    Neural Networks Incorporating Dictionaries for Chinese Word Segmentation

    No full text
    In recent years, deep neural networks have achieved significant success in Chinese word segmentation and many other natural language processing tasks. Most of these algorithms are end-to-end trainable systems and can effectively process and learn from large scale labeled datasets. However, these methods typically lack the capability of processing rare words and data whose domains are different from training data. Previous statistical methods have demonstrated that human knowledge can provide valuable information for handling rare cases and domain shifting problems. In this paper, we seek to address the problem of incorporating dictionaries into neural networks for the Chinese word segmentation task. Two different methods that extend the bi-directional long short-term memory neural network are proposed to perform the task. To evaluate the performance of the proposed methods, state-of-the-art supervised models based methods and domain adaptation approaches are compared with our methods on nine datasets from different domains. The experimental results demonstrate that the proposed methods can achieve better performance than other state-of-the-art neural network methods and domain adaptation approaches in most cases

    Adaptive Co-attention Network for Named Entity Recognition in Tweets

    No full text
    In this study, we investigate the problem of named entity recognition for tweets. Named entity recognition is an important task in natural language processing and has been carefully studied in recent decades.  Previous named entity recognition methods usually only used the textual content when processing tweets. However, many tweets contain not only textual content, but also images. Such visual information is also valuable in the name entity recognition task. To make full use of textual and visual information, this paper proposes a novel method to process tweets that contain multimodal information. We extend a bi-directional long short term memory network with conditional random fields and an adaptive co-attention network to achieve this task.  To evaluate the proposed methods, we constructed a large scale labeled dataset that contained multimodal tweets. Experimental results demonstrated that the proposed method could achieve a better performance than the previous methods in most cases

    Palmitate Promotes Autophagy and Apoptosis Through ROS-Dependent JNK and p38 MAPK

    No full text
    Palmitate (PA), one of the most prevalent saturated fatty acids, causes myocardial dysfunction. However, the mechanisms by which PA induces cell apoptosis and autophagy remain to be elucidated. We showed that autophagy was induced in an mTORC1-dependent way and played a protective role against PA-induced apoptosis, which was verified by pretreatment with 3-methyladenine (3MA) and rapamycin. However, p62 began to accumulate after 18 h treatment with PA, suggesting prolonged exposure to PA lead to an impairment of autophagic flux. PA enhanced ROS production as well as activated p38-mitogen-activated protein kinase (p38 MAPK) and c-jun NH2 terminal kinases (JNKs). The antioxidant N-Acety-L-Cysteine (NAC) was found to attenuate the JNK and p38 MAPK activation with a concomitant reduction of PA-induced autophagy and apoptosis. Furthermore, both JNK and p38 MAPK inhibitors were shown to directly abrogate caspase 7 cleavage as well as the conversion of LC3BI to LC3BII. Thus, we demonstrate that PA stimulates autophagy and apoptosis via ROS-dependent JNK and p38 MAPK pathways
    corecore