949 research outputs found

    Robust Dialog Management Through A Context-centric Architecture

    Get PDF
    This dissertation presents and evaluates a method of managing spoken dialog interactions with a robust attention to fulfilling the human user’s goals in the presence of speech recognition limitations. Assistive speech-based embodied conversation agents are computer-based entities that interact with humans to help accomplish a certain task or communicate information via spoken input and output. A challenging aspect of this task involves open dialog, where the user is free to converse in an unstructured manner. With this style of input, the machine’s ability to communicate may be hindered by poor reception of utterances, caused by a user’s inadequate command of a language and/or faults in the speech recognition facilities. Since a speech-based input is emphasized, this endeavor involves the fundamental issues associated with natural language processing, automatic speech recognition and dialog system design. Driven by ContextBased Reasoning, the presented dialog manager features a discourse model that implements mixed-initiative conversation with a focus on the user’s assistive needs. The discourse behavior must maintain a sense of generality, where the assistive nature of the system remains constant regardless of its knowledge corpus. The dialog manager was encapsulated into a speech-based embodied conversation agent platform for prototyping and testing purposes. A battery of user trials was performed on this agent to evaluate its performance as a robust, domain-independent, speech-based interaction entity capable of satisfying the needs of its users

    I feel you: the design and evaluation of a domotic affect-sensitive spoken conversational agent

    Get PDF
    We describe the work on infusion of emotion into a limited-task autonomous spoken conversational agent situated in the domestic environment, using a need-inspired task-independent emotion model (NEMO). In order to demonstrate the generation of affect through the use of the model, we describe the work of integrating it with a natural-language mixed-initiative HiFi-control spoken conversational agent (SCA). NEMO and the host system communicate externally, removing the need for the Dialog Manager to be modified, as is done in most existing dialog systems, in order to be adaptive. The first part of the paper concerns the integration between NEMO and the host agent. The second part summarizes the work on automatic affect prediction, namely, frustration and contentment, from dialog features, a non-conventional source, in the attempt of moving towards a more user-centric approach. The final part reports the evaluation results obtained from a user study, in which both versions of the agent (non-adaptive and emotionally-adaptive) were compared. The results provide substantial evidences with respect to the benefits of adding emotion in a spoken conversational agent, especially in mitigating users' frustrations and, ultimately, improving their satisfaction

    An Evaluation of Three Online Chatbots

    Get PDF
    Chatbots enable machines to emulate human conversation. While research has been done to examine how human-like communication with chatbots can be, heretofore comparisons of the systems with humans have not accounted for abnormal behavior from the users. For example, the people using the chatbot might be lying or trying to, in turn, imitate a computer’s response. Results of a study comparing transcripts from three chatbots and two humans show that student evaluators were able to correctly identify two computer transcripts, but failed on one. Further, they incorrectly guessed that one of the humans was a chatbot. The study also presents a detailed analysis of the 11 responses from the agents

    Don't Take This Out of Context! On the Need for Contextual Models and Evaluations for Stylistic Rewriting

    Full text link
    Most existing stylistic text rewriting methods and evaluation metrics operate on a sentence level, but ignoring the broader context of the text can lead to preferring generic, ambiguous, and incoherent rewrites. In this paper, we investigate integrating the preceding textual context into both the rewriting\textit{rewriting} and evaluation\textit{evaluation} stages of stylistic text rewriting, and introduce a new composite contextual evaluation metric CtxSimFit\texttt{CtxSimFit} that combines similarity to the original sentence with contextual cohesiveness. We comparatively evaluate non-contextual and contextual rewrites in formality, toxicity, and sentiment transfer tasks. Our experiments show that humans significantly prefer contextual rewrites as more fitting and natural over non-contextual ones, yet existing sentence-level automatic metrics (e.g., ROUGE, SBERT) correlate poorly with human preferences (ρ\rho=0--0.3). In contrast, human preferences are much better reflected by both our novel CtxSimFit\texttt{CtxSimFit} (ρ\rho=0.7--0.9) as well as proposed context-infused versions of common metrics (ρ\rho=0.4--0.7). Overall, our findings highlight the importance of integrating context into the generation and especially the evaluation stages of stylistic text rewriting.Comment: emnlp 2023 main camera read

    A TOGAF Based Chatbot Evaluation Metrics: Insights from Literature Review

    Get PDF
    Chatbots have been used for basic conversational functionalities and task performance in today\u27s world. With the surge in the use of chatbots, several design features have emerged to cater to its rising demands and increasing complexity. Researchers have grappled with the issues of modeling and evaluating these tools because of the vast number of metrics associated with their measure of successful. This paper conducted a literature survey to identify the various conversational metrics used to evaluate chatbots. The selected evaluation metrics were mapped to the various layers of The Open Group Architecture Framework (TOGAF) architecture. TOGAF architecture helped us divide the metrics based on the various facets critical to developing successful chatbot applications. Our results show that the metrics related to the business layer have been well studied. However, metrics associated with the data, information, and system layers warrant more research. As chatbots become more complex, success metrics across the intermediate layers may assume greater significance
    corecore