887 research outputs found
Towards responsive Sensitive Artificial Listeners
This paper describes work in the recently started project SEMAINE, which aims to build a set of Sensitive Artificial Listeners – conversational agents designed to sustain an interaction with a human user despite limited verbal skills, through robust recognition and generation of non-verbal behaviour in real-time, both when the agent is speaking and listening. We report on data collection and on the design of a system architecture in view of real-time responsiveness
Recommended from our members
Open-domain neural conversational agents: The step towards artificial general intelligence
Development of conversational agents started half century ago and since then it has transformed into a technology that is accessible in various aspects in everyday life. This paper presents a survey current state-of-the-art in the open domain neural conversational agent research and future research directions towards Artificial General Intelligence (AGI) creation. In order to create a conversational agent which is able to pass the Turing Test, numerous research efforts are focused on open-domain dialogue system. This paper will present latest research in domain of Neural Network reasoning and logical association, sentiment analysis and real-time learning approaches applied to open domain neural conversational agents. As an effort to provide future research directions, current cuttingedge approaches applied to open domain neural conversational agents, current cutting-edge approaches in rationale generation and the state-of-the-art research directions in alternative training methods will be discussed in this paper
Bias and Fairness in Chatbots: An Overview
Chatbots have been studied for more than half a century. With the rapid
development of natural language processing (NLP) technologies in recent years,
chatbots using large language models (LLMs) have received much attention
nowadays. Compared with traditional ones, modern chatbots are more powerful and
have been used in real-world applications. There are however, bias and fairness
concerns in modern chatbot design. Due to the huge amounts of training data,
extremely large model sizes, and lack of interpretability, bias mitigation and
fairness preservation of modern chatbots are challenging. Thus, a comprehensive
overview on bias and fairness in chatbot systems is given in this paper. The
history of chatbots and their categories are first reviewed. Then, bias sources
and potential harms in applications are analyzed. Considerations in designing
fair and unbiased chatbot systems are examined. Finally, future research
directions are discussed
Deep Learning-Based Speech Emotion Recognition Using Librosa
Speech Emotion Recognition is a challenge of computational paralinguistic and speech processing that tries to identify and classify the emotions expressed in spoken language. The objective is to infer from a speaker's speech patterns, such as prosody, pitch, and rhythm, their emotional state, such as happiness, rage, sadness, or frustration. In the modern world, one of the most crucial marketing tactics is emotion detection. For a person, you might tailor several things in order to best fit their interests. Due to this, we made the decision to work on a project where we could identify a person's emotions based just on their speech, allowing us to handle a variety of AI-related applications. Examples include the ability of call centers to play music during tense exchanges. Another example might be a smart automobile that slows down when someone is scared or furious. In Python, we processed and extracted features from the audio files using the Librosa module. A Python library for audio and music analysis is called Librosa. It offers the fundamental components required to develop systems for retrieving music-related information. Because of this, there is a lot of potential for this kind of application in the market that would help businesses and ensure customer safety
Applications of Artificial Intelligence in the Treatment of Behavioral and Mental Health Conditions
Introduction
Artificial intelligence (AI) is the branch of science that studies and designs intelligent devices. For individuals unfamiliar with artificial intelligence, the concept of intelligent machines may bring up visions of attractive human-like computers or robots, like those described in science fiction. Others may consider AI technology to be mysterious machines limited to research facilities or a technical triumph that will come in the far future. Popular media accounts on the deployment of aerial drones, autonomous autos, or the potential dangers of developing super-intelligent technologies may have raised some broad awareness of the subject
Defining a New NLP Playground
The recent explosion of performance of large language models (LLMs) has
changed the field of Natural Language Processing (NLP) more abruptly and
seismically than any other shift in the field's 80-year history. This has
resulted in concerns that the field will become homogenized and
resource-intensive. The new status quo has put many academic researchers,
especially PhD students, at a disadvantage. This paper aims to define a new NLP
playground by proposing 20+ PhD-dissertation-worthy research directions,
covering theoretical analysis, new and challenging problems, learning
paradigms, and interdisciplinary applications.Comment: EMNLP Findings 2023 "Theme Track: Large Language Models and the
Future of NLP
TikTalk: A Video-Based Dialogue Dataset for Multi-Modal Chitchat in Real World
To facilitate the research on intelligent and human-like chatbots with
multi-modal context, we introduce a new video-based multi-modal dialogue
dataset, called TikTalk. We collect 38K videos from a popular video-sharing
platform, along with 367K conversations posted by users beneath them. Users
engage in spontaneous conversations based on their multi-modal experiences from
watching videos, which helps recreate real-world chitchat context. Compared to
previous multi-modal dialogue datasets, the richer context types in TikTalk
lead to more diverse conversations, but also increase the difficulty in
capturing human interests from intricate multi-modal information to generate
personalized responses. Moreover, external knowledge is more frequently evoked
in our dataset. These facts reveal new challenges for multi-modal dialogue
models. We quantitatively demonstrate the characteristics of TikTalk, propose a
video-based multi-modal chitchat task, and evaluate several dialogue baselines.
Experimental results indicate that the models incorporating large language
models (LLM) can generate more diverse responses, while the model utilizing
knowledge graphs to introduce external knowledge performs the best overall.
Furthermore, no existing model can solve all the above challenges well. There
is still a large room for future improvements, even for LLM with visual
extensions. Our dataset is available at
\url{https://ruc-aimind.github.io/projects/TikTalk/}.Comment: Accepted to ACM Multimedia 202
- …