Affect recognition, encompassing emotions, moods, and feelings, plays a
pivotal role in human communication. In the realm of conversational artificial
intelligence (AI), the ability to discern and respond to human affective cues
is a critical factor for creating engaging and empathetic interactions. This
study delves into the capacity of large language models (LLMs) to recognise
human affect in conversations, with a focus on both open-domain chit-chat
dialogues and task-oriented dialogues. Leveraging three diverse datasets,
namely IEMOCAP, EmoWOZ, and DAIC-WOZ, covering a spectrum of dialogues from
casual conversations to clinical interviews, we evaluated and compared LLMs'
performance in affect recognition. Our investigation explores the zero-shot and
few-shot capabilities of LLMs through in-context learning (ICL) as well as
their model capacities through task-specific fine-tuning. Additionally, this
study takes into account the potential impact of automatic speech recognition
(ASR) errors on LLM predictions. With this work, we aim to shed light on the
extent to which LLMs can replicate human-like affect recognition capabilities
in conversations