37 research outputs found
Zero-shot Conversational Summarization Evaluations with small Large Language Models
Large Language Models (LLMs) exhibit powerful summarization abilities.
However, their capabilities on conversational summarization remains under
explored. In this work we evaluate LLMs (approx. 10 billion parameters) on
conversational summarization and showcase their performance on various prompts.
We show that the summaries generated by models depend on the instructions and
the performance of LLMs vary with different instructions sometimes resulting
steep drop in ROUGE scores if prompts are not selected carefully. We also
evaluate the models with human evaluations and discuss the limitations of the
models on conversational summarizationComment: Accepted at RoF0Mo workshop at Neurips 202
End-to-End Evaluation of a Spoken Dialogue System for Learning Basic Mathematics
The advances in language-based Artificial Intelligence (AI) technologies
applied to build educational applications can present AI for social-good
opportunities with a broader positive impact. Across many disciplines,
enhancing the quality of mathematics education is crucial in building critical
thinking and problem-solving skills at younger ages. Conversational AI systems
have started maturing to a point where they could play a significant role in
helping students learn fundamental math concepts. This work presents a
task-oriented Spoken Dialogue System (SDS) built to support play-based learning
of basic math concepts for early childhood education. The system has been
evaluated via real-world deployments at school while the students are
practicing early math concepts with multimodal interactions. We discuss our
efforts to improve the SDS pipeline built for math learning, for which we
explore utilizing MathBERT representations for potential enhancement to the
Natural Language Understanding (NLU) module. We perform an end-to-end
evaluation using real-world deployment outputs from the Automatic Speech
Recognition (ASR), Intent Recognition, and Dialogue Manager (DM) components to
understand how error propagation affects the overall performance in real-world
scenarios.Comment: Proceedings of the 1st Workshop on Mathematical Natural Language
Processing (MathNLP) at EMNLP 202
Inspecting Spoken Language Understanding from Kids for Basic Math Learning at Home
Enriching the quality of early childhood education with interactive math
learning at home systems, empowered by recent advances in conversational AI
technologies, is slowly becoming a reality. With this motivation, we implement
a multimodal dialogue system to support play-based learning experiences at
home, guiding kids to master basic math concepts. This work explores Spoken
Language Understanding (SLU) pipeline within a task-oriented dialogue system
developed for Kid Space, with cascading Automatic Speech Recognition (ASR) and
Natural Language Understanding (NLU) components evaluated on our home
deployment data with kids going through gamified math learning activities. We
validate the advantages of a multi-task architecture for NLU and experiment
with a diverse set of pretrained language representations for Intent
Recognition and Entity Extraction tasks in the math learning domain. To
recognize kids' speech in realistic home environments, we investigate several
ASR systems, including the commercial Google Cloud and the latest open-source
Whisper solutions with varying model sizes. We evaluate the SLU pipeline by
testing our best-performing NLU models on noisy ASR output to inspect the
challenges of understanding children for math learning in authentic homes.Comment: Proceedings of the 18th Workshop on Innovative Use of NLP for
Building Educational Applications (BEA) at ACL 202