Search CORE

5 research outputs found

Crowdsourcing for Creating a Dataset for Training a Medication Chatbot

Author: Denecke Kerstin
Kunz Sebastian B.
Zgraggen Cyril R.
Publication venue: 'IOS Press'
Publication date: 01/01/2021
Field of study

To facilitate interaction with mobile health applications, chatbots are increasingly used. They realize the interaction as a dialog where users can ask questions and get answers from the chatbot. A big challenge is to create a comprehensive knowledge base comprising patterns and rules for representing possible user queries the chatbot has to understand and interpret. In this work, we assess how crowdsourcing can be used for generating examples of possible user queries for a medication chatbot. Within one week, the crowdworker generated 2'738 user questions. The examples provide a large variety of possible formulations and information needs. As a next step, these examples for user queries will be used to train our medication chatbot

Berner Fachhochschule: ARBOR

Chatting with Confidence: A Review on the Impact of User Interface, Trust, and User Experience in Chatbots, and a Proposal of a Redesigned Prototype

Author: Tamimi Arin
Publication venue: North Dakota State University
Publication date: 01/01/2023
Field of study

As artificial intelligence (AI) becomes more prevalent in our daily lives, trust has become a critical issue in ensuring that AI systems are reliable, ethical, and beneficial to society. This paper explores the role of user experience (UX) in shaping users' trust in chat AI. Chat AI has become increasingly popular as a communication tool, but users often struggle with trusting the technology. The paper examines how different design elements, such as conversational style, interface, and feedback mechanisms, affect users' perception of trust in chat AI by analyzing previous literature written in this area. Research demonstrates that UX plays a critical role in users' trust in chat AI, with factors such as transparency, responsiveness, and empathy contributing to higher levels of trust. Using the results found in the research around this topic, a redesigned prototype of a popular chat AI software called chatGPT was created with the help of Figma

NDSU Libraries Institutional Repository

수집 결과의 표현 다양성 향상을 중심으로

Author: 김병준
Publication venue: 서울대학교 대학원
Publication date: 01/08/2019
Field of study

학위논문(석사)--서울대학교 대학원 :융합과학기술대학원 융합과학부(디지털정보융합전공),2019. 8. 이중식.The conversational agent is a system that receives the natural language from the user and understand the intent for performing the function. With the advancement of speech recognition technology and the development platform of IT companies, service development using a conversational agent is becoming popular. To develop such a conversational agent, a large amount of training data is required. Currently, conversational agents provide a way for users to interact as if they were human beings. Accordingly, the conversational agent needs to understand the users intent and Understanding intent is learned through various and large amount of training data. However, collecting training data for the development of conversational agents is a very difficult task because of the diversity of expressions and the limitations of collections methods in natural language. Diversity of expressions means having different structures with the same meaning, collecting training data should take characteristic into consideration. Although some methods of collecting are proposed, problems such as time, cost, and accessibility are raised. With the recent development of artificial intelligence, crowdsourcing has developed and the possibility of solving these problems can be seen. Crowdsourcing has the advantage of solving problems that are difficult for a computer to solve from people and collecting data to a large number of people at low cost. In practice, the possibility of using crowdsourcing in relation to the training data acquisition is raised. However, although quality of crowdsourcing is influenced greatly by the task design method and diversity of training data is important, understanding of task design method is insufficient. Therefore, this paper focuses on improving the diversity of expression, examines the effect of task design elements on training data collection, and then suggests a design method that can collect training data effectively. For this purpose, this paper selects three design elements(task amount, bonus compensation method, social proof based explanation method) to explore the effect of task design elements and conducts 3 experiments of three design elements. The paraphrasing task that possibility of training data acquisition is proven was used, 1473 data were collected from MTurk using $73.65. The collected data were analyzed with four indicators(semantic equivalence, diversity, error rate, and execution time). As a result of analysis, it was difficult to get data with the same meaning as the amount of task increased. In terms of bonus compensation method, the efficiency of collection increased when offering bonus compensation. Finally, in terms of the social proof- based explanations, there is a trade-off relationship between diversity and efficiency. Individual differences in collecting among participants and pressure on collecting results were discussed, and an integrated task design method was suggested. This paper has academic significance in that it studies the possibility of improving the quality of collecting, mainly focusing on the study of the possibility of collecting training data. In addition, it has significance in terms of timeliness and usefulness in trying to solve the problem that is actually experienced in the industrial field. Finally, there is significance in terms of convergence in that it combines social psychology theory, HCI and engineering.대화형 에이전트는 사용자로부터 자연어를 입력 받아 인텐트를 파악하고 기능을 수행하는 시스템이다. 음성 인식 기술의 고도화와 거대 IT 기업들을 중심으로 개발 플랫폼을 제공함에 따라 대화형 에이전트를 이용한 서비스 개발이 보편화되고 있다. 이러한 대화형 에이전트를 개발하기 위해서는 다양하고 많은 양의 학습데이터가 필요하다. 현재 대화형 에이전트는 사용자에게 사람처럼 대화하는 상호작용 방식을 제공한다. 이에 따라 대화형 에이전트는 사용자의 대화 인텐트를 파악해야 하며, 인텐트 파악은 다양하고 많은 양의 학습데이터를 통해 학습되기 때문이다. 하지만 대화형 에이전트 개발을 위한 학습데이터를 수집하는 것은 자연어의 표현 다양성과 수집 방법의 한계로 인해 매우 어려운 작업이다. 자연어의 표현 다양성은 같은 의미를 가지면서 다른 구조를 가질 수 있음을 뜻하며, 학습데이터 수집은 이러한 특성이 고려되어야 한다. 수집할 수 있는 방법들이 일부 제안되긴 하였으나 시간, 비용, 접근성 등의 문제가 제기되고 있다. 최근 인공지능 개발이 활성화됨에 따라 크라우드소싱 분야가 발전하면서 이러한 문제를 해결할 가능성을 엿볼 수 있게 되었다. 크라우드소싱은 컴퓨터가 해결하기 어려운 문제를 사람들로부터 풀며, 적은 비용으로 다수의 사람들에게 데이터를 수집할 수 있는 장점이 있다. 실제, 학습데이터 수집과 관련하여 크라우드소싱의 활용 가능성이 제기되고 있다. 하지만 태스크 디자인 방식에 따라 크라우드소싱의 수집결과가 많은 영향을 받고, 학습데이터의 다양성이 중요함에도 불구하고 태스크 디자인 방식에 대한 이해가 부족한 상황이다. 따라서 본 연구는 표현의 다양성 향상에 초점을 맞춰 태스크 디자인 요소가 학습데이터 수집 결과에 미치는 영향을 알아보고, 효과적으로 학습데이터를 수집할 수 있는 디자인 방안을 제언하고자 한다. 이를 위해 본 연구에서는 크라우드소싱 기반의 학습데이터 수집 결과에 영향을 주는 태스크 디자인 요소들을 선정하여 이에 따른 영향을 알아보고자 일련의 3가지 실험(태스크 양, 보너스 보상 방식, Social Proof 기반 설명 방식)을 진행했다. 수집가능성이 검증된 패러프레이징 태스크를 사용하였으며, MTurk을 통해 480명의 참가자로부터 73.65달러를 사용하여 1473개의 데이터를 수집하였다. 수집한 데이터는 4가지 지표(의미적 동등성, 다양성, 에러 비율, 수행 시간)로 분석되었다. 분석 결과, 태스크 양이 늘어날수록 같은 의미를 갖는 데이터를 얻기 어려웠다. 보너스 보상 방식 측면에서는, 보너스 보상 방식을 제공할 때 수집의 효율성이 높아졌다. 마지막으로 Social Proof 기반 설명 방식 측면에서는 다양성과 효율성 사이의 트레이드 오프(Trade- off) 관계가 나타났다. 최종적으로 참가자 간 수집의 개인차, 수집 결과에 대한 압박에 대해 논의하고, 실험 결과를 종합하여 통합적인 태스크 디자인 방식을 제언하였다. 본 연구는 학습데이터의 수집 가능성을 밝히는 연구가 주를 이루는 가운데, 수집 결과를 향상시킬 수 있는 방안을 연구한다는 점에서 학술적 의의를 갖는다. 또한 대화형 에이전트의 개발이 보편화되는 시점에, 산업 분야에서 실제 겪고 있는 문제를 해결하고자 한다는 점에서 시의성과 유용성 측면의 의의를 갖는다. 마지막으로 사회심리학 이론, HCI, 공학 분야를 접목한다는 점에서 융합적 의의를 갖는다.제 1장 서론 1 제 1절 연구의 배경 1 제 2절 논문의 구성 7 제 2장 이론적 배경 8 제 1절 대화형 에이전트의 인텐트 파악 8 제 2절 자연어 학습데이터 관련 크라우드소싱 활용 연구 10 제 3절 크라우드소싱 수집 결과와 관련된 태스크 디자인 요인 12 제 4절 Social Proof 효과 16 제 3장 연구 문제 18 제 4장 연구 방법 21 제 1절 태스크 및 실험절차 22 제 2절 실험물 23 제 3절 측정 지표 및 분석방법 27 제 5장 연구 결과 33 제 1절 태스크 양에 따른 수집 결과 33 제 2절 보너스 보상 방식에 따른 수집 결과 39 제 3절 Social Proof 기반 설명 방식에 따른 수집 결과 46 제 6장 디자인 제언 55 제 7장 결론 58 제 1절 연구 결과의 요약 58 제 2절 연구의 한계 59 제 3절 연구의 의의 60 참고문헌 61 Abstract 69Maste

SNU Open Repository and Archive

Effective crowdsourced generation of training data for chatbots natural language understanding

Author: Bapat Rucha
Bozzon A.
Kucherbaev P.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2018
Field of study

Chatbots are text-based conversational agents. Natural Language Understanding (NLU) models are used to extract meaning and intention from user messages sent to chatbots. The user experience of chatbots largely depends on the performance of the NLU model, which itself largely depends on the initial dataset the model is trained with. The training data should cover the diversity of real user requests the chatbot will receive. Obtaining such data is a challenging task even for big corporations. We introduce a generic approach to generate training data with the help of crowd workers, we discuss the approach workflow and the design of crowdsourcing tasks assuring high quality. We evaluate the approach by running an experiment collecting data for 9 different intents. We use the collected training data to train a natural language understanding model. We analyse the performance of the model under different training set sizes for each intent. We provide recommendations on selecting an optimal confidence threshold for predicting intents, based on the cost model of incorrect and unknown predictions

Effective crowdsourced generation of training data for chatbots natural language understanding

Author: Bapat Rucha (author)
Bozzon A. (author)
Kucherbaev P. (author)
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2018
Field of study

Crossref

TU Delft Repository