5 research outputs found

    Crowdsourcing for Creating a Dataset for Training a Medication Chatbot

    Get PDF
    To facilitate interaction with mobile health applications, chatbots are increasingly used. They realize the interaction as a dialog where users can ask questions and get answers from the chatbot. A big challenge is to create a comprehensive knowledge base comprising patterns and rules for representing possible user queries the chatbot has to understand and interpret. In this work, we assess how crowdsourcing can be used for generating examples of possible user queries for a medication chatbot. Within one week, the crowdworker generated 2'738 user questions. The examples provide a large variety of possible formulations and information needs. As a next step, these examples for user queries will be used to train our medication chatbot

    Chatting with Confidence: A Review on the Impact of User Interface, Trust, and User Experience in Chatbots, and a Proposal of a Redesigned Prototype

    Get PDF
    As artificial intelligence (AI) becomes more prevalent in our daily lives, trust has become a critical issue in ensuring that AI systems are reliable, ethical, and beneficial to society. This paper explores the role of user experience (UX) in shaping users' trust in chat AI. Chat AI has become increasingly popular as a communication tool, but users often struggle with trusting the technology. The paper examines how different design elements, such as conversational style, interface, and feedback mechanisms, affect users' perception of trust in chat AI by analyzing previous literature written in this area. Research demonstrates that UX plays a critical role in users' trust in chat AI, with factors such as transparency, responsiveness, and empathy contributing to higher levels of trust. Using the results found in the research around this topic, a redesigned prototype of a popular chat AI software called chatGPT was created with the help of Figma

    μˆ˜μ§‘ 결과의 ν‘œν˜„ λ‹€μ–‘μ„± ν–₯상을 μ€‘μ‹¬μœΌλ‘œ

    Get PDF
    ν•™μœ„λ…Όλ¬Έ(석사)--μ„œμšΈλŒ€ν•™κ΅ λŒ€ν•™μ› :μœ΅ν•©κ³Όν•™κΈ°μˆ λŒ€ν•™μ› μœ΅ν•©κ³Όν•™λΆ€(λ””μ§€ν„Έμ •λ³΄μœ΅ν•©μ „κ³΅),2019. 8. 이쀑식.The conversational agent is a system that receives the natural language from the user and understand the intent for performing the function. With the advancement of speech recognition technology and the development platform of IT companies, service development using a conversational agent is becoming popular. To develop such a conversational agent, a large amount of training data is required. Currently, conversational agents provide a way for users to interact as if they were human beings. Accordingly, the conversational agent needs to understand the users intent and Understanding intent is learned through various and large amount of training data. However, collecting training data for the development of conversational agents is a very difficult task because of the diversity of expressions and the limitations of collections methods in natural language. Diversity of expressions means having different structures with the same meaning, collecting training data should take characteristic into consideration. Although some methods of collecting are proposed, problems such as time, cost, and accessibility are raised. With the recent development of artificial intelligence, crowdsourcing has developed and the possibility of solving these problems can be seen. Crowdsourcing has the advantage of solving problems that are difficult for a computer to solve from people and collecting data to a large number of people at low cost. In practice, the possibility of using crowdsourcing in relation to the training data acquisition is raised. However, although quality of crowdsourcing is influenced greatly by the task design method and diversity of training data is important, understanding of task design method is insufficient. Therefore, this paper focuses on improving the diversity of expression, examines the effect of task design elements on training data collection, and then suggests a design method that can collect training data effectively. For this purpose, this paper selects three design elements(task amount, bonus compensation method, social proof based explanation method) to explore the effect of task design elements and conducts 3 experiments of three design elements. The paraphrasing task that possibility of training data acquisition is proven was used, 1473 data were collected from MTurk using $73.65. The collected data were analyzed with four indicators(semantic equivalence, diversity, error rate, and execution time). As a result of analysis, it was difficult to get data with the same meaning as the amount of task increased. In terms of bonus compensation method, the efficiency of collection increased when offering bonus compensation. Finally, in terms of the social proof- based explanations, there is a trade-off relationship between diversity and efficiency. Individual differences in collecting among participants and pressure on collecting results were discussed, and an integrated task design method was suggested. This paper has academic significance in that it studies the possibility of improving the quality of collecting, mainly focusing on the study of the possibility of collecting training data. In addition, it has significance in terms of timeliness and usefulness in trying to solve the problem that is actually experienced in the industrial field. Finally, there is significance in terms of convergence in that it combines social psychology theory, HCI and engineering.λŒ€ν™”ν˜• μ—μ΄μ „νŠΈλŠ” μ‚¬μš©μžλ‘œλΆ€ν„° μžμ—°μ–΄λ₯Ό μž…λ ₯ λ°›μ•„ μΈν…νŠΈλ₯Ό νŒŒμ•…ν•˜κ³  κΈ°λŠ₯을 μˆ˜ν–‰ν•˜λŠ” μ‹œμŠ€ν…œμ΄λ‹€. μŒμ„± 인식 기술의 고도화와 κ±°λŒ€ IT 기업듀을 μ€‘μ‹¬μœΌλ‘œ 개발 ν”Œλž«νΌμ„ μ œκ³΅ν•¨μ— 따라 λŒ€ν™”ν˜• μ—μ΄μ „νŠΈλ₯Ό μ΄μš©ν•œ μ„œλΉ„μŠ€ 개발이 λ³΄νŽΈν™”λ˜κ³  μžˆλ‹€. μ΄λŸ¬ν•œ λŒ€ν™”ν˜• μ—μ΄μ „νŠΈλ₯Ό κ°œλ°œν•˜κΈ° μœ„ν•΄μ„œλŠ” λ‹€μ–‘ν•˜κ³  λ§Žμ€ μ–‘μ˜ ν•™μŠ΅λ°μ΄ν„°κ°€ ν•„μš”ν•˜λ‹€. ν˜„μž¬ λŒ€ν™”ν˜• μ—μ΄μ „νŠΈλŠ” μ‚¬μš©μžμ—κ²Œ μ‚¬λžŒμ²˜λŸΌ λŒ€ν™”ν•˜λŠ” μƒν˜Έμž‘μš© 방식을 μ œκ³΅ν•œλ‹€. 이에 따라 λŒ€ν™”ν˜• μ—μ΄μ „νŠΈλŠ” μ‚¬μš©μžμ˜ λŒ€ν™” μΈν…νŠΈλ₯Ό νŒŒμ•…ν•΄μ•Ό ν•˜λ©°, μΈν…νŠΈ νŒŒμ•…μ€ λ‹€μ–‘ν•˜κ³  λ§Žμ€ μ–‘μ˜ ν•™μŠ΅λ°μ΄ν„°λ₯Ό 톡해 ν•™μŠ΅λ˜κΈ° λ•Œλ¬Έμ΄λ‹€. ν•˜μ§€λ§Œ λŒ€ν™”ν˜• μ—μ΄μ „νŠΈ κ°œλ°œμ„ μœ„ν•œ ν•™μŠ΅λ°μ΄ν„°λ₯Ό μˆ˜μ§‘ν•˜λŠ” 것은 μžμ—°μ–΄μ˜ ν‘œν˜„ λ‹€μ–‘μ„±κ³Ό μˆ˜μ§‘ λ°©λ²•μ˜ ν•œκ³„λ‘œ 인해 맀우 μ–΄λ €μš΄ μž‘μ—…μ΄λ‹€. μžμ—°μ–΄μ˜ ν‘œν˜„ 닀양성은 같은 의미λ₯Ό κ°€μ§€λ©΄μ„œ λ‹€λ₯Έ ꡬ쑰λ₯Ό κ°€μ§ˆ 수 μžˆμŒμ„ λœ»ν•˜λ©°, ν•™μŠ΅λ°μ΄ν„° μˆ˜μ§‘μ€ μ΄λŸ¬ν•œ νŠΉμ„±μ΄ κ³ λ €λ˜μ–΄μ•Ό ν•œλ‹€. μˆ˜μ§‘ν•  수 μžˆλŠ” 방법듀이 일뢀 μ œμ•ˆλ˜κΈ΄ ν•˜μ˜€μœΌλ‚˜ μ‹œκ°„, λΉ„μš©, μ ‘κ·Όμ„± λ“±μ˜ λ¬Έμ œκ°€ 제기되고 μžˆλ‹€. 졜근 인곡지λŠ₯ 개발이 ν™œμ„±ν™”λ¨μ— 따라 ν¬λΌμš°λ“œμ†Œμ‹± λΆ„μ•Όκ°€ λ°œμ „ν•˜λ©΄μ„œ μ΄λŸ¬ν•œ 문제λ₯Ό ν•΄κ²°ν•  κ°€λŠ₯성을 μ—Ώλ³Ό 수 있게 λ˜μ—ˆλ‹€. ν¬λΌμš°λ“œμ†Œμ‹±μ€ 컴퓨터가 ν•΄κ²°ν•˜κΈ° μ–΄λ €μš΄ 문제λ₯Ό μ‚¬λžŒλ“€λ‘œλΆ€ν„° ν’€λ©°, 적은 λΉ„μš©μœΌλ‘œ λ‹€μˆ˜μ˜ μ‚¬λžŒλ“€μ—κ²Œ 데이터λ₯Ό μˆ˜μ§‘ν•  수 μžˆλŠ” μž₯점이 μžˆλ‹€. μ‹€μ œ, ν•™μŠ΅λ°μ΄ν„° μˆ˜μ§‘κ³Ό κ΄€λ ¨ν•˜μ—¬ ν¬λΌμš°λ“œμ†Œμ‹±μ˜ ν™œμš© κ°€λŠ₯성이 제기되고 μžˆλ‹€. ν•˜μ§€λ§Œ νƒœμŠ€ν¬ λ””μžμΈ 방식에 따라 ν¬λΌμš°λ“œμ†Œμ‹±μ˜ μˆ˜μ§‘κ²°κ³Όκ°€ λ§Žμ€ 영ν–₯을 λ°›κ³ , ν•™μŠ΅λ°μ΄ν„°μ˜ 닀양성이 μ€‘μš”ν•¨μ—λ„ λΆˆκ΅¬ν•˜κ³  νƒœμŠ€ν¬ λ””μžμΈ 방식에 λŒ€ν•œ 이해가 λΆ€μ‘±ν•œ 상황이닀. λ”°λΌμ„œ λ³Έ μ—°κ΅¬λŠ” ν‘œν˜„μ˜ λ‹€μ–‘μ„± ν–₯상에 μ΄ˆμ μ„ 맞좰 νƒœμŠ€ν¬ λ””μžμΈ μš”μ†Œκ°€ ν•™μŠ΅λ°μ΄ν„° μˆ˜μ§‘ 결과에 λ―ΈμΉ˜λŠ” 영ν–₯을 μ•Œμ•„λ³΄κ³ , 효과적으둜 ν•™μŠ΅λ°μ΄ν„°λ₯Ό μˆ˜μ§‘ν•  수 μžˆλŠ” λ””μžμΈ λ°©μ•ˆμ„ μ œμ–Έν•˜κ³ μž ν•œλ‹€. 이λ₯Ό μœ„ν•΄ λ³Έ μ—°κ΅¬μ—μ„œλŠ” ν¬λΌμš°λ“œμ†Œμ‹± 기반의 ν•™μŠ΅λ°μ΄ν„° μˆ˜μ§‘ 결과에 영ν–₯을 μ£ΌλŠ” νƒœμŠ€ν¬ λ””μžμΈ μš”μ†Œλ“€μ„ μ„ μ •ν•˜μ—¬ 이에 λ”°λ₯Έ 영ν–₯을 μ•Œμ•„λ³΄κ³ μž 일련의 3가지 μ‹€ν—˜(νƒœμŠ€ν¬ μ–‘, λ³΄λ„ˆμŠ€ 보상 방식, Social Proof 기반 μ„€λͺ… 방식)을 μ§„ν–‰ν–ˆλ‹€. μˆ˜μ§‘κ°€λŠ₯성이 κ²€μ¦λœ νŒ¨λŸ¬ν”„λ ˆμ΄μ§• νƒœμŠ€ν¬λ₯Ό μ‚¬μš©ν•˜μ˜€μœΌλ©°, MTurk을 톡해 480λͺ…μ˜ μ°Έκ°€μžλ‘œλΆ€ν„° 73.65λ‹¬λŸ¬λ₯Ό μ‚¬μš©ν•˜μ—¬ 1473개의 데이터λ₯Ό μˆ˜μ§‘ν•˜μ˜€λ‹€. μˆ˜μ§‘ν•œ λ°μ΄ν„°λŠ” 4가지 μ§€ν‘œ(의미적 동등성, λ‹€μ–‘μ„±, μ—λŸ¬ λΉ„μœ¨, μˆ˜ν–‰ μ‹œκ°„)둜 λΆ„μ„λ˜μ—ˆλ‹€. 뢄석 κ²°κ³Ό, νƒœμŠ€ν¬ 양이 λŠ˜μ–΄λ‚ μˆ˜λ‘ 같은 의미λ₯Ό κ°–λŠ” 데이터λ₯Ό μ–»κΈ° μ–΄λ €μ› λ‹€. λ³΄λ„ˆμŠ€ 보상 방식 μΈ‘λ©΄μ—μ„œλŠ”, λ³΄λ„ˆμŠ€ 보상 방식을 μ œκ³΅ν•  λ•Œ μˆ˜μ§‘μ˜ νš¨μœ¨μ„±μ΄ λ†’μ•„μ‘Œλ‹€. λ§ˆμ§€λ§‰μœΌλ‘œ Social Proof 기반 μ„€λͺ… 방식 μΈ‘λ©΄μ—μ„œλŠ” λ‹€μ–‘μ„±κ³Ό νš¨μœ¨μ„± μ‚¬μ΄μ˜ νŠΈλ ˆμ΄λ“œ μ˜€ν”„(Trade- off) 관계가 λ‚˜νƒ€λ‚¬λ‹€. μ΅œμ’…μ μœΌλ‘œ μ°Έκ°€μž κ°„ μˆ˜μ§‘μ˜ 개인차, μˆ˜μ§‘ 결과에 λŒ€ν•œ 압박에 λŒ€ν•΄ λ…Όμ˜ν•˜κ³ , μ‹€ν—˜ κ²°κ³Όλ₯Ό μ’…ν•©ν•˜μ—¬ 톡합적인 νƒœμŠ€ν¬ λ””μžμΈ 방식을 μ œμ–Έν•˜μ˜€λ‹€. λ³Έ μ—°κ΅¬λŠ” ν•™μŠ΅λ°μ΄ν„°μ˜ μˆ˜μ§‘ κ°€λŠ₯성을 λ°νžˆλŠ” 연ꡬ가 μ£Όλ₯Ό μ΄λ£¨λŠ” κ°€μš΄λ°, μˆ˜μ§‘ κ²°κ³Όλ₯Ό ν–₯μƒμ‹œν‚¬ 수 μžˆλŠ” λ°©μ•ˆμ„ μ—°κ΅¬ν•œλ‹€λŠ” μ μ—μ„œ ν•™μˆ μ  의의λ₯Ό κ°–λŠ”λ‹€. λ˜ν•œ λŒ€ν™”ν˜• μ—μ΄μ „νŠΈμ˜ 개발이 λ³΄νŽΈν™”λ˜λŠ” μ‹œμ μ—, μ‚°μ—… λΆ„μ•Όμ—μ„œ μ‹€μ œ κ²ͺκ³  μžˆλŠ” 문제λ₯Ό ν•΄κ²°ν•˜κ³ μž ν•œλ‹€λŠ” μ μ—μ„œ μ‹œμ˜μ„±κ³Ό μœ μš©μ„± 츑면의 의의λ₯Ό κ°–λŠ”λ‹€. λ§ˆμ§€λ§‰μœΌλ‘œ μ‚¬νšŒμ‹¬λ¦¬ν•™ 이둠, HCI, 곡학 λΆ„μ•Όλ₯Ό μ ‘λͺ©ν•œλ‹€λŠ” μ μ—μ„œ μœ΅ν•©μ  의의λ₯Ό κ°–λŠ”λ‹€.제 1μž₯ μ„œλ‘  1 제 1절 μ—°κ΅¬μ˜ λ°°κ²½ 1 제 2절 λ…Όλ¬Έμ˜ ꡬ성 7 제 2μž₯ 이둠적 λ°°κ²½ 8 제 1절 λŒ€ν™”ν˜• μ—μ΄μ „νŠΈμ˜ μΈν…νŠΈ νŒŒμ•… 8 제 2절 μžμ—°μ–΄ ν•™μŠ΅λ°μ΄ν„° κ΄€λ ¨ ν¬λΌμš°λ“œμ†Œμ‹± ν™œμš© 연ꡬ 10 제 3절 ν¬λΌμš°λ“œμ†Œμ‹± μˆ˜μ§‘ 결과와 κ΄€λ ¨λœ νƒœμŠ€ν¬ λ””μžμΈ μš”μΈ 12 제 4절 Social Proof 효과 16 제 3μž₯ 연ꡬ 문제 18 제 4μž₯ 연ꡬ 방법 21 제 1절 νƒœμŠ€ν¬ 및 μ‹€ν—˜μ ˆμ°¨ 22 제 2절 μ‹€ν—˜λ¬Ό 23 제 3절 μΈ‘μ • μ§€ν‘œ 및 뢄석방법 27 제 5μž₯ 연ꡬ κ²°κ³Ό 33 제 1절 νƒœμŠ€ν¬ 양에 λ”°λ₯Έ μˆ˜μ§‘ κ²°κ³Ό 33 제 2절 λ³΄λ„ˆμŠ€ 보상 방식에 λ”°λ₯Έ μˆ˜μ§‘ κ²°κ³Ό 39 제 3절 Social Proof 기반 μ„€λͺ… 방식에 λ”°λ₯Έ μˆ˜μ§‘ κ²°κ³Ό 46 제 6μž₯ λ””μžμΈ μ œμ–Έ 55 제 7μž₯ κ²°λ‘  58 제 1절 연ꡬ 결과의 μš”μ•½ 58 제 2절 μ—°κ΅¬μ˜ ν•œκ³„ 59 제 3절 μ—°κ΅¬μ˜ 의의 60 μ°Έκ³ λ¬Έν—Œ 61 Abstract 69Maste

    Effective crowdsourced generation of training data for chatbots natural language understanding

    No full text
    Chatbots are text-based conversational agents. Natural Language Understanding (NLU) models are used to extract meaning and intention from user messages sent to chatbots. The user experience of chatbots largely depends on the performance of the NLU model, which itself largely depends on the initial dataset the model is trained with. The training data should cover the diversity of real user requests the chatbot will receive. Obtaining such data is a challenging task even for big corporations. We introduce a generic approach to generate training data with the help of crowd workers, we discuss the approach workflow and the design of crowdsourcing tasks assuring high quality. We evaluate the approach by running an experiment collecting data for 9 different intents. We use the collected training data to train a natural language understanding model. We analyse the performance of the model under different training set sizes for each intent. We provide recommendations on selecting an optimal confidence threshold for predicting intents, based on the cost model of incorrect and unknown predictions

    Effective crowdsourced generation of training data for chatbots natural language understanding

    No full text
    Chatbots are text-based conversational agents. Natural Language Understanding (NLU) models are used to extract meaning and intention from user messages sent to chatbots. The user experience of chatbots largely depends on the performance of the NLU model, which itself largely depends on the initial dataset the model is trained with. The training data should cover the diversity of real user requests the chatbot will receive. Obtaining such data is a challenging task even for big corporations. We introduce a generic approach to generate training data with the help of crowd workers, we discuss the approach workflow and the design of crowdsourcing tasks assuring high quality. We evaluate the approach by running an experiment collecting data for 9 different intents. We use the collected training data to train a natural language understanding model. We analyse the performance of the model under different training set sizes for each intent. We provide recommendations on selecting an optimal confidence threshold for predicting intents, based on the cost model of incorrect and unknown predictions.Accepted Author ManuscriptWeb Information System
    corecore