27 research outputs found

    Dynamic Estimation of Rater Reliability using Multi-Armed Bandits

    Get PDF
    One of the critical success factors for supervised machine learning is the quality of target values, or predictions, associated with training instances. Predictions can be discrete labels (such as a binary variable specifying whether a blog post is positive or negative) or continuous ratings (for instance, how boring a video is on a 10-point scale). In some areas, predictions are readily available, while in others, the eort of human workers has to be involved. For instance, in the task of emotion recognition from speech, a large corpus of speech recordings is usually available, and humans denote which emotions are present in which recordings

    Reinforcement Learning for Machine Translation: from Simulations to Real-World Applications

    Get PDF
    If a machine translation is wrong, how we can tell the underlying model to fix it? Answering this question requires (1) a machine learning algorithm to define update rules, (2) an interface for feedback to be submitted, and (3) expertise on the side of the human who gives the feedback. This thesis investigates solutions for machine learning updates, the suitability of feedback interfaces, and the dependency on reliability and expertise for different types of feedback. We start with an interactive online learning scenario where a machine translation (MT) system receives bandit feedback (i.e. only once per source) instead of references for learning. Policy gradient algorithms for statistical and neural MT are developed to learn from absolute and pairwise judgments. Our experiments on domain adaptation with simulated online feedback show that the models can largely improve under weak feedback, with variance reduction techniques being very effective. In production environments offline learning is often preferred over online learning. We evaluate algorithms for counterfactual learning from human feedback in a study on eBay product title translations. Feedback is either collected via explicit star ratings from users, or implicitly from the user interaction with cross-lingual product search. Leveraging implicit feedback turns out to be more successful due to lower levels of noise. We compare the reliability and learnability of absolute Likert-scale ratings with pairwise preferences in a smaller user study, and find that absolute ratings are overall more effective for improvements in down-stream tasks. Furthermore, we discover that error markings provide a cheap and practical alternative to error corrections. In a generalized interactive learning framework we propose a self-regulation approach, where the learner, guided by a regulator module, decides which type of feedback to choose for each input. The regulator is reinforced to find a good trade-off between supervision effect and cost. In our experiments, it discovers strategies that are more efficient than active learning and standard fully supervised learning

    How Technology Impacts and Compares to Humans in Socially Consequential Arenas

    Full text link
    One of the main promises of technology development is for it to be adopted by people, organizations, societies, and governments -- incorporated into their life, work stream, or processes. Often, this is socially beneficial as it automates mundane tasks, frees up more time for other more important things, or otherwise improves the lives of those who use the technology. However, these beneficial results do not apply in every scenario and may not impact everyone in a system the same way. Sometimes a technology is developed which produces both benefits and inflicts some harm. These harms may come at a higher cost to some people than others, raising the question: {\it how are benefits and harms weighed when deciding if and how a socially consequential technology gets developed?} The most natural way to answer this question, and in fact how people first approach it, is to compare the new technology to what used to exist. As such, in this work, I make comparative analyses between humans and machines in three scenarios and seek to understand how sentiment about a technology, performance of that technology, and the impacts of that technology combine to influence how one decides to answer my main research question.Comment: Doctoral thesis proposal. arXiv admin note: substantial text overlap with arXiv:2110.08396, arXiv:2108.12508, arXiv:2006.1262

    Lindsey the Tour Guide Robot: Adaptive Long-Term Autonomy in Social Environments

    Get PDF
    This project proposes a framework for online adaptation of robot behaviours deployed autonomously in social settings with the goal of increasing the overall users' engagement during the interactions. One of the most critical aspects to address for robots deployed in ``the real world'' is the necessity of interacting with people, whether intentionally or not. Interacting with people requires a wide range of capabilities, from perceiving the different people's intentions and emotional states to generating appropriate behaviours for the specific context of the interaction. Moreover, it requires that robots learn and adapt from experience while interacting with their users. In this project, a mobile robot is embedded in a long-term study in a public museum. The robot has been deployed for more than a year, to date, as an autonomous tour guide to the museum's visitors, with its tasks being guiding people to the position of various exhibits and giving a description of each item. The long-term scenario allows studying how people interact with a robot in an unconstrained setting and give the opportunity of improving the current state-of-the-art robotics autonomy in a social setting. The initial data collection shows that users' engagement during the robotised tours steeply declines after the initial moments of the interaction. The first main contribution of this project is to investigate whether it is possible to automatically assess the users' engagement from the robot point-of-view during the interactions. A dataset of robot ego-centric videos was collected and manually annotated by independent coders with continuous engagement values. From it, an end-to-end regression model was trained to predict engagement from the robot point of view from a single camera. Experimental evaluation shows that the model accurately estimates the engagement level of people during an interaction, even in diverse environments and with different robots. Once the robot can detect the engagement state of users during the interactions, it can potentially plan tangential behaviours to influence the users' attentional state itself. The second contribution of this work is devising an online reinforcement learning algorithm that allows the robot to adapt its behaviour online from the feedback obtained during the interactions. The feedback is obtained from users' engagement values estimated from the robot head camera. In the experimental evaluation, the robot delivers the usual tours to the users with the difference that the choice of some actions is left to the adaptive learning algorithm. Results show that after a few months of exploration, the robot successfully learns a policy that leads people to stay in the interaction for longer

    Exploring Diversity and Fairness in Machine Learning

    Get PDF
    With algorithms, artificial intelligence, and machine learning becoming ubiquitous in our society, we need to start thinking about the implications and ethical concerns of new machine learning models. In fact, two types of biases that impact machine learning models are social injustice bias (bias created by society) and measurement bias (bias created by unbalanced sampling). Biases against groups of individuals found in machine learning models can be mitigated through the use of diversity and fairness constraints. This dissertation introduces models to help humans make decisions by enforcing diversity and fairness constraints. This work starts with a call to action. Bias is rife in hiring, and since algorithms are being used in multiple companies to filter applicants, we need to pay special attention to this application. Inspired by this hiring application, I introduce new multi-armed bandit frameworks to help assign human resources in the hiring process while enforcing diversity through a submodular utility function. These frameworks increase diversity while using less resources compared to original admission decisions of the Computer Science graduate program at the University of Maryland. Moving outside of hiring I present a contextual multi-armed bandit algorithm that enforces group fairness by learning a societal bias term and correcting for it. This algorithm is tested on two real world datasets and shows marked improvement over other in-use algorithms. Additionally I take a look at fairness in traditional machine learning domain adaptation. I provide the first theoretical analysis of this setting and test the resulting model on two deal world datasets. Finally I explore extensions to my core work, delving into suicidality, comprehension of fairness definitions, and student evaluations

    ์ธ๊ณต์ง€๋Šฅ๊ณผ ๋Œ€ํ™”ํ•˜๊ธฐ: ์ผ๋Œ€์ผ ๊ทธ๋ฆฌ๊ณ  ๊ทธ๋ฃน ์ƒ์šฉ์ž‘์šฉ์„ ์œ„ํ•œ ๋Œ€ํ™”ํ˜• ์—์ด์ „ํŠธ ์‹œ์Šคํ…œ ๊ฐœ๋ฐœ

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ(๋ฐ•์‚ฌ) -- ์„œ์šธ๋Œ€ํ•™๊ต๋Œ€ํ•™์› : ์‚ฌํšŒ๊ณผํ•™๋Œ€ํ•™ ์–ธ๋ก ์ •๋ณดํ•™๊ณผ, 2022.2. ์ด์ค€ํ™˜."์ธ๊ฐ„-์ปดํ“จํ„ฐ ์ƒํ˜ธ์ž‘์šฉ"๊ณผ "์‚ฌ์šฉ์ž ๊ฒฝํ—˜"์„ ๋„˜์–ด, "์ธ๊ฐ„-์ธ๊ณต์ง€๋Šฅ ์ƒํ˜ธ์ž‘์šฉ" ๊ทธ๋ฆฌ๊ณ  "์•Œ๊ณ ๋ฆฌ์ฆ˜ ๊ฒฝํ—˜"์˜ ์‹œ๋Œ€๊ฐ€ ๋„๋ž˜ํ•˜๊ณ  ์žˆ๋‹ค. ๊ธฐ์ˆ ์˜ ๋ฐœ์ „์€ ์šฐ๋ฆฌ๊ฐ€ ์˜์‚ฌ์†Œํ†ตํ•˜๊ณ  ํ˜‘์—…ํ•˜๋Š” ๋ฐฉ์‹์˜ ํŒจ๋Ÿฌ๋‹ค์ž„์„ ์ „ํ™˜ํ–ˆ๋‹ค. ๊ธฐ๊ณ„ ์—์ด์ „ํŠธ๋Š” ์ธ๊ฐ„ ์ปค๋ฎค๋‹ˆ์ผ€์ด์…˜์—์„œ ์ ๊ทน์ ์ด๋ฉฐ ์ฃผ๋„์ ์ธ ์—ญํ• ์„ ์ˆ˜ํ–‰ํ•œ๋‹ค. ํ•˜์ง€๋งŒ ํšจ๊ณผ์ ์ธ AI ๊ธฐ๋ฐ˜ ์ปค๋ฎค๋‹ˆ์ผ€์ด์…˜๊ณผ ํ† ๋ก  ์‹œ์Šคํ…œ ๋””์ž์ธ์— ๋Œ€ํ•œ ์ดํ•ด์™€ ๋…ผ์˜๋Š” ๋ถ€์กฑํ•œ ๊ฒƒ์ด ์‚ฌ์‹ค์ด๋‹ค. ์ด์— ๋ณธ ์—ฐ๊ตฌ๋Š” ์ธ๊ฐ„-์ปดํ“จํ„ฐ ์ƒํ˜ธ์ž‘์šฉ์˜ ๊ด€์ ์—์„œ ๋‹ค์–‘ํ•œ ํ˜•ํƒœ์˜ ์ปค๋ฎค๋‹ˆ์ผ€์ด์…˜์„ ์ง€์›ํ•  ์ˆ˜ ์žˆ๋Š” ๊ธฐ์ˆ ์  ๋ฐฉ๋ฒ•์„ ํƒ์ƒ‰ํ•˜๋Š” ๊ฒƒ์„ ๋ชฉํ‘œ๋กœ ํ•œ๋‹ค. ์ด๋ฅผ ์œ„ํ•ด ์ €์ž๋Š” ์ผ๋Œ€์ผ ๊ทธ๋ฆฌ๊ณ  ๊ทธ๋ฃน ์ƒํ˜ธ์ž‘์šฉ์„ ์ง€์›ํ•˜๋Š” ๋Œ€ํ™”ํ˜• ์—์ด์ „ํŠธ๋ฅผ ์ œ์‹œํ•œ๋‹ค. ๊ตฌ์ฒด์ ์œผ๋กœ ๋ณธ ์—ฐ๊ตฌ๋Š” 1) ์ผ๋Œ€์ผ ์ƒํ˜ธ์ž‘์š”์—์„œ ์‚ฌ์šฉ์ž ๊ด€์—ฌ๋ฅผ ๋†’์ด๋Š” ๋Œ€ํ™”ํ˜• ์—์ด์ „ํŠธ, 2) ์ผ์ƒ์ ์ธ ์†Œ์…œ ๊ทธ๋ฃน ํ† ๋ก ์„ ์ง€์›ํ•˜๋Š” ์—์ด์ „ํŠธ, 3) ์ˆ™์˜ ํ† ๋ก ์„ ๊ฐ€๋Šฅํ•˜๊ฒŒ ํ•˜๋Š” ์—์ด์ „ํŠธ๋ฅผ ๋””์ž์ธ ๋ฐ ๊ฐœ๋ฐœํ•˜๊ณ  ๊ทธ ํšจ๊ณผ๋ฅผ ์ •๋Ÿ‰์  ๊ทธ๋ฆฌ๊ณ  ์ •์„ฑ์ ์œผ๋กœ ๊ฒ€์ฆํ–ˆ๋‹ค. ์‹œ์Šคํ…œ์„ ๋””์ž์ธํ•จ์— ์žˆ์–ด์„œ ์ธ๊ฐ„-์ปดํ“จํ„ฐ ์ƒํ˜ธ์ž‘์šฉ๋ฟ ์•„๋‹ˆ๋ผ, ์ปค๋ฎค๋‹ˆ์ผ€์ด์…˜ํ•™, ์‹ฌ๋ฆฌํ•™, ๊ทธ๋ฆฌ๊ณ  ๋ฐ์ดํ„ฐ ๊ณผํ•™์„ ์ ‘๋ชฉํ•œ ๋‹คํ•™์ œ์  ์ ‘๊ทผ ๋ฐฉ์‹์ด ์ ์šฉ๋˜์—ˆ๋‹ค. ์ฒซ ๋ฒˆ์งธ ์—ฐ๊ตฌ๋Š” ์ผ๋Œ€์ผ ์ƒํ˜ธ์ž‘์šฉ ์ƒํ™ฉ์—์„œ ์‚ฌ์šฉ์ž์˜ ๊ด€์—ฌ ์ฆ์ง„์„ ์œ„ํ•œ ๋Œ€ํ™”ํ˜• ์—์ด์ „ํŠธ์˜ ํšจ๊ณผ๋ฅผ ๊ฒ€์ฆํ–ˆ๋‹ค. ์„ค๋ฌธ์กฐ์‚ฌ๋ผ๋Š” ๋งฅ๋ฝ์—์„œ ์ˆ˜ํ–‰๋œ ์ด ์—ฐ๊ตฌ๋Š” ์›น ์„ค๋ฌธ์กฐ์‚ฌ์—์„œ ์‘๋‹ต์ž์˜ ๋ถˆ์„ฑ์‹ค๋กœ ์ธํ•ด ๋ฐœ์ƒํ•˜๋Š” ์‘๋‹ต ๋ฐ์ดํ„ฐ ํ’ˆ์งˆ์˜ ๋ฌธ์ œ๋ฅผ ๊ทน๋ณตํ•˜๊ธฐ ์œ„ํ•œ ์ƒˆ๋กœ์šด ์ธํ„ฐ๋ž™์…˜ ๋ฐฉ๋ฒ•์œผ๋กœ ํ…์ŠคํŠธ ๊ธฐ๋ฐ˜ ๋Œ€ํ™”ํ˜• ์—์ด์ „ํŠธ์˜ ๊ฐ€๋Šฅ์„ฑ์„ ํƒ์ƒ‰ํ•˜๋Š” ๊ฒƒ์„ ๋ชฉํ‘œ๋กœ ํ–ˆ๋‹ค. ์ด๋ฅผ ์œ„ํ•ด 2 (์ธํ„ฐํŽ˜์ด์Šค: ์›น ๅฐ ์ฑ—๋ด‡) X 2 (๋Œ€ํ™” ์Šคํƒ€์ผ: ํฌ๋ฉ€ ๅฐ ์บ์ฅฌ์–ผ) ์‹คํ—˜์„ ์ง„ํ–‰ํ–ˆ์œผ๋ฉฐ, ๋งŒ์กฑํ™” ์ด๋ก ์— ๊ทผ๊ฑฐํ•˜์—ฌ ์‘๋‹ต ๋ฐ์ดํ„ฐ์˜ ํ’ˆ์งˆ์„ ํ‰๊ฐ€ํ–ˆ๋‹ค. ๊ทธ ๊ฒฐ๊ณผ, ์ฑ—๋ด‡ ์„ค๋ฌธ์กฐ์‚ฌ์˜ ์ฐธ์—ฌ์ž๊ฐ€ ์›น ์„ค๋ฌธ์กฐ์‚ฌ์˜ ์ฐธ์—ฌ์ž๋ณด๋‹ค ๋” ๋†’์€ ์ˆ˜์ค€์˜ ๊ด€์—ฌ๋ฅผ ๋ณด์ด๊ณ , ๊ฒฐ๊ณผ์ ์œผ๋กœ ๋” ๋†’์€ ํ’ˆ์งˆ์˜ ๋ฐ์ดํ„ฐ๋ฅผ ์ƒ์„ฑํ•˜๋Š” ๊ฒƒ์„ ํ™•์ธํ•  ์ˆ˜ ์žˆ์—ˆ๋‹ค. ํ•˜์ง€๋งŒ ์ด๋Ÿฐ ์ฑ—๋ด‡์˜ ๋ฐ์ดํ„ฐ ํ’ˆ์งˆ์— ๋Œ€ํ•œ ํšจ๊ณผ๋Š” ์ฑ—๋ด‡์ด ์นœ๊ตฌ ๊ฐ™๊ณ  ์บ์ฅฌ์–ผํ•œ ๋Œ€ํ™”์ฒด๋ฅผ ์‚ฌ์šฉํ•  ๋•Œ๋งŒ ๋‚˜ํƒ€๋‚ฌ๋‹ค. ์ด ๊ฒฐ๊ณผ๋Š” ๋Œ€ํ™”ํ˜• ์ธํ„ฐ๋ž™ํ‹ฐ๋น„ํ‹ฐ๊ฐ€ ์ธํ„ฐํŽ˜์ด์Šค๋ฟ ์•„๋‹ˆ๋ผ ๋Œ€ํ™” ์Šคํƒ€์ผ์ด๋ผ๋Š” ํšจ๊ณผ์ ์ธ ๋ฉ”์„ธ์ง€ ์ „๋žต์„ ๋™๋ฐ˜ํ•  ๋•Œ ๋ฐœ์ƒํ•˜๋Š” ๊ฒƒ์„ ์˜๋ฏธํ•œ๋‹ค. ๋‘ ๋ฒˆ์งธ ์—ฐ๊ตฌ๋Š” ์ผ์ƒ์ ์ธ ์†Œ์…œ ์ฑ„ํŒ… ๊ทธ๋ฃน์—์„œ ์ง‘๋‹จ์˜ ์˜์‚ฌ๊ฒฐ์ •๊ณผ์ •๊ณผ ํ† ๋ก ์„ ์ง€์›ํ•˜๋Š” ๋Œ€ํ™”ํ˜• ์‹œ์Šคํ…œ์— ๋Œ€ํ•œ ๊ฒƒ์ด๋‹ค. ์ด๋ฅผ ์œ„ํ•ด GroupfeedBot์ด๋ผ๋Š” ๋Œ€ํ™”ํ˜• ์—์ด์ „ํŠธ๋ฅผ ์ œ์ž‘ํ•˜์˜€์œผ๋ฉฐ, GroupfeedBot์€ (1) ํ† ๋ก  ์‹œ๊ฐ„์„ ๊ด€๋ฆฌํ•˜๊ณ , (2) ๊ตฌ์„ฑ์›๋“ค์˜ ๊ท ๋“ฑํ•œ ์ฐธ์—ฌ๋ฅผ ์ด‰์ง„ํ•˜๋ฉฐ, (3) ๊ตฌ์„ฑ์›๋“ค์˜ ๋‹ค์–‘ํ•œ ์˜๊ฒฌ์„ ์š”์•ฝ ๋ฐ ์กฐ์งํ™”ํ•˜๋Š” ๊ธฐ๋Šฅ์„ ๊ฐ–๊ณ  ์žˆ๋‹ค. ํ•ด๋‹น ์—์ด์ „ํŠธ๋ฅผ ํ‰๊ฐ€ํ•˜๊ธฐ ์œ„ํ•ด ๋‹ค์–‘ํ•œ ํƒœ์Šคํฌ (์ถ”๋ก , ์˜์‚ฌ๊ฒฐ์ •, ์ž์œ  ํ† ๋ก , ๋ฌธ์ œ ํ•ด๊ฒฐ ๊ณผ์ œ)์™€ ๊ทธ๋ฃน ๊ทœ๋ชจ(์†Œ๊ทœ๋ชจ, ์ค‘๊ทœ๋ชจ)์— ๊ด€ํ•˜์—ฌ ์‚ฌ์šฉ์ž ์กฐ์‚ฌ๋ฅผ ์‹œํ–‰ํ–ˆ๋‹ค. ๊ทธ ๊ฒฐ๊ณผ ์˜๊ฒฌ์˜ ๋‹ค์–‘์„ฑ ์ธก๋ฉด์—์„œ GroupfeedBot์œผ๋กœ ํ† ๋ก ํ•œ ์ง‘๋‹จ์ด ๊ธฐ๋ณธ ์—์ด์ „ํŠธ์™€ ํ† ๋ก ํ•œ ์ง‘๋‹จ๋ณด๋‹ค ๋” ๋‹ค์–‘ํ•œ ์˜๊ฒฌ์„ ์ƒ์„ฑํ–ˆ์ง€๋งŒ ์‚ฐ์ถœ๋œ ๊ฒฐ๊ณผ์˜ ํ’ˆ์งˆ๊ณผ ๋ฉ”์‹œ์ง€ ์–‘์— ์žˆ์–ด์„œ๋Š” ์ฐจ์ด๊ฐ€ ์—†๋Š” ๊ฒƒ์„ ํ™•์ธํ•  ์ˆ˜ ์žˆ์—ˆ๋‹ค. ๊ท ๋“ฑํ•œ ์ฐธ์—ฌ์— ๋Œ€ํ•œ GroupfeedBot์˜ ํšจ๊ณผ๋Š” ํƒœ์Šคํฌ์˜ ํŠน์„ฑ์— ๋”ฐ๋ผ ๋‹ค๋ฅด๊ฒŒ ๋‚˜ํƒ€๋‚ฌ๋Š”๋ฐ, ํŠนํžˆ ์ž์œ  ํ† ๋ก  ๊ณผ์ œ์—์„œ GroupfeedBot์ด ์ฐธ์—ฌ์ž๋“ค์˜ ๊ท ๋“ฑํ•œ ์ฐธ์—ฌ๋ฅผ ์ด‰์ง„ํ–ˆ๋‹ค. ์„ธ ๋ฒˆ์งธ ์—ฐ๊ตฌ๋Š” ์ˆ™์˜ ํ† ๋ก ์„ ์ง€์›ํ•˜๋Š” ๋Œ€ํ™”ํ˜• ์‹œ์Šคํ…œ์— ๋Œ€ํ•œ ๊ฒƒ์ด๋‹ค. ์„ธ ๋ฒˆ์งธ ์—ฐ๊ตฌ์—์„œ ๊ฐœ๋ฐœ๋œ DebateBot์€ GroupfeeedBot๊ณผ ๋‹ฌ๋ฆฌ ๋” ์ง„์ง€ํ•œ ์‚ฌํšŒ์  ๋งฅ๋ฝ์—์„œ ์ ์šฉ๋˜์—ˆ๋‹ค. DebateBot์€ (1) ์ƒ๊ฐํ•˜๊ธฐ-์ง์ง“๊ธฐ-๊ณต์œ ํ•˜๊ธฐ (Think-Pair-Share) ์ „๋žต์— ๋”ฐ๋ผ ํ† ๋ก ์„ ๊ตฌ์กฐํ™”ํ•˜๊ณ , (2) ๊ณผ๋ฌตํ•œ ํ† ๋ก ์ž์—๊ฒŒ ์˜๊ฒฌ์„ ์š”์ฒญํ•จ์œผ๋กœ์จ ๋™๋“ฑํ•œ ์ฐธ์—ฌ๋ฅผ ์ด‰์ง„ํ•˜๋Š” ๋‘ ๊ฐ€์ง€ ์ฃผ์š” ๊ธฐ๋Šฅ์„ ์ˆ˜ํ–‰ํ–ˆ๋‹ค. ์‚ฌ์šฉ์ž ํ‰๊ฐ€ ๊ฒฐ๊ณผ DebateBot์€ ๊ทธ๋ฃน ์ƒํ˜ธ์ž‘์šฉ์„ ๊ฐœ์„ ํ•จ์œผ๋กœ์จ ์‹ฌ์˜ ํ† ๋ก ์„ ๊ฐ€๋Šฅํ•˜๊ฒŒ ํ–ˆ๋‹ค. ํ† ๋ก  ๊ตฌ์กฐํ™”๋Š” ํ† ๋ก ์˜ ์งˆ์— ๊ธ์ •์ ์ธ ํšจ๊ณผ๋ฅผ ๋ฐœํœ˜ํ•˜์˜€๊ณ , ์ฐธ์—ฌ์ž ์ด‰์ง„์€ ์ง„์ •ํ•œ ํ•ฉ์˜ ๋„๋‹ฌ์— ๊ธฐ์—ฌํ•˜์˜€์œผ๋ฉฐ, ๊ทธ๋ฃน ๊ตฌ์„ฑ์›๋“ค์˜ ์ฃผ๊ด€์  ๋งŒ์กฑ๋„๋ฅผ ํ–ฅ์ƒํ–ˆ๋‹ค. ๋ณธ ์—ฐ๊ตฌ๋Š” ์ด ์„ธ ๊ฐ€์ง€ ์—ฐ๊ตฌ์˜ ๊ฒฐ๊ณผ๋“ค์„ ๋ฐ”ํƒ•์œผ๋กœ ์ธ๊ฐ„-์ธ๊ณต์ง€๋Šฅ ์ปค๋ฎค๋‹ˆ์ผ€์ด์…˜์— ๋Œ€ํ•œ ๋‹ค์–‘ํ•œ ์‹œ์‚ฌ์ ๋“ค์„ ๋„์ถœํ•˜์˜€์œผ๋ฉฐ, ์ด๋ฅผ TAMED (Task-Agent-Message-Information Exchange-Relationship Dynamics) ๋ชจ๋ธ๋กœ ์ •๋ฆฌํ•˜์˜€๋‹ค.The advancements in technology shift the paradigm of how individuals communicate and collaborate. Machines play an active role in human communication. However, we still lack a generalized understanding of how exactly to design effective machine-driven communication and discussion systems. How should machine agents be designed differently when interacting with a single user as opposed to when interacting with multiple users? How can machine agents be designed to drive user engagement during dyadic interaction? What roles can machine agents perform for the sake of group interaction contexts? How should technology be implemented in support of the group decision-making process and to promote group dynamics? What are the design and technical issues which should be considered for the sake of creating human-centered interactive systems? In this thesis, I present new interactive systems in the form of a conversational agent, or a chatbot, that facilitate dyadic and group interactions. Specifically, I focus on: 1) a conversational agent to engage users in dyadic communication, 2) a chatbot called GroupfeedBot that facilitates daily social group discussion, 3) a chatbot called DebateBot that enables deliberative discussion. My approach to research is multidisciplinary and informed by not only in HCI, but also communication, psychology and data science. In my work, I conduct in-depth qualitative inquiry and quantitative data analysis towards understanding issues that users have with current systems, before developing new computational techniques that meet those user needs. Finally, I design, build, and deploy systems that use these techniques to the public in order to achieve real-world impact and to study their use by different usage contexts. The findings of this thesis are as follows. For a dyadic interaction, participants interacting with a chatbot system were more engaged as compared to those with a static web system. However, the conversational agent leads to better user engagement only when the messages apply a friendly, human-like conversational style. These results imply that the chatbot interface itself is not quite sufficient for the purpose of conveying conversational interactivity. Messages should also be carefully designed to convey such. Unlike dyadic interactions, which focus on message characteristics, other elements of the interaction should be considered when designing agents for group communication. In terms of messages, it is important to synthesize and organize information given that countless messages are exchanged simultaneously. In terms of relationship dynamics, rather than developing a rapport with a single user, it is essential to understand and facilitate the dynamics of the group as a whole. In terms of task performance, technology should support the group's decision-making process by efficiently managing the task execution process. Considering the above characteristics of group interactions, I created the chatbot agents that facilitate group communication in two different contexts and verified their effectiveness. GroupfeedBot was designed and developed with the aim of enhancing group discussion in social chat groups. GroupfeedBot possesses the feature of (1) managing time, (2) encouraging members to participate evenly, and (3) organizing the membersโ€™ diverse opinions. The group which discussed with GroupfeedBot tended to produce more diverse opinions compared to the group discussed with the basic chatbot. Some effects of GroupfeedBot varied by the task's characteristics. GroupfeedBot encouraged the members to contribute evenly to the discussions, especially for the open-debating task. On the other hand, DebateBot was designed and developed to facilitate deliberative discussion. In contrast to GroupfeedBot, DebateBot was applied to more serious and less casual social contexts. Two main features were implemented in DebateBot: (1) structure discussion and (2) request opinions from reticent discussants.This work found that a chatbot agent which structures discussions and promotes even participation can improve discussions, resulting in higher quality deliberative discussion. Overall, adding structure to the discussion positively influenced the discussion quality, and the facilitation helped groups reach a genuine consensus and improved the subjective satisfaction of the group members. The findings of this thesis reflect the importance of understanding human factors in designing AI-infused systems. By understanding the characteristics of individual humans and collective groups, we are able to place humans at the heart of the system and utilize AI technology in a human-friendly way.1. Introduction 1.1 Background 1.2 Rise of Machine Agency 1.3 Theoretical Framework 1.4 Research Goal 1.5 Research Approach 1.6 Summary of Contributions 1.7 Thesis Overview 2. Related Work 2.1 A Brief History of Conversational Agents 2.2 TAMED Framework 3. Designing Conversational Agents for Dyadic Interaction 3.1 Background 3.2 Related Work 3.3 Method 3.4 Results 3.5 Discussion 3.6 Conclusion 4. Designing Conversational Agents for Social Group Discussion 4.1 Background 4.2 Related Work 4.3 Needfinding Survey for Facilitator Chatbot Agent 4.4 GroupfeedBot: A Chatbot Agent For Facilitating Discussion in Group Chats 4.5 Qualitative Study with Small-Sized Group 4.6 User Study With Medium-Sized Group 4.7 Discussion 4.8 Conclusion 5. Designing Conversational Agents for Deliberative Group Discussion 5.1 Background 5.2 Related Work 5.3 DebateBot 5.4 Method 5.5 Results 5.6 Discussion and Design Implications 5.7 Conclusion 6. Discussion 6.1 Designing Conversational Agents as a Communicator 6.2 Design Guidelines Based on TAMED Model 6.3 Technical Considerations 6.4 Human-AI Collaborative System 7. Conclusion 7.1 Research Summary 7.2 Summary of Contributions 7.3 Future Work 7.4 Conclusion๋ฐ•

    Interactive Machine Learning with Applications in Health Informatics

    Full text link
    Recent years have witnessed unprecedented growth of health data, including millions of biomedical research publications, electronic health records, patient discussions on health forums and social media, fitness tracker trajectories, and genome sequences. Information retrieval and machine learning techniques are powerful tools to unlock invaluable knowledge in these data, yet they need to be guided by human experts. Unlike training machine learning models in other domains, labeling and analyzing health data requires highly specialized expertise, and the time of medical experts is extremely limited. How can we mine big health data with little expert effort? In this dissertation, I develop state-of-the-art interactive machine learning algorithms that bring together human intelligence and machine intelligence in health data mining tasks. By making efficient use of human expert's domain knowledge, we can achieve high-quality solutions with minimal manual effort. I first introduce a high-recall information retrieval framework that helps human users efficiently harvest not just one but as many relevant documents as possible from a searchable corpus. This is a common need in professional search scenarios such as medical search and literature review. Then I develop two interactive machine learning algorithms that leverage human expert's domain knowledge to combat the curse of "cold start" in active learning, with applications in clinical natural language processing. A consistent empirical observation is that the overall learning process can be reliably accelerated by a knowledge-driven "warm start", followed by machine-initiated active learning. As a theoretical contribution, I propose a general framework for interactive machine learning. Under this framework, a unified optimization objective explains many existing algorithms used in practice, and inspires the design of new algorithms.PHDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/147518/1/raywang_1.pd
    corecore