5 research outputs found

    Cody: An AI-Based System to Semi-Automate Coding for Qualitative Research

    Get PDF
    Qualitative research can produce a rich understanding of a phenomenon but requires an essential and strenuous data annotation process known as coding. Coding can be repetitive and time-consuming, particularly for large datasets. Existing AI-based approaches for partially automating coding, like supervised machine learning (ML) or explicit knowledge represented in code rules, require high technical literacy and lack transparency. Further, little is known about the interaction of researchers with AI-based coding assistance. We introduce Cody, an AI-based system that semi-automates coding through code rules and supervised ML. Cody supports researchers with interactively (re)defining code rules and uses ML to extend coding to unseen data. In two studies with qualitative researchers, we found that (1) code rules provide structure and transparency, (2) explanations are commonly desired but rarely used, (3) suggestions benefit coding quality rather than coding speed, increasing the intercoder reliability, calculated with Krippendorff’s Alpha, from 0.085 (MAXQDA) to 0.33 (Cody)

    Designing AI-Based Systems for Qualitative Data Collection and Analysis

    Get PDF
    With the continuously increasing impact of information systems (IS) on private and professional life, it has become crucial to integrate users in the IS development process. One of the critical reasons for failed IS projects is the inability to accurately meet user requirements, resulting from an incomplete or inaccurate collection of requirements during the requirements elicitation (RE) phase. While interviews are the most effective RE technique, they face several challenges that make them a questionable fit for the numerous, heterogeneous, and geographically distributed users of contemporary IS. Three significant challenges limit the involvement of a large number of users in IS development processes today. Firstly, there is a lack of tool support to conduct interviews with a wide audience. While initial studies show promising results in utilizing text-based conversational agents (chatbots) as interviewer substitutes, we lack design knowledge for designing AI-based chatbots that leverage established interviewing techniques in the context of RE. By successfully applying chatbot-based interviewing, vast amounts of qualitative data can be collected. Secondly, there is a need to provide tool support enabling the analysis of large amounts of qualitative interview data. Once again, while modern technologies, such as machine learning (ML), promise remedy, concrete implementations of automated analysis for unstructured qualitative data lag behind the promise. There is a need to design interactive ML (IML) systems for supporting the coding process of qualitative data, which centers around simple interaction formats to teach the ML system, and transparent and understandable suggestions to support data analysis. Thirdly, while organizations rely on online feedback to inform requirements without explicitly conducting RE interviews (e.g., from app stores), we know little about the demographics of who is giving feedback and what motivates them to do so. Using online feedback as requirement source risks including solely the concerns and desires of vocal user groups. With this thesis, I tackle these three challenges in two parts. In part I, I address the first and the second challenge by presenting and evaluating two innovative AI-based systems, a chatbot for requirements elicitation and an IML system to semi-automate qualitative coding. In part II, I address the third challenge by presenting results from a large-scale study on IS feedback engagement. With both parts, I contribute with prescriptive knowledge for designing AI-based qualitative data collection and analysis systems and help to establish a deeper understanding of the coverage of existing data collected from online sources. Besides providing concrete artifacts, architectures, and evaluations, I demonstrate the application of a chatbot interviewer to understand user values in smartphones and provide guidance for extending feedback coverage from underrepresented IS user groups

    Empowering users to communicate their preferences to machine learning models in Visual Analytics

    Get PDF
    Recent visual analytic (VA) systems rely on machine learning (ML) to allow users to perform a variety of data analytic tasks, e.g., biologists clustering genome samples, medical practitioners predicting the diagnosis for a new patient, ML practitioners tuning models' hyperparameter settings, etc. These VA systems support interactive construction of models to people (I call them power users) with a diverse set of expertise in ML; from non-experts, to intermediates, to expert ML users. Through my research, I designed and developed VA systems for power users empowering them to communicate their preferences to interactively construct machine learning models for their analytical tasks. In this process, I design algorithms to incorporate user interaction data in machine learning modeling pipelines. Specifically, I deployed and tested (e.g., task completion times, user satisfaction ratings, success rate in finding user-preferred models, model accuracies) two main interaction techniques, multi-model steering, and interactive objective functions to facilitate specification of user goals and objectives to underlying model(s) in VA. However, designing these VA systems for power users poses various challenges, such as addressing diversity in user expertise, metric selection, user modeling to automatically infer preferences, evaluating the success of these systems, etc. Through this work I contribute a set of VA systems that support interactive construction and selection of supervised and unsupervised models using tabular data. In addition, I also present results/findings from a design study of interactive ML in a specific domain with real users and real data.Ph.D
    corecore