34 research outputs found

    Ciao AI: the Italian adaptation and validation of the Chatbot Usability Scale

    Get PDF
    Chatbot-based tools are becoming pervasive in multiple domains from commercial websites to rehabilitation applications. Only recently, an eleven-item satisfaction inventory was developed (the ChatBot Usability Scale, BUS-11) to help designers in the assessment process of their systems. The BUS-11 has been validated in multiple contexts and languages, i.e., English, German, Dutch, and Spanish. This scale forms a solid platform enabling designers to rapidly assess chatbots both during and after the design process. The present work aims to adapt and validate the BUS-11 inventory in Italian. A total of 1360 questionnaires were collected which related to a total of 10 Italian chatbot-based systems using the BUS-11 inventory and also using the lite version of the Usability Metrics for User eXperience for convergent validity purposes. The Italian version of the BUS-11 was adapted in terms of the wording of one item, and a Multi-Group Confirmatory Factorial Analysis was performed to establish the factorial structure of the scale and compare the effects of the wording adaptation. Results indicate that the adapted Italian version of the scale matches the expected factorial structure of the original scale. The Italian BUS-11 is highly reliable (Cronbach alpha: 0.921), and it correlates to other measures of satisfaction (e.g., UMUX-Lite, τb = 0.67; p < .001) by also offering specific insights regarding the chatbots’ characteristics. The Italian BUS-11 can be confidently used by chatbot designers to assess the satisfaction of their users during formative or summative tests

    A confirmatory factorial analysis of the Chatbot Usability Scale: a multilanguage validation

    Get PDF
    The Bot Usability Scale (BUS) is a standardised tool to assess and compare the satisfaction of users after interacting with chatbots to support the development of usable conversational systems. The English version of the 15-item BUS scale (BUS-15) was the result of an exploratory factorial analysis; a confirmatory factorial analysis tests the replicability of the initial model and further explores the properties of the scale aiming to optimise this tool seeking for the stability of the original model, the potential reduction of items, and testing multiple language versions of the scale. BUS-15 and the usability metrics for user experience (UMUX-LITE), used here for convergent validity purposes, were translated from English to Spanish, German, and Dutch. A total of 1292 questionnaires were completed in multiple languages; these were collected from 209 participants interacting with an overall pool of 26 chatbots. BUS-15 was acceptably reliable; however, a shorter and more reliable solution with 11 items (BUS-11) emerged from the data. The satisfaction ratings obtained with the translated version of BUS-11 were not significantly different from the original version in English, suggesting that the BUS-11 could be used in multiple languages. The results also suggested that the age of participants seems to affect the evaluation when using the scale, with older participants significantly rating the chatbots as less satisfactory, when compared to younger participants. In line with the expectations, based on reliability, BUS-11 positively correlates with UMUX-LITE scale. The new version of the scale (BUS-11) aims to facilitate the evaluation with chatbots, and its diffusion could help practitioners to compare the performances and benchmark chatbots during the product assessment stage. This tool could be a way to harmonise and enable comparability in the field of human and conversational agent interaction

    The Chatbot Usability Scale: the Design and Pilot of a Usability Scale for Interaction with AI-Based Conversational Agents

    Get PDF
    Standardised tools to assess a user's satisfaction with the experience of using chatbots and conversational agents are currently unavailable. This work describes four studies; including a systematic literature review, with an overall sample of 141 participants in the survey (experts and novices), focus group sessions and testing of chatbots to i) define attributes to assess the quality of interaction with chatbots, and ii) the designing and piloting a new scale to measure satisfaction after the experience with chatbots. Two instruments were developed: i) A diagnostic tool in the form of a checklist (BOT-Check). This tool is a development of previous works which can be used reliably to check the quality of a chatbots experience in line with commonplace principles. ii) A 15-item questionnaire (BOT Usability Scale, BUS-15) with estimated reliability between .76 and .87 distributed in five factors. BUS-15 strongly correlates with UMUX-LITE by enabling designers to consider a broader range of aspects usually not considered in satisfaction tools for non-conversational agents, e.g., conversational efficiency and accessibility, quality of the chatbot's functionality and so on. Despite the convincing psychometric properties, BUS-15 requires further testing and validation. Designers can use it as a tool to assess products, thus building independent databases for future evaluation of its reliability, validity, and sensitivity

    DATUS: Dashboard Assessment Usability Model: A case study with student dashboards

    Get PDF
    The software market sees the appearance of new companies and products every day. This growth translates into the competition, and the survival of companies is reduced to investment in their products. Universities are also interested in improving their product, education. This improvement can be achieved by investing in the learning experience of students. Usability and user experience play an important role and have been a competitive advantage worth investing. Consequently, new methods have emerged to improve the process of evaluating the usability of products. Despite this growth, there is no direct model for assessing the usability of a dashboard. This gap led to the investigation of this dissertation, a proposal for a new model, Dashboard Assessment Usability Model (DATUS), accompanied by an evaluation method, which can be applied to the evaluation of the usability of dashboards. Eight usability dimensions are included in DATUS, each corresponding to a specific usability facet that has been identified in an existing standard or model and decomposed into a total of 20 metrics. In this sense, to verify if the model created is feasible, and as a contribution to Iscte - Instituto Universitário de Lisboa, a prototype dashboard was designed for the Fénix platform, to which the DATUS model was applied. To test the usability of the dashboards, a behavioural study was conducted with 30 Iscte students. After analysing the results, not only was the feasibility of the proposed model and method confirmed, but positive conclusions were also reached regarding the usability of the prototype.O mercado de software observa o aparecimento de novas empresas e produtos todos os dias. Este crescimento traduz-se em competição e a sobrevivência das empresas resume-se ao investimento nos seus produtos. Também as universidades têm interesse em melhorar o seu produto, o ensino. Esta melhoria pode ser alcançada através de investimento na experiência de aprendizagem dos estudantes. A usabilidade e a experiência do utilizador desempenham um papel importante e demonstram ser uma vantagem competitiva em que vale a pena investir. Consequentemente, têm surgido novos métodos para melhorar o processo de avaliação de usabilidade. Apesar deste crescimento, não existe um modelo claro para avaliar a usabilidade de um dashboard. Esta lacuna levou à investigação desta dissertação, uma proposta de um novo modelo, Dashboard Assessment Usability Model (DATUS), acompanhado por um método de avaliação, que pode ser aplicado à avaliação da usabilidade de dashboards. Estão incluídas no DATUS oito dimensões de usabilidade, cada uma corresponde a uma faceta específica de usabilidade que foi identificada numa normalização ou modelo existente, e decompõem-se num total de 20 métricas. Para verificar se o modelo é viável, e como contribuição para o Iscte - Instituto Universitário de Lisboa, foi desenhado um protótipo de dashboard para a plataforma Fénix, à qual o modelo DATUS foi aplicado. Para testar a usabilidade dos dashboards, foi realizado um estudo comportamental com 30 alunos do Iscte. Após a análise dos resultados, foi confirmada a viabilidade do modelo e do método propostos e retiraram-se conclusões positivas em relação à usabilidade do protótipo

    EXPLORING USABILITY IN EXERCISE INTERVENTIONS: FROM CONCEPTUALIZATION TO MEASUREMENT AND APPLICATION

    Get PDF
    Exercise interventions hold promise for preventing and treating numerous conditions, diseases, and injuries. However, these interventions will only be effective if they are being used. Unfortunately, uptake and adherence to prescribed exercise and physical activity guidelines are insufficient. Some reasons for this include lack of knowledge, resources, flexibility, and enjoyment. Exercise program developers need to not only consider the effectiveness of the program during the development phase, but also involve end-users and receive feedback on program usability to determine likelihood of uptake and adoption. Usability testing can be used to detect barriers to use and implementation likelihood but has not yet been utilized within the domain of exercise-based interventions. The goal of this research was to better characterize and quantify exercise program usability to promote the design of more usable exercise programs. In the first study, a modified usability scale was used to assess and identify important program characteristics and their relationship to female handball players’ intention to use a newly developed anterior cruciate ligament (ACL) injury prevention program (IPP). Study 2 involved a mixed methods approach to gain deeper insight into factors affecting use of an IPP and the relationship between perceived program characteristics and effectiveness of the program utilizing interviews with coaches and players, and surveys. From study 1 and 2, results indicated that perceived effectiveness, enjoyability, efficiency and flexibility affected players’ and coaches’ intention and willingness to continue using the IPP. Building on these findings, the Intervention Usability Scale for Exercise (IUSE) was developed and validated in study 3. Exercise intervention stakeholders and target users of an exercise program contributed to item generation and content validation. Subsequently, a large sample of target users used the full scale to assess the usability of an exercise program. Following an extensive data analysis process, the 8-item IUSE indicated good psychometrics properties. Collectively, this research sought to improve exercise program usability by developing a tool exercise intervention developers can utilize as part of their program development and assessment process. Future studies should evaluate the predictive utility of the scale on actual uptake and adherence to an exercise intervention

    Measuring the Quality of the Website User Experience

    Get PDF
    Consumers spend an increasing amount of time and money online finding information, completing tasks, or making purchases. The quality of the website experience has become a key differentiator for organizations--affecting whether they purchase and their likelihood to return and recommend a website to friends. Two instruments were created to more effectively measure the quality of the website user experience to help improve the experience. Three studies used Classical Test Theory (CTT) to create a new instrument to measure the quality of the website user experience from the website visitor\u27s perspective. Data were collected over five years from more than 4,000 respondents reflecting on experiences with more than 100 websites. An eight-item questionnaire of website quality was created - the Standardized User Experience Percentile Rank Questionnaire (SUPR-Q). The SUPR-Q contains four factors: usability, trust, appearance, and loyalty. The factor structure was replicated across three studies, with data collected both during usability tests and retrospectively in surveys. There was evidence of convergent validity with existing questionnaires, including the System Usability Scale (SUS). An initial distribution of scores across the websites generated a database used to produce percentile ranks and make scores more meaningful to researchers and practitioners. In Study 4, a new set of data and confirmatory factor analysis (CFA) confirmed the factor structure and generated alternative items that work on non-e-commerce websites. The SUPR-Q can be used to generate reliable scores in benchmarking websites, and the normed scores can be used to understand how well a website scores relative to others in the database. A fifth study was designed to develop and evaluate guidelines regarding the quality of the user experience that could be judged by experts. Study 5 establishes a Calibrated Evaluator\u27s Guide (CEG) for evaluators to review websites against a set of guidelines to predict perceptions of quality of website user experience. The CEG was refined from 105 to 37 items using the many-faceted Rasch model. The CEG was found to complement the SUPR-Q by providing a more detailed description of the website user experience. Suggestions for practical use and future research are discussed

    Analyzing the usability of the WYRED Platform with undergraduate students to improve its features

    Get PDF
    [EN]The WYRED ecosystem is a technological ecosystem developed as part of WYRED (netWorked Youth Research for Empowerment in the Digital society), a European Project funded by the Horizon 2020 program. The main aim of the project is to provide a framework for research in which children and young people can express and explore their perspectives and interests concerning digital society. The WYRED ecosystem supports this framework from a technological point of view. The WYRED Platform is one of the main software components of this complex technological solution; it is focused on supporting the social dialogues that take place between children, young people and stakeholders. The ecosystem, and in particular the Platform, are already developed, but it is vital to ensure the acceptance by the final users, the children and young people mainly. This work presents the usability test carried out to evolve the Platform through the System Usability Scale. This usability test allows the identification of the weaknesses of the Platform regarding its characteristics, also allowing the corresponding improvement of the WYRED Platform, and it will serve as a reference for further usability testin
    corecore