92 research outputs found

    WevQuery: Testing Hypotheses about Web Interaction Patterns

    Get PDF
    Remotely stored user interaction logs, which give access to a wealth of data generated by large numbers of users, have been long used to understand if interactive systems meet the expectations of designers. Unfortunately, detailed insight into users' interaction behaviour still requires a high degree of expertise and domain specific knowledge. We present WevQuery, a scalable system to query user interaction logs in order to allow designers to test their hypotheses about users' behaviour. WevQuery supports this purpose using a graphical notation to define the interaction patterns designers are seeking. WevQuery is scalable as the queries can then be executed against large user interaction datasets by employing the MapReduce paradigm. This way WevQuery provides designers effortless access to harvest users' interaction patterns, removing the burden of low-level interaction data analysis. We present two scenarios to showcase the potential of WevQuery, from the design of the queries to their execution on real interaction data accounting for 5.7m events generated by 2,445 unique users

    Group versus Individual Web Accessibility Evaluations: Effects with Novice Evaluators

    Get PDF
    We present an experiment comparing performance of 20 novice evaluators of accessibility carrying out Web Content Accessibility Guidelines 2.0 conformance reviews working individually to performance obtained when they work in teams of two. They were asked to first carry out an individual assessment of a web page. Later on, they were matched randomly to constitute a group of two and they were asked to revise their initial assessment and to produce a group assessment of the same page. Results indicate that significant differences were found for sensitivity (inversely related to false negatives: +8%) and agreement (when measured in terms of the majority view: +10%). Members of groups exhibited strong agreement on the evaluation results among them and with the group outcome. Other measures of validity and reliability are not significantly affected by group work. Practical implications of these findings are that, for example, when it is important to reduce the false-negative rate, then employing a group of two people is more useful than having individuals carrying out the assessment. Openings for future research include further explorations of whether similar results hold for groups larger than two or what is the effect of mixing people with different accessibility background. RESEARCH HIGHLIGHTS When novice accessibility evaluators work in groups, their ability to identify all the true problems increases (by 8%). Likewise, reliability of group evaluations increases (by 10%). Individual or group evaluations can be considered as equivalent methods with respect to false positives (if differences up to 8% in correctness are tolerated). Individual or group evaluations can be considered as equivalent methods with respect to overall effectiveness (if differences up to 11% in F-measure are tolerated)

    What Makes Research Software Sustainable? An Interview Study With Research Software Engineers

    Full text link
    Software is now a vital scientific instrument, providing the tools for data collection and analysis across disciplines from bioinformatics and computational physics, to the humanities. The software used in research is often home-grown and bespoke: it is constructed for a particular project, and rarely maintained beyond this, leading to rapid decay, and frequent `reinvention of the wheel'. Understanding how to develop sustainable research software, such that it is suitable for future reuse, is therefore of interest to both researchers and funders, but how to achieve this remains an open question. Here we report the results of an interview study examining how research software engineers -- the people actively developing software in an academic research environment -- subjectively define software sustainability. Thematic analysis of the data reveals two interacting dimensions: \emph{intrinsic sustainability}, which relates to internal qualities of software, such as modularity, encapsulation and testability, and \emph{extrinsic sustainability}, concerning cultural and organisational factors, including how software is resourced, supported and shared. Research software engineers believe an increased focus on quality and discoverability are key factors in increasing the sustainability of academic research software

    Does descriptive text change how people look at art? A novel analysis of eye-movements using data-driven Units of Interest

    Get PDF
    Does reading a description of an artwork affect how a person subsequently views it? In a controlled study, we show that in most cases, textual description does not influence how people subsequently view paintings, contrary to participants’ self-report that they believed it did. To examine whether the description affected transition behaviour, we devised a novel analysis method that systematically determines Units of Interest (UOIs), and calculates transitions between these, to quantify the effect of an external factor (a descriptive text) on the viewing pattern of a naturalistic stimulus (a painting). UOIs are defined using a grid-based system, where the cell-size is determined by a clustering algorithm (DBSCAN). The Hellinger distance is computed for the distance between two Markov chains using a permutation test, constructed from the transition matrices (visual shifts between UOIs) of the two groups for each painting. Results show that the description does not affect the way in which people transition between UOIs for all but one of the paintings -- an abstract work -- suggesting that description may play more of a role in determining transition behaviour when a lack of semantic cues means it is unclear how the painting should be interpreted. The contribution is twofold: to the domain of art/curation, we provide evidence that descriptive texts do not effect how people view paintings, with the possible exception of some abstract paintings; to the domain of eye-movement research, we provide a method with the potential to answer questions across multiple research areas, where the goal is to determine whether a particular factor or condition consistently affects viewing behaviour of naturalistic stimuli

    Assessing the communication gap between AI models and healthcare professionals: explainability, utility and trust in AI-driven clinical decision-making

    Full text link
    This paper contributes with a pragmatic evaluation framework for explainable Machine Learning (ML) models for clinical decision support. The study revealed a more nuanced role for ML explanation models, when these are pragmatically embedded in the clinical context. Despite the general positive attitude of healthcare professionals (HCPs) towards explanations as a safety and trust mechanism, for a significant set of participants there were negative effects associated with confirmation bias, accentuating model over-reliance and increased effort to interact with the model. Also, contradicting one of its main intended functions, standard explanatory models showed limited ability to support a critical understanding of the limitations of the model. However, we found new significant positive effects which repositions the role of explanations within a clinical context: these include reduction of automation bias, addressing ambiguous clinical cases (cases where HCPs were not certain about their decision) and support of less experienced HCPs in the acquisition of new domain knowledge.Comment: supplementary information in the main pd
    • …
    corecore