92 research outputs found
WevQuery: Testing Hypotheses about Web Interaction Patterns
Remotely stored user interaction logs, which give access to a wealth of data generated by large numbers of users, have been long used to understand if interactive systems meet the expectations of designers. Unfortunately, detailed insight into users' interaction behaviour still requires a high degree of expertise and domain specific knowledge. We present WevQuery, a scalable system to query user interaction logs in order to allow designers to test their hypotheses about users' behaviour. WevQuery supports this purpose using a graphical notation to define the interaction patterns designers are seeking. WevQuery is scalable as the queries can then be executed against large user interaction datasets by employing the MapReduce paradigm. This way WevQuery provides designers effortless access to harvest users' interaction patterns, removing the burden of low-level interaction data analysis. We present two scenarios to showcase the potential of WevQuery, from the design of the queries to their execution on real interaction data accounting for 5.7m events generated by 2,445 unique users
To Sign Up, or not to Sign Up? Maximizing Citizen Science Contribution Rates through Optional Registration.
The Visualisation of Eye-tracking Scanpaths: What can they tell us about how Clinicians View Electrocardiograms?
Group versus Individual Web Accessibility Evaluations: Effects with Novice Evaluators
We present an experiment comparing performance of 20 novice evaluators of accessibility carrying out Web Content Accessibility Guidelines 2.0 conformance reviews working individually to performance obtained when they work in teams of two. They were asked to first carry out an individual assessment of a web page. Later on, they were matched randomly to constitute a group of two and they were asked to revise their initial assessment and to produce a group assessment of the same page. Results indicate that significant differences were found for sensitivity (inversely related to false negatives: +8%) and agreement (when measured in terms of the majority view: +10%). Members of groups exhibited strong agreement on the evaluation results among them and with the group outcome. Other measures of validity and reliability are not significantly affected by group work. Practical implications of these findings are that, for example, when it is important to reduce the false-negative rate, then employing a group of two people is more useful than having individuals carrying out the assessment. Openings for future research include further explorations of whether similar results hold for groups larger than two or what is the effect of mixing people with different accessibility background. RESEARCH HIGHLIGHTS When novice accessibility evaluators work in groups, their ability to identify all the true problems increases (by 8%). Likewise, reliability of group evaluations increases (by 10%). Individual or group evaluations can be considered as equivalent methods with respect to false positives (if differences up to 8% in correctness are tolerated). Individual or group evaluations can be considered as equivalent methods with respect to overall effectiveness (if differences up to 11% in F-measure are tolerated)
What Makes Research Software Sustainable? An Interview Study With Research Software Engineers
Software is now a vital scientific instrument, providing the tools for data
collection and analysis across disciplines from bioinformatics and
computational physics, to the humanities. The software used in research is
often home-grown and bespoke: it is constructed for a particular project, and
rarely maintained beyond this, leading to rapid decay, and frequent
`reinvention of the wheel'. Understanding how to develop sustainable research
software, such that it is suitable for future reuse, is therefore of interest
to both researchers and funders, but how to achieve this remains an open
question. Here we report the results of an interview study examining how
research software engineers -- the people actively developing software in an
academic research environment -- subjectively define software sustainability.
Thematic analysis of the data reveals two interacting dimensions:
\emph{intrinsic sustainability}, which relates to internal qualities of
software, such as modularity, encapsulation and testability, and
\emph{extrinsic sustainability}, concerning cultural and organisational
factors, including how software is resourced, supported and shared. Research
software engineers believe an increased focus on quality and discoverability
are key factors in increasing the sustainability of academic research software
Does descriptive text change how people look at art? A novel analysis of eye-movements using data-driven Units of Interest
Does reading a description of an artwork affect how a person subsequently views it? In a controlled study, we show that in most cases, textual description does not influence how people subsequently view paintings, contrary to participants’ self-report that they believed it did. To examine whether the description affected transition behaviour, we devised a novel analysis method that systematically determines Units of Interest (UOIs), and calculates transitions between these, to quantify the effect of an external factor (a descriptive text) on the viewing pattern of a naturalistic stimulus (a painting). UOIs are defined using a grid-based system, where the cell-size is determined by a clustering algorithm (DBSCAN). The Hellinger distance is computed for the distance between two Markov chains using a permutation test, constructed from the transition matrices (visual shifts between UOIs) of the two groups for each painting. Results show that the description does not affect the way in which people transition between UOIs for all but one of the paintings -- an abstract work -- suggesting that description may play more of a role in determining transition behaviour when a lack of semantic cues means it is unclear how the painting should be interpreted. The contribution is twofold: to the domain of art/curation, we provide evidence that descriptive texts do not effect how people view paintings, with the possible exception of some abstract paintings; to the domain of eye-movement research, we provide a method with the potential to answer questions across multiple research areas, where the goal is to determine whether a particular factor or condition consistently affects viewing behaviour of naturalistic stimuli
Assessing the communication gap between AI models and healthcare professionals: explainability, utility and trust in AI-driven clinical decision-making
This paper contributes with a pragmatic evaluation framework for explainable
Machine Learning (ML) models for clinical decision support. The study revealed
a more nuanced role for ML explanation models, when these are pragmatically
embedded in the clinical context. Despite the general positive attitude of
healthcare professionals (HCPs) towards explanations as a safety and trust
mechanism, for a significant set of participants there were negative effects
associated with confirmation bias, accentuating model over-reliance and
increased effort to interact with the model. Also, contradicting one of its
main intended functions, standard explanatory models showed limited ability to
support a critical understanding of the limitations of the model. However, we
found new significant positive effects which repositions the role of
explanations within a clinical context: these include reduction of automation
bias, addressing ambiguous clinical cases (cases where HCPs were not certain
about their decision) and support of less experienced HCPs in the acquisition
of new domain knowledge.Comment: supplementary information in the main pd
- …