14 research outputs found
Visual comparison of two data sets: do people use the means and the variability?
In our everyday lives, we are required to make decisions based upon our statistical intuitions. Often, these involve the comparison of two groups, such as luxury versus family cars and their suitability. Research has shown that the mean difference affects judgements where two sets of data are compared, but the variability of the data has only a minor influence, if any at all. However, prior research has tended to present raw data as simple lists of values. Here, we investigated whether displaying data visually, in the form of parallel dot plots, would lead viewers to incorporate variability information. In Experiment 1, we asked a large sample of people to compare two fictional groups (children who drank ‘Brain Juice’ versus water) in a one-shot design, where only a single comparison was made. Our results confirmed that only the mean difference between the groups predicted subsequent judgements of how much they differed, in line with previous work using lists of numbers. In Experiment 2, we asked each participant to make multiple comparisons, with both the mean difference and the pooled standard deviation varying across data sets they were shown. Here, we found that both sources of information were correctly incorporated when making responses. Taken together, we suggest that increasing the salience of variability information, through manipulating this factor across items seen, encourages viewers to consider this in their judgements. Such findings may have useful applications for best practices when teaching difficult concepts like sampling variation
Measuring Categorical Perception in Color-Coded Scatterplots
Scatterplots commonly use color to encode categorical data. However, as
datasets increase in size and complexity, the efficacy of these channels may
vary. Designers lack insight into how robust different design choices are to
variations in category numbers. This paper presents a crowdsourced experiment
measuring how the number of categories and choice of color encodings used in
multiclass scatterplots influences the viewers' abilities to analyze data
across classes. Participants estimated relative means in a series of
scatterplots with 2 to 10 categories encoded using ten color palettes drawn
from popular design tools. Our results show that the number of categories and
color discriminability within a color palette notably impact people's
perception of categorical data in scatterplots and that the judgments become
harder as the number of categories grows. We examine existing palette design
heuristics in light of our results to help designers make robust color choices
informed by the parameters of their data.Comment: The paper has been accepted to the ACM CHI 2023. 14 pages, 7 figure
Empirically measuring soft knowledge in visualization
In this paper, we present an empirical study designed to evaluate the hypothesis that humans’ soft knowledge can enhance
the cost-benefit ratio of a visualization process by reducing the potential distortion. In particular, we focused on the impact of
three classes of soft knowledge: (i) knowledge about application contexts, (ii) knowledge about the patterns to be observed (i.e.,
in relation to visualization task), and (iii) knowledge about statistical measures. We mapped these classes into three control
variables, and used real-world time series data to construct stimuli. The results of the study confirmed the positive contribution
of each class of knowledge towards the reduction of the potential distortion, while the knowledge about the patterns prevents
distortion more effectively than the other two classes
Four types of ensemble coding in data visualizations
Ensemble coding supports rapid extraction of visual statistics about distributed visual information. Researchers typically study this ability with the goal of drawing conclusions about how such coding extracts information from natural scenes. Here we argue that a second domain can serve as another strong inspiration for understanding ensemble coding: graphs, maps, and other visual presentations of data. Data visualizations allow observers to leverage their ability to perform visual ensemble statistics on distributions of spatial or featural visual information to estimate actual statistics on data. We survey the types of visual statistical tasks that occur within data visualizations across everyday examples, such as scatterplots, and more specialized images, such as weather maps or depictions of patterns in text. We divide these tasks into four categories: identification of sets of values, summarization across those values, segmentation of collections, and estimation of structure. We point to unanswered questions for each category and give examples of such cross-pollination in the current literature. Increased collaboration between the data visualization and perceptual psychology research communities can inspire new solutions to challenges in visualization while simultaneously exposing unsolved problems in perception research
The nature of correlation perception in scatterplots
For scatterplots with gaussian distributions of dots, the perception of Pearson correlation r can be
described by two simple laws: a linear one for discrimination, and a logarithmic one for
perceived magnitude (Rensink & Baldridge, 2010). The underlying perceptual mechanisms,
however, remain poorly understood. To cast light on these, four different distributions of
datapoints were examined. The first had 100 points with equal variance in both dimensions.
Consistent with earlier results, just noticeable difference (JND) was a linear function of the distance away from r = 1, and the magnitude of perceived correlation a logarithmic function of this quantity. In addition, these laws were linked, with the intercept of the JND line being the inverse of the bias in perceived magnitude. Three other conditions were also examined: a dot cloud with 25 points, a horizontal compression of the cloud, and a cloud with a uniform distribution of dots. Performance was found to be similar in all conditions. The generality and form of these laws suggest that what underlies correlation perception is not a geometric structure such as the shape of the dot cloud, but the shape of the probability distribution of the dots, likely inferred via a form of ensemble coding. It is suggested that this reflects the ability of observers to perceive the information entropy in an image, with this quantity used as a proxy for Pearson
correlation