Search CORE

22 research outputs found

CLIPPR: Maximally Informative CLIPped PRojections with Bounding Regions

Author: Cashman Dylan
Chang Remco
De Bie Tijl
Kang Bo
Lijffijt Jefrey
Publication venue
Publication date: 01/01/2018
Field of study

Are Metrics Enough? Guidelines for Communicating and Visualizing Predictive Models to Subject Matter Experts

Author: Anderson Erik W.
Appleby Gabriel
Cashman Dylan
Chang Remco
Finelli Luca
Suh Ashley
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 27/03/2023
Field of study

Presenting a predictive model's performance is a communication bottleneck that threatens collaborations between data scientists and subject matter experts. Accuracy and error metrics alone fail to tell the whole story of a model - its risks, strengths, and limitations - making it difficult for subject matter experts to feel confident in their decision to use a model. As a result, models may fail in unexpected ways or go entirely unused, as subject matter experts disregard poorly presented models in favor of familiar, yet arguably substandard methods. In this paper, we describe an iterative study conducted with both subject matter experts and data scientists to understand the gaps in communication between these two groups. We find that, while the two groups share common goals of understanding the data and predictions of the model, friction can stem from unfamiliar terms, metrics, and visualizations - limiting the transfer of knowledge to SMEs and discouraging clarifying questions being asked during presentations. Based on our findings, we derive a set of communication guidelines that use visualization as a common medium for communicating the strengths and weaknesses of a model. We provide a demonstration of our guidelines in a regression modeling scenario and elicit feedback on their use from subject matter experts. From our demonstration, subject matter experts were more comfortable discussing a model's performance, more aware of the trade-offs for the presented model, and better equipped to assess the model's risks - ultimately informing and contextualizing the model's use beyond text and numbers

arXiv.org e-Print Archive

Paths Explored, Paths Omitted, Paths Obscured: Decision Points & Selective Reporting in End-to-End Data Analysis

Author: Battle Leilani
Callahan Steven P.
Cashman Dylan
Cockburn Andy
Collaboration Open Science
Computer Transparent
Creswell John W.
Cumming Geoff
Dragicevic Pierre
Dragicevic Pierre
Eiselmayer Alexander
Feger Sebastian S.
Feger Sebastian S.
Glenn Begley C.
Guest Greg
Guo Philip J
Hartmann Björn
Henderson Peter
Jun Eunice
Kale Alex
Kay Matthew
Kery Mary B.
Liu Jiali
Mary
Myers James D.
Nicolaci Pimentel Joao Felipe
Rae James R.
Rule Adam
Zgraggen Emanuel
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 08/01/2020
Field of study

Drawing reliable inferences from data involves many, sometimes arbitrary, decisions across phases of data collection, wrangling, and modeling. As different choices can lead to diverging conclusions, understanding how researchers make analytic decisions is important for supporting robust and replicable analysis. In this study, we pore over nine published research studies and conduct semi-structured interviews with their authors. We observe that researchers often base their decisions on methodological or theoretical concerns, but subject to constraints arising from the data, expertise, or perceived interpretability. We confirm that researchers may experiment with choices in search of desirable results, but also identify other reasons why researchers explore alternatives yet omit findings. In concert with our interviews, we also contribute visualizations for communicating decision processes throughout an analysis. Based on our results, we identify design opportunities for strengthening end-to-end analysis, for instance via tracking and meta-analysis of multiple decision paths

arXiv.org e-Print Archive

Crossref

Bridging the Human-Machine Gap in Applied Machine Learning with Visual Analytics

Author: Cashman Dylan
Publication venue: Tufts University
Publication date: 01/01/2020
Field of study

Machine learning is becoming a ubiquitous toolset for analyzing and making use of large collections of data. Advanced learning algorithms are able to learn from complex data to build models that can tackle artificial intelligence tasks previously thought impossible. As a result, organizations from many domains are attempting to apply machine learning to their data analysis problems. In practice, such efforts can suffer from gaps between the goals of the human and the objective being optimized by the machine, resulting in models that perform poorly in deployment. In this dissertation, I present my thesis that visual analytics systems can improve the performance of deployed models in applied machine learning tasks by allowing the user to compensate for vulnerabilities in learning paradigms. First, I outline how learning paradigms used by machine learning algorithms can miss out on certain aspects of the end goal of the user. Then, I will describe four different visual analytics systems that allow the user to intervene in the learning process across many types of data and models. In these systems, visualizations help users understand how a model performs on different regions of the data. They can also help a user encode their domain expertise to the learning algorithm, to correct for misalignments between the goals of the machine and the needs of the application scenario. This work offers evidence that advanced machine learning algorithms are applied more effectively by involving a domain user in the learning process, using a visual analytics tool as a medium

ProQuest OAI Repository

BEAMES: Interactive Multimodel Steering, Selection, and Inspection for Regression Tasks

Author: Alex Endert
Dylan Cashman
Remco Chang
Subhajit Das
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Crossref