4 research outputs found
SUBPLEX: Towards a Better Understanding of Black Box Model Explanations at the Subpopulation Level
Understanding the interpretation of machine learning (ML) models has been of
paramount importance when making decisions with societal impacts such as
transport control, financial activities, and medical diagnosis. While current
model interpretation methodologies focus on using locally linear functions to
approximate the models or creating self-explanatory models that give
explanations to each input instance, they do not focus on model interpretation
at the subpopulation level, which is the understanding of model interpretations
across different subset aggregations in a dataset. To address the challenges of
providing explanations of an ML model across the whole dataset, we propose
SUBPLEX, a visual analytics system to help users understand black-box model
explanations with subpopulation visual analysis. SUBPLEX is designed through an
iterative design process with machine learning researchers to address three
usage scenarios of real-life machine learning tasks: model debugging, feature
selection, and bias detection. The system applies novel subpopulation analysis
on ML model explanations and interactive visualization to explore the
explanations on a dataset with different levels of granularity. Based on the
system, we conduct user evaluation to assess how understanding the
interpretation at a subpopulation level influences the sense-making process of
interpreting ML models from a user's perspective. Our results suggest that by
providing model explanations for different groups of data, SUBPLEX encourages
users to generate more ingenious ideas to enrich the interpretations. It also
helps users to acquire a tight integration between programming workflow and
visual analytics workflow. Last but not least, we summarize the considerations
observed in applying visualization to machine learning interpretations
Summaries as Captions: Generating Figure Captions for Scientific Documents with Automated Text Summarization
Good figure captions help paper readers understand complex scientific
figures. Unfortunately, even published papers often have poorly written
captions. Automatic caption generation could aid paper writers by providing
good starting captions that can be refined for better quality. Prior work often
treated figure caption generation as a vision-to-language task. In this paper,
we show that it can be more effectively tackled as a text summarization task in
scientific documents. We fine-tuned PEGASUS, a pre-trained abstractive
summarization model, to specifically summarize figure-referencing paragraphs
(e.g., "Figure 3 shows...") into figure captions. Experiments on large-scale
arXiv figures show that our method outperforms prior vision methods in both
automatic and human evaluations. We further conducted an in-depth investigation
focused on two key challenges: (i) the common presence of low-quality
author-written captions and (ii) the lack of clear standards for good captions.
Our code and data are available at:
https://github.com/Crowd-AI-Lab/Generating-Figure-Captions-as-a-Text-Summarization-Task.Comment: Accepted by INLG-202