Embeddings mapping high-dimensional discrete input to lower-dimensional
continuous vector spaces have been widely adopted in machine learning
applications as a way to capture domain semantics. Interviewing 13 embedding
users across disciplines, we find comparing embeddings is a key task for
deployment or downstream analysis but unfolds in a tedious fashion that poorly
supports systematic exploration. In response, we present the Embedding
Comparator, an interactive system that presents a global comparison of
embedding spaces alongside fine-grained inspection of local neighborhoods. It
systematically surfaces points of comparison by computing the similarity of the
k-nearest neighbors of every embedded object between a pair of spaces.
Through case studies, we demonstrate our system rapidly reveals insights, such
as semantic changes following fine-tuning, language changes over time, and
differences between seemingly similar models. In evaluations with 15
participants, we find our system accelerates comparisons by shifting from
laborious manual specification to browsing and manipulating visualizations.Comment: Equal contribution by first two author