1 research outputs found
Learning to Taste: A Multimodal Wine Dataset
We present WineSensed, a large multimodal wine dataset for studying the
relations between visual perception, language, and flavor. The dataset
encompasses 897k images of wine labels and 824k reviews of wines curated from
the Vivino platform. It has over 350k unique vintages, annotated with year,
region, rating, alcohol percentage, price, and grape composition. We obtained
fine-grained flavor annotations on a subset by conducting a wine-tasting
experiment with 256 participants who were asked to rank wines based on their
similarity in flavor, resulting in more than 5k pairwise flavor distances. We
propose a low-dimensional concept embedding algorithm that combines human
experience with automatic machine similarity kernels. We demonstrate that this
shared concept embedding space improves upon separate embedding spaces for
coarse flavor classification (alcohol percentage, country, grape, price,
rating) and aligns with the intricate human perception of flavor.Comment: Accepted to NeurIPS 2023. See project page:
https://thoranna.github.io/learning_to_taste