Integrating real-time artificial intelligence (AI) systems in clinical
practices faces challenges such as scalability and acceptance. These challenges
include data availability, biased outcomes, data quality, lack of transparency,
and underperformance on unseen datasets from different distributions. The
scarcity of large-scale, precisely labeled, and diverse datasets are the major
challenge for clinical integration. This scarcity is also due to the legal
restrictions and extensive manual efforts required for accurate annotations
from clinicians. To address these challenges, we present \textit{GastroVision},
a multi-center open-access gastrointestinal (GI) endoscopy dataset that
includes different anatomical landmarks, pathological abnormalities, polyp
removal cases and normal findings (a total of 27 classes) from the GI tract.
The dataset comprises 8,000 images acquired from B{\ae}rum Hospital in Norway
and Karolinska University Hospital in Sweden and was annotated and verified by
experienced GI endoscopists. Furthermore, we validate the significance of our
dataset with extensive benchmarking based on the popular deep learning based
baseline models. We believe our dataset can facilitate the development of
AI-based algorithms for GI disease detection and classification. Our dataset is
available at \url{https://osf.io/84e7f/}