Wireless capsule endoscopy (WCE) is a non-invasive method for visualizing the
gastrointestinal (GI) tract, crucial for diagnosing GI tract diseases. However,
interpreting WCE results can be time-consuming and tiring. Existing studies
have employed deep neural networks (DNNs) for automatic GI tract lesion
detection, but acquiring sufficient training examples, particularly due to
privacy concerns, remains a challenge. Public WCE databases lack diversity and
quantity. To address this, we propose a novel approach leveraging generative
models, specifically the diffusion model (DM), for generating diverse WCE
images. Our model incorporates semantic map resulted from visualization scale
(VS) engine, enhancing the controllability and diversity of generated images.
We evaluate our approach using visual inspection and visual Turing tests,
demonstrating its effectiveness in generating realistic and diverse WCE images