The generation of synthetic medical records using generative adversarial
networks (GANs) has become increasingly important for addressing privacy
concerns and promoting data sharing in the medical field. In this paper, we
propose a novel method for generating synthetic hybrid medical records
consisting of chest X-ray images (CXRs) and structured tabular data (including
anthropometric data and laboratory tests) using an auto-encoding GAN
({\alpha}GAN) and a conditional tabular GAN (CTGAN). Our approach involves
training a {\alpha}GAN model on a large public database (pDB) to reduce the
dimensionality of CXRs. We then applied the trained encoder of the GAN model to
the images in original database (oDB) to obtain the latent vectors. These
latent vectors were combined with tabular data in oDB, and these joint data
were used to train the CTGAN model. We successfully generated diverse synthetic
records of hybrid CXR and tabular data, maintaining correspondence between
them. We evaluated this synthetic database (sDB) through visual assessment,
distribution of interrecord distances, and classification tasks. Our evaluation
results showed that the sDB captured the features of the oDB while maintaining
the correspondence between the images and tabular data. Although our approach
relies on the availability of a large-scale pDB containing a substantial number
of images with the same modality and imaging region as those in the oDB, this
method has the potential for the public release of synthetic datasets without
compromising the secondary use of data.Comment: 14 page