The standard approach to tackling computer vision problems is to train deep
convolutional neural network (CNN) models using large-scale image datasets
which are representative of the target task. However, in many scenarios, it is
often challenging to obtain sufficient image data for the target task. Data
augmentation is a way to mitigate this challenge. A common practice is to
explicitly transform existing images in desired ways so as to create the
required volume and variability of training data necessary to achieve good
generalization performance. In situations where data for the target domain is
not accessible, a viable workaround is to synthesize training data from
scratch--i.e., synthetic data augmentation. This paper presents an extensive
review of synthetic data augmentation techniques. It covers data synthesis
approaches based on realistic 3D graphics modeling, neural style transfer
(NST), differential neural rendering, and generative artificial intelligence
(AI) techniques such as generative adversarial networks (GANs) and variational
autoencoders (VAEs). For each of these classes of methods, we focus on the
important data generation and augmentation techniques, general scope of
application and specific use-cases, as well as existing limitations and
possible workarounds. Additionally, we provide a summary of common synthetic
datasets for training computer vision models, highlighting the main features,
application domains and supported tasks. Finally, we discuss the effectiveness
of synthetic data augmentation methods. Since this is the first paper to
explore synthetic data augmentation methods in great detail, we are hoping to
equip readers with the necessary background information and in-depth knowledge
of existing methods and their attendant issues