N-of-1 trials aim to estimate treatment effects on the individual level and
can be applied to personalize a wide range of physical and digital
interventions in mHealth. In this study, we propose and apply a framework for
multimodal N-of-1 trials in order to allow the inclusion of health outcomes
assessed through images, audio or videos. We illustrate the framework in a
series of N-of-1 trials that investigate the effect of acne creams on acne
severity assessed through pictures. For the analysis, we compare an
expert-based manual labelling approach with different deep learning-based
pipelines where in a first step, we train and fine-tune convolutional neural
networks (CNN) on the images. Then, we use a linear mixed model on the scores
obtained in the first step in order to test the effectiveness of the treatment.
The results show that the CNN-based test on the images provides a similar
conclusion as tests based on manual expert ratings of the images, and
identifies a treatment effect in one individual. This illustrates that
multimodal N-of-1 trials can provide a powerful way to identify individual
treatment effects and can enable large-scale studies of a large variety of
health outcomes that can be actively and passively assessed using technological
advances in order to personalized health interventions