The menstrual cycle is a key indicator of overall health for women of
reproductive age. Previously, menstruation was primarily studied through survey
results; however, as menstrual tracking mobile apps become more widely adopted,
they provide an increasingly large, content-rich source of menstrual health
experiences and behaviors over time. By exploring a database of user-tracked
observations from the Clue app by BioWink of over 378,000 users and 4.9 million
natural cycles, we show that self-reported menstrual tracker data can reveal
statistically significant relationships between per-person cycle length
variability and self-reported qualitative symptoms. A concern for self-tracked
data is that they reflect not only physiological behaviors, but also the
engagement dynamics of app users. To mitigate such potential artifacts, we
develop a procedure to exclude cycles lacking user engagement, thereby allowing
us to better distinguish true menstrual patterns from tracking anomalies. We
uncover that women located at different ends of the menstrual variability
spectrum, based on the consistency of their cycle length statistics, exhibit
statistically significant differences in their cycle characteristics and
symptom tracking patterns. We also find that cycle and period length statistics
are stationary over the app usage timeline across the variability spectrum. The
symptoms that we identify as showing statistically significant association with
timing data can be useful to clinicians and users for predicting cycle
variability from symptoms or as potential health indicators for conditions like
endometriosis. Our findings showcase the potential of longitudinal,
high-resolution self-tracked data to improve understanding of menstruation and
women's health as a whole.Comment: The Supplementary Information for this work, as well as the code
required for data pre-processing and producing results is available in
https://github.com/iurteaga/menstrual_cycle_analysi