We consider the problem of estimating repetition in video, such as performing
push-ups, cutting a melon or playing violin. Existing work shows good results
under the assumption of static and stationary periodicity. As realistic video
is rarely perfectly static and stationary, the often preferred Fourier-based
measurements is inapt. Instead, we adopt the wavelet transform to better handle
non-static and non-stationary video dynamics. From the flow field and its
differentials, we derive three fundamental motion types and three motion
continuities of intrinsic periodicity in 3D. On top of this, the 2D perception
of 3D periodicity considers two extreme viewpoints. What follows are 18
fundamental cases of recurrent perception in 2D. In practice, to deal with the
variety of repetitive appearance, our theory implies measuring time-varying
flow and its differentials (gradient, divergence and curl) over segmented
foreground motion. For experiments, we introduce the new QUVA Repetition
dataset, reflecting reality by including non-static and non-stationary videos.
On the task of counting repetitions in video, we obtain favorable results
compared to a deep learning alternative