Multiple toddler tracking (MTT) involves identifying and differentiating
toddlers in video footage. While conventional multi-object tracking (MOT)
algorithms are adept at tracking diverse objects, toddlers pose unique
challenges due to their unpredictable movements, various poses, and similar
appearance. Tracking toddlers in indoor environments introduces additional
complexities such as occlusions and limited fields of view. In this paper, we
address the challenges of MTT and propose MTTSort, a customized method built
upon the DeepSort algorithm. MTTSort is designed to track multiple toddlers in
indoor videos accurately. Our contributions include discussing the primary
challenges in MTT, introducing a genetic algorithm to optimize hyperparameters,
proposing an accurate tracking algorithm, and curating the MTTrack dataset
using unbiased AI co-labeling techniques. We quantitatively compare MTTSort to
state-of-the-art MOT methods on MTTrack, DanceTrack, and MOT15 datasets. In our
evaluation, the proposed method outperformed other MOT methods, achieving 0.98,
0.68, and 0.98 in multiple object tracking accuracy (MOTA), higher order
tracking accuracy (HOTA), and iterative and discriminative framework 1 (IDF1)
metrics, respectively