Infertility is a global health problem, and an increasing number of couples
are seeking medical assistance to achieve reproduction, at least half of which
are caused by men. The success rate of assisted reproductive technologies
depends on sperm assessment, in which experts determine whether sperm can be
used for reproduction based on morphology and motility of sperm. Previous sperm
assessment studies with deep learning have used datasets comprising images that
include only sperm heads, which cannot consider motility and other morphologies
of sperm. Furthermore, the labels of the dataset are one-hot, which provides
insufficient support for experts, because assessment results are inconsistent
between experts, and they have no absolute answer. Therefore, we constructed
the video dataset for sperm assessment whose videos include sperm head as well
as neck and tail, and its labels were annotated with soft-label. Furthermore,
we proposed the sperm assessment framework and the neural network, RoSTFine,
for sperm video recognition. Experimental results showed that RoSTFine could
improve the sperm assessment performances compared to existing video
recognition models and focus strongly on important sperm parts (i.e., head and
neck).Comment: Accepted at Winter Conference on Applications of Computer Vision
(WACV) 202