Deep-learning-based age estimation from panoramic radiographs : unraveling the learning process

Abstract

Background: Dental age has been proven to be a good predictor of chronological age, especially in children, for whom the permanent teeth are still developing.1 In adolescents and young adults, the dental age is still useful as a predictor of chronological age, albeit less accurate and precise than it is in children.2 Numerous dental age estimation methods have been described in the literature, both for children, on one hand, and for adolescents and young adults, on the other hand.2 Traditionally, the methods are based on staging dental development, which is done by expert observers. Still, inter-observer variability remains, regardless of the observer’s experience. To overcome the issue of inter-observer variability, automated dental age estimation methods were developed.3-7 Vila-Blanco et al. demonstrated that their Convolutional Neural Network (CNN) for age estimation focused on different parts of the panoramic radiograph, depending on the age category of the studied individual.7 This grossly corresponds with how human observers interpret the panoramic radiograph: by focusing on the developing permanent teeth in children and on the developing third molars in adolescents and young adults. At the same time, Banar and Bertels et al. showed that their CNN for third molar development staging benefited from background removal as a potential compensatory mechanism for having a limited amount of training data.8 Therefore, it can be hypothesized that CNNs for age estimation can be aided by directing their focus inside the panoramic radiograph corresponding to a human observer’s approach. Providing the observer with extra information about where to look may increase accuracy and precision by eliminating redundant information. Purpose: To study the effect of three preprocessing steps of the panoramic radiographs on age estimation performance. Materials and Methods: A set of 3,266 digital panoramic radiographs was collected retrospectively at blinded for review. The study population was between 1 and 24 years old, approximately evenly distributed between both sexes and among all age categories of 1 year. Two preprocessing steps were conducted by a human observer: (1) cropping the panoramic radiographs to display only a rectangle within the third quadrant, and (2) indicating the long axes of the seven permanent teeth and the four third molars on the panoramic radiographs. Based on those long axes, a third preprocessing step was conducted using the Python OpenCV package: (3) masking all redundant information in the panoramic radiographs, keeping rectangles around the seven permanent teeth and the four third molars. Subsequently, the EfficientNet-B7 CNN pretrained on ImageNet was used in combination with an additional global average pooling layer and two dense layers to predict age.9 Age was predicted based either on the original panoramic radiograph or on one of its three preprocessed variants. In each setup, the CNN was tested on the same random subset of 170 subjects. The remainder—for which the number of subjects could vary slightly between the different preprocessing steps because of technical constraints—was used for model fitting, including an internal validation set to monitor convergence. Results: The mean absolute error equaled 0.95 year using the original panoramic radiograph, 1.15 years using the cropped images, 1.08 years using the images with the long axes, and 1.22 years using the masked images. Conclusion: Rather unexpectedly, the CNN benefited from being able to consider the original panoramic radiograph. This suggests that significant tion is present throughout the full panoramic radiograph and/or that directing the CNNs’ focus using prior human experience could be harmful outside the limited data regime

    Similar works

    Full text

    thumbnail-image