Accurate detection of thyroid lesions is a critical aspect of computer-aided
diagnosis. However, most existing detection methods perform only one feature
extraction process and then fuse multi-scale features, which can be affected by
noise and blurred features in ultrasound images. In this study, we propose a
novel detection network based on a feature feedback mechanism inspired by
clinical diagnosis. The mechanism involves first roughly observing the overall
picture and then focusing on the details of interest. It comprises two parts: a
feedback feature selection module and a feature feedback pyramid. The feedback
feature selection module efficiently selects the features extracted in the
first phase in both space and channel dimensions to generate high semantic
prior knowledge, which is similar to coarse observation. The feature feedback
pyramid then uses this high semantic prior knowledge to enhance feature
extraction in the second phase and adaptively fuses the two features, similar
to fine observation. Additionally, since radiologists often focus on the shape
and size of lesions for diagnosis, we propose an adaptive detection head
strategy to aggregate multi-scale features. Our proposed method achieves an AP
of 70.3% and AP50 of 99.0% on the thyroid ultrasound dataset and meets the
real-time requirement. The code is available at
https://github.com/HIT-wanglingtao/Thinking-Twice.Comment: 20 pages, 11 figures, released code for
https://github.com/HIT-wanglingtao/Thinking-Twic