Tuberculosis (TB) is a major global health threat, causing millions of deaths
annually. Although early diagnosis and treatment can greatly improve the
chances of survival, it remains a major challenge, especially in developing
countries. Recently, computer-aided tuberculosis diagnosis (CTD) using deep
learning has shown promise, but progress is hindered by limited training data.
To address this, we establish a large-scale dataset, namely the Tuberculosis
X-ray (TBX11K) dataset, which contains 11,200 chest X-ray (CXR) images with
corresponding bounding box annotations for TB areas. This dataset enables the
training of sophisticated detectors for high-quality CTD. Furthermore, we
propose a strong baseline, SymFormer, for simultaneous CXR image classification
and TB infection area detection. SymFormer incorporates Symmetric Search
Attention (SymAttention) to tackle the bilateral symmetry property of CXR
images for learning discriminative features. Since CXR images may not strictly
adhere to the bilateral symmetry property, we also propose Symmetric Positional
Encoding (SPE) to facilitate SymAttention through feature recalibration. To
promote future research on CTD, we build a benchmark by introducing evaluation
metrics, evaluating baseline models reformed from existing detectors, and
running an online challenge. Experiments show that SymFormer achieves
state-of-the-art performance on the TBX11K dataset. The data, code, and models
will be released.Comment: 14 page