9 research outputs found
Traffic Sign Recognition Using Local Vision Transformer
Recognition of traffic signs is a crucial aspect of self-driving cars and
driver assistance systems, and machine vision tasks such as traffic sign
recognition have gained significant attention. CNNs have been frequently used
in machine vision, but introducing vision transformers has provided an
alternative approach to global feature learning. This paper proposes a new
novel model that blends the advantages of both convolutional and
transformer-based networks for traffic sign recognition. The proposed model
includes convolutional blocks for capturing local correlations and
transformer-based blocks for learning global dependencies. Additionally, a
locality module is incorporated to enhance local perception. The performance of
the suggested model is evaluated on the Persian Traffic Sign Dataset and German
Traffic Sign Recognition Benchmark and compared with SOTA convolutional and
transformer-based models. The experimental evaluations demonstrate that the
hybrid network with the locality module outperforms pure transformer-based
models and some of the best convolutional networks in accuracy. Specifically,
our proposed final model reached 99.66% accuracy in the German traffic sign
recognition benchmark and 99.8% in the Persian traffic sign dataset, higher
than the best convolutional models. Moreover, it outperforms existing CNNs and
ViTs while maintaining fast inference speed. Consequently, the proposed model
proves to be significantly faster and more suitable for real-world
applications