146 research outputs found

    Comparative study on the detection of early dental caries using thermo-photonic lock-in imaging and optical coherence tomography

    Get PDF
    Early detection of dental caries is known to be the key to the effectiveness of therapeutic and preventive approaches in dentistry. However, existing clinical detection techniques, such as radiographs, are not sufficiently sensitive to detect and monitor the progression of caries at early stages. As such, in recent years, several optics-based imaging modalities have been proposed for the early detection of caries. The majority of these techniques rely on the enhancement of light scattering in early carious lesions, while a few of them are based on the enhancement of light absorption at early caries sites. In this paper, we report on a systemic comparative study on the detection performances of optical coherence tomography (OCT) and thermophotonic lock-in imaging (TPLI) as representative early caries detection modalities based on light scattering and absorption, respectively. Through controlled demineralization studies on extracted human teeth and µCT validation experiments, several detection performance parameters of the two modalities such as detection threshold, sensitivity and specificity have been qualitatively analyzed and discussed. Our experiment results suggests that both modalities have sufficient sensitivity for the detection of well-developed early caries on occlusal and smooth surfaces; however, TPLI provides better sensitivity and detection threshold for detecting very early stages of caries formation, which is deemed to be critical for the effectiveness of therapeutic and preventive approaches in dentistry. Moreover, due to the more specific nature of the light absorption contrast mechanism over light scattering, TPLI exhibits better detection specificity, which results in less false positive readings and thus allows for the proper differentiation of early caries regions from the surrounding intact areas. The major shortcoming of TPLI is its inherent depth-integrated nature, prohibiting the production of depth-resolved/B-mode like images. The outcomes of this research justify the need for a light-absorption based imaging modality with the ability to produce tomographic and depth-resolved images, combining the key advantages of OCT and TPLI.York University Librarie

    Pyramid Transformer for Traffic Sign Detection

    Full text link
    Traffic sign detection is a vital task in the visual system of self-driving cars and the automated driving system. Recently, novel Transformer-based models have achieved encouraging results for various computer vision tasks. We still observed that vanilla ViT could not yield satisfactory results in traffic sign detection because the overall size of the datasets is very small and the class distribution of traffic signs is extremely unbalanced. To overcome this problem, a novel Pyramid Transformer with locality mechanisms is proposed in this paper. Specifically, Pyramid Transformer has several spatial pyramid reduction layers to shrink and embed the input image into tokens with rich multi-scale context by using atrous convolutions. Moreover, it inherits an intrinsic scale invariance inductive bias and is able to learn local feature representation for objects at various scales, thereby enhancing the network robustness against the size discrepancy of traffic signs. The experiments are conducted on the German Traffic Sign Detection Benchmark (GTSDB). The results demonstrate the superiority of the proposed model in the traffic sign detection tasks. More specifically, Pyramid Transformer achieves 77.8% mAP on GTSDB when applied to the Cascade RCNN as the backbone, which surpasses most well-known and widely-used state-of-the-art models

    Ensembles of Deep Neural Networks for Action Recognition in Still Images

    Full text link
    Despite the fact that notable improvements have been made recently in the field of feature extraction and classification, human action recognition is still challenging, especially in images, in which, unlike videos, there is no motion. Thus, the methods proposed for recognizing human actions in videos cannot be applied to still images. A big challenge in action recognition in still images is the lack of large enough datasets, which is problematic for training deep Convolutional Neural Networks (CNNs) due to the overfitting issue. In this paper, by taking advantage of pre-trained CNNs, we employ the transfer learning technique to tackle the lack of massive labeled action recognition datasets. Furthermore, since the last layer of the CNN has class-specific information, we apply an attention mechanism on the output feature maps of the CNN to extract more discriminative and powerful features for classification of human actions. Moreover, we use eight different pre-trained CNNs in our framework and investigate their performance on Stanford 40 dataset. Finally, we propose using the Ensemble Learning technique to enhance the overall accuracy of action classification by combining the predictions of multiple models. The best setting of our method is able to achieve 93.17%\% accuracy on the Stanford 40 dataset.Comment: 5 pages, 2 figures, 3 tables, Accepted by ICCKE 201

    Traffic Sign Recognition Using Local Vision Transformer

    Full text link
    Recognition of traffic signs is a crucial aspect of self-driving cars and driver assistance systems, and machine vision tasks such as traffic sign recognition have gained significant attention. CNNs have been frequently used in machine vision, but introducing vision transformers has provided an alternative approach to global feature learning. This paper proposes a new novel model that blends the advantages of both convolutional and transformer-based networks for traffic sign recognition. The proposed model includes convolutional blocks for capturing local correlations and transformer-based blocks for learning global dependencies. Additionally, a locality module is incorporated to enhance local perception. The performance of the suggested model is evaluated on the Persian Traffic Sign Dataset and German Traffic Sign Recognition Benchmark and compared with SOTA convolutional and transformer-based models. The experimental evaluations demonstrate that the hybrid network with the locality module outperforms pure transformer-based models and some of the best convolutional networks in accuracy. Specifically, our proposed final model reached 99.66% accuracy in the German traffic sign recognition benchmark and 99.8% in the Persian traffic sign dataset, higher than the best convolutional models. Moreover, it outperforms existing CNNs and ViTs while maintaining fast inference speed. Consequently, the proposed model proves to be significantly faster and more suitable for real-world applications
    corecore