146 research outputs found
Comparative study on the detection of early dental caries using thermo-photonic lock-in imaging and optical coherence tomography
Early detection of dental caries is known to be the key to the effectiveness of therapeutic and preventive approaches in dentistry. However, existing clinical detection techniques, such as radiographs, are not sufficiently sensitive to detect and monitor the progression of caries at early stages. As such, in recent years, several optics-based imaging modalities have been proposed for the early detection of caries. The majority of these techniques rely on the enhancement of light scattering in early carious lesions, while a few of them are based on the enhancement of light absorption at early caries sites. In this paper, we report on a systemic comparative study on the detection performances of optical coherence tomography (OCT) and thermophotonic lock-in imaging (TPLI) as representative early caries detection modalities based on light scattering and absorption, respectively. Through controlled demineralization studies on extracted human teeth and µCT validation experiments, several detection performance parameters of the two modalities such as detection threshold, sensitivity and specificity have been qualitatively analyzed and discussed. Our experiment results suggests that both modalities have sufficient sensitivity for the detection of well-developed early caries on occlusal and smooth surfaces; however, TPLI provides better sensitivity and detection threshold for detecting very early stages of caries formation, which is deemed to be critical for the effectiveness of therapeutic and preventive approaches in dentistry. Moreover, due to the more specific nature of the light absorption contrast mechanism over light scattering, TPLI exhibits better detection specificity, which results in less false positive readings and thus allows for the proper differentiation of early caries regions from the surrounding intact areas. The major shortcoming of TPLI is its inherent depth-integrated nature, prohibiting the production of depth-resolved/B-mode like images. The outcomes of this research justify the need for a light-absorption based imaging modality with the ability to produce tomographic and depth-resolved images, combining the key advantages of OCT and TPLI.York University Librarie
Pyramid Transformer for Traffic Sign Detection
Traffic sign detection is a vital task in the visual system of self-driving
cars and the automated driving system. Recently, novel Transformer-based models
have achieved encouraging results for various computer vision tasks. We still
observed that vanilla ViT could not yield satisfactory results in traffic sign
detection because the overall size of the datasets is very small and the class
distribution of traffic signs is extremely unbalanced. To overcome this
problem, a novel Pyramid Transformer with locality mechanisms is proposed in
this paper. Specifically, Pyramid Transformer has several spatial pyramid
reduction layers to shrink and embed the input image into tokens with rich
multi-scale context by using atrous convolutions. Moreover, it inherits an
intrinsic scale invariance inductive bias and is able to learn local feature
representation for objects at various scales, thereby enhancing the network
robustness against the size discrepancy of traffic signs. The experiments are
conducted on the German Traffic Sign Detection Benchmark (GTSDB). The results
demonstrate the superiority of the proposed model in the traffic sign detection
tasks. More specifically, Pyramid Transformer achieves 77.8% mAP on GTSDB when
applied to the Cascade RCNN as the backbone, which surpasses most well-known
and widely-used state-of-the-art models
Ensembles of Deep Neural Networks for Action Recognition in Still Images
Despite the fact that notable improvements have been made recently in the
field of feature extraction and classification, human action recognition is
still challenging, especially in images, in which, unlike videos, there is no
motion. Thus, the methods proposed for recognizing human actions in videos
cannot be applied to still images. A big challenge in action recognition in
still images is the lack of large enough datasets, which is problematic for
training deep Convolutional Neural Networks (CNNs) due to the overfitting
issue. In this paper, by taking advantage of pre-trained CNNs, we employ the
transfer learning technique to tackle the lack of massive labeled action
recognition datasets. Furthermore, since the last layer of the CNN has
class-specific information, we apply an attention mechanism on the output
feature maps of the CNN to extract more discriminative and powerful features
for classification of human actions. Moreover, we use eight different
pre-trained CNNs in our framework and investigate their performance on Stanford
40 dataset. Finally, we propose using the Ensemble Learning technique to
enhance the overall accuracy of action classification by combining the
predictions of multiple models. The best setting of our method is able to
achieve 93.17 accuracy on the Stanford 40 dataset.Comment: 5 pages, 2 figures, 3 tables, Accepted by ICCKE 201
Traffic Sign Recognition Using Local Vision Transformer
Recognition of traffic signs is a crucial aspect of self-driving cars and
driver assistance systems, and machine vision tasks such as traffic sign
recognition have gained significant attention. CNNs have been frequently used
in machine vision, but introducing vision transformers has provided an
alternative approach to global feature learning. This paper proposes a new
novel model that blends the advantages of both convolutional and
transformer-based networks for traffic sign recognition. The proposed model
includes convolutional blocks for capturing local correlations and
transformer-based blocks for learning global dependencies. Additionally, a
locality module is incorporated to enhance local perception. The performance of
the suggested model is evaluated on the Persian Traffic Sign Dataset and German
Traffic Sign Recognition Benchmark and compared with SOTA convolutional and
transformer-based models. The experimental evaluations demonstrate that the
hybrid network with the locality module outperforms pure transformer-based
models and some of the best convolutional networks in accuracy. Specifically,
our proposed final model reached 99.66% accuracy in the German traffic sign
recognition benchmark and 99.8% in the Persian traffic sign dataset, higher
than the best convolutional models. Moreover, it outperforms existing CNNs and
ViTs while maintaining fast inference speed. Consequently, the proposed model
proves to be significantly faster and more suitable for real-world
applications
- …