103 research outputs found

    Towards End-to-end Car License Plate Location and Recognition in Unconstrained Scenarios

    Full text link
    Benefiting from the rapid development of convolutional neural networks, the performance of car license plate detection and recognition has been largely improved. Nonetheless, challenges still exist especially for real-world applications. In this paper, we present an efficient and accurate framework to solve the license plate detection and recognition tasks simultaneously. It is a lightweight and unified deep neural network, that can be optimized end-to-end and work in real-time. Specifically, for unconstrained scenarios, an anchor-free method is adopted to efficiently detect the bounding box and four corners of a license plate, which are used to extract and rectify the target region features. Then, a novel convolutional neural network branch is designed to further extract features of characters without segmentation. Finally, recognition task is treated as sequence labelling problems, which are solved by Connectionist Temporal Classification (CTC) directly. Several public datasets including images collected from different scenarios under various conditions are chosen for evaluation. A large number of experiments indicate that the proposed method significantly outperforms the previous state-of-the-art methods in both speed and precision

    Semi-supervised Thai Sentence Segmentation Using Local and Distant Word Representations

    Get PDF
    A sentence is typically treated as the minimal syntactic unit used to extract valuable information from long text. However, in written Thai, there are no explicit sentence markers. Some prior works use machine learning; however, a deep learning approach has never been employed. We propose a deep learning model for sentence segmentation that includes three main contributions. First, we integrate n-gram embedding as a local representation to capture word groups near sentence boundaries. Second, to focus on the keywords of dependent clauses, we combine the model with a distant representation obtained from self-attention modules. Finally, due to the scarcity of labeled data, for which annotation is difficult and time-consuming, we also investigate two techniques that allow us to utilize unlabeled data: Cross-View Training (CVT) as a semi-supervised learning technique, and a pre-trained language model (ELMo) to improve word representation. In the experiments, our model reduced the relative error by 7.4% and 18.5% compared with the baseline models on the Orchid and UGWC datasets, respectively. Ablation studies revealed that the main contributing factor was adopting n-gram features, which were further analyzed using the interpretation technique and indicated that the model utilizes the features in the same way that humans do

    Civil infrastructure damage and corrosion detection: an application of machine learning

    Get PDF
    Automatic detection of corrosion and associated damages to civil infrastructures such as bridges, buildings, and roads, from aerial images captured by an Unmanned Aerial Vehicle (UAV), helps one to overcome the challenges and shortcomings (objectivity and reliability) associated with the manual inspection methods. Deep learning methods have been widely reported in the literature for civil infrastructure corrosion detection. Among them, convolutional neural networks (CNNs) display promising applicability for the automatic detection of image features less affected by image noises. Therefore, in the current study, we propose a modified version of deep hierarchical CNN architecture, based on 16 convolution layers and cycle generative adversarial network (CycleGAN), to predict pixel-wise segmentation in an end-to-end manner using the images of Bolte Bridge and sky rail areas in Victoria (Melbourne). The convolutedly designed model network proposed in the study is based on learning and aggregation of multi-scale and multilevel features while moving from the low convolutional layers to the high-level layers, thus reducing the consistency loss in images due to the inclusion of CycleGAN. The standard approaches only use the last convolutional layer, but our proposed architecture differs from these approaches and uses multiple layers. Moreover, we have used guided filtering and Conditional Random Fields (CRFs) methods to refine the prediction results. Additionally, the effectiveness of the proposed architecture was assessed using benchmarking data of 600 images of civil infrastructure. Overall, the results show that the deep hierarchical CNN architecture based on 16 convolution layers produced advanced performances when evaluated for different methods, including the baseline, PSPNet, DeepLab, and SegNet. Overall, the extended method displayed the Global Accuracy (GA); Class Average Accuracy (CAC); mean Intersection Of the Union (IOU); Precision (P); Recall (R); and F-score values of 0.989, 0.931, 0.878, 0.849, 0.818 and 0.833, respectively
    • …
    corecore