14 research outputs found

    Multiclassification of license plate based on deep convolution neural networks

    Get PDF
    In the classification of license plate there are some challenges such that the different sizes of plate numbers, the plates' background, and the number of the dataset of the plates. In this paper, a multiclass classification model established using deep convolutional neural network (CNN) to classify the license plate for three countries (Armenia, Belarus, Hungary) with the dataset of 600 images as 200 images for each class (160 for training and 40 for validation sets). Because of the small numbers of datasets, a preprocessing on the dataset is performed using pixel normalization and image data augmentation techniques (rotation, horizontal flip, zoom range) to increase the number of datasets. After that, we feed the augmented images into the convolution layer model, which consists of four blocks of convolution layer. For calculating and optimizing the efficiency of the classification model, a categorical cross-entropy and Adam optimizer used with a learning rate was 0.0001. The model's performance showed 99.17% and 97.50% of the training and validation sets accuracies sequentially, with total accuracy of classification is 96.66%. The time of training is lasting for 12 minutes. An anaconda python 3.7 and Keras Tensor flow backend are used

    HYDRA: Hypergradient Data Relevance Analysis for Interpreting Deep Neural Networks

    Full text link
    The behaviors of deep neural networks (DNNs) are notoriously resistant to human interpretations. In this paper, we propose Hypergradient Data Relevance Analysis, or HYDRA, which interprets the predictions made by DNNs as effects of their training data. Existing approaches generally estimate data contributions around the final model parameters and ignore how the training data shape the optimization trajectory. By unrolling the hypergradient of test loss w.r.t. the weights of training data, HYDRA assesses the contribution of training data toward test data points throughout the training trajectory. In order to accelerate computation, we remove the Hessian from the calculation and prove that, under moderate conditions, the approximation error is bounded. Corroborating this theoretical claim, empirical results indicate the error is indeed small. In addition, we quantitatively demonstrate that HYDRA outperforms influence functions in accurately estimating data contribution and detecting noisy data labels. The source code is available at https://github.com/cyyever/aaai_hydra_8686

    Mnemosyne: Learning to Train Transformers with Transformers

    Full text link
    In this work, we propose a new class of learnable optimizers, called \textit{Mnemosyne}. It is based on the novel spatio-temporal low-rank implicit attention Transformers that can learn to train entire neural network architectures, including other Transformers, without any task-specific optimizer tuning. We show that Mnemosyne: (a) outperforms popular LSTM optimizers (also with new feature engineering to mitigate catastrophic forgetting of LSTMs), (b) can successfully train Transformers while using simple meta-training strategies that require minimal computational resources, (c) matches accuracy-wise SOTA hand-designed optimizers with carefully tuned hyper-parameters (often producing top performing models). Furthermore, Mnemosyne provides space complexity comparable to that of its hand-designed first-order counterparts, which allows it to scale to training larger sets of parameters. We conduct an extensive empirical evaluation of Mnemosyne on: (a) fine-tuning a wide range of Vision Transformers (ViTs) from medium-size architectures to massive ViT-Hs (36 layers, 16 heads), (b) pre-training BERT models and (c) soft prompt-tuning large 11B+ T5XXL models. We complement our results with a comprehensive theoretical analysis of the compact associative memory used by Mnemosyne which we believe was never done before
    corecore