Search CORE

14 research outputs found

Multiclassification of license plate based on deep convolution neural networks

Author: Croock Muayad Sadik
Uthaib Masar Abed
Publication venue: Institute of Advanced Engineering and Science
Publication date: 01/12/2021
Field of study

In the classification of license plate there are some challenges such that the different sizes of plate numbers, the plates' background, and the number of the dataset of the plates. In this paper, a multiclass classification model established using deep convolutional neural network (CNN) to classify the license plate for three countries (Armenia, Belarus, Hungary) with the dataset of 600 images as 200 images for each class (160 for training and 40 for validation sets). Because of the small numbers of datasets, a preprocessing on the dataset is performed using pixel normalization and image data augmentation techniques (rotation, horizontal flip, zoom range) to increase the number of datasets. After that, we feed the augmented images into the convolution layer model, which consists of four blocks of convolution layer. For calculating and optimizing the efficiency of the classification model, a categorical cross-entropy and Adam optimizer used with a learning rate was 0.0001. The model's performance showed 99.17% and 97.50% of the training and validation sets accuracies sequentially, with total accuracy of classification is 96.66%. The time of training is lasting for 12 minutes. An anaconda python 3.7 and Keras Tensor flow backend are used

ZENODO

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

Institute of Advanced Engineering and Science

HYDRA: Hypergradient Data Relevance Analysis for Interpreting Deep Neural Networks

Author: Chen Yuanyuan
Li Boyang
Miao Chunyan
Wu Pengcheng
Yu Han
Publication venue
Publication date: 24/03/2021
Field of study

The behaviors of deep neural networks (DNNs) are notoriously resistant to human interpretations. In this paper, we propose Hypergradient Data Relevance Analysis, or HYDRA, which interprets the predictions made by DNNs as effects of their training data. Existing approaches generally estimate data contributions around the final model parameters and ignore how the training data shape the optimization trajectory. By unrolling the hypergradient of test loss w.r.t. the weights of training data, HYDRA assesses the contribution of training data toward test data points throughout the training trajectory. In order to accelerate computation, we remove the Hessian from the calculation and prove that, under moderate conditions, the approximation error is bounded. Corroborating this theoretical claim, empirical results indicate the error is indeed small. In addition, we quantitatively demonstrate that HYDRA outperforms influence functions in accurately estimating data contribution and detecting noisy data labels. The source code is available at https://github.com/cyyever/aaai_hydra_8686

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Mnemosyne: Learning to Train Transformers with Transformers

Author: Choromanski Krzysztof Marcin
Dubey Avinava
Jain Deepali
Sindhwani Vikas
Singh Sumeet
Tan Jie
Zhang Tingnan
Publication venue
Publication date: 16/06/2023
Field of study

In this work, we propose a new class of learnable optimizers, called \textit{Mnemosyne}. It is based on the novel spatio-temporal low-rank implicit attention Transformers that can learn to train entire neural network architectures, including other Transformers, without any task-specific optimizer tuning. We show that Mnemosyne: (a) outperforms popular LSTM optimizers (also with new feature engineering to mitigate catastrophic forgetting of LSTMs), (b) can successfully train Transformers while using simple meta-training strategies that require minimal computational resources, (c) matches accuracy-wise SOTA hand-designed optimizers with carefully tuned hyper-parameters (often producing top performing models). Furthermore, Mnemosyne provides space complexity comparable to that of its hand-designed first-order counterparts, which allows it to scale to training larger sets of parameters. We conduct an extensive empirical evaluation of Mnemosyne on: (a) fine-tuning a wide range of Vision Transformers (ViTs) from medium-size architectures to massive ViT-Hs (36 layers, 16 heads), (b) pre-training BERT models and (c) soft prompt-tuning large 11B+ T5XXL models. We complement our results with a comprehensive theoretical analysis of the compact associative memory used by Mnemosyne which we believe was never done before

arXiv.org e-Print Archive