Electronic Letters on Computer Vision and Image Analysis (ELCVIA - Universitat Autònoma de Barcelona)
Not a member yet
    326 research outputs found

    Off-line identifying Script Writers by Swin Transformers and ResNeSt-50

    No full text
    In this work, we present two advanced models for identifying script writers, leveraging the power of deep learning. The proposed systems utilize the new vision Swin Transformer and ResNeSt-50. Swin Transformer is known for its robustness to variations and ability to model long-range dependencies, which helps capture context and make robust predictions. Through extensive training on large datasets of handwritten text samples, the Swin Transformer operates on sequences of image patches and learns to establish a robust representation of each writer’s unique style. On the other hand, ResNeSt-50 (Residual Neural Network with Squeeze-and-Excitation (SE) and Next Stage modules), with its multiple layers, helps in learning complex representations of a writer’s unique style and distinguishing between different writing styles with high precision. The SE module within ResNeSt helps the model focus on distinctive handwriting characteristics and reduce noise. The experimental results demonstrate exceptional performance, achieving an accuracy of 98.50% (at patch level) by the Swin Transformer on the CVL database, which consists of images withcursively handwritten German and English texts, and an accuracy of 96.61% (at page level) by ResNeSt-50 on the same database. This research advances writer identification by showcasing the effectiveness of the Swin Transformer and ResNeSt-50. The achieved accuracy underscores the potential of these models to process and understand complex handwriting effectively

    A Multimodal Biometric Authentication System Using of Autoencoders and Siamese Networks for Enhanced Security

    No full text
    Ensuring secure and reliable identity verification is crucial, and biometric authentication plays a significant role in achieving this. However, relying on a single biometric trait, unimodal authentication, may have accuracy and attack vulnerability limitations. On the other hand, multimodal authentication, which combines multiple biometric traits, can enhance accuracy and security by leveraging their complementary strengths. In the literature, different biometric modalities, such as face, voice, fingerprint, and iris, have been studied and used extensively for user authentication. Our research introduces a highly effective multimodal biometric authentication system with a deep learning approach. Our study focuses on two of the most user-friendly safety mechanisms: face and voice recognition. We employ a convolutional autoencoder for face images and an LSTM autoencoder for voice data to extract features. These features are then combined through concatenation to form a joint feature representation. A Siamese network carries out the final step of user identification. We evaluated our model’s efficiency using the OMG-Emotion and RAVDESS datasets. We achieved an accuracy of 89.79% and 95% on RAVDESS and OMG-Emotion datasets, respectively. These results are obtained using a combination of face and voice modality

    Multi-Biometric System Based On The Fusion Of Fingerprint And Finger-Vein

    No full text
    Biometrics is the process of measuring the unique biological traits of an individual for identification and verification purposes. Multiple features are used to enhance the security and robustness of the system. This study concentrates exclusively on the finger and employs two modalities - fingerprint and finger vein. The proposed system utilizes feature extraction for finger vein and two matching algorithms, namely ridge-based matching, and minutiae-based matching, to derive matching scores for both biometrics. The scores from the two modalities are combined using four fusion approaches: holistic fusion, non-linear fusion, sum rule-based fusion, and Dempster-Shafer theory. The ultimate decision is made by the performance metrics and the Receiver Operating Characteristics (ROC) curve of the fusion technique with the best results. The proposed technique is tested on images collected from the “Nanjing University Posts and Telecommunications- Fingerprint and Finger vein dataset (NUPT-FPV).” According to the results, which were obtained for 840 input images the proposed system accomplishes the Equal Error Rate (EER) of 0% while using Dempster Shafer-based fusion and 14% while using the other three fusion techniques. Also, the False Acceptance Rate (FAR) is very low at 0% for all the fusion techniques which are crucial for security and preventing unauthorized access. &nbsp

    A Deep Learning model based on CNN using Keras and TensorFlow to determine real time melting point of chemical substances

    No full text
    Deep learning is a subset of machine learning that uses artificial neural networks inspired by human cognitive systems. Although this is a newly approach recently it became very popular and effective. In many applications deep learning become most successful approach where machine learning has been successful at certain rates. In the succession of these the proposed deep learning model is suitable for melting point detection apparatus which determine melting point of chemical substances this apparatus generally used in pharmaceutical and chemical industries. Proposed deep learning model classify images of chemical’s state (Solid or Liquid) by deep neural network (DNN) it consists of TensorFlow framework, libraries like Keras and activation function like ReLu, sigmoid, MaxPool and Flatten to determine melting point of chemical substances. The proposed model enables to TensorFlow architecture, which can determine the melting point of chemicals in real time on a single board computer. This use python as a programming language, TensorFlow framework and keras library. The input image data mainly focuses on chemical’s  state, there are 2 categories of chemical’s  state either solid or liquid.  The Deep Neural Network (DNN) chosen as the best practice for the training process because it provides high accuracy. The results discussed in terms of the image classification accuracy in percentage. The images from two class label gets maximum accuracy is 99.72% and maximum validation accuracy is 99.37%   same as liquid’s image and the average value of accuracy 84.17% or higher after certain epochs

    ERNet : Enhanced ResNet for classification of breast histopathological images

    No full text
    Inspite of expeditious approaches in field of breast cancer, histopathological analysis is considered as gold standard in diagnosis of cancer. Researchers are working tremendously to automate the detection and analysis of breast histology images, which confess in improving the accuracy and also induce the mimisation of processing time. Deep learning models are providing greater contribution in solving several image classification tasks. In this paper we propose a model to classify breast histological images, which is redesigned from existing ResNet architecture that minimises model parameters and increase computational efficiency. This approach uses enhanced ResNet connection instead of identity shortcut connection used in ResNet architecture. We apply our proposed method on BreakHis dataset and achieve an accuracy around 95.92 %.  The numerical results show that our proposed approach outperforms the previous methods with respect to sensitivity and accuracy

    Image-based Mangifera Indica Leaf Disease Detection using Transfer Learning for Deep Learning Methods

    No full text
    Mangifera Indica, ordinarily known as mango, comes from a large tree. The leaf of the mango treehas human health benefits; the mango leaf extract is used for curing various diseases, including patientswith cancer and diabetes. It also has an anti-oxidant and anti-microbial biological activity. Leaf disease,including fungal disease, is a severe security threat to nourishment and food paramours. Sometimes, itleads to decreased productivity and a huge loss for the farmers. Observing and determining whether aleaf is infected through the naked eye is unreliable and inconsistent. Technology advancement has helpedagriculture people in several ways, and deep learning methods are a promising approach to spotting leafdiseases with the best accuracy. A mango leaf disease detection model is developed with the pre-trainedmodel of ResNet18, which is used in transfer learning along with the Fast.ai framework. Around 2000images were used, including images of healthy and infected leaves. The trained model achieved an accuracyof 99.88% and performed well compared to the existing state-of-the-art methods

    Robust fingerprint recognition approach based on diagonal slice of polyspectra in the polar space

    No full text
    Although fingerprint recognition is a mature technology and nowadays commercial state-of-the-art systems can be successfully used in a number of real applications, not all the problems have been solved and the research is still very active in the field. This paper presents a new approach for estimating the shift and rotation parameters between fingerprint images stored in the dadabase that operates in the third-order frequency-domain measure called the auto-bispectrum, which, allows us to estimate the shift and the rotation separately. The diagonal slices and their spectra of auto- and cross-bispectrum are proposed. The rotation parameters are estimated from the remaining polar sampled the third-order spectrum using cross-correlation, and then, after compensating for rotation, we may easily estimate the translational component, e.g., by using phase correlation.Experimental evidence of this performance is presented, and the mathematical reasons behind these characteristics are explained in depth. We compare our approach in a simulation to other frequency-domain fingerprint recognition algorithms. We find that our algorithm can better estimate shift and rotation parameters than the other methods

    On Performance Analysis Of Diabetic Retinopathy Classification

    No full text
    This paper describes the Classification of bulk OCT retinal fundus images of normal and diabetic retinopathy using the Intensity histogram features, Gray Level Co-Occurrence Matrix (GLCM), and the Gray Level Run Length Matrix (GLRLM) feature extraction techniques. Three features—Intensity histogram features, GLCM, and GLRLM were taken and, that features were compared fairly. A total of 301 bulk OCT retinal fundus color images were taken for two different varieties which are normal and diabetic retinopathy. For classification and feature extraction, a filtered image output based on a fourth-order PDE is used. Using OCT retinal fundus images, the most effective feature extraction method is identified

    Classification of radiological patterns of tuberculosis with a Convolutional neural network in x-ray images

    No full text
    In this paper we propose the classification of radiological patterns with the presence of tuberculosis in X-ray images, it was observed that two to six patterns (consolidation, fibrosis, opacity, opacity, pleural, nodules and cavitations) are present in the radiographs of the patients. It is important to mention that species specialists consider the type of TB pattern in order to provide appropriate treatment. It should be noted that not all medical centres have specialists who can immediately interpret radiological patterns. Considering the above, the aim is to classify patterns by means of a convolutional neural network to help make a more accurate diagnosis on X-rays, so that doctors can recommend immediate treatment and thus avoid infecting more people. For the classification of tuberculosis patterns, a proprietary convolutional neural network (CNN) was proposed and compared against the VGG16, InceptionV3 and ResNet-50 architectures, which were selected based on the results of other radiograph classification research [1]–[3] . The results obtained for the Macro-averange AUC-SVM metric for the proposed architecture and InceptionV3 were 0.80, and for VGG16 it was 0.75, and for the ResNet-50 network it was 0.79. The proposed architecture has better classification results, as does InceptionV3

    Deep Learning based-framework for Math Formulas Understanding

    No full text
    Extracting mathematical formulas from images of scientific documents and converting them into structured data for storage in a database is essential for their further use. However, recognizing and extracting math formulas automatically, rapidly, and effectively can be challenging. To handle this problem, we have proposed a system, with a deep learning architecture, that uses the formula combination features to train the YOLOv8 model. This system can detect and classify the formula inside and outside the text. Once extracted, we built a robust end-to-end math formula recognition system that automatically identifies and classifies math symbols, using the faster R-CNN object detection, then a Convolution Graphical Neural network (ConvGNN) to analyze the math formula layout, as the formula is better represented as a graph with complex relationships and object interdependency. ConvGNN can predict formula linkages without resorting to laborious feature engineering. Experimental results on the IBEM and CROHME 2019 datasets reveal that the proposed approach can accurately extract isolated formulas with mAP of 99.3\%, embedded formulas with mAP of 80.3%, detect symbols with mAP of 87.3%, and analyze formula layout with an accuracy of 92%. We also showed that our system is competitive with related work

    258

    full texts

    326

    metadata records
    Updated in last 30 days.
    Electronic Letters on Computer Vision and Image Analysis (ELCVIA - Universitat Autònoma de Barcelona) is based in Spain
    Access Repository Dashboard
    Do you manage Open Research Online? Become a CORE Member to access insider analytics, issue reports and manage access to outputs from your repository in the CORE Repository Dashboard! 👇