146 research outputs found

    Deep neural networks for image quality: a comparison study for identification photos

    Get PDF
    Many online platforms allow their users to upload images to their account profile. The fact that a user is free to upload any image of their liking to a university or a job platform, has resulted in occurrences of profile images that weren’t very professional or adequate in any of those contexts. Another problem associated with submitting a profile image is that even if there is some kind of control over each submitted image, this control is performed manually by someone, and that process alone can be very tedious and time-consuming, especially when there are cases of a large influx of new users joining those platforms. Based on international compliance standards used to validate photographs for machine-readable travel documents, there are SDKs that already perform automatic classification of the quality of those photographs, however, the classification is based on traditional computer vision algorithms. With the growing popularity and powerful performance of deep neural networks, it would be interesting to examine how would these perform in this task. This dissertation proposes a deep neural network model to classify the quality of profile images, and a comparison of this model against traditional computer vision algorithms, with respect to the complexity of the implementation, the quality of the classifications, and the computation time associated to the classification process. To the best of our knowledge, this dissertation is the first to study the use of deep neural networks on image quality classification.Muitas plataformas online permitem que os seus utilizadores façam upload de imagens para o perfil das respetivas contas. O facto de um utilizador ser livre de submeter qualquer imagem do seu agrado para uma plataforma de universidade ou de emprego, pode resultar em ocorrências de casos onde as imagens de perfil não são adequadas ou profissionais nesses contextos. Outro problema associado à submissão de imagens para um perfil é que, mesmo que haja algum tipo de controlo sobre cada imagem submetida, esse controlo é feito manualmente. Esse processo, por si, só pode ser aborrecido e demorado, especialmente em situações de grande afluxo de novos utilizadores a inscreverem-se nessas plataformas. Com base em normas internacionais utilizadas para validar fotografias de documentos de viagem de leitura óptica, existem SDKs que já realizam a classificação automática da qualidade dessas fotografias. No entanto, essa classificação é baseada em algoritmos tradicionais de visão computacional. Com a crescente popularidade e o poderoso desempenho de redes neurais profundas, seria interessante examinar como é que estas se comportam nesta tarefa. Assim, esta dissertação propõe um modelo de rede neural profunda para classificar a qualidade de imagens de perfis e faz uma comparação deste modelo com algoritmos tradicionais de visão computacional, no que respeita à complexidade da implementação, qualidade das classificações e ao tempo de computação associado ao processo de classificação. Tanto quanto sabemos, esta dissertação é a primeira a estudar o uso de redes neurais profundas na classificação da qualidade de imagem

    A Survey on Deep Learning in Medical Image Analysis

    Full text link
    Deep learning algorithms, in particular convolutional networks, have rapidly become a methodology of choice for analyzing medical images. This paper reviews the major deep learning concepts pertinent to medical image analysis and summarizes over 300 contributions to the field, most of which appeared in the last year. We survey the use of deep learning for image classification, object detection, segmentation, registration, and other tasks and provide concise overviews of studies per application area. Open challenges and directions for future research are discussed.Comment: Revised survey includes expanded discussion section and reworked introductory section on common deep architectures. Added missed papers from before Feb 1st 201

    Entropy in Image Analysis II

    Get PDF
    Image analysis is a fundamental task for any application where extracting information from images is required. The analysis requires highly sophisticated numerical and analytical methods, particularly for those applications in medicine, security, and other fields where the results of the processing consist of data of vital importance. This fact is evident from all the articles composing the Special Issue "Entropy in Image Analysis II", in which the authors used widely tested methods to verify their results. In the process of reading the present volume, the reader will appreciate the richness of their methods and applications, in particular for medical imaging and image security, and a remarkable cross-fertilization among the proposed research areas

    CONCRETE CRACK EVALUATION FOR CIVIL INFRASTRUCTURE USING COMPUTER VISION AND DEEP LEARNING

    Get PDF
    Department of Urban and Environmental Engineering (Urban Infrastructure Engineering)Surface cracks of civil infrastructure are one of the important indicators for structural durability and integrity. Concrete cracks are typically investigated by manual visual observation on the surface, which is intrinsically subjective because it highly depends on the experience of inspectors. Furthermore, manual visual inspection is time-consuming, expensive, and often unsafe when inaccessible structural components need to be assessed. Computer vision-based approach is recognized as a promising alternative that can automatically extract crack information from images captured by the digital camera. As texts and cracks are similar in terms of consisting distinguishable lines and curves, image binarization developed for text detection can be appropriate for crack identification purposes. However, although image binarization is useful to separate cracks and backgrounds, the crack assessment is difficult to standardize owing to the high dependence of binarization parameters determined by users. Another critical challenge in digital image processing for crack detection is to automatically distinguish cracks from an image containing actual cracks and crack-like noise patterns (e.g., stains, holes, dark shadows, and lumps), which are often seen on the surface of concrete structures. In addition, a tailored camera system and the corresponding strategy are necessary to effectively address the practical issues in terms of the skewed angle and the process of the sequential crack images for efficient measurement. This research develops a computer vision-based approach in conjunction with deep learning for accurate crack evaluation of for civil infrastructure. The main contribution of the proposed approach can be summarized as follows: (1) a deep learning-based approach for crack detection, (2) a hybrid image processing for crack quantification, and (3) camera systems for the practical issues on civil infrastructure in terms of a skewed angle problem and an efficient measurement with the sequential crack images. The proposed research allows accurate crack evaluation to provide a proper maintenance strategy for civil infrastructure in practice.clos

    Deep learning for food instance segmentation

    Get PDF
    Food object detection and instance segmentation are critical in many applications such as dietary management or food intake monitoring. Food image recognition poses different challenges, such as the existence of a large number of classes, a high inter-class similarity, and high intra-class variance. This, along with the traditional problems associated with object detection and instance segmentation make this a very complex computer vision task. Real-world food datasets generally suffer from long-tailed and fine-grained distributions. However, the recent literature fails to address food detection in this regard. In this research, we propose a novel two-stage object detector, which we call Strong LOng-tailed Food object Detection and instance Segmentation (SLOF-DS), to tackle the long-tailed nature of food images. In addition, a multi-task based framework, which exploits different sources of prior information, was proposed to improve the classification of fine-grained classes. Lastly, we also propose a new module based on Graph Neural Neworks, we call Graph Confidence Propagation (GCP) that additionally improves the performance of both object detection and instance segmentation modules by combining all the model outputs considering the global image context. Exhaustive quantitive and qualitative analysis performed on two open source food benchmarks, namely the UECFood-256 (object detection) and the AiCrowd Food Recognition Challenge 2022 dataset (instance segmentation) using different baseline algorithms prove the robust improvements introduced by the different components proposed in this thesis. More concretely, we outperformed the state-of-the-art performance on both public datasets

    License Plate Recognition using Convolutional Neural Networks Trained on Synthetic Images

    Get PDF
    In this thesis, we propose a license plate recognition system and study the feasibility of using synthetic training samples to train convolutional neural networks for a practical application. First we develop a modular framework for synthetic license plate generation; to generate different license plate types (or other objects) only the first module needs to be adapted. The other modules apply variations to the training samples such as background, occlusions, camera perspective projection, object noise and camera acquisition noise, with the aim to achieve enough variation of the object that the trained networks will also recognize real objects of the same class. Then we design two convolutional neural networks of low-complexity for license plate detection and character recognition. Both are designed for simultaneous classification and localization by branching the networks into a classification and a regression branch and are trained end-to-end simultaneously over both branches, on only our synthetic training samples. To recognize real license plates, we design a pipeline for scale invariant license plate detection with a scale pyramid and a fully convolutional application of the license plate detection network in order to detect any number of license plates and of any scale in an image. Before character classification is applied, potential plate regions are un-skewed based on the detected plate location in order to achieve an as optimal representation of the characters as possible. The character classification is also performed with a fully convolutional sweep to simultaneously find all characters at once. Both the plate and the character stages apply a refinement classification where initial classifications are first centered and rescaled. We show that this simple, yet effective trick greatly improves the accuracy of our classifications, and at a small increase of complexity. To our knowledge, this trick has not been exploited before. To show the effectiveness of our system we first apply it on a dataset of photos of Italian license plates to evaluate the different stages of our system and which effect the classification thresholds have on the accuracy. We also find robust training parameters and thresholds that are reliable for classification without any need for calibration on a validation set of real annotated samples (which may not always be available) and achieve a balanced precision and recall on the set of Italian license plates, both in excess of 98%. Finally, to show that our system generalizes to new plate types, we compare our system to two reference system on a dataset of Taiwanese license plates. For this, we only modify the first module of the synthetic plate generation algorithm to produce Taiwanese license plates and adjust parameters regarding plate dimensions, then we train our networks and apply the classification pipeline, using the robust parameters, on the Taiwanese reference dataset. We achieve state-of-the-art performance on plate detection (99.86% precision and 99.1% recall), single character detection (99.6%) and full license reading (98.7%)

    Artificial Intelligence Algorithms for Eye Banking

    Get PDF
    Eye banking plays a critical role in modern medicine by providing cornea tissues for transplantation to restore vision for millions of people worldwide. The evaluation of corneal endothelium is done by measuring the corneal endothelial cell density (ECD). Unfortunately, the current system to measure ECD is manual, time-consuming, and error prone. Furthermore, the impact of social behaviors and biological conditions on corneal endothelium and corneal transplant success is largely unexplored. To overcome these challenges, this dissertation aims to develop tools for corneal endothelial image and data analysis that enhance the efficiency and quality of the cornea transplants. In the first study, an image processing algorithm is developed to analyze corneal endothelial images captured by a Konan CellChek specular microscope. The algorithm successfully identifies the region of interest, filters the image, and employs stochastic watershed segmentation to determine cell boundaries and evaluate endothelial cell density (ECD). The proposed algorithm achieves a high correlation with manual counts (R2 = 0.98) and has an average analysis time of 2.5 seconds. In the second study, a deep learning-based cell segmentation algorithm called Mobile-CellNet is proposed to estimate ECD. This technique addresses the limitations of classical algorithms and creates a more robust and highly efficient algorithm. The approach achieves a mean absolute error of 4.06% for ECD on the test set, similar to U-Net but with significantly fewer floating-point operations and parameters. The third study explores the correlation between alcohol abuse and corneal endothelial morphology in a donor pool of 5,624 individuals. Multivariable regression analysis shows that alcohol abuse is associated with a reduction in endothelial cell density, an increase in the coefficient of variation, and a decrease in percent hexagonality. These studies highlight the potential of big data and artificial algorithms in accurately and efficiently analyzing corneal images and donor medical data to improve the efficiency of eye banking and patient outcomes. By automating the analysis of corneal images and exploring the impact of social behaviors and biological conditions on corneal endothelial morphology, we can enhance the quality and availability of cornea transplants and ultimately improve the lives of millions of people worldwide
    corecore