Search CORE

146 research outputs found

Recommended from our members

Towards Universal Object Detection

Author: Cai Zhaowei
Publication venue: eScholarship, University of California
Publication date: 01/01/2019
Field of study

Object detection is one of the most important and challenging research topics in computer vision. It is playing an important role in our everyday life and has many applications, e.g. surveillance, autonomous driving, robotics, drone, medical imaging, etc. The ultimate goal of object detection is a universal object detector that can work very well in any case under any condition like human vision system. However, there are multiple challenges on the universality of object detection, e.g. scale-variance, high-quality requirement, domain shift, computational constraint, etc. These will prevent the object detector from being widely used for various scales of objects, critical applications requiring extremely accurate localization, scenarios with changing domain priors, and diverse hardware settings. To address these challenges, multiple solutions have been proposed in this thesis. These include an efficient multi-scale architecture to achieve scale-invariant detection, a robust multi-stage framework effective for high-quality requirement, a cross-domain solution to extend the universality over various domains, and a design of complexity-aware cascades and a novel low-precision network to enhance the universality under different computational constraints. All these efforts have substantially improved the universality of object detection, and the advanced object detector can be applied to broader environments

eScholarship - University of California

Adaptieve diepe neurale architecturen voor efficiënte dataverwerking op sensorknopen

Author: Leroux Sam
Publication venue
Publication date: 27/06/2021
Field of study

Ghent University Academic Bibliography

Deep neural networks for image quality: a comparison study for identification photos

Author: Ruivo José Miguel Costa
Publication venue
Publication date: 07/12/2018
Field of study

Many online platforms allow their users to upload images to their account profile. The fact that a user is free to upload any image of their liking to a university or a job platform, has resulted in occurrences of profile images that weren’t very professional or adequate in any of those contexts. Another problem associated with submitting a profile image is that even if there is some kind of control over each submitted image, this control is performed manually by someone, and that process alone can be very tedious and time-consuming, especially when there are cases of a large influx of new users joining those platforms. Based on international compliance standards used to validate photographs for machine-readable travel documents, there are SDKs that already perform automatic classification of the quality of those photographs, however, the classification is based on traditional computer vision algorithms. With the growing popularity and powerful performance of deep neural networks, it would be interesting to examine how would these perform in this task. This dissertation proposes a deep neural network model to classify the quality of profile images, and a comparison of this model against traditional computer vision algorithms, with respect to the complexity of the implementation, the quality of the classifications, and the computation time associated to the classification process. To the best of our knowledge, this dissertation is the first to study the use of deep neural networks on image quality classification.Muitas plataformas online permitem que os seus utilizadores façam upload de imagens para o perfil das respetivas contas. O facto de um utilizador ser livre de submeter qualquer imagem do seu agrado para uma plataforma de universidade ou de emprego, pode resultar em ocorrências de casos onde as imagens de perfil não são adequadas ou profissionais nesses contextos. Outro problema associado à submissão de imagens para um perfil é que, mesmo que haja algum tipo de controlo sobre cada imagem submetida, esse controlo é feito manualmente. Esse processo, por si, só pode ser aborrecido e demorado, especialmente em situações de grande afluxo de novos utilizadores a inscreverem-se nessas plataformas. Com base em normas internacionais utilizadas para validar fotografias de documentos de viagem de leitura óptica, existem SDKs que já realizam a classificação automática da qualidade dessas fotografias. No entanto, essa classificação é baseada em algoritmos tradicionais de visão computacional. Com a crescente popularidade e o poderoso desempenho de redes neurais profundas, seria interessante examinar como é que estas se comportam nesta tarefa. Assim, esta dissertação propõe um modelo de rede neural profunda para classificar a qualidade de imagens de perfis e faz uma comparação deste modelo com algoritmos tradicionais de visão computacional, no que respeita à complexidade da implementação, qualidade das classificações e ao tempo de computação associado ao processo de classificação. Tanto quanto sabemos, esta dissertação é a primeira a estudar o uso de redes neurais profundas na classificação da qualidade de imagem

Repositório Institucional do ISCTE-IUL

A Survey on Deep Learning in Medical Image Analysis

Author: Bejnordi Babak Ehteshami
Ciompi Francesco
Ghafoorian Mohsen
Kooi Thijs
Litjens Geert
Setio Arnaud Arindra Adiyoso
Sánchez Clara I.
van der Laak Jeroen A. W. M.
van Ginneken Bram
Publication venue: 'Elsevier BV'
Publication date: 01/01/2017
Field of study

Deep learning algorithms, in particular convolutional networks, have rapidly become a methodology of choice for analyzing medical images. This paper reviews the major deep learning concepts pertinent to medical image analysis and summarizes over 300 contributions to the field, most of which appeared in the last year. We survey the use of deep learning for image classification, object detection, segmentation, registration, and other tasks and provide concise overviews of studies per application area. Open challenges and directions for future research are discussed.Comment: Revised survey includes expanded discussion section and reworked introductory section on common deep architectures. Added missed papers from before Feb 1st 201

arXiv.org e-Print Archive

Radboud Repository

Entropy in Image Analysis II

Author
Publication venue: 'MDPI AG'
Publication date: 01/05/2021
Field of study

Image analysis is a fundamental task for any application where extracting information from images is required. The analysis requires highly sophisticated numerical and analytical methods, particularly for those applications in medicine, security, and other fields where the results of the processing consist of data of vital importance. This fact is evident from all the articles composing the Special Issue "Entropy in Image Analysis II", in which the authors used widely tested methods to verify their results. In the process of reading the present volume, the reader will appreciate the richness of their methods and applications, in particular for medical imaging and image security, and a remarkable cross-fertilization among the proposed research areas

Directory of Open Access Books (DOAB)

CONCRETE CRACK EVALUATION FOR CIVIL INFRASTRUCTURE USING COMPUTER VISION AND DEEP LEARNING

Author: Kim Hyunjun
Publication venue: Graduate School of UNIST
Publication date: 01/02/2020
Field of study

Department of Urban and Environmental Engineering (Urban Infrastructure Engineering)Surface cracks of civil infrastructure are one of the important indicators for structural durability and integrity. Concrete cracks are typically investigated by manual visual observation on the surface, which is intrinsically subjective because it highly depends on the experience of inspectors. Furthermore, manual visual inspection is time-consuming, expensive, and often unsafe when inaccessible structural components need to be assessed. Computer vision-based approach is recognized as a promising alternative that can automatically extract crack information from images captured by the digital camera. As texts and cracks are similar in terms of consisting distinguishable lines and curves, image binarization developed for text detection can be appropriate for crack identification purposes. However, although image binarization is useful to separate cracks and backgrounds, the crack assessment is difficult to standardize owing to the high dependence of binarization parameters determined by users. Another critical challenge in digital image processing for crack detection is to automatically distinguish cracks from an image containing actual cracks and crack-like noise patterns (e.g., stains, holes, dark shadows, and lumps), which are often seen on the surface of concrete structures. In addition, a tailored camera system and the corresponding strategy are necessary to effectively address the practical issues in terms of the skewed angle and the process of the sequential crack images for efficient measurement. This research develops a computer vision-based approach in conjunction with deep learning for accurate crack evaluation of for civil infrastructure. The main contribution of the proposed approach can be summarized as follows: (1) a deep learning-based approach for crack detection, (2) a hybrid image processing for crack quantification, and (3) camera systems for the practical issues on civil infrastructure in terms of a skewed angle problem and an efficient measurement with the sequential crack images. The proposed research allows accurate crack evaluation to provide a proper maintenance strategy for civil infrastructure in practice.clos

ScholarWorks@UNIST

Deep learning for food instance segmentation

Author: Molina Rodríguez De Vera Jesús
Publication venue: Universitat Politècnica de Catalunya
Publication date: 25/01/2023
Field of study

Food object detection and instance segmentation are critical in many applications such as dietary management or food intake monitoring. Food image recognition poses different challenges, such as the existence of a large number of classes, a high inter-class similarity, and high intra-class variance. This, along with the traditional problems associated with object detection and instance segmentation make this a very complex computer vision task. Real-world food datasets generally suffer from long-tailed and fine-grained distributions. However, the recent literature fails to address food detection in this regard. In this research, we propose a novel two-stage object detector, which we call Strong LOng-tailed Food object Detection and instance Segmentation (SLOF-DS), to tackle the long-tailed nature of food images. In addition, a multi-task based framework, which exploits different sources of prior information, was proposed to improve the classification of fine-grained classes. Lastly, we also propose a new module based on Graph Neural Neworks, we call Graph Confidence Propagation (GCP) that additionally improves the performance of both object detection and instance segmentation modules by combining all the model outputs considering the global image context. Exhaustive quantitive and qualitative analysis performed on two open source food benchmarks, namely the UECFood-256 (object detection) and the AiCrowd Food Recognition Challenge 2022 dataset (instance segmentation) using different baseline algorithms prove the robust improvements introduced by the different components proposed in this thesis. More concretely, we outperformed the state-of-the-art performance on both public datasets

UPCommons. Portal del coneixement obert de la UPC

License Plate Recognition using Convolutional Neural Networks Trained on Synthetic Images

Author: Bjoerklund TOMAS PER ROLF
Publication venue: Politecnico di Torino
Publication date
Field of study

In this thesis, we propose a license plate recognition system and study the feasibility of using synthetic training samples to train convolutional neural networks for a practical application. First we develop a modular framework for synthetic license plate generation; to generate different license plate types (or other objects) only the first module needs to be adapted. The other modules apply variations to the training samples such as background, occlusions, camera perspective projection, object noise and camera acquisition noise, with the aim to achieve enough variation of the object that the trained networks will also recognize real objects of the same class. Then we design two convolutional neural networks of low-complexity for license plate detection and character recognition. Both are designed for simultaneous classification and localization by branching the networks into a classification and a regression branch and are trained end-to-end simultaneously over both branches, on only our synthetic training samples. To recognize real license plates, we design a pipeline for scale invariant license plate detection with a scale pyramid and a fully convolutional application of the license plate detection network in order to detect any number of license plates and of any scale in an image. Before character classification is applied, potential plate regions are un-skewed based on the detected plate location in order to achieve an as optimal representation of the characters as possible. The character classification is also performed with a fully convolutional sweep to simultaneously find all characters at once. Both the plate and the character stages apply a refinement classification where initial classifications are first centered and rescaled. We show that this simple, yet effective trick greatly improves the accuracy of our classifications, and at a small increase of complexity. To our knowledge, this trick has not been exploited before. To show the effectiveness of our system we first apply it on a dataset of photos of Italian license plates to evaluate the different stages of our system and which effect the classification thresholds have on the accuracy. We also find robust training parameters and thresholds that are reliable for classification without any need for calibration on a validation set of real annotated samples (which may not always be available) and achieve a balanced precision and recall on the set of Italian license plates, both in excess of 98%. Finally, to show that our system generalizes to new plate types, we compare our system to two reference system on a dataset of Taiwanese license plates. For this, we only modify the first module of the synthetic plate generation algorithm to produce Taiwanese license plates and adjust parameters regarding plate dimensions, then we train our networks and apply the classification pipeline, using the robust parameters, on the Taiwanese reference dataset. We achieve state-of-the-art performance on plate detection (99.86% precision and 99.1% recall), single character detection (99.6%) and full license reading (98.7%)

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

Artificial Intelligence Algorithms for Eye Banking

Author: Karmakar Ranit
Publication venue: 'Michigan Technological University'
Publication date: 01/01/2023
Field of study

Eye banking plays a critical role in modern medicine by providing cornea tissues for transplantation to restore vision for millions of people worldwide. The evaluation of corneal endothelium is done by measuring the corneal endothelial cell density (ECD). Unfortunately, the current system to measure ECD is manual, time-consuming, and error prone. Furthermore, the impact of social behaviors and biological conditions on corneal endothelium and corneal transplant success is largely unexplored. To overcome these challenges, this dissertation aims to develop tools for corneal endothelial image and data analysis that enhance the efficiency and quality of the cornea transplants. In the first study, an image processing algorithm is developed to analyze corneal endothelial images captured by a Konan CellChek specular microscope. The algorithm successfully identifies the region of interest, filters the image, and employs stochastic watershed segmentation to determine cell boundaries and evaluate endothelial cell density (ECD). The proposed algorithm achieves a high correlation with manual counts (R2 = 0.98) and has an average analysis time of 2.5 seconds. In the second study, a deep learning-based cell segmentation algorithm called Mobile-CellNet is proposed to estimate ECD. This technique addresses the limitations of classical algorithms and creates a more robust and highly efficient algorithm. The approach achieves a mean absolute error of 4.06% for ECD on the test set, similar to U-Net but with significantly fewer floating-point operations and parameters. The third study explores the correlation between alcohol abuse and corneal endothelial morphology in a donor pool of 5,624 individuals. Multivariable regression analysis shows that alcohol abuse is associated with a reduction in endothelial cell density, an increase in the coefficient of variation, and a decrease in percent hexagonality. These studies highlight the potential of big data and artificial algorithms in accurately and efficiently analyzing corneal images and donor medical data to improve the efficiency of eye banking and patient outcomes. By automating the analysis of corneal images and exploring the impact of social behaviors and biological conditions on corneal endothelial morphology, we can enhance the quality and availability of cornea transplants and ultimately improve the lives of millions of people worldwide

Michigan Technological University