559,385 research outputs found

    Optic nerve head segmentation

    Get PDF
    Reliable and efficient optic disk localization and segmentation are important tasks in automated retinal screening. General-purpose edge detection algorithms often fail to segment the optic disk due to fuzzy boundaries, inconsistent image contrast or missing edge features. This paper presents an algorithm for the localization and segmentation of the optic nerve head boundary in low-resolution images (about 20 /spl mu//pixel). Optic disk localization is achieved using specialized template matching, and segmentation by a deformable contour model. The latter uses a global elliptical model and a local deformable model with variable edge-strength dependent stiffness. The algorithm is evaluated against a randomly selected database of 100 images from a diabetic screening programme. Ten images were classified as unusable; the others were of variable quality. The localization algorithm succeeded on all bar one usable image; the contour estimation algorithm was qualitatively assessed by an ophthalmologist as having Excellent-Fair performance in 83% of cases, and performs well even on blurred image

    Detection of Dental Apical Lesions Using CNNs on Periapical Radiograph

    Get PDF
    Apical lesions, the general term for chronic infectious diseases, are very common dental diseases in modern life, and are caused by various factors. The current prevailing endodontic treatment makes use of X-ray photography taken from patients where the lesion area is marked manually, which is therefore time consuming. Additionally, for some images the significant details might not be recognizable due to the different shooting angles or doses. To make the diagnosis process shorter and efficient, repetitive tasks should be performed automatically to allow the dentists to focus more on the technical and medical diagnosis, such as treatment, tooth cleaning, or medical communication. To realize the automatic diagnosis, this article proposes and establishes a lesion area analysis model based on convolutional neural networks (CNN). For establishing a standardized database for clinical application, the Institutional Review Board (IRB) with application number 202002030B0 has been approved with the database established by dentists who provided the practical clinical data. In this study, the image data is preprocessed by a Gaussian high-pass filter. Then, an iterative thresholding is applied to slice the X-ray image into several individual tooth sample images. The collection of individual tooth images that comprises the image database are used as input into the CNN migration learning model for training. Seventy percent (70%) of the image database is used for training and validating the model while the remaining 30% is used for testing and estimating the accuracy of the model. The practical diagnosis accuracy of the proposed CNN model is 92.5%. The proposed model successfully facilitated the automatic diagnosis of the apical lesion

    Semantic sentence similarity: size does not always matter

    Get PDF
    This study addresses the question whether visually grounded speech recognition (VGS) models learn to capture sentence semantics without access to any prior linguistic knowledge. We produce synthetic and natural spoken versions of a well known semantic textual similarity database and show that our VGS model produces embeddings that correlate well with human semantic similarity judgements. Our results show that a model trained on a small image-caption database outperforms two models trained on much larger databases, indicating that database size is not all that matters. We also investigate the importance of having multiple captions per image and find that this is indeed helpful even if the total number of images is lower, suggesting that paraphrasing is a valuable learning signal. While the general trend in the field is to create ever larger datasets to train models on, our findings indicate other characteristics of the database can just as important important.Comment: This paper has been accepted at Interspeech 2021 where it will be presented and appear in the conference proceedings in September 202

    Non-parametric Ensemble Kalman methods for the inpainting of noisy dynamic textures

    No full text
    International audienceIn this work, we propose a novel non parametric method for the temporally consistent inpainting of dynamic texture sequences. The inpainting of texture image sequences is stated as a stochastic assimilation issue, for which a novel model-free and data-driven Ensemble Kalman method is introduced. Our model is inspired by the Analog Ensemble Kalman Filter (AnEnKF) recently proposed for the assimilation of geophysical space-time dynamics, where the physical model is replaced by the use of statistical analogs or nearest neighbours. Such a non-parametric framework is of key interest for image processing applications, as prior models are seldom available in general. We present experimental evidence for real dynamic texture that using only a catalog database of historical data and without having any assumption on the model, the proposed method provides relevant dynamically-consistent interpolation and outperforms the classical parametric (autoregressive) dynamical prior

    Blind Quality Assessment for in-the-Wild Images via Hierarchical Feature Fusion and Iterative Mixed Database Training

    Full text link
    Image quality assessment (IQA) is very important for both end-users and service-providers since a high-quality image can significantly improve the user's quality of experience (QoE) and also benefit lots of computer vision algorithms. Most existing blind image quality assessment (BIQA) models were developed for synthetically distorted images, however, they perform poorly on in-the-wild images, which are widely existed in various practical applications. In this paper, we propose a novel BIQA model for in-the-wild images by addressing two critical problems in this field: how to learn better quality-aware feature representation, and how to solve the problem of insufficient training samples in terms of their content and distortion diversity. Considering that perceptual visual quality is affected by both low-level visual features (e.g. distortions) and high-level semantic information (e.g. content), we first propose a staircase structure to hierarchically integrate the features from intermediate layers into the final feature representation, which enables the model to make full use of visual information from low-level to high-level. Then an iterative mixed database training (IMDT) strategy is proposed to train the BIQA model on multiple databases simultaneously, so the model can benefit from the increase in both training samples and image content and distortion diversity and can learn a more general feature representation. Experimental results show that the proposed model outperforms other state-of-the-art BIQA models on six in-the-wild IQA databases by a large margin. Moreover, the proposed model shows an excellent performance in the cross-database evaluation experiments, which further demonstrates that the learned feature representation is robust to images with diverse distortions and content. The code will be released publicly for reproducible research

    Object Discovery From a Single Unlabeled Image by Mining Frequent Itemset With Multi-scale Features

    Full text link
    TThe goal of our work is to discover dominant objects in a very general setting where only a single unlabeled image is given. This is far more challenge than typical co-localization or weakly-supervised localization tasks. To tackle this problem, we propose a simple but effective pattern mining-based method, called Object Location Mining (OLM), which exploits the advantages of data mining and feature representation of pre-trained convolutional neural networks (CNNs). Specifically, we first convert the feature maps from a pre-trained CNN model into a set of transactions, and then discovers frequent patterns from transaction database through pattern mining techniques. We observe that those discovered patterns, i.e., co-occurrence highlighted regions, typically hold appearance and spatial consistency. Motivated by this observation, we can easily discover and localize possible objects by merging relevant meaningful patterns. Extensive experiments on a variety of benchmarks demonstrate that OLM achieves competitive localization performance compared with the state-of-the-art methods. We also evaluate our approach compared with unsupervised saliency detection methods and achieves competitive results on seven benchmark datasets. Moreover, we conduct experiments on fine-grained classification to show that our proposed method can locate the entire object and parts accurately, which can benefit to improving the classification results significantly

    Image Retrieval Using Auto Encoding Features In Deep Learning

    Get PDF
    The latest technologies and growth in availability of image storage in day to day life has made a vast storage place for the images in the database. Several devices which help in capturing the image contribute to a huge repository of images. Keeping in mind the daily input in the database, one must think of retrieving those images according to certain criteria mentioned. Several techniques such as shape of the object, Discrete Wavelet transform (DWT), texture features etc. were used in determining the type of image and classifying them. Segmentation also plays a vital role in image retrieval but the robustness is lacking in most of the cases. The process of retrieval mainly depends on the special characteristics possessed by an image rather than the whole image. Two types of image retrieval can be seen. One with a general object and the other which may be specific to some type of application. Modern deep neural networks for unsupervised feature learning like Deep Autoencoder (AE) learn embedded representations by stacking layers on top of each other. These learnt embedded-representations, however, may degrade as the AE network deepens due to vanishing gradient, resulting in decreased performance. We have introduced here the ResNet Autoencoder (RAE) and its convolutional version (C-RAE) for unsupervised feature based learning. The proposed model is tested on three distinct databases Corel1K, Cifar-10, Cifar-100 which differ in size. The presented algorithm have significantly reduced computation time and provided very high image retrieval levels of accuracy

    Using the Natural Scenes’ Edges for Assessing Image Quality Blindly and Efficiently

    Get PDF
    Two real blind/no-reference (NR) image quality assessment (IQA) algorithms in the spatial domain are developed. To measure image quality, the introduced approach uses an unprecedented concept for gathering a set of novel features based on edges of natural scenes. The enhanced sensitivity of the human eye to the information carried by edge and contour of an image supports this claim. The effectiveness of the proposed technique in quantifying image quality has been studied. The gathered features are formed using both Weibull distribution statistics and two sharpness functions to devise two separate NR IQA algorithms. The presented algorithms do not need training on databases of human judgments or even prior knowledge about expected distortions, so they are real NR IQA algorithms. In contrast to the most general no-reference IQA, the model used for this study is generic and has been created in such a way that it is not specified to any particular distortion type. When testing the proposed algorithms on LIVE database, experiments show that they correlate well with subjective opinion scores. They also show that the introduced methods significantly outperform the popular full-reference peak signal-to-noise ratio (PSNR) and the structural similarity (SSIM) methods. Besides they outperform the recently developed NR natural image quality evaluator (NIQE) model
    • 

    corecore