18 research outputs found
CG-DIQA: No-reference Document Image Quality Assessment Based on Character Gradient
Document image quality assessment (DIQA) is an important and challenging
problem in real applications. In order to predict the quality scores of
document images, this paper proposes a novel no-reference DIQA method based on
character gradient, where the OCR accuracy is used as a ground-truth quality
metric. Character gradient is computed on character patches detected with the
maximally stable extremal regions (MSER) based method. Character patches are
essentially significant to character recognition and therefore suitable for use
in estimating document image quality. Experiments on a benchmark dataset show
that the proposed method outperforms the state-of-the-art methods in estimating
the quality score of document images.Comment: To be published in Proc. of ICPR 201
TrueImage: A Machine Learning Algorithm to Improve the Quality of Telehealth Photos
Telehealth is an increasingly critical component of the health care
ecosystem, especially due to the COVID-19 pandemic. Rapid adoption of
telehealth has exposed limitations in the existing infrastructure. In this
paper, we study and highlight photo quality as a major challenge in the
telehealth workflow. We focus on teledermatology, where photo quality is
particularly important; the framework proposed here can be generalized to other
health domains. For telemedicine, dermatologists request that patients submit
images of their lesions for assessment. However, these images are often of
insufficient quality to make a clinical diagnosis since patients do not have
experience taking clinical photos. A clinician has to manually triage poor
quality images and request new images to be submitted, leading to wasted time
for both the clinician and the patient. We propose an automated image
assessment machine learning pipeline, TrueImage, to detect poor quality
dermatology photos and to guide patients in taking better photos. Our
experiments indicate that TrueImage can reject 50% of the sub-par quality
images, while retaining 80% of good quality images patients send in, despite
heterogeneity and limitations in the training data. These promising results
suggest that our solution is feasible and can improve the quality of
teledermatology care.Comment: 12 pages, 5 figures, Preprint of an article published in Pacific
Symposium on Biocomputing \c{opyright} 2020 World Scientific Publishing Co.,
Singapore, http://psb.stanford.edu
Uncovering local aggregated air quality index with smartphone captured images leveraging efficient deep convolutional neural network
The prevalence and mobility of smartphones make these a widely used tool for
environmental health research. However, their potential for determining
aggregated air quality index (AQI) based on PM2.5 concentration in specific
locations remains largely unexplored in the existing literature. In this paper,
we thoroughly examine the challenges associated with predicting
location-specific PM2.5 concentration using images taken with smartphone
cameras. The focus of our study is on Dhaka, the capital of Bangladesh, due to
its significant air pollution levels and the large population exposed to it.
Our research involves the development of a Deep Convolutional Neural Network
(DCNN), which we train using over a thousand outdoor images taken and
annotated. These photos are captured at various locations in Dhaka, and their
labels are based on PM2.5 concentration data obtained from the local US
consulate, calculated using the NowCast algorithm. Through supervised learning,
our model establishes a correlation index during training, enhancing its
ability to function as a Picture-based Predictor of PM2.5 Concentration (PPPC).
This enables the algorithm to calculate an equivalent daily averaged AQI index
from a smartphone image. Unlike, popular overly parameterized models, our model
shows resource efficiency since it uses fewer parameters. Furthermore, test
results indicate that our model outperforms popular models like ViT and INN, as
well as popular CNN-based models such as VGG19, ResNet50, and MobileNetV2, in
predicting location-specific PM2.5 concentration. Our dataset is the first
publicly available collection that includes atmospheric images and
corresponding PM2.5 measurements from Dhaka. Our code and dataset will be made
public when publishing the paper.Comment: 18 pages, 7 figures, submitted to Nature Scientific Report
Subjective and Objective Quality Assessment for in-the-Wild Computer Graphics Images
Computer graphics images (CGIs) are artificially generated by means of
computer programs and are widely perceived under various scenarios, such as
games, streaming media, etc. In practical, the quality of CGIs consistently
suffers from poor rendering during the production and inevitable compression
artifacts during the transmission of multimedia applications. However, few
works have been dedicated to dealing with the challenge of computer graphics
images quality assessment (CGIQA). Most image quality assessment (IQA) metrics
are developed for natural scene images (NSIs) and validated on the databases
consisting of NSIs with synthetic distortions, which are not suitable for
in-the-wild CGIs. To bridge the gap between evaluating the quality of NSIs and
CGIs, we construct a large-scale in-the-wild CGIQA database consisting of 6,000
CGIs (CGIQA-6k) and carry out the subjective experiment in a well-controlled
laboratory environment to obtain the accurate perceptual ratings of the CGIs.
Then, we propose an effective deep learning-based no-reference (NR) IQA model
by utilizing multi-stage feature fusion strategy and multi-stage channel
attention mechanism. The major motivation of the proposed model is to make full
use of inter-channel information from low-level to high-level since CGIs have
apparent patterns as well as rich interactive semantic content. Experimental
results show that the proposed method outperforms all other state-of-the-art NR
IQA methods on the constructed CGIQA-6k database and other CGIQA-related
databases. The database along with the code will be released to facilitate
further research
Stereoscopic video quality assessment based on 3D convolutional neural networks
The research of stereoscopic video quality assessment (SVQA) plays an important role for promoting the development of stereoscopic video system. Existing SVQA metrics rely on hand-crafted features, which is inaccurate and time-consuming because of the diversity and complexity of stereoscopic video distortion. This paper introduces a 3D convolutional neural networks (CNN) based SVQA framework that can model not only local spatio-temporal information but also global temporal information with cubic difference video patches as input. First, instead of using hand-crafted features, we design a 3D CNN architecture to automatically and effectively capture local spatio-temporal features. Then we employ a quality score fusion strategy considering global temporal clues to obtain final video-level predicted score. Extensive experiments conducted on two public stereoscopic video quality datasets show that the proposed method correlates highly with human perception and outperforms state-of-the-art methods by a large margin. We also show that our 3D CNN features have more desirable property for SVQA than hand-crafted features in previous methods, and our 3D CNN features together with support vector regression (SVR) can further boost the performance. In addition, with no complex preprocessing and GPU acceleration, our proposed method is demonstrated computationally efficient and easy to use
A blind stereoscopic image quality evaluator with segmented stacked autoencoders considering the whole visual perception route
Most of the current blind stereoscopic image quality assessment (SIQA) algorithms cannot show reliable accuracy. One reason is that they do not have the deep architectures and the other reason is that they are designed on the relatively weak biological basis, compared with findings on human visual system (HVS). In this paper, we propose a Deep Edge and COlor Signal INtegrity Evaluator (DECOSINE) based on the whole visual perception route from eyes to the frontal lobe, and especially focus on edge and color signal processing in retinal ganglion cells (RGC) and lateral geniculate nucleus (LGN). Furthermore, to model the complex and deep structure of the visual cortex, Segmented Stacked Auto-encoder (S-SAE) is used, which has not utilized for SIQA before. The utilization of the S-SAE complements weakness of deep learning-based SIQA metrics that require a very long training time. Experiments are conducted on popular SIQA databases, and the superiority of DECOSINE in terms of prediction accuracy and monotonicity is proved. The experimental results show that our model about the whole visual perception route and utilization of S-SAE are effective for SIQA
Blind Omnidirectional Image Quality Assessment with Viewport Oriented Graph Convolutional Networks
Quality assessment of omnidirectional images has become increasingly urgent
due to the rapid growth of virtual reality applications. Different from
traditional 2D images and videos, omnidirectional contents can provide
consumers with freely changeable viewports and a larger field of view covering
the spherical surface, which makes the objective
quality assessment of omnidirectional images more challenging. In this paper,
motivated by the characteristics of the human vision system (HVS) and the
viewing process of omnidirectional contents, we propose a novel Viewport
oriented Graph Convolution Network (VGCN) for blind omnidirectional image
quality assessment (IQA). Generally, observers tend to give the subjective
rating of a 360-degree image after passing and aggregating different viewports
information when browsing the spherical scenery. Therefore, in order to model
the mutual dependency of viewports in the omnidirectional image, we build a
spatial viewport graph. Specifically, the graph nodes are first defined with
selected viewports with higher probabilities to be seen, which is inspired by
the HVS that human beings are more sensitive to structural information. Then,
these nodes are connected by spatial relations to capture interactions among
them. Finally, reasoning on the proposed graph is performed via graph
convolutional networks. Moreover, we simultaneously obtain global quality using
the entire omnidirectional image without viewport sampling to boost the
performance according to the viewing experience. Experimental results
demonstrate that our proposed model outperforms state-of-the-art full-reference
and no-reference IQA metrics on two public omnidirectional IQA databases
Recent Advances in Forest Observation with Visual Interpretation of Very High-Resolution Imagery
The land area covered by freely available very high-resolution (VHR) imagery has grown dramatically over recent years, which has considerable relevance for forest observation and monitoring. For example, it is possible to recognize and extract a number of features related to forest type, forest management, degradation and disturbance using VHR imagery. Moreover, time series of medium-to-high-resolution imagery such as MODIS, Landsat or Sentinel has allowed for monitoring of parameters related to forest cover change. Although automatic classification is used regularly to monitor forests using medium-resolution imagery, VHR imagery and changes in web-based technology have opened up new possibilities for the role of visual interpretation in forest observation. Visual interpretation of VHR is typically employed to provide training and/or validation data for other remote sensing-based techniques or to derive statistics directly on forest cover/forest cover change over large regions. Hence, this paper reviews the state of the art in tools designed for visual interpretation of VHR, including Geo-Wiki, LACO-Wiki and Collect Earth as well as issues related to interpretation of VHR imagery and approaches to quality assurance. We have also listed a number of success stories where visual interpretation plays a crucial role, including a global forest mask harmonized with FAO FRA country statistics; estimation of dryland forest area; quantification of deforestation; national reporting to the UNFCCC; and drivers of forest change