25,830 research outputs found
A deep learning framework for quality assessment and restoration in video endoscopy
Endoscopy is a routine imaging technique used for both diagnosis and
minimally invasive surgical treatment. Artifacts such as motion blur, bubbles,
specular reflections, floating objects and pixel saturation impede the visual
interpretation and the automated analysis of endoscopy videos. Given the
widespread use of endoscopy in different clinical applications, we contend that
the robust and reliable identification of such artifacts and the automated
restoration of corrupted video frames is a fundamental medical imaging problem.
Existing state-of-the-art methods only deal with the detection and restoration
of selected artifacts. However, typically endoscopy videos contain numerous
artifacts which motivates to establish a comprehensive solution.
We propose a fully automatic framework that can: 1) detect and classify six
different primary artifacts, 2) provide a quality score for each frame and 3)
restore mildly corrupted frames. To detect different artifacts our framework
exploits fast multi-scale, single stage convolutional neural network detector.
We introduce a quality metric to assess frame quality and predict image
restoration success. Generative adversarial networks with carefully chosen
regularization are finally used to restore corrupted frames.
Our detector yields the highest mean average precision (mAP at 5% threshold)
of 49.0 and the lowest computational time of 88 ms allowing for accurate
real-time processing. Our restoration models for blind deblurring, saturation
correction and inpainting demonstrate significant improvements over previous
methods. On a set of 10 test videos we show that our approach preserves an
average of 68.7% which is 25% more frames than that retained from the raw
videos.Comment: 14 page
Priming Neural Networks
Visual priming is known to affect the human visual system to allow detection
of scene elements, even those that may have been near unnoticeable before, such
as the presence of camouflaged animals. This process has been shown to be an
effect of top-down signaling in the visual system triggered by the said cue. In
this paper, we propose a mechanism to mimic the process of priming in the
context of object detection and segmentation. We view priming as having a
modulatory, cue dependent effect on layers of features within a network. Our
results show how such a process can be complementary to, and at times more
effective than simple post-processing applied to the output of the network,
notably so in cases where the object is hard to detect such as in severe noise.
Moreover, we find the effects of priming are sometimes stronger when early
visual layers are affected. Overall, our experiments confirm that top-down
signals can go a long way in improving object detection and segmentation.Comment: fixed error in author nam
Multi-Path Region-Based Convolutional Neural Network for Accurate Detection of Unconstrained "Hard Faces"
Large-scale variations still pose a challenge in unconstrained face
detection. To the best of our knowledge, no current face detection algorithm
can detect a face as large as 800 x 800 pixels while simultaneously detecting
another one as small as 8 x 8 pixels within a single image with equally high
accuracy. We propose a two-stage cascaded face detection framework, Multi-Path
Region-based Convolutional Neural Network (MP-RCNN), that seamlessly combines a
deep neural network with a classic learning strategy, to tackle this challenge.
The first stage is a Multi-Path Region Proposal Network (MP-RPN) that proposes
faces at three different scales. It simultaneously utilizes three parallel
outputs of the convolutional feature maps to predict multi-scale candidate face
regions. The "atrous" convolution trick (convolution with up-sampled filters)
and a newly proposed sampling layer for "hard" examples are embedded in MP-RPN
to further boost its performance. The second stage is a Boosted Forests
classifier, which utilizes deep facial features pooled from inside the
candidate face regions as well as deep contextual features pooled from a larger
region surrounding the candidate face regions. This step is included to further
remove hard negative samples. Experiments show that this approach achieves
state-of-the-art face detection performance on the WIDER FACE dataset "hard"
partition, outperforming the former best result by 9.6% for the Average
Precision.Comment: 11 pages, 7 figures, to be presented at CRV 201
Recurrent Attentional Networks for Saliency Detection
Convolutional-deconvolution networks can be adopted to perform end-to-end
saliency detection. But, they do not work well with objects of multiple scales.
To overcome such a limitation, in this work, we propose a recurrent attentional
convolutional-deconvolution network (RACDNN). Using spatial transformer and
recurrent network units, RACDNN is able to iteratively attend to selected image
sub-regions to perform saliency refinement progressively. Besides tackling the
scale problem, RACDNN can also learn context-aware features from past
iterations to enhance saliency refinement in future iterations. Experiments on
several challenging saliency detection datasets validate the effectiveness of
RACDNN, and show that RACDNN outperforms state-of-the-art saliency detection
methods.Comment: CVPR 201
Big data analytics:Computational intelligence techniques and application areas
Big Data has significant impact in developing functional smart cities and supporting modern societies. In this paper, we investigate the importance of Big Data in modern life and economy, and discuss challenges arising from Big Data utilization. Different computational intelligence techniques have been considered as tools for Big Data analytics. We also explore the powerful combination of Big Data and Computational Intelligence (CI) and identify a number of areas, where novel applications in real world smart city problems can be developed by utilizing these powerful tools and techniques. We present a case study for intelligent transportation in the context of a smart city, and a novel data modelling methodology based on a biologically inspired universal generative modelling approach called Hierarchical Spatial-Temporal State Machine (HSTSM). We further discuss various implications of policy, protection, valuation and commercialization related to Big Data, its applications and deployment
- …