58,578 research outputs found
Hybrid coding of visual content and local image features
Distributed visual analysis applications, such as mobile visual search or
Visual Sensor Networks (VSNs) require the transmission of visual content on a
bandwidth-limited network, from a peripheral node to a processing unit.
Traditionally, a Compress-Then-Analyze approach has been pursued, in which
sensing nodes acquire and encode the pixel-level representation of the visual
content, that is subsequently transmitted to a sink node in order to be
processed. This approach might not represent the most effective solution, since
several analysis applications leverage a compact representation of the content,
thus resulting in an inefficient usage of network resources. Furthermore,
coding artifacts might significantly impact the accuracy of the visual task at
hand. To tackle such limitations, an orthogonal approach named
Analyze-Then-Compress has been proposed. According to such a paradigm, sensing
nodes are responsible for the extraction of visual features, that are encoded
and transmitted to a sink node for further processing. In spite of improved
task efficiency, such paradigm implies the central processing node not being
able to reconstruct a pixel-level representation of the visual content. In this
paper we propose an effective compromise between the two paradigms, namely
Hybrid-Analyze-Then-Compress (HATC) that aims at jointly encoding visual
content and local image features. Furthermore, we show how a target tradeoff
between image quality and task accuracy might be achieved by accurately
allocating the bitrate to either visual content or local features.Comment: submitted to IEEE International Conference on Image Processin
PEA265: Perceptual Assessment of Video Compression Artifacts
The most widely used video encoders share a common hybrid coding framework
that includes block-based motion estimation/compensation and block-based
transform coding. Despite their high coding efficiency, the encoded videos
often exhibit visually annoying artifacts, denoted as Perceivable Encoding
Artifacts (PEAs), which significantly degrade the visual Qualityof- Experience
(QoE) of end users. To monitor and improve visual QoE, it is crucial to develop
subjective and objective measures that can identify and quantify various types
of PEAs. In this work, we make the first attempt to build a large-scale
subjectlabelled database composed of H.265/HEVC compressed videos containing
various PEAs. The database, namely the PEA265 database, includes 4 types of
spatial PEAs (i.e. blurring, blocking, ringing and color bleeding) and 2 types
of temporal PEAs (i.e. flickering and floating). Each containing at least
60,000 image or video patches with positive and negative labels. To objectively
identify these PEAs, we train Convolutional Neural Networks (CNNs) using the
PEA265 database. It appears that state-of-theart ResNeXt is capable of
identifying each type of PEAs with high accuracy. Furthermore, we define PEA
pattern and PEA intensity measures to quantify PEA levels of compressed video
sequence. We believe that the PEA265 database and our findings will benefit the
future development of video quality assessment methods and perceptually
motivated video encoders.Comment: 10 pages,15 figures,4 table
Exploiting Deep Features for Remote Sensing Image Retrieval: A Systematic Investigation
Remote sensing (RS) image retrieval is of great significant for geological
information mining. Over the past two decades, a large amount of research on
this task has been carried out, which mainly focuses on the following three
core issues: feature extraction, similarity metric and relevance feedback. Due
to the complexity and multiformity of ground objects in high-resolution remote
sensing (HRRS) images, there is still room for improvement in the current
retrieval approaches. In this paper, we analyze the three core issues of RS
image retrieval and provide a comprehensive review on existing methods.
Furthermore, for the goal to advance the state-of-the-art in HRRS image
retrieval, we focus on the feature extraction issue and delve how to use
powerful deep representations to address this task. We conduct systematic
investigation on evaluating correlative factors that may affect the performance
of deep features. By optimizing each factor, we acquire remarkable retrieval
results on publicly available HRRS datasets. Finally, we explain the
experimental phenomenon in detail and draw conclusions according to our
analysis. Our work can serve as a guiding role for the research of
content-based RS image retrieval
- …