17,457 research outputs found
Salient Regions for Query by Image Content
Much previous work on image retrieval has used global features such as colour and texture to describe the content of the image. However, these global features are insufficient to accurately describe the image content when different parts of the image have different characteristics. This paper discusses how this problem can be circumvented by using salient interest points and compares and contrasts an extension to previous work in which the concept of scale is incorporated into the selection of salient regions to select the areas of the image that are most interesting and generate local descriptors to describe the image characteristics in that region. The paper describes and contrasts two such salient region descriptors and compares them through their repeatability rate under a range of common image transforms. Finally, the paper goes on to investigate the performance of one of the salient region detectors in an image retrieval situation
A new Edge Detector Based on Parametric Surface Model: Regression Surface Descriptor
In this paper we present a new methodology for edge detection in digital
images. The first originality of the proposed method is to consider image
content as a parametric surface. Then, an original parametric local model of
this surface representing image content is proposed. The few parameters
involved in the proposed model are shown to be very sensitive to
discontinuities in surface which correspond to edges in image content. This
naturally leads to the design of an efficient edge detector. Moreover, a
thorough analysis of the proposed model also allows us to explain how these
parameters can be used to obtain edge descriptors such as orientations and
curvatures.
In practice, the proposed methodology offers two main advantages. First, it
has high customization possibilities in order to be adjusted to a wide range of
different problems, from coarse to fine scale edge detection. Second, it is
very robust to blurring process and additive noise. Numerical results are
presented to emphasis these properties and to confirm efficiency of the
proposed method through a comparative study with other edge detectors.Comment: 21 pages, 13 figures and 2 table
Automated pebble mosaic stylization of images
Digital mosaics have usually used regular tiles, simulating the historical
"tessellated" mosaics. In this paper, we present a method for synthesizing
pebble mosaics, a historical mosaic style in which the tiles are rounded
pebbles. We address both the tiling problem, where pebbles are distributed over
the image plane so as to approximate the input image content, and the problem
of geometry, creating a smooth rounded shape for each pebble. We adapt SLIC,
simple linear iterative clustering, to obtain elongated tiles conforming to
image content, and smooth the resulting irregular shapes into shapes resembling
pebble cross-sections. Then, we create an interior and exterior contour for
each pebble and solve a Laplace equation over the region between them to obtain
height-field geometry. The resulting pebble set approximates the input image
while presenting full geometry that can be rendered and textured for a highly
detailed representation of a pebble mosaic
Image Content Generation with Causal Reasoning
The emergence of ChatGPT has once again sparked research in generative
artificial intelligence (GAI). While people have been amazed by the generated
results, they have also noticed the reasoning potential reflected in the
generated textual content. However, this current ability for causal reasoning
is primarily limited to the domain of language generation, such as in models
like GPT-3. In visual modality, there is currently no equivalent research.
Considering causal reasoning in visual content generation is significant. This
is because visual information contains infinite granularity. Particularly,
images can provide more intuitive and specific demonstrations for certain
reasoning tasks, especially when compared to coarse-grained text. Hence, we
propose a new image generation task called visual question answering with image
(VQAI) and establish a dataset of the same name based on the classic
\textit{Tom and Jerry} animated series. Additionally, we develop a new paradigm
for image generation to tackle the challenges of this task. Finally, we perform
extensive experiments and analyses, including visualizations of the generated
content and discussions on the potentials and limitations. The code and data
are publicly available under the license of CC BY-NC-SA 4.0 for academic and
non-commercial usage. The code and dataset are publicly available at:
https://github.com/IEIT-AGI/MIX-Shannon/blob/main/projects/VQAI/lgd_vqai.md.Comment: Accepted by the 38th Annual AAAI Conference on Artificial
Intelligence (AAAI 2024) in December 202
Recommended from our members
Investigating the impact of image content on the energy efficiency of hardware-accelerated digital spatial filters
Battery-operated low-power portable computing devices are becoming an inseparable part of human daily life. One of the major goals is to achieve the longest battery life in such a device. Additionally, the need for performance in processing multimedia content is ever increasing. Processing image and video content consume more power than other applications. A widely used approach to improving energy efficiency is to implement the computationally intensive functions as digital hardware accelerators. Spatial filtering is one of the most commonly used methods of digital image processing. As per the Fourier theory, an image can be considered as a two-dimensional signal that is composed of spatially extended two-dimensional sinusoidal patterns called gratings. Spatial frequency theory states that sinusoidal gratings can be characterised by its spatial frequency, phase, amplitude, and orientation. This article presents results from our investigation into assessing the impact of these characteristics of a digital image on the energy efficiency of hardware-accelerated spatial filters employed to process the same image. Two greyscale images each of size 128 × 128 pixels comprising two-dimensional sinusoidal gratings at maximum spatial frequency of 64 cycles per image orientated at 0° and 90°, respectively, were processed in a hardware implemented Gaussian smoothing filter. The energy efficiency of the filter was compared with the baseline energy efficiency of processing a featureless plain black image. The results show that energy efficiency of the filter drops to 12.5% when the gratings are orientated at 0° whilst rises to 72.38% at 90°
Content-based Propagation of User Markings for Interactive Segmentation of Patterned Images
Efficient and easy segmentation of images and volumes is of great practical
importance. Segmentation problems that motivate our approach originate from
microscopy imaging commonly used in materials science, medicine, and biology.
We formulate image segmentation as a probabilistic pixel classification
problem, and we apply segmentation as a step towards characterising image
content. Our method allows the user to define structures of interest by
interactively marking a subset of pixels. Thanks to the real-time feedback, the
user can place new markings strategically, depending on the current outcome.
The final pixel classification may be obtained from a very modest user input.
An important ingredient of our method is a graph that encodes image content.
This graph is built in an unsupervised manner during initialisation and is
based on clustering of image features. Since we combine a limited amount of
user-labelled data with the clustering information obtained from the unlabelled
parts of the image, our method fits in the general framework of semi-supervised
learning. We demonstrate how this can be a very efficient approach to
segmentation through pixel classification.Comment: 9 pages, 7 figures, PDFLaTe
- …