57 research outputs found
Spatially organized visualization of image query results
Gianluigi Ciocca, Claudio Cusano, Simone Santini, Raimondo Schettini, "Spatially organized visualization of image query results", Proceedings of SPIE 7881, Multimedia on Mobile Devices 2011; and Multimedia Content Access: Algorithms and Systems V. Ed. David Akopian, Reiner Creutzburg, Cees G. M. Snoek, Nicu Sebe, Lyndon Kennedy, SPIE (2011). Copyright 2011 Society of Photo‑Optical Instrumentation Engineers. One print or electronic copy may be made for personal use only. Systematic reproduction and distribution, duplication of any material in this paper for a fee or for commercial purposes, or modification of the content of the paper are prohibited.In this work we present a system which visualizes the results obtained from image search engines in such a way that users can conveniently browse the retrieved images. The way in which search results are presented allows the user to grasp the composition of the set of images "at a glance". To do so, images are grouped and positioned according to their distribution in a prosemantic feature space which encodes information about their content at an abstraction level that can be placed between visual and semantic information. The compactness of the feature space allows a fast analysis of the image distribution so that all the computation can be performed in real time
Predicting complexity perception of real world images
The aim of this work is to predict the complexity perception of real world images.We propose a new complexity measure where different image features, based on spatial, frequency and color properties are linearly combined. In order to find the optimal set of weighting coefficients we have applied a Particle Swarm Optimization. The optimal linear combination is the one that best fits the subjective data obtained in an experiment where observers evaluate the complexity of real world scenes on a web-based interface. To test the proposed complexity measure we have performed a second experiment on a different database of real world scenes, where the linear combination previously obtained is correlated with the new subjective data. Our complexity measure outperforms not only each single visual feature but also two visual clutter measures frequently used in the literature to predict image complexity. To analyze the usefulness of our proposal, we have also considered two different sets of stimuli composed of real texture images. Tuning the parameters of our measure for this kind of stimuli, we have obtained a linear combination that still outperforms the single measures. In conclusion our measure, properly tuned, can predict complexity perception of different kind of images
State Recognition of Food Images Using Deep Features
State recognition of food images is a recent topic that is gaining a huge interest in the Computer Vision community. Recently, researchers presented a dataset of food images at different states where unfortunately no information regarding the food category was included. In practical food monitoring applications it is important to be able to recognize a peeled tomato instead of a generic peeled item. To this end, in this paper, we introduce a new dataset containing 20 different food categories taken from fruits and vegetables at 11 different states ranging from solid, sliced to creamy paste. We experiment with most common Convolutional Neural Network (CNN) architectures on three different recognition tasks: food categories, food states, and both food categories and states. Since lack of labeled data is a common situation in practical applications, here we exploits deep features extracted from CNNs combined with Support Vector Machines (SVMs) as an alternative to the End-to-End classification. We also compare deep features with several hand-crafted features. These experiments confirm that deep features outperform hand-crafted features on all the three classification tasks and whatever is the food category or food state considered. Finally, we test the generalization capability of the most performing deep features by using another, publicly available, dataset of food states. This last experiment shows that the features extracted from a CNN trained on our proposed dataset achieve performance quite close to the one achieved by the state of the art method. This confirms that our deep features are robust with respect to data never seen by the CNN
Searching through photographic databases with QuickLook
G. Ciocca, C. Cusano, R. Schettini, S. Santini, A. de Polo, F. Tavanti, “Searching through photographic databases with QuickLook”. Proc. Multimedia on Mobile Devices 2012; and Multimedia Content Access: Algorithms and Systems VI. Ed- Reiner Creutzburg; David Akopian; Cees G. M. Snoek; Nicu Sebe; Lyndon Kennedy. 8304. 83040V-1 (2012). Copyright 2012 Society of Photo‑Optical Instrumentation Engineers. One print or electronic copy may be made for personal use only. Systematic reproduction and distribution, duplication of any material in this paper for a fee or for commercial purposes, or modification of the content of the paper are prohibited.We present here the results obtained by including a new image descriptor, that we called prosemantic feature vector, within the framework of QuickLook2 image retrieval system. By coupling the prosemantic features and the relevance feedback mechanism provided by QuickLook2, the user can move in a more rapid and precise way through the feature space toward the intended goal. The prosemantic features are obtained by a two-step feature extraction process. At the first step, low level features related to image structure and color distribution are extracted from the images. At the second step, these features are used as input to a bank of classifiers, each one trained to recognize a given semantic category, to produce score vectors. We evaluated the efficacy of the prosemantic features under search tasks on a dataset provided by Fratelli Alinari Photo Archive.© (2012) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only
Predicting complexity perception of real world images
The aim of this work is to predict the complexity perception of real world images.We propose a new complexity measure where different image features, based on spatial, frequency and color properties are linearly combined. In order to find the optimal set of weighting coefficients we have applied a Particle Swarm Optimization. The optimal linear combination is the one that best fits the subjective data obtained in an experiment where observers evaluate the complexity of real world scenes on a web-based interface. To test the proposed complexity measure we have performed a second experiment on a different database of real world scenes, where the linear combination previously obtained is correlated with the new subjective data. Our complexity measure outperforms not only each single visual feature but also two visual clutter measures frequently used in the literature to predict image complexity. To analyze the usefulness of our proposal, we have also considered two different sets of stimuli composed of real texture images. Tuning the parameters of our measure for this kind of stimuli, we have obtained a linear combination that still outperforms the single measures. In conclusion our measure, properly tuned, can predict complexity perception of different kind of images
A Robust Multi-Feature Cut Detection Algorithm for Video Segmentation
Video segmentation is the first task in almost all video analysis applications. It consists in identifying the boundaries of the meaningful video units (shots). Without a doubt, cuts are the most common among production effects that characterize the shot boundaries. In this paper we propose an algorithm for cut detection exploiting an innovative, robust frame difference measure. The measure is based on a combination of different visual features. To improve the precision of the cut detection algorithm, a temporal pattern analysis model, and a flashes removal are also proposed. Experimental results to prove the effectiveness of the proposed measure coupled with the temporal pattern analysis model on very heterogeneous and complex sets of videos are critically reporte
- …