45,338 research outputs found
A Study of the Role of Visual Information in Supporting Ideation in Graphic Design
Existing computer technologies poorly support the ideation phase common to graphic design practice. Finding and indexing visual material to assist the process of ideation often fall on the designer, leading to user experiences that are less than ideal. To inform development of computer systems to assist graphic designers in the ideation phase of the design process, we conducted interviews with 15 professional graphic designers about their design process and visual information needs. Based on the study, we propose a set of requirements for an ideation-support system for graphic design
Exploration of a Polarized Surface Bidirectional Reflectance Model Using the Ground-Based Multiangle Spectropolarimetric Imager
Accurate characterization of surface reflection is essential for retrieval of aerosols using downward-looking remote sensors. In this paper, observations from the Ground-based Multiangle SpectroPolarimetric Imager (GroundMSPI) are used to evaluate a surface polarized bidirectional reflectance distribution function (PBRDF) model. GroundMSPI is an eight-band spectropolarimetric camera mounted on a rotating gimbal to acquire pushbroom imagery of outdoor landscapes. The camera uses a very accurate photoelastic-modulator-based polarimetric imaging technique to acquire Stokes vector measurements in three of the instrument's bands (470, 660, and 865 nm). A description of the instrument is presented, and observations of selected targets within a scene acquired on 6 January 2010 are analyzed. Data collected during the course of the day as the Sun moved across the sky provided a range of illumination geometries that facilitated evaluation of the surface model, which is comprised of a volumetric reflection term represented by the modified Rahman-Pinty-Verstraete function plus a specular reflection term generated by a randomly oriented array of Fresnel-reflecting microfacets. While the model is fairly successful in predicting the polarized reflection from two grass targets in the scene, it does a poorer job for two manmade targets (a parking lot and a truck roof), possibly due to their greater degree of geometric organization. Several empirical adjustments to the model are explored and lead to improved fits to the data. For all targets, the data support the notion of spectral invariance in the angular shape of the unpolarized and polarized surface reflection. As noted by others, this behavior provides valuable constraints on the aerosol retrieval problem, and highlights the importance of multiangle observations.NASAJPLCenter for Space Researc
Recent Developments in Cultural Heritage Image Databases: Directions for User-Centered Design
published or submitted for publicatio
Digital Image Access & Retrieval
The 33th Annual Clinic on Library Applications of Data Processing, held at the University of Illinois at Urbana-Champaign in March of 1996, addressed the theme of "Digital Image Access & Retrieval." The papers from this conference cover a wide range of topics concerning digital imaging technology for visual resource collections. Papers covered three general areas: (1) systems, planning, and implementation; (2) automatic and semi-automatic indexing; and (3) preservation with the bulk of the conference focusing on indexing and retrieval.published or submitted for publicatio
Towards Diverse and Natural Image Descriptions via a Conditional GAN
Despite the substantial progress in recent years, the image captioning
techniques are still far from being perfect.Sentences produced by existing
methods, e.g. those based on RNNs, are often overly rigid and lacking in
variability. This issue is related to a learning principle widely used in
practice, that is, to maximize the likelihood of training samples. This
principle encourages high resemblance to the "ground-truth" captions while
suppressing other reasonable descriptions. Conventional evaluation metrics,
e.g. BLEU and METEOR, also favor such restrictive methods. In this paper, we
explore an alternative approach, with the aim to improve the naturalness and
diversity -- two essential properties of human expression. Specifically, we
propose a new framework based on Conditional Generative Adversarial Networks
(CGAN), which jointly learns a generator to produce descriptions conditioned on
images and an evaluator to assess how well a description fits the visual
content. It is noteworthy that training a sequence generator is nontrivial. We
overcome the difficulty by Policy Gradient, a strategy stemming from
Reinforcement Learning, which allows the generator to receive early feedback
along the way. We tested our method on two large datasets, where it performed
competitively against real people in our user study and outperformed other
methods on various tasks.Comment: accepted in ICCV2017 as an Oral pape
Balancing the power of multimedia information retrieval and usability in designing interactive TV
Steady progress in the field of multimedia information retrieval (MMIR) promises a useful set of tools that could provide new usage scenarios and features to enhance the user experience in today s digital media applications. In the interactive TV domain, the simplicity of interaction is more crucial than in any other digital media domain and ultimately determines the success or otherwise of any new applications. Thus when integrating emerging tools like MMIR into interactive TV, the increase in interface complexity and sophistication resulting from these features can easily reduce its actual usability. In this paper we describe a design strategy we developed as a result of our e®ort in balancing the power of emerging multimedia information retrieval techniques and maintaining the simplicity of the interface in interactive TV. By providing multiple levels of interface sophistication in increasing order as a viewer repeatedly presses the same button on their remote control, we provide a layered interface that can accommodate viewers requiring varying degrees of power and simplicity. A series of screen shots from the system we have actually developed and built illustrates how this is achieved
Enhancing Energy Minimization Framework for Scene Text Recognition with Top-Down Cues
Recognizing scene text is a challenging problem, even more so than the
recognition of scanned documents. This problem has gained significant attention
from the computer vision community in recent years, and several methods based
on energy minimization frameworks and deep learning approaches have been
proposed. In this work, we focus on the energy minimization framework and
propose a model that exploits both bottom-up and top-down cues for recognizing
cropped words extracted from street images. The bottom-up cues are derived from
individual character detections from an image. We build a conditional random
field model on these detections to jointly model the strength of the detections
and the interactions between them. These interactions are top-down cues
obtained from a lexicon-based prior, i.e., language statistics. The optimal
word represented by the text image is obtained by minimizing the energy
function corresponding to the random field model. We evaluate our proposed
algorithm extensively on a number of cropped scene text benchmark datasets,
namely Street View Text, ICDAR 2003, 2011 and 2013 datasets, and IIIT 5K-word,
and show better performance than comparable methods. We perform a rigorous
analysis of all the steps in our approach and analyze the results. We also show
that state-of-the-art convolutional neural network features can be integrated
in our framework to further improve the recognition performance
- …