3,711 research outputs found
ImageSpirit: Verbal Guided Image Parsing
Humans describe images in terms of nouns and adjectives while algorithms
operate on images represented as sets of pixels. Bridging this gap between how
humans would like to access images versus their typical representation is the
goal of image parsing, which involves assigning object and attribute labels to
pixel. In this paper we propose treating nouns as object labels and adjectives
as visual attribute labels. This allows us to formulate the image parsing
problem as one of jointly estimating per-pixel object and attribute labels from
a set of training images. We propose an efficient (interactive time) solution.
Using the extracted labels as handles, our system empowers a user to verbally
refine the results. This enables hands-free parsing of an image into pixel-wise
object/attribute labels that correspond to human semantics. Verbally selecting
objects of interests enables a novel and natural interaction modality that can
possibly be used to interact with new generation devices (e.g. smart phones,
Google Glass, living room devices). We demonstrate our system on a large number
of real-world images with varying complexity. To help understand the tradeoffs
compared to traditional mouse based interactions, results are reported for both
a large scale quantitative evaluation and a user study.Comment: http://mmcheng.net/imagespirit
Creating virtual models from uncalibrated camera views
The reconstruction of photorealistic 3D models from camera views is becoming an ubiquitous element in many applications that simulate physical interaction with the real world. In this paper, we present a low-cost, interactive pipeline aimed at non-expert users, that achieves 3D reconstruction from multiple views acquired with a standard digital camera. 3D models are amenable to access through diverse representation modalities that typically imply trade-offs between level of detail, interaction, and computational costs. Our approach allows users to selectively control the complexity of different surface regions, while requiring only simple 2D image editing operations. An initial reconstruction at coarse resolution is followed by an iterative refining of the surface areas corresponding to the selected regions
Automating joiners
Pictures taken from different view points cannot be stitched into a geometrically consistent mosaic, unless the structure of the scene is very special. However, geometrical consistency is not the only criterion for success: incorporating multiple view points into the same picture may produce compelling and informative representations. A multi viewpoint form of visual expression that has recently become highly popular is that of joiners (a term coined by artist David Hockney). Joiners are compositions where photographs are layered on a 2D canvas, with some photographs occluding others and boundaries fully visible.
Composing joiners is currently a tedious manual process, especially when a great number of photographs is involved. We are thus interested in automating their construction. Our approach is based on optimizing a cost function encouraging image-to-image consistency which is measured on point-features and along picture boundaries. The optimization looks for consistency in the 2D composition rather than 3D geometrical scene consistency and explicitly considers occlusion between pictures. We illustrate our ideas with a number of experiments on collections of images of objects, people, and outdoor scenes
Capturing natural-colour 3D models of insects for species discovery
Collections of biological specimens are fundamental to scientific
understanding and characterization of natural diversity. This paper presents a
system for liberating useful information from physical collections by bringing
specimens into the digital domain so they can be more readily shared, analyzed,
annotated and compared. It focuses on insects and is strongly motivated by the
desire to accelerate and augment current practices in insect taxonomy which
predominantly use text, 2D diagrams and images to describe and characterize
species. While these traditional kinds of descriptions are informative and
useful, they cannot cover insect specimens "from all angles" and precious
specimens are still exchanged between researchers and collections for this
reason. Furthermore, insects can be complex in structure and pose many
challenges to computer vision systems. We present a new prototype for a
practical, cost-effective system of off-the-shelf components to acquire
natural-colour 3D models of insects from around 3mm to 30mm in length. Colour
images are captured from different angles and focal depths using a digital
single lens reflex (DSLR) camera rig and two-axis turntable. These 2D images
are processed into 3D reconstructions using software based on a visual hull
algorithm. The resulting models are compact (around 10 megabytes), afford
excellent optical resolution, and can be readily embedded into documents and
web pages, as well as viewed on mobile devices. The system is portable, safe,
relatively affordable, and complements the sort of volumetric data that can be
acquired by computed tomography. This system provides a new way to augment the
description and documentation of insect species holotypes, reducing the need to
handle or ship specimens. It opens up new opportunities to collect data for
research, education, art, entertainment, biodiversity assessment and
biosecurity control.Comment: 24 pages, 17 figures, PLOS ONE journa
Search-and-replace editing for personal photo collections
We propose a new system for editing personal photo collections, inspired by search-and-replace editing for text. In our system, local edits specified by the user in a single photo (e.g., using the “clone brush” tool) can be propagated automatically to other photos in the same collection, by matching the edited region across photos. To achieve this, we build on tools from computer vision for image matching. Our experimental results on real photo collections demonstrate the feasibility and potential benefits of our approach.Natural Sciences and Engineering Research Council of Canada Postdoctoral FellowshipMassachusetts Institute of Technology. Undergraduate Research Opportunities ProgramNational Science Foundation (U.S.) (CAREER award 0447561)T-Party ProjectUnited States. National Geospatial-Intelligence Agency (NGA NEGI-1582- 04-0004)United States. Office of Naval Research. Multidisciplinary University Research Initiative (Grant N00014-06-1-0734)Microsoft ResearchAlfred P. Sloan Foundatio
- …