Search CORE

169 research outputs found

Mosaic: Designing Online Creative Communities for Sharing Works-in-Progress

Author: Agrawala Maneesh
Bernstein Michael S.
Kim Joy
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 08/11/2016
Field of study

Online creative communities allow creators to share their work with a large audience, maximizing opportunities to showcase their work and connect with fans and peers. However, sharing in-progress work can be technically and socially challenging in environments designed for sharing completed pieces. We propose an online creative community where sharing process, rather than showcasing outcomes, is the main method of sharing creative work. Based on this, we present Mosaic---an online community where illustrators share work-in-progress snapshots showing how an artwork was completed from start to finish. In an online deployment and observational study, artists used Mosaic as a vehicle for reflecting on how they can improve their own creative process, developed a social norm of detailed feedback, and became less apprehensive of sharing early versions of artwork. Through Mosaic, we argue that communities oriented around sharing creative process can create a collaborative environment that is beneficial for creative growth

arXiv.org e-Print Archive

Crossref

Searching the Visual Style and Structure of D3 Visualizations

Author: Agrawala Maneesh
Hoque Enamul
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2019
Field of study

We present a search engine for D3 visualizations that allows queries based on their visual style and underlying structure. To build the engine we crawl a collection of 7860 D3 visualizations from the Web and deconstruct each one to recover its data, its data-encoding marks and the encodings describing how the data is mapped to visual attributes of the marks. We also extract axes and other non-data-encoding attributes of marks (e.g., typeface, background color). Our search engine indexes this style and structure information as well as metadata about the webpage containing the chart. We show how visualization developers can search the collection to find visualizations that exhibit specific design characteristics and thereby explore the space of possible designs. We also demonstrate how researchers can use the search engine to identify commonly used visual design patterns and we perform such a demographic design analysis across our collection of D3 charts. A user study reveals that visualization developers found our style and structure based search engine to be significantly more useful and satisfying for finding different designs of D3 charts, than a baseline search engine that only allows keyword search over the webpage containing a chart

arXiv.org e-Print Archive

Crossref

Adding Conditional Control to Text-to-Image Diffusion Models

Author: Agrawala Maneesh
Rao Anyi
Zhang Lvmin
Publication venue
Publication date: 02/09/2023
Field of study

We present ControlNet, a neural network architecture to add spatial conditioning controls to large, pretrained text-to-image diffusion models. ControlNet locks the production-ready large diffusion models, and reuses their deep and robust encoding layers pretrained with billions of images as a strong backbone to learn a diverse set of conditional controls. The neural architecture is connected with "zero convolutions" (zero-initialized convolution layers) that progressively grow the parameters from zero and ensure that no harmful noise could affect the finetuning. We test various conditioning controls, eg, edges, depth, segmentation, human pose, etc, with Stable Diffusion, using single or multiple conditions, with or without prompts. We show that the training of ControlNets is robust with small (<50k) and large (>1m) datasets. Extensive results show that ControlNet may facilitate wider applications to control image diffusion models.Comment: Codes and Supplementary Material: https://github.com/lllyasviel/ControlNe

arXiv.org e-Print Archive

Recommended from our members

Efficient Shadows from Sampled Environment Maps

Author: Agrawala Maneesh
Ben-Artzi Aner
Ramamoorthi Ravi
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2004
Field of study

This paper addresses the problem of efficiently calculating shadows from environment maps. Since accurate rendering of shadows from environment maps requires hundreds of lights, the expensive computation is determining visibility from each pixel to each light direction, such as by ray-tracing. We show that coherence in both spatial and angular domains can be used to reduce the number of shadow rays that need to be traced. Specifically, we use a coarse-to-fine evaluation of the image, predicting visibility by reusing visibility calculations from four nearby pixels that have already been evaluated. This simple method allows us to explicitly mark regions of uncertainty in the prediction. By only tracing rays in these and neighboring directions, we are able to reduce the number of shadow rays traced by up to a factor of 20 while maintaining error rates below 0.01%. For many scenes, our algorithm can add shadowing from hundreds of lights at twice the cost of rendering without shadows

Columbia University Academic Commons

Tree-Structured Shading Decomposition

Author: Agrawala Maneesh
Geng Chen
Wu Jiajun
Yu Hong-Xing
Zhang Sharon
Publication venue
Publication date: 13/09/2023
Field of study

We study inferring a tree-structured representation from a single image for object shading. Prior work typically uses the parametric or measured representation to model shading, which is neither interpretable nor easily editable. We propose using the shade tree representation, which combines basic shading nodes and compositing methods to factorize object surface shading. The shade tree representation enables novice users who are unfamiliar with the physical shading process to edit object shading in an efficient and intuitive manner. A main challenge in inferring the shade tree is that the inference problem involves both the discrete tree structure and the continuous parameters of the tree nodes. We propose a hybrid approach to address this issue. We introduce an auto-regressive inference model to generate a rough estimation of the tree structure and node parameters, and then we fine-tune the inferred shade tree through an optimization algorithm. We show experiments on synthetic images, captured reflectance, real images, and non-realistic vector drawings, allowing downstream applications such as material editing, vectorized shading, and relighting. Project website: https://chen-geng.com/inv-shade-treesComment: Accepted at ICCV 2023. Project website: https://chen-geng.com/inv-shade-tree

arXiv.org e-Print Archive

Bridging the Gulf of Envisioning: Cognitive Design Challenges in LLM Interfaces

Author: Agrawala Maneesh
Pea Roy
Pondoc Christopher Lawrence
Seifert Colleen
Subramonyam Hariharan
Publication venue
Publication date: 25/09/2023
Field of study

Large language models (LLMs) exhibit dynamic capabilities and appear to comprehend complex and ambiguous natural language prompts. However, calibrating LLM interactions is challenging for interface designers and end-users alike. A central issue is our limited grasp of how human cognitive processes begin with a goal and form intentions for executing actions, a blindspot even in established interaction models such as Norman's gulfs of execution and evaluation. To address this gap, we theorize how end-users 'envision' translating their goals into clear intentions and craft prompts to obtain the desired LLM response. We define a process of Envisioning by highlighting three misalignments: (1) knowing whether LLMs can accomplish the task, (2) how to instruct the LLM to do the task, and (3) how to evaluate the success of the LLM's output in meeting the goal. Finally, we make recommendations to narrow the envisioning gulf in human-LLM interactions

arXiv.org e-Print Archive

De-emphasis of distracting image regions using texture power maps

Author: Agrawala Maneesh
Durand Frédo
Su Sara L.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2005
Field of study

We present a post-processing technique that selectively reduces the salience of distracting regions in an image. Computational models of attention predict that texture variation influences bottom-up attention mechanisms. Our method reduces the spatial variation of texture using power maps, high-order features describing local frequency content in an image. Modification of power maps results in effective regional de-emphasis. We validate our results quantitatively via a human subject search experiment and qualitatively with eye tracking data.Singapore-MIT Alliance (SMA

CiteSeerX

Crossref

DSpace@MIT