13 research outputs found

    A Dataset of Multi-Illumination Images in the Wild

    Full text link
    Collections of images under a single, uncontrolled illumination have enabled the rapid advancement of core computer vision tasks like classification, detection, and segmentation. But even with modern learning techniques, many inverse problems involving lighting and material understanding remain too severely ill-posed to be solved with single-illumination datasets. To fill this gap, we introduce a new multi-illumination dataset of more than 1000 real scenes, each captured under 25 lighting conditions. We demonstrate the richness of this dataset by training state-of-the-art models for three challenging applications: single-image illumination estimation, image relighting, and mixed-illuminant white balance.Comment: ICCV 201

    BiDi screen: a thin, depth-sensing LCD for 3D interaction using light fields

    Get PDF
    We transform an LCD into a display that supports both 2D multi-touch and unencumbered 3D gestures. Our BiDirectional (BiDi) screen, capable of both image capture and display, is inspired by emerging LCDs that use embedded optical sensors to detect multiple points of contact. Our key contribution is to exploit the spatial light modulation capability of LCDs to allow lensless imaging without interfering with display functionality. We switch between a display mode showing traditional graphics and a capture mode in which the backlight is disabled and the LCD displays a pinhole array or an equivalent tiled-broadband code. A large-format image sensor is placed slightly behind the liquid crystal layer. Together, the image sensor and LCD form a mask-based light field camera, capturing an array of images equivalent to that produced by a camera array spanning the display surface. The recovered multi-view orthographic imagery is used to passively estimate the depth of scene points. Two motivating applications are described: a hybrid touch plus gesture interaction and a light-gun mode for interacting with external light-emitting widgets. We show a working prototype that simulates the image sensor with a camera and diffuser, allowing interaction up to 50 cm in front of a modified 20.1 inch LCD.National Science Foundation (U.S.) (Grant CCF-0729126)Alfred P. Sloan Foundatio

    Beyond the Appraisal Framework: Evaluation of Can and May in Introductions and Conclusions to Computing Research Articles

    Get PDF
    This paper attempts to analyse the presence of the modal auxiliaries can and may as markers of authorial evaluation in a corpus of introductions and conclusions to computing research articles. Bearing in mind the semantic familiarity of these two modals, we start from Martin and White’s Appraisal framework, whose focus is on the interpersonal in language, the subjective presence of authors in their texts, and the stances they take both towards those texts and their readers. In particular, we extend Martin and White’s notions on epistemic modality and evidentiality, which they interpret from a co-textual and contextual point of view, and use Alonso-Almeida’s views on epistemicity as a pragmatic eff ect of evidential strategies. An important conclusion points at functional variation of epistemic and evidential readings in these two sections of research articles, with a predominant occurrence of epistemic attributions in introductions and evidential interpretations in conclusions. This result is in consonance with the type of genre selected and its authors’ aims.Este trabajo pretende analizar la presencia de los auxiliares modales can y may como indicadores de evaluación en un corpus de introducciones y conclusiones de artículos de investigación sobre ingeniería informática. Teniendo en cuenta el parecido semántico entre estos dos modales, tomamos como primera referencia el modelo evaluativo de Martin y White, cuyo trabajo se centra en la función interpersonal del lenguaje, la presencia del autor en su obra, y su posicionamiento con respecto a esta y sus lectores. Extendemos a continuación sus nociones sobre modalidad epistémica y evidencialidad, que interpretan desde una perspectiva cotextual y contextual, y utilizamos para ello las ideas de Alonso-Almeida (en prensa) sobre epistemicidad como efecto pragmático de las estrategias evidenciales. Una conclusión importante refl eja la variación funcional de lecturas epistémicas y evidenciales en las dos secciones de los artículos de investigación, predominando las primeras en las introducciones y las segundas en las conclusiones. Este resultado concuerda con el tipo de género y el propósito de los autores

    Handheld reflectance acquisition of paintings

    Get PDF
    Relightable photographs are alternatives to traditional photographs as they provide a richer viewing experience. However, the complex acquisition systems of existing techniques have restricted its usage to specialized setups. We introduce an easy-to-use and affordable solution for using smartphones to acquire the reflectance of paintings and similar almost-planar objects like tablets, engravings and textile. Our goal is to enable interactive relighting of such artifacts by everyone. In our approach, we non-uniformly sample the reflectance functions by moving the LED light of a smartphone and simultaneously tracking the position of the smartphone by using its camera. We then propose a compressive-sensing based approach for reconstructing the light transport matrix from the non-uniformly sampled data. As shown with experiments, we accurately reconstruct the light transport matrix that can then be used to create relightable photographs

    Perception of Lighting and Reflectance in Real and Synthetic Stimuli

    Get PDF
    The human visual system estimates the proportion of light reflected off of a surface despite variable lighting in a scene, a phenomenon known as lightness constancy. Classically, lightness constancy has been explained as a 'discounting' of the lighting intensity (Helmholtz, 1866), and this continues to be a common view today (e.g., Brainard & Maloney, 2011). However, Logvinenko and Maloney (2006) have made a radically different claim that the human visual system does not have any perceptual access to an estimation of lightness. The experiments described in Chapter 2 use a novel experimental paradigm to test this new theory proposed by Logvinenko and Maloney. We provide evidence against Logvinenko and Maloney's theory of lightness perception while adding to existing evidence that the visual system has good lightness constancy. In Chapter 3, we manipulate screen colour and texture cues to test the realism of computer-generated stimuli. We find that by matching the chromaticity of an LCD screen to the surrounding lighting and using a realistic texture, LCD screens can be made to appear similar to physical paper. Finally, Chapter 4 is an extension of the ideas from Chapter 3, in which the knowledge about how to adjust color and texture cues on an LCD monitor is applied to a lightness matching task. Here, the LCD screen is a small part of a larger physical setup. Additionally, levels of lightness constancy are compared across physical and simulated surfaces in the same novel experimental paradigm in Chapters 2 and 4. We find that physical and simulated surfaced elicit different levels of lightness constancy on the same task

    Multiple cue integration for robust tracking in dynamic environments: application to video relighting

    Get PDF
    L'anàlisi de moviment i seguiment d'objectes ha estat un dels pricipals focus d'atenció en la comunitat de visió per computador durant les dues darreres dècades. L'interès per aquesta àrea de recerca resideix en el seu ample ventall d'aplicabilitat, que s'extén des de tasques de navegació de vehicles autònoms i robots, fins a aplications en la indústria de l'entreteniment i realitat virtual.Tot i que s'han aconseguit resultats espectaculars en problemes específics, el seguiment d'objectes continua essent un problema obert, ja que els mètodes disponibles són propensos a ser sensibles a diversos factors i condicions no estacionàries de l'entorn, com ara moviments impredictibles de l'objecte a seguir, canvis suaus o abruptes de la il·luminació, proximitat d'objectes similars o fons confusos. Enfront aquests factors de confusió la integració de múltiples característiques ha demostrat que permet millorar la robustesa dels algoritmes de seguiment. En els darrers anys, degut a la creixent capacitat de càlcul dels ordinadors, hi ha hagut un significatiu increment en el disseny de complexes sistemes de seguiment que consideren simultàniament múltiples característiques de l'objecte. No obstant, la majoria d'aquests algoritmes estan basats enheurístiques i regles ad-hoc formulades per aplications específiques, fent-ne impossible l'extrapolació a noves condicions de l'entorn.En aquesta tesi proposem un marc probabilístic general per integrar el nombre de característiques de l'objecte que siguin necessàries, permetent que interactuin mútuament per tal d'estimar-ne el seu estat amb precisió, i per tant, estimar amb precisió la posició de l'objecte que s'està seguint. Aquest marc, s'utilitza posteriorment per dissenyar un algoritme de seguiment, que es valida en diverses seqüències de vídeo que contenen canvis abruptes de posició i il·luminació, camuflament de l'objecte i deformacions no rígides. Entre les característiques que s'han utilitzat per representar l'objecte, cal destacar la paramatrització robusta del color en un espai de color dependent de l'objecte, que permet distingir-lo del fons més clarament que altres espais de color típicament ulitzats al llarg de la literatura.En la darrera part de la tesi dissenyem una tècnica per re-il·luminar tant escenes estàtiques com en moviment, de les que s'en desconeix la geometria. La re-il·luminació es realitza amb un mètode 'basat en imatges', on la generació de les images de l'escena sota noves condicions d'il·luminació s'aconsegueix a partir de combinacions lineals d'un conjunt d'imatges de referència pre-capturades, i que han estat generades il·luminant l'escena amb patrons de llum coneguts. Com que la posició i intensitat de les fonts d'il.luminació que formen aquests patrons de llum es pot controlar, és natural preguntar-nos: quina és la manera més òptima d'il·luminar una escena per tal de reduir el nombre d'imatges de referència? Demostrem que la millor manera d'il·luminar l'escena (és a dir, la que minimitza el nombre d'imatges de referència) no és utilitzant una seqüència de fonts d'il·luminació puntuals, com es fa generalment, sinó a través d'una seqüència de patrons de llum d'una base d'il·luminació depenent de l'objecte. És important destacar que quan es re-il·luminen seqüències de vídeo, les imatges successives s'han d'alinear respecte a un sistema de coordenades comú. Com que cada imatge ha estat generada per un patró de llum diferent il·uminant l'escena, es produiran canvis d'il·luminació bruscos entre imatges de referència consecutives. Sota aquestes circumstàncies, el mètode de seguiment proposat en aquesta tesi juga un paper fonamental. Finalment, presentem diversos resultats on re-il·luminem seqüències de vídeo reals d'objectes i cares d'actors en moviment. En cada cas, tot i que s'adquireix un únic vídeo, som capaços de re-il·luminar una i altra vegada, controlant la direcció de la llum, la seva intensitat, i el color.Motion analysis and object tracking has been one of the principal focus of attention over the past two decades within the computer vision community. The interest of this research area lies in its wide range of applicability, extending from autonomous vehicle and robot navigation tasks, to entertainment and virtual reality applications.Even though impressive results have been obtained in specific problems, object tracking is still an open problem, since available methods are prone to be sensitive to several artifacts and non-stationary environment conditions, such as unpredictable target movements, gradual or abrupt changes of illumination, proximity of similar objects or cluttered backgrounds. Multiple cue integration has been proved to enhance the robustness of the tracking algorithms in front of such disturbances. In recent years, due to the increasing power of the computers, there has been a significant interest in building complex tracking systems which simultaneously consider multiple cues. However, most of these algorithms are based on heuristics and ad-hoc rules formulated for specific applications, making impossible to extrapolate them to new environment conditions.In this dissertation we propose a general probabilistic framework to integrate as many object features as necessary, permitting them to mutually interact in order to obtain a precise estimation of its state, and thus, a precise estimate of the target position. This framework is utilized to design a tracking algorithm, which is validated on several video sequences involving abrupt position and illumination changes, target camouflaging and non-rigid deformations. Among the utilized features to represent the target, it is important to point out the use of a robust parameterization of the target color in an object dependent colorspace which allows to distinguish the object from the background more clearly than other colorspaces commonly used in the literature.In the last part of the dissertation, we design an approach for relighting static and moving scenes with unknown geometry. The relighting is performed through an -image-based' methodology, where the rendering under new lighting conditions is achieved by linear combinations of a set of pre-acquired reference images of the scene illuminated by known light patterns. Since the placement and brightness of the light sources composing such light patterns can be controlled, it is natural to ask: what is the optimal way to illuminate the scene to reduce the number of reference images that are needed? We show that the best way to light the scene (i.e., the way that minimizes the number of reference images) is not using a sequence of single, compact light sources as is most commonly done, but rather to use a sequence of lighting patterns as given by an object-dependent lighting basis. It is important to note that when relighting video sequences, consecutive images need to be aligned with respect to a common coordinate frame. However, since each frame is generated by a different light pattern illuminating the scene, abrupt illumination changes between consecutive reference images are produced. Under these circumstances, the tracking framework designed in this dissertation plays a central role. Finally, we present several relighting results on real video sequences of moving objects, moving faces, and scenes containing both. In each case, although a single video clip was captured, we are able to relight again and again, controlling the lighting direction, extent, and color.Postprint (published version

    BiDi screen : depth and lighting aware interaction and display

    Get PDF
    Thesis (S.M.)--Massachusetts Institute of Technology, School of Architecture and Planning, Program in Media Arts and Sciences, 2009.Cataloged from PDF version of thesis.Includes bibliographical references (p. 75-79).In this thesis, I describe a new type of interactive display that supports both on-screen multi-touch interactions and off-screen hover-based gestures. This BiDirectional (BiDi) screen, capable of both image capture and display, is inspired by emerging LCDs that use embedded optical sensors to detect multiple points of direct contact. The key contribution of this thesis is to exploit the spatial light modulation capability of LCDs to allow dynamic mask-based scene capture without interfering with display functionality. A large-format image sensor is placed slightly behind the liquid crystal layer. By alternatly switching the liquid crystal between a display mode showing traditional graphics and a capture mode in which the backlight is disabled and a pinhole array or an equivalent tiled-broadband code is displayed, the BiDi Screen can recover multi-view orthographic imagery while functioning as a 2D display. The recovered imagery is used to passively estimate the depth of scene points from focus. I discuss the design and construction of a prototype to demonstrate these capabilities in two motivating applications: a hybrid touch plus gesture interaction and a light-gun mode for interacting with external light-emitting widgets. The working prototype simulates the large format light sensor with a camera and diffuser, supporting interaction up to 50 cm in front of a modified 20.1 inch LCD.by Matthew W. Hirsch.S.M

    Advanced methods for relightable scene representations in image space

    Get PDF
    The realistic reproduction of visual appearance of real-world objects requires accurate computer graphics models that describe the optical interaction of a scene with its surroundings. Data-driven approaches that model the scene globally as a reflectance field function in eight parameters deliver high quality and work for most material combinations, but are costly to acquire and store. Image-space relighting, which constrains the application to create photos with a virtual, fix camera in freely chosen illumination, requires only a 4D data structure to provide full fidelity. This thesis contributes to image-space relighting on four accounts: (1) We investigate the acquisition of 4D reflectance fields in the context of sampling and propose a practical setup for pre-filtering of reflectance data during recording, and apply it in an adaptive sampling scheme. (2) We introduce a feature-driven image synthesis algorithm for the interpolation of coarsely sampled reflectance data in software to achieve highly realistic images. (3) We propose an implicit reflectance data representation, which uses a Bayesian approach to relight complex scenes from the example of much simpler reference objects. (4) Finally, we construct novel, passive devices out of optical components that render reflectance field data in real-time, shaping the incident illumination into the desired imageDie realistische Wiedergabe der visuellen Erscheinung einer realen Szene setzt genaue Modelle aus der Computergraphik für die Interaktion der Szene mit ihrer Umgebung voraus. Globale Ansätze, die das Verhalten der Szene insgesamt als Reflektanzfeldfunktion in acht Parametern modellieren, liefern hohe Qualität für viele Materialtypen, sind aber teuer aufzuzeichnen und zu speichern. Verfahren zur Neubeleuchtung im Bildraum schränken die Anwendbarkeit auf fest gewählte Kameras ein, ermöglichen aber die freie Wahl der Beleuchtung, und erfordern dadurch lediglich eine 4D - Datenstruktur für volle Wiedergabetreue. Diese Arbeit enthält vier Beiträge zu diesem Thema: (1) wir untersuchen die Aufzeichnung von 4D Reflektanzfeldern im Kontext der Abtasttheorie und schlagen einen praktischen Aufbau vor, der Reflektanzdaten bereits während der Messung vorfiltert. Wir verwenden ihn in einem adaptiven Abtastschema. (2) Wir führen einen merkmalgesteuerten Bildsynthesealgorithmus für die Interpolation von grob abgetasteten Reflektanzdaten ein. (3) Wir schlagen eine implizite Beschreibung von Reflektanzdaten vor, die mit einem Bayesschen Ansatz komplexe Szenen anhand des Beispiels eines viel einfacheren Referenzobjektes neu beleuchtet. (4) Unter der Verwendung optischer Komponenten schaffen wir passive Aufbauten zur Darstellung von Reflektanzfeldern in Echtzeit, indem wir einfallende Beleuchtung direkt in das gewünschte Bild umwandeln
    corecore