24 research outputs found

    Phenomenological modeling of image irradiance for non-Lambertian surfaces under natural illumination.

    Get PDF
    Various vision tasks are usually confronted by appearance variations due to changes of illumination. For instance, in a recognition system, it has been shown that the variability in human face appearance is owed to changes to lighting conditions rather than person\u27s identity. Theoretically, due to the arbitrariness of the lighting function, the space of all possible images of a fixed-pose object under all possible illumination conditions is infinite dimensional. Nonetheless, it has been proven that the set of images of a convex Lambertian surface under distant illumination lies near a low dimensional linear subspace. This result was also extended to include non-Lambertian objects with non-convex geometry. As such, vision applications, concerned with the recovery of illumination, reflectance or surface geometry from images, would benefit from a low-dimensional generative model which captures appearance variations w.r.t. illumination conditions and surface reflectance properties. This enables the formulation of such inverse problems as parameter estimation. Typically, subspace construction boils to performing a dimensionality reduction scheme, e.g. Principal Component Analysis (PCA), on a large set of (real/synthesized) images of object(s) of interest with fixed pose but different illumination conditions. However, this approach has two major problems. First, the acquired/rendered image ensemble should be statistically significant vis-a-vis capturing the full behavior of the sources of variations that is of interest, in particular illumination and reflectance. Second, the curse of dimensionality hinders numerical methods such as Singular Value Decomposition (SVD) which becomes intractable especially with large number of large-sized realizations in the image ensemble. One way to bypass the need of large image ensemble is to construct appearance subspaces using phenomenological models which capture appearance variations through mathematical abstraction of the reflection process. In particular, the harmonic expansion of the image irradiance equation can be used to derive an analytic subspace to represent images under fixed pose but different illumination conditions where the image irradiance equation has been formulated in a convolution framework. Due to their low-frequency nature, irradiance signals can be represented using low-order basis functions, where Spherical Harmonics (SH) has been extensively adopted. Typically, an ideal solution to the image irradiance (appearance) modeling problem should be able to incorporate complex illumination, cast shadows as well as realistic surface reflectance properties, while moving away from the simplifying assumptions of Lambertian reflectance and single-source distant illumination. By handling arbitrary complex illumination and non-Lambertian reflectance, the appearance model proposed in this dissertation moves the state of the art closer to the ideal solution. This work primarily addresses the geometrical compliance of the hemispherical basis for representing surface reflectance while presenting a compact, yet accurate representation for arbitrary materials. To maintain the plausibility of the resulting appearance, the proposed basis is constructed in a manner that satisfies the Helmholtz reciprocity property while avoiding high computational complexity. It is believed that having the illumination and surface reflectance represented in the spherical and hemispherical domains respectively, while complying with the physical properties of the surface reflectance would provide better approximation accuracy of image irradiance when compared to the representation in the spherical domain. Discounting subsurface scattering and surface emittance, this work proposes a surface reflectance basis, based on hemispherical harmonics (HSH), defined on the Cartesian product of the incoming and outgoing local hemispheres (i.e. w.r.t. surface points). This basis obeys physical properties of surface reflectance involving reciprocity and energy conservation. The basis functions are validated using analytical reflectance models as well as scattered reflectance measurements which might violate the Helmholtz reciprocity property (this can be filtered out through the process of projecting them on the subspace spanned by the proposed basis, where the reciprocity property is preserved in the least-squares sense). The image formation process of isotropic surfaces under arbitrary distant illumination is also formulated in the frequency space where the orthogonality relation between illumination and reflectance bases is encoded in what is termed as irradiance harmonics. Such harmonics decouple the effect of illumination and reflectance from the underlying pose and geometry. Further, a bilinear approach to analytically construct irradiance subspace is proposed in order to tackle the inherent problem of small-sample-size and curse of dimensionality. The process of finding the analytic subspace is posed as establishing a relation between its principal components and that of the irradiance harmonics basis functions. It is also shown how to incorporate prior information about natural illumination and real-world surface reflectance characteristics in order to capture the full behavior of complex illumination and non-Lambertian reflectance. The use of the presented theoretical framework to develop practical algorithms for shape recovery is further presented where the hitherto assumed Lambertian assumption is relaxed. With a single image of unknown general illumination, the underlying geometrical structure can be recovered while accounting explicitly for object reflectance characteristics (e.g. human skin types for facial images and teeth reflectance for human jaw reconstruction) as well as complex illumination conditions. Experiments on synthetic and real images illustrate the robustness of the proposed appearance model vis-a-vis illumination variation. Keywords: computer vision, computer graphics, shading, illumination modeling, reflectance representation, image irradiance, frequency space representations, {hemi)spherical harmonics, analytic bilinear PCA, model-based bilinear PCA, 3D shape reconstruction, statistical shape from shading

    Robust Face Recognition based on Color and Depth Information

    Get PDF
    One of the most important advantages of automatic human face recognition is its nonintrusiveness property. Face images can sometime be acquired without user's knowledge or explicit cooperation. However, face images acquired in an uncontrolled environment can appear with varying imaging conditions. Traditionally, researchers focus on tackling this problem using 2D gray-scale images due to the wide availability of 2D cameras and the low processing and storage cost of gray-scale data. Nevertheless, face recognition can not be performed reliably with 2D gray-scale data due to insu_cient information and its high sensitivity to pose, expression and illumination variations. Recent rapid development in hardware makes acquisition and processing of color and 3D data feasible. This thesis aims to improve face recognition accuracy and robustness using color and 3D information.In terms of color information usage, this thesis proposes several improvements over existing approaches. Firstly, the Block-wise Discriminant Color Space is proposed, which learns the discriminative color space based on local patches of a human face image instead of the holistic image, as human faces display different colors in different parts. Secondly, observing that most of the existing color spaces consist of at most three color components, while complementary information can be found in multiple color components across multiple color spaces and therefore the Multiple Color Fusion model is proposed to search and utilize multiple color components effectively. Lastly, two robust color face recognition algorithms are proposed. The Color Sparse Coding method can deal with face images with noise and occlusion. The Multi-linear Color Tensor Discriminant method harnesses multi-linear technique to handle non-linear data. Experiments show that all the proposed methods outperform their existing competitors.In terms of 3D information utilization, this thesis investigates the feasibility of face recognition using Kinect. Unlike traditional 3D scanners which are too slow in speed and too expensive in cost for broad face recognition applications, Kinect trades data quality for high speed and low cost. An algorithm is proposed to show that Kinect data can be used for face recognition despite its noisy nature. In order to fully utilize Kinect data, a more sophisticated RGB-D face recognition algorithm is developed which harnesses theColor Sparse Coding framework and 3D information to perform accurate face recognition robustly even under simultaneous varying conditions of poses, illuminations, expressionsand disguises

    State of the Art in Face Recognition

    Get PDF
    Notwithstanding the tremendous effort to solve the face recognition problem, it is not possible yet to design a face recognition system with a potential close to human performance. New computer vision and pattern recognition approaches need to be investigated. Even new knowledge and perspectives from different fields like, psychology and neuroscience must be incorporated into the current field of face recognition to design a robust face recognition system. Indeed, many more efforts are required to end up with a human like face recognition system. This book tries to make an effort to reduce the gap between the previous face recognition research state and the future state

    Data-driven approaches for interactive appearance editing

    Get PDF
    This thesis proposes several techniques for interactive editing of digital content and fast rendering of virtual 3D scenes. Editing of digital content - such as images or 3D scenes - is difficult, requires artistic talent and technical expertise. To alleviate these difficulties, we exploit data-driven approaches that use the easily accessible Internet data (e. g., images, videos, materials) to develop new tools for digital content manipulation. Our proposed techniques allow casual users to achieve high-quality editing by interactively exploring the manipulations without the need to understand the underlying physical models of appearance. First, the thesis presents a fast algorithm for realistic image synthesis of virtual 3D scenes. This serves as the core framework for a new method that allows artists to fine tune the appearance of a rendered 3D scene. Here, artists directly paint the final appearance and the system automatically solves for the material parameters that best match the desired look. Along this line, an example-based material assignment approach is proposed, where the 3D models of a virtual scene can be "materialized" simply by giving a guidance source (image/video). Next, the thesis proposes shape and color subspaces of an object that are learned from a collection of exemplar images. These subspaces can be used to constrain image manipulations to valid shapes and colors, or provide suggestions for manipulations. Finally, data-driven color manifolds which contain colors of a specific context are proposed. Such color manifolds can be used to improve color picking performance, color stylization, compression or white balancing.Diese Dissertation stellt Techniken zum interaktiven Editieren von digitalen Inhalten und zum schnellen Rendering von virtuellen 3D Szenen vor. Digitales Editieren - seien es Bilder oder dreidimensionale Szenen - ist kompliziert, benötigt künstlerisches Talent und technische Expertise. Um diese Schwierigkeiten zu relativieren, nutzen wir datengesteuerte Ansätze, die einfach zugängliche Internetdaten, wie Bilder, Videos und Materialeigenschaften, nutzen um neue Werkzeuge zur Manipulation von digitalen Inhalten zu entwickeln. Die von uns vorgestellten Techniken erlauben Gelegenheitsnutzern das Editieren in hoher Qualität, indem Manipulationsmöglichkeiten interaktiv exploriert werden können ohne die zugrundeliegenden physikalischen Modelle der Bildentstehung verstehen zu müssen. Zunächst stellen wir einen effizienten Algorithmus zur realistischen Bildsynthese von virtuellen 3D Szenen vor. Dieser dient als Kerngerüst einer Methode, die Nutzern die Feinabstimmung des finalen Aussehens einer gerenderten dreidimensionalen Szene erlaubt. Hierbei malt der Künstler direkt das beabsichtigte Aussehen und das System errechnet automatisch die zugrundeliegenden Materialeigenschaften, die den beabsichtigten Eigenschaften am nahesten kommen. Zu diesem Zweck wird ein auf Beispielen basierender Materialzuordnungsansatz vorgestellt, für den das 3D Model einer virtuellen Szene durch das simple Anführen einer Leitquelle (Bild, Video) in Materialien aufgeteilt werden kann. Als Nächstes schlagen wir Form- und Farbunterräume von Objektklassen vor, die aus einer Sammlung von Beispielbildern gelernt werden. Diese Unterräume können genutzt werden um Bildmanipulationen auf valide Formen und Farben einzuschränken oder Manipulationsvorschläge zu liefern. Schließlich werden datenbasierte Farbmannigfaltigkeiten vorgestellt, die Farben eines spezifischen Kontexts enthalten. Diese Mannigfaltigkeiten ermöglichen eine Leistungssteigerung bei Farbauswahl, Farbstilisierung, Komprimierung und Weißabgleich

    Specialized Signals for Spatial Attention in the Ventral and Dorsal Visual Streams

    Get PDF
    Neuroscientists have traditionally conceived the visual system as having a ventral stream of vision for perception and a dorsal one associated with vision for action. However functional differences between them have become relatively blurred in recent years, not the least by the systematic parallel mapping of functions allowed by functional magnetic resonance imaging (fMRI). Here, using fMRI to simultaneously monitor several brain regions, we first studied a hallmark ventral stream computation: the processing of faces. We did so by probing responses to motion, an attribute whose processing is typically associated with the dorsal stream. In humans, it is known that face-selective regions in the superior temporal sulcus (STS) show enhanced responses to facial motion that are absent in the rest of the face-processing system. In macaques, face areas also exist, but their functional specializations for facial motion are unknown. We showed static and moving face and non-face objects to macaques and humans in an fMRI experiment in order to isolate potential functional specializations in the ventral stream face-processing system and to motivate putative homologies across species. Our results revealed all macaque face areas showed enhanced responses to moving faces. There was a difference between more dorsal face areas in the fundus of the STS, which are embedded in motion responsive cortex and ventral ones, where enhanced responses to motion interacted with object category and could not be explained by their proximity to motion responsive cortex. In humans watching the same stimuli, only the STS face area showed an enhancement for motion. These results suggest specializations for motion exist in the macaque face-processing network but they do not lend themselves to a direct equalization between human and macaque face areas. We then proceeded to compare ventral and dorsal stream functions in terms of their code for spatial attention, whose control was typically associated with the dorsal stream and prefrontal areas. We took advantage of recent fMRI studies that provide a systematic map of cortical areas modulated by spatial attention and suggest PITd, a ventral stream area in the temporal lobe, can support endogenous attention control. Covert attention and stimulus selection by saccades are represented in the same maps of visual space in attention control areas. Difficulties interpreting this multiplicity of functions led to the proposal that they encode priority maps, where multiple sources are summed to form a single priority signal, agnostic as to its eventual use by downstream areas. Using a paradigm that dissociates covert attention and response selection, we test this hypothesis with fMRI-guided electrophysiology in two cortical areas: parietal area LIP, where the priority map was first proposed to apply, and temporal area PITd. Our results indicate LIP sums disparate signals, but as a consequence independent channels of spatial information exist for attention and response planning. PITd represents relevant locations and, rather than summing signals, contains a single map for covert attention. Our findings have the potential to resolve a longstanding controversy about the nature of spatial signals in LIP and establish PITd as a robust map for covert attention in the ventral stream. Together, our results suggest that while the distribution of labor between ventral stream and dorsal stream areas is less linear than what a what a rough depiction of them can suggest, it is illuminated by their proposed function as supporting vision for perception and vision for action respectively

    Neuroinformatics in Functional Neuroimaging

    Get PDF
    This Ph.D. thesis proposes methods for information retrieval in functional neuroimaging through automatic computerized authority identification, and searching and cleaning in a neuroscience database. Authorities are found through cocitation analysis of the citation pattern among scientific articles. Based on data from a single scientific journal it is shown that multivariate analyses are able to determine group structure that is interpretable as particular “known ” subgroups in functional neuroimaging. Methods for text analysis are suggested that use a combination of content and links, in the form of the terms in scientific documents and scientific citations, respectively. These included context sensitive author ranking and automatic labeling of axes and groups in connection with multivariate analyses of link data. Talairach foci from the BrainMap ™ database are modeled with conditional probability density models useful for exploratory functional volumes modeling. A further application is shown with conditional outlier detection where abnormal entries in the BrainMap ™ database are spotted using kernel density modeling and the redundancy between anatomical labels and spatial Talairach coordinates. This represents a combination of simple term and spatial modeling. The specific outliers that were found in the BrainMap ™ database constituted among others: Entry errors, errors in the article and unusual terminology

    Compression, Modeling, and Real-Time Rendering of Realistic Materials and Objects

    Get PDF
    The realism of a scene basically depends on the quality of the geometry, the illumination and the materials that are used. Whereas many sources for the creation of three-dimensional geometry exist and numerous algorithms for the approximation of global illumination were presented, the acquisition and rendering of realistic materials remains a challenging problem. Realistic materials are very important in computer graphics, because they describe the reflectance properties of surfaces, which are based on the interaction of light and matter. In the real world, an enormous diversity of materials can be found, comprising very different properties. One important objective in computer graphics is to understand these processes, to formalize them and to finally simulate them. For this purpose various analytical models do already exist, but their parameterization remains difficult as the number of parameters is usually very high. Also, they fail for very complex materials that occur in the real world. Measured materials, on the other hand, are prone to long acquisition time and to huge input data size. Although very efficient statistical compression algorithms were presented, most of them do not allow for editability, such as altering the diffuse color or mesostructure. In this thesis, a material representation is introduced that makes it possible to edit these features. This makes it possible to re-use the acquisition results in order to easily and quickly create deviations of the original material. These deviations may be subtle, but also substantial, allowing for a wide spectrum of material appearances. The approach presented in this thesis is not based on compression, but on a decomposition of the surface into several materials with different reflection properties. Based on a microfacette model, the light-matter interaction is represented by a function that can be stored in an ordinary two-dimensional texture. Additionally, depth information, local rotations, and the diffuse color are stored in these textures. As a result of the decomposition, some of the original information is inevitably lost, therefore an algorithm for the efficient simulation of subsurface scattering is presented as well. Another contribution of this work is a novel perception-based simplification metric that includes the material of an object. This metric comprises features of the human visual system, for example trichromatic color perception or reduced resolution. The proposed metric allows for a more aggressive simplification in regions where geometric metrics do not simplif
    corecore