13,588 research outputs found
Reflectance Hashing for Material Recognition
We introduce a novel method for using reflectance to identify materials.
Reflectance offers a unique signature of the material but is challenging to
measure and use for recognizing materials due to its high-dimensionality. In
this work, one-shot reflectance is captured using a unique optical camera
measuring {\it reflectance disks} where the pixel coordinates correspond to
surface viewing angles. The reflectance has class-specific stucture and angular
gradients computed in this reflectance space reveal the material class.
These reflectance disks encode discriminative information for efficient and
accurate material recognition. We introduce a framework called reflectance
hashing that models the reflectance disks with dictionary learning and binary
hashing. We demonstrate the effectiveness of reflectance hashing for material
recognition with a number of real-world materials
BRDF Estimation of Complex Materials with Nested Learning
The estimation of the optical properties of a material from RGB-images is an
important but extremely ill-posed problem in Computer Graphics. While recent
works have successfully approached this problem even from just a single
photograph, significant simplifications of the material model are assumed,
limiting the usability of such methods. The detection of complex material
properties such as anisotropy or Fresnel effect remains an unsolved challenge.
We propose a novel method that predicts the model parameters of an
artist-friendly, physically-based BRDF, from only two low-resolution shots of
the material. Thanks to a novel combination of deep neural networks in a nested
architecture, we are able to handle the ambiguities given by the
non-orthogonality and non-convexity of the parameter space. To train the
network, we generate a novel dataset of physically-based synthetic images. We
prove that our model can recover new properties like anisotropy, index of
refraction and a second reflectance color, for materials that have tinted
specular reflections or whose albedo changes at glancing angles.Comment: Accepted to IEEE Winter Conference on Applications of Computer Vision
2019 (WACV 2019
Texture image analysis and texture classification methods - A review
Tactile texture refers to the tangible feel of a surface and visual texture
refers to see the shape or contents of the image. In the image processing, the
texture can be defined as a function of spatial variation of the brightness
intensity of the pixels. Texture is the main term used to define objects or
concepts of a given image. Texture analysis plays an important role in computer
vision cases such as object recognition, surface defect detection, pattern
recognition, medical image analysis, etc. Since now many approaches have been
proposed to describe texture images accurately. Texture analysis methods
usually are classified into four categories: statistical methods, structural,
model-based and transform-based methods. This paper discusses the various
methods used for texture or analysis in details. New researches shows the power
of combinational methods for texture analysis, which can't be in specific
category. This paper provides a review on well known combinational methods in a
specific section with details. This paper counts advantages and disadvantages
of well-known texture image descriptors in the result part. Main focus in all
of the survived methods is on discrimination performance, computational
complexity and resistance to challenges such as noise, rotation, etc. A brief
review is also made on the common classifiers used for texture image
classification. Also, a survey on texture image benchmark datasets is included.Comment: 29 Pages, Keywords: Texture Image, Texture Analysis, Texture
classification, Feature extraction, Image processing, Local Binary Patterns,
Benchmark texture image dataset
Learning a Reinforced Agent for Flexible Exposure Bracketing Selection
Automatically selecting exposure bracketing (images exposed differently) is
important to obtain a high dynamic range image by using multi-exposure fusion.
Unlike previous methods that have many restrictions such as requiring camera
response function, sensor noise model, and a stream of preview images with
different exposures (not accessible in some scenarios e.g. some mobile
applications), we propose a novel deep neural network to automatically select
exposure bracketing, named EBSNet, which is sufficiently flexible without
having the above restrictions. EBSNet is formulated as a reinforced agent that
is trained by maximizing rewards provided by a multi-exposure fusion network
(MEFNet). By utilizing the illumination and semantic information extracted from
just a single auto-exposure preview image, EBSNet can select an optimal
exposure bracketing for multi-exposure fusion. EBSNet and MEFNet can be jointly
trained to produce favorable results against recent state-of-the-art
approaches. To facilitate future research, we provide a new benchmark dataset
for multi-exposure selection and fusion.Comment: to be published in CVPR 202
Materials for Masses: SVBRDF Acquisition with a Single Mobile Phone Image
We propose a material acquisition approach to recover the spatially-varying
BRDF and normal map of a near-planar surface from a single image captured by a
handheld mobile phone camera. Our method images the surface under arbitrary
environment lighting with the flash turned on, thereby avoiding shadows while
simultaneously capturing high-frequency specular highlights. We train a CNN to
regress an SVBRDF and surface normals from this image. Our network is trained
using a large-scale SVBRDF dataset and designed to incorporate physical
insights for material estimation, including an in-network rendering layer to
model appearance and a material classifier to provide additional supervision
during training. We refine the results from the network using a dense CRF
module whose terms are designed specifically for our task. The framework is
trained end-to-end and produces high quality results for a variety of
materials. We provide extensive ablation studies to evaluate our network on
both synthetic and real data, while demonstrating significant improvements in
comparisons with prior works.Comment: submitted to European Conference on Computer Visio
Distinguishing mirror from glass: A 'big data' approach to material perception
Visually identifying materials is crucial for many tasks, yet material
perception remains poorly understood. Distinguishing mirror from glass is
particularly challenging as both materials derive their appearance from their
surroundings, yet we rarely experience difficulties telling them apart. Here we
took a 'big data' approach to uncovering the underlying visual cues and
processes, leveraging recent advances in neural network models of vision. We
trained thousands of convolutional neural networks on >750,000 simulated mirror
and glass objects, and compared their performance with human judgments, as well
as alternative classifiers based on 'hand-engineered' image features. For
randomly chosen images, all classifiers and humans performed with high
accuracy, and therefore correlated highly with one another. To tease the models
apart, we then painstakingly assembled a diagnostic image set for which humans
make highly systematic errors, allowing us to decouple accuracy from human-like
performance. A large-scale, systematic search through feedforward neural
architectures revealed that relatively shallow networks predicted human
judgments better than any other models. However, surprisingly, no network
correlated better than 0.6 with humans (below inter-human correlations). Thus,
although the model sets new standards for simulating human vision in a
challenging material perception task, the results cast doubt on recent claims
that such architectures are generally good models of human vision.Comment: 40 pages, 5 figures, 7 supplement figure
Visual Tracking via Shallow and Deep Collaborative Model
In this paper, we propose a robust tracking method based on the collaboration
of a generative model and a discriminative classifier, where features are
learned by shallow and deep architectures, respectively. For the generative
model, we introduce a block-based incremental learning scheme, in which a local
binary mask is constructed to deal with occlusion. The similarity degrees
between the local patches and their corresponding subspace are integrated to
formulate a more accurate global appearance model. In the discriminative model,
we exploit the advances of deep learning architectures to learn generic
features which are robust to both background clutters and foreground appearance
variations. To this end, we first construct a discriminative training set from
auxiliary video sequences. A deep classification neural network is then trained
offline on this training set. Through online fine-tuning, both the hierarchical
feature extractor and the classifier can be adapted to the appearance change of
the target for effective online tracking. The collaboration of these two models
achieves a good balance in handling occlusion and target appearance change,
which are two contradictory challenging factors in visual tracking. Both
quantitative and qualitative evaluations against several state-of-the-art
algorithms on challenging image sequences demonstrate the accuracy and the
robustness of the proposed tracker.Comment: Undergraduate Thesis, appearing in Pattern Recognitio
Semantic Hierarchical Priors for Intrinsic Image Decomposition
Intrinsic Image Decomposition (IID) is a challenging and interesting computer
vision problem with various applications in several fields. We present novel
semantic priors and an integrated approach for single image IID that involves
analyzing image at three hierarchical context levels. Local context priors
capture scene properties at each pixel within a small neighbourhood. Mid-level
context priors encode object level semantics. Global context priors establish
correspondences at the scene level. Our semantic priors are designed on both
fixed and flexible regions, using selective search method and Convolutional
Neural Network features. Our IID method is an iterative multistage optimization
scheme and consists of two complementary formulations: smoothing for
shading and sparsity for reflectance. Experiments and analysis of our
method indicate the utility of our semantic priors and structured hierarchical
analysis in an IID framework. We compare our method with other contemporary IID
solutions and show results with lesser artifacts. Finally, we highlight that
proper choice and encoding of prior knowledge can produce competitive results
even when compared to end-to-end deep learning IID methods, signifying the
importance of such priors. We believe that the insights and techniques
presented in this paper would be useful in the future IID research
Watermark Retrieval from 3D Printed Objects via Convolutional Neural Networks
We present a method for reading digital data embedded in planar 3D printed
surfaces. The data are organised in binary arrays and embedded as surface
textures in a way inspired by QR codes. At the core of the retrieval method
lies a Convolutional Neural Network, outputting a confidence map of the
location of the surface textures encoding value 1 bits. Subsequently, the bit
array is retrieved through a series of simple image processing and statistical
operations applied on the confidence map. Extensive experimentation with images
captured from various camera views, under various illumination conditions and
from objects printed with various material colours, shows that the proposed
method generalizes well and achieves the level of accuracy required in
practical applications
Human Emotional Facial Expression Recognition
An automatic Facial Expression Recognition (FER) model with Adaboost face
detector, feature selection based on manifold learning and synergetic prototype
based classifier has been proposed. Improved feature selection method and
proposed classifier can achieve favorable effectiveness to performance FER in
reasonable processing time
- …