9,252 research outputs found
Robust Face Representation and Recognition Under Low Resolution and Difficult Lighting Conditions
This dissertation focuses on different aspects of face image analysis for accurate face recognition under low resolution and poor lighting conditions. A novel resolution enhancement technique is proposed for enhancing a low resolution face image into a high resolution image for better visualization and improved feature extraction, especially in a video surveillance environment. This method performs kernel regression and component feature learning in local neighborhood of the face images. It uses directional Fourier phase feature component to adaptively lean the regression kernel based on local covariance to estimate the high resolution image. For each patch in the neighborhood, four directional variances are estimated to adapt the interpolated pixels. A Modified Local Binary Pattern (MLBP) methodology for feature extraction is proposed to obtain robust face recognition under varying lighting conditions. Original LBP operator compares pixels in a local neighborhood with the center pixel and converts the resultant binary string to 8-bit integer value. So, it is less effective under difficult lighting conditions where variation between pixels is negligible. The proposed MLBP uses a two stage encoding procedure which is more robust in detecting this variation in a local patch. A novel dimensionality reduction technique called Marginality Preserving Embedding (MPE) is also proposed for enhancing the face recognition accuracy. Unlike Principal Component Analysis (PCA) and Linear Discriminant Analysis (LDA), which project data in a global sense, MPE seeks for a local structure in the manifold. This is similar to other subspace learning techniques but the difference with other manifold learning is that MPE preserves marginality in local reconstruction. Hence it provides better representation in low dimensional space and achieves lower error rates in face recognition. Two new concepts for robust face recognition are also presented in this dissertation. In the first approach, a neural network is used for training the system where input vectors are created by measuring distance from each input to its class mean. In the second approach, half-face symmetry is used, realizing the fact that the face images may contain various expressions such as open/close eye, open/close mouth etc., and classify the top half and bottom half separately and finally fuse the two results. By performing experiments on several standard face datasets, improved results were observed in all the new proposed methodologies. Research is progressing in developing a unified approach for the extraction of features suitable for accurate face recognition in a long range video sequence in complex environments
Recovering refined surface normals for relighting clothing in dynamic scenes
In this paper we present a method to relight captured 3D video sequences of non-rigid, dynamic scenes, such as clothing of real actors, reconstructed from multiple view video. A view-dependent approach is introduced to refine an initial coarse surface reconstruction using shape-from-shading to estimate detailed surface normals. The prior surface approximation is used to constrain the simultaneous estimation of surface normals and scene illumination, under the assumption of Lambertian surface reflectance. This approach enables detailed surface normals of a moving non-rigid object to be estimated from a single image frame. Refined normal estimates from multiple views are integrated into a single surface normal map. This approach allows highly non-rigid surfaces, such as creases in clothing, to be relit whilst preserving the detailed dynamics observed in video
A Novel Framework for Highlight Reflectance Transformation Imaging
We propose a novel pipeline and related software tools for processing the multi-light image collections (MLICs) acquired in different application contexts to obtain shape and appearance information of captured surfaces, as well as to derive compact relightable representations of them. Our pipeline extends the popular Highlight Reflectance Transformation Imaging (H-RTI) framework, which is widely used in the Cultural Heritage domain. We support, in particular, perspective camera modeling, per-pixel interpolated light direction estimation, as well as light normalization correcting vignetting and uneven non-directional illumination. Furthermore, we propose two novel easy-to-use software tools to simplify all processing steps. The tools, in addition to support easy processing and encoding of pixel data, implement a variety of visualizations, as well as multiple reflectance-model-fitting options. Experimental tests on synthetic and real-world MLICs demonstrate the usefulness of the novel algorithmic framework and the potential benefits of the proposed tools for end-user applications.Terms: "European Union (EU)" & "Horizon 2020" / Action: H2020-EU.3.6.3. - Reflective societies - cultural heritage and European identity / Acronym: Scan4Reco / Grant number: 665091DSURF project (PRIN 2015) funded by the Italian Ministry of University and ResearchSardinian Regional Authorities under projects VIGEC and Vis&VideoLa
Implicit 3D Orientation Learning for 6D Object Detection from RGB Images
We propose a real-time RGB-based pipeline for object detection and 6D pose
estimation. Our novel 3D orientation estimation is based on a variant of the
Denoising Autoencoder that is trained on simulated views of a 3D model using
Domain Randomization. This so-called Augmented Autoencoder has several
advantages over existing methods: It does not require real, pose-annotated
training data, generalizes to various test sensors and inherently handles
object and view symmetries. Instead of learning an explicit mapping from input
images to object poses, it provides an implicit representation of object
orientations defined by samples in a latent space. Our pipeline achieves
state-of-the-art performance on the T-LESS dataset both in the RGB and RGB-D
domain. We also evaluate on the LineMOD dataset where we can compete with other
synthetically trained approaches. We further increase performance by correcting
3D orientation estimates to account for perspective errors when the object
deviates from the image center and show extended results.Comment: Code available at: https://github.com/DLR-RM/AugmentedAutoencode
Learning Multi-Scale Representations for Material Classification
The recent progress in sparse coding and deep learning has made unsupervised
feature learning methods a strong competitor to hand-crafted descriptors. In
computer vision, success stories of learned features have been predominantly
reported for object recognition tasks. In this paper, we investigate if and how
feature learning can be used for material recognition. We propose two
strategies to incorporate scale information into the learning procedure
resulting in a novel multi-scale coding procedure. Our results show that our
learned features for material recognition outperform hand-crafted descriptors
on the FMD and the KTH-TIPS2 material classification benchmarks
Comparing Evolutionary Operators, Search Spaces, and Evolutionary Algorithms in the Construction of Facial Composites
Facial composite construction is one of the most successful applications of interactive evolutionary computation.
In spite of this, previous work in the area of composite construction has not investigated the
algorithm design options in detail. We address this issue with four experiments. In the first experiment a
sorting task is used to identify the 12 most salient dimensions of a 30-dimensional search space. In the second
experiment the performances of two mutation and two recombination operators for interactive genetic
algorithms are compared. In the third experiment three search spaces are compared: a 30-dimensional
search space, a mathematically reduced 12-dimensional search space, and a 12-dimensional search space
formed from the 12 most salient dimensions. Finally, we compare the performances of an interactive
genetic algorithm to interactive differential evolution. Our results show that the facial composite construction
process is remarkably robust to the choice of evolutionary operator(s), the dimensionality of the search
space, and the choice of interactive evolutionary algorithm. We attribute this to the imprecise nature of human
face perception and differences between the participants in how they interact with the algorithms.
Povzetek: Kompozitna gradnja obrazov je ena izmed najbolj uspešnih aplikacij interaktivnega evolucijskega
ra?cunanja. Kljub temu pa do zdaj na podro?cju kompozitne gradnje niso bile podrobno raziskane
možnosti snovanja algoritma. To vprašanje smo obravnavali s štirimi poskusi. V prvem je uporabljeno
sortiranje za identifikacijo 12 najbolj izstopajo?cih dimenzij 30-dimenzionalnega preiskovalnega prostora.
V drugem primerjamo u?cinkovitost dveh mutacij in dveh rekombinacijskih operaterjev za interaktivni
genetski algoritem. V tretjem primerjamo tri preiskovalne prostore: 30-dimenzionalni, matemati?cno reducirani
12-dimenzionalni in 12-dimenzionalni prostor sestavljen iz 12 najpomembnejših dimenzij. Na
koncu smo primerjali uspešnost interaktivnega genetskega algoritma z interaktivno diferencialno evolucijo.
Rezultati kažejo, da je proces kompozitne gradnje obrazov izredno robusten glede na izbiro evolucijskega
operatorja(-ev), dimenzionalnost preiskovalnega prostora in izbiro interaktivnega evolucijskega algoritma.
To pripisujemo nenatan?cni naravi percepcije in razlikam med interakcijami uporabnikov z algoritmom
- …