16 research outputs found
Face Identification and Clustering
In this thesis, we study two problems based on clustering algorithms. In the
first problem, we study the role of visual attributes using an agglomerative
clustering algorithm to whittle down the search area where the number of
classes is high to improve the performance of clustering. We observe that as we
add more attributes, the clustering performance increases overall. In the
second problem, we study the role of clustering in aggregating templates in a
1:N open set protocol using multi-shot video as a probe. We observe that by
increasing the number of clusters, the performance increases with respect to
the baseline and reaches a peak, after which increasing the number of clusters
causes the performance to degrade. Experiments are conducted using recently
introduced unconstrained IARPA Janus IJB-A, CS2, and CS3 face recognition
datasets
Illumination Condition Effect on Object Tracking: A Review
Illumination is an important concept in computer science application. A good tracker should perform well in a large number of videos involving illumination changes, occlusion, clutter, camera motion, low contrast, specularities and at least six more aspects. By using the review approach, our tracker is able to adapt to irregular illumination variations and abrupt changes of brightness. In static environment segmentation of object is not complex. In dynamic environment due to dynamic environmental conditions such as waving tree branches, shadows and illumination changes in the wind object segmentation is a difficult and major problem that needs to be handled well for a robust surveillance system. In this paper, we survey various tracking algorithms under changing lighting condition
Consensus Message Passing for Layered Graphical Models
Generative models provide a powerful framework for probabilistic reasoning.
However, in many domains their use has been hampered by the practical
difficulties of inference. This is particularly the case in computer vision,
where models of the imaging process tend to be large, loopy and layered. For
this reason bottom-up conditional models have traditionally dominated in such
domains. We find that widely-used, general-purpose message passing inference
algorithms such as Expectation Propagation (EP) and Variational Message Passing
(VMP) fail on the simplest of vision models. With these models in mind, we
introduce a modification to message passing that learns to exploit their
layered structure by passing 'consensus' messages that guide inference towards
good solutions. Experiments on a variety of problems show that the proposed
technique leads to significantly more accurate inference results, not only when
compared to standard EP and VMP, but also when compared to competitive
bottom-up conditional models.Comment: Appearing in Proceedings of the 18th International Conference on
Artificial Intelligence and Statistics (AISTATS) 201
Analysis and approximation of some Shape-from-Shading models for non-Lambertian surfaces
The reconstruction of a 3D object or a scene is a classical inverse problem
in Computer Vision. In the case of a single image this is called the
Shape-from-Shading (SfS) problem and it is known to be ill-posed even in a
simplified version like the vertical light source case. A huge number of works
deals with the orthographic SfS problem based on the Lambertian reflectance
model, the most common and simplest model which leads to an eikonal type
equation when the light source is on the vertical axis. In this paper we want
to study non-Lambertian models since they are more realistic and suitable
whenever one has to deal with different kind of surfaces, rough or specular. We
will present a unified mathematical formulation of some popular orthographic
non-Lambertian models, considering vertical and oblique light directions as
well as different viewer positions. These models lead to more complex
stationary nonlinear partial differential equations of Hamilton-Jacobi type
which can be regarded as the generalization of the classical eikonal equation
corresponding to the Lambertian case. However, all the equations corresponding
to the models considered here (Oren-Nayar and Phong) have a similar structure
so we can look for weak solutions to this class in the viscosity solution
framework. Via this unified approach, we are able to develop a semi-Lagrangian
approximation scheme for the Oren-Nayar and the Phong model and to prove a
general convergence result. Numerical simulations on synthetic and real images
will illustrate the effectiveness of this approach and the main features of the
scheme, also comparing the results with previous results in the literature.Comment: Accepted version to Journal of Mathematical Imaging and Vision, 57
page
A Computer Vision Non-Contact 3D System to Improve Fingerprint Acquisition
The fingerprint is one of the most important biometrics, with many acquisition methods developed over the years. Traditional 2D acquisition techniques produce nonlinear distortions due to the forced flattening of the finger onto a 2D surface. These random elastic deformations often introduce matching errors, making 2D techniques less reliable. Inevitably non-contact 3D capturing techniques were developed in an effort to deal with these problems. In this study we present a novel non-contact single camera 3D fingerprint reconstruction system based on fringe projection and a new model for approximating the epidermal ridges. The 3D shape of the fingerprint is reconstructed from a single 2D shading image in two steps. First the original image is decomposed into structure and texture components by an advanced Meyer algorithm. The structural component is reconstructed by a classical fringe projection technique. The textural component, containing the fingerprint information, is restored using a specialized photometric algorithm we call Cylindrical Ridge Model (CRM). CRM is a photometric algorithm that takes advantage of the axial symmetry of the ridges in order to integrate the illumination equation. The two results are combined together to form the 3D fingerprint, which is then digitally unfolded onto a 2D plane for compatibility with traditional 2D impressions. This paper describes the prototype 3D imaging system developed along with the calibration procedure, the reconstruction algorithm and the unwrapping process of the resulting 3D fingerprint, necessary for the performance evaluation of the method.
Correction of Errors in Time of Flight Cameras
En esta tesis se aborda la corrección de errores en cámaras de profundidad basadas en tiempo de vuelo (Time of Flight - ToF). De entre las más recientes tecnologías, las cámaras ToF de modulación continua (Continuous Wave Modulation - CWM) son una alternativa prometedora para la creación de sensores compactos y rápidos. Sin embargo, existen gran variedad de errores que afectan notablemente la medida de profundidad, poniendo en compromiso posibles aplicaciones. La corrección de dichos errores propone un reto desafiante. Actualmente, se consideran dos fuentes principales de error: i) sistemático y ii) no sistemático. Mientras que el primero admite calibración, el último depende de la geometría y el movimiento relativo de la escena. Esta tesis propone métodos que abordan i) la distorsión sistemática de profundidad y dos de las fuentes de error no sistemático más relevantes: ii.a) la interferencia por multicamino (Multipath Interference - MpI) y ii.b) los artefactos de movimiento. La distorsión sistemática de profundidad en cámaras ToF surge principalmente debido al uso de señales sinusoidales no perfectas para modular. Como resultado, las medidas de profundidad aparecen distorsionadas, pudiendo ser reducidas con una etapa de calibración. Esta tesis propone un método de calibración basado en mostrar a la cámara un plano en diferentes posiciones y orientaciones. Este método no requiere de patrones de calibración y, por tanto, puede emplear los planos, que de manera natural, aparecen en la escena. El método propuesto encuentra una función que obtiene la corrección de profundidad correspondiente a cada píxel. Esta tesis mejora los métodos existentes en cuanto a precisión, eficiencia e idoneidad. La interferencia por multicamino surge debido a la superposición de la señal reflejada por diferentes caminos con la reflexión directa, produciendo distorsiones que se hacen más notables en superficies convexas. La MpI es la causa de importantes errores en la estimación de profundidad en cámaras CWM ToF. Esta tesis propone un método que elimina la MpI a partir de un solo mapa de profundidad. El enfoque propuesto no requiere más información acerca de la escena que las medidas ToF. El método se fundamenta en un modelo radio-métrico de las medidas que se emplea para estimar de manera muy precisa el mapa de profundidad sin distorsión. Una de las tecnologías líderes para la obtención de profundidad en imagen ToF está basada en Photonic Mixer Device (PMD), la cual obtiene la profundidad mediante el muestreado secuencial de la correlación entre la señal de modulación y la señal proveniente de la escena en diferentes desplazamientos de fase. Con movimiento, los píxeles PMD capturan profundidades diferentes en cada etapa de muestreo, produciendo artefactos de movimiento. El método propuesto en esta tesis para la corrección de dichos artefactos destaca por su velocidad y sencillez, pudiendo ser incluido fácilmente en el hardware de la cámara. La profundidad de cada píxel se recupera gracias a la consistencia entre las muestras de correlación en el píxel PMD y de la vecindad local. Este método obtiene correcciones precisas, reduciendo los artefactos de movimiento enormemente. Además, como resultado de este método, puede obtenerse el flujo óptico en los contornos en movimiento a partir de una sola captura. A pesar de ser una alternativa muy prometedora para la obtención de profundidad, las cámaras ToF todavía tienen que resolver problemas desafiantes en relación a la corrección de errores sistemáticos y no sistemáticos. Esta tesis propone métodos eficaces para enfrentarse con estos errores