3 research outputs found

    Pattern matching of footwear Impressions

    Get PDF
    One of the most frequently secured types of evidence at crime scenes are footware impressions. Identifying the brand and model of the footware can be crucial to narrowing the search for suspects. This is done by forensic experts by comparing the evidence found at the crime scene with a huge list of reference impressions. In order to support the forensic experts an automatic retrieval of the most likely matches is desired.In this thesis different techniques are evaluated to recognize and match footwear impressions, using reference and real crime scene shoeprint images. Due to the conditions in which the shoeprints are found (partial occlusions, variation in shape) a translation, rotation and scale invariant system is needed. A VLAD (Vector of Locally Aggregated Descriptors) encoder is used to clustering descriptors obtained using different approaches, such as SIFT (Scale-Invariant Feature Transform), Dense SIFT in a Triplet CNN (Convolutional Neural Network). These last two approaches provide the best performance results when the parameters are correctly adjusted, using the Cumulative Matching Characteristic curve to evaluate it.En esta tesis se evalúan diferentes técnicas para reconocer y emparejar impresiones de calzado, utilizando imágenes de referencia y de escenas reales de crimen. Debido a las condiciones en que se encuentran las impresiones (oclusiones parciales, variaciones de forma) se necesita un sistema invariante ante translación, rotación y escalado. Para ello se utiliza un codificador VLAD (Vector of Locally Aggregated Descriptors) para agrupar descriptores obtenidos en diferentes enfoques, como SIFT (Scale-Invariant Feature Transform), Dense SIFT y Triplet CNN (Convolutional Neural Network). Estos dos últimos enfoques proporcionan los mejores resultados una vez los parámetros se han ajustado correctamente, utilizando la curva CMC (Characteristic Matching Curve) para realizar la evaluación.En aquesta tesi s'avaluen diferents tècniques per reconèixer i aparellar impressions de calçat, utilitzant imatges de referència i d'escenes reals de crim. Degut a les condicions en què es troben les impressions (oclusions parcials, variació de forma ) es necessita un sistema invariant davant translació, rotació i escalat. Per això s'utilitza un codificador VLAD (Vector of Locally Aggregated Descriptors) per agrupar descriptors obtinguts en diferents enfocaments, com SIFT (Scale-Invariant Feature Transform), Dense SIFT i Triplet CNN (Convolutional Neural Network). Aquests dos últims enfocaments proporcionen els millors resultats un cop els paràmetres s'han ajustat correctament, utilitzant la corba CMC (Characteristic Matching Curve) per realitzar l'avaluació

    A Survey of Face Recognition based on Convolutional Neural Network

    Get PDF
    Face recognition is one of the interesting research topics in the field of computer vision. In recent years, deep learning methods, especially the Convolutional Neural Network, have progressed. One of the successes of CNN is in face recognition. Face recognition by computer is a technique done so that the computer can automatically recognize faces in an image. Various researchers have conducted related research on facial recognition. This survey presents researches related to face recognition based on Convolutional Neural Network that has been conducted. The studies used are studies that have been published in the last five years. It was performed to determine the renewal that emerged in face recognition based on Convolutional Neural Network. The basic theory of the Convolutional Neural Network, face recognition, and description of the database used in various researches are also discussed. Hopefully, this survey can provide additional knowledge regarding face recognition based on the Convolutional Neural Network

    Deep Learning with Constraints and Priors for Improved Subject Clustering, Medical Imaging, and Robust Inference

    Get PDF
    Deep neural networks (DNNs) have achieved significant success in several fields including computer vision, natural language processing, and robot control. The common philosophy behind these success is the use of large amount of annotated data and end-to-end networks with task-specific constraints and priors implicitly incorporated into the trained model without the need for careful feature engineering. However, DNNs are shown to be vulnerable to distribution shifts and adversarial perturbations, which indicates that such implicit priors and constraints are not sufficient for real world applications. In this dissertation, we target three applications and design task-specific constraints and priors for improved performance of deep neural networks. We first study the problem of subject clustering, the task of grouping face images of the same person together. We propose to utilize the prior structure in the feature space of DNNs trained for face identification to design a novel clustering algorithm. Specifically, the clustering algorithm exploits the local neighborhood structure of deep representations by exemplar-based learning based on k-nearest neighbors (k-NN). Extensive experiments show promising results for grouping face images according to subject identity. As an example, we apply the proposed clustering algorithm to automatically curate a large-scale face dataset with noisy labels and show that the performance of face recognition DNNs can be significantly improved by training on the curated dataset. Furthermore, we empirically find that the k-NN rule does not capture proper local structures for deep representations when each subject has very few face images. We then propose to improve upon the exemplar-based approach by a density-aware similarity measure and theoretically show its asymptotic convergence to a density estimator. We conduct experiments on challenging face datasets that show promising results. Second, we study the problem of metal artifact reduction in computed tomography (CT). Unlike typical image restoration tasks such as super-resolution and denoising, metal artifacts in CT images are structured and non-local. Conventional DNNs do not generalize well when metal implants with unseen shapes are presented. We find that the imaging process of CT induces a data consistency prior that can be exploited for image enhancement. Based on this observation, we propose a dual-domain learning approach to CT metal artifact reduction. We design and implement a novel Radon inversion layer that allows gradients in the image domain to be backpropagated to the projection domain. Experiments conducted on both simulated datasets and clinical datasets show promising results. Compared to conventional DNN-based models, the proposed dual-domain approach leads to impressive metal artifact reduction and has improved generalization capability. Finally, we study the problem of robust classification. In the past few years, the vulnerability of DNNs to small imperceptible perturbations has been widely studied, which raises concerns about the security and robustness of DNNs against possible threat models. To defend against threat models, Samangoui et al. proposed DefenseGAN, a preprocessing approach which removes adversarial perturbations by projecting the input images onto the learned data prior. However, the projection operation in DefenseGAN is time-consuming and may not yield proper reconstruction when images have complicated textures. We propose an inversion network to constrain the initial estimates of the latent code for input images. With the proposed constraint, the number of optimization steps in DefenseGAN can be reduced while achieving improved accuracy and robustness. Furthermore, we conduct empirical studies on attack methods that have claimed to break DefenseGAN, which shows that on-manifold robustness might be the key factor for ensuring adversarial robustness
    corecore