126 research outputs found

    A Review of Recent Advances in Surface Defect Detection using Texture analysis Techniques

    Get PDF
    In this paper, we systematically review recent advances in surface inspection using computer vision andimage processing techniques, particularly those based on texture analysis methods. The aim is to reviewthe state-of-the-art techniques for the purposes of visual inspection and decision making schemes that areable to discriminate the features extracted from normal and defective regions. This field is so vast that itis impossible to cover all the aspects of visual inspection. This paper focuses on a particular but importantsubset which generally treats visual surface inspection as texture analysis problems. Other topics related tovisual inspection such as imaging system and data acquisition are out of the scope of this survey.The surface defects are loosely separated into two types. One is local textural irregularities which is themain concern for most visual surface inspection applications. The other is global deviation of colour and/ortexture, where local pattern or texture does not exhibit abnormalities. We refer this type of defects as shadeor tonality problem. The second type of defects have been largely neglected until recently, particularly whencolour imaging system has been widely used in visual inspection and where chromatic consistency plays animportant role in quality control. The emphasis of this survey though is still on detecting local abnormalities,given the fact that majority of the reported works are dealing with the first type of defects.The techniques used to inspect textural abnormalities are discussed in four categories, statistical approaches,structural approaches, filter based methods, and model based approaches, with a comprehensivelist of references to some recent works. Due to rising demand and practice of colour texture analysis inapplication to visual inspection, those works that are dealing with colour texture analysis are discussedseparately. It is also worth noting that processing vector-valued data has its unique challenges, which conventionalsurface inspection methods have often ignored or do not encounter.We also compare classification approaches with novelty detection approaches at the decision makingstage. Classification approaches often require supervised training and usually provide better performancethan novelty detection based approaches where training is only carried out on defect-free samples. However,novelty detection is relatively easier to adapt and is particularly desirable when training samples areincomplet

    Deep Learning Approaches to Image Texture Analysis in Material Processing

    Get PDF
    Texture analysis is key to better understanding of the relationships between the microstructures of the materials and their properties, as well as the use of models in process systems using raw signals or images as input. Recently, new methods based on transfer learning with deep neural networks have become established as highly competitive approaches to classical texture analysis. In this study, three traditional approaches, based on the use of grey level co-occurrence matrices, local binary patterns and textons are compared with five transfer learning approaches, based on the use of AlexNet, VGG19, ResNet50, GoogLeNet and MobileNetV2. This is done based on two simulated and one real-world case study. In the simulated case studies, material microstructures were simulated with Voronoi graphic representations and in the real-world case study, the appearance of ultrahigh carbon steel is cast as a textural pattern recognition pattern. The ability of random forest models, as well as the convolutional neural networks themselves, to discriminate between different textures with the image features as input was used as the basis for comparison. The texton algorithm performed better than the LBP and GLCM algorithms and similar to the deep learning approaches when these were used directly, without any retraining. Partial or full retraining of the convolutional neural networks yielded considerably better results, with GoogLeNet and MobileNetV2 yielding the best results

    Road Condition Mapping by Integration of Laser Scanning, RGB Imaging and Spectrometry

    Get PDF
    Roads are important infrastructure and are primary means of transportation. Control and maintenance of roads are substantial as the pavement surface deforms and deteriorates due to heavy load and influences of weather. Acquiring detailed information about the pavement condition is a prerequisite for proper planning of road pavement maintenance and rehabilitation. Many companies detect and localize the road pavement distresses manually, either by on-site inspection or by digitizing laser data and imagery captured by mobile mapping. The automation of road condition mapping using laser data and colour images is a challenge. Beyond that, the mapping of material properties of the road pavement surface with spectrometers has not yet been investigated. This study aims at automatic mapping of road surface condition including distress and material properties by integrating laser scanning, RGB imaging and spectrometry. All recorded data are geo-referenced by means of GNSS/ INS. Methods are developed for pavement distress detection that cope with a variety of different weather and asphalt conditions. Further objective is to analyse and map the material properties of the pavement surface using spectrometry data. No standard test data sets are available for benchmarking developments on road condition mapping. Therefore, all data have been recorded with a mobile mapping van which is set up for the purpose of this research. The concept for detecting and localizing the four main pavement distresses, i.e. ruts, potholes, cracks and patches is the following: ruts and potholes are detected using laser scanning data, cracks and patches using RGB images. For each of these pavement distresses, two or more methods are developed, implemented, compared to each other and evaluated to identify the most successful method. With respect to the material characteristics, spectrometer data of road sections are classified to indicate pavement quality. As a spectrometer registers almost a reflectivity curve in VIS, NIR and SWIR wavelength, indication of aging can be derived. After detection and localization of the pavement distresses and pavement quality classes, the road condition map is generated by overlaying all distresses and quality classes. As a preparatory step for rut and pothole detection, the road surface is extracted from mobile laser scanning data based on a height jump criterion. For the investigation on rut detection, all scanlines are processed. With an approach based on iterative 1D polynomial fitting, ruts are successfully detected. For streets with the width of 6 m to 10 m, a 6th order polynomial is found to be most suitable. By 1D cross-correlation, the centre of the rut is localized. An alternative method using local curvature shows a high sensitivity to the shape and width of a rut and is less successful. For pothole detection, the approach based on polynomial fitting generalized to two dimensions. As an alternative, a procedure using geodesic morphological reconstruction is investigated. Bivariate polynomial fitting encounters problems with overshoot at the boundary of the regions. The detection is very successful using geodesic morphology. For the detection of pavement cracks, three methods using rotation invariant kernels are investigated. Line Filter, High-pass Filter and Modified Local Binary Pattern kernels are implemented. A conceptual aspect of the procedure is to achieve a high degree of completeness. The most successful variant is the Line Filter for which the highest degree of completeness of 81.2 % is achieved. Two texture measures, the gradient magnitude and the local standard deviation are employed to detect pavement patches. As patches may differ with respect to homogeneity and may not always have a dark border with the intact pavement surface, the method using the local standard deviation is more suitable for detecting the patches. Linear discriminant analysis is utilized for asphalt pavement quality analysis and classification. Road pavement sections of ca. 4 m length are classified into two classes, namely: “Good” and “Bad” with the overall accuracy of 77.6 %. The experimental investigations show that the developed methods for automatic distress detection are very successful. By 1D polynomial fitting on laser scanlines, ruts are detected. In addition to ruts also pavement depressions like shoving can be revealed. The extraction of potholes is less demanding. As potholes appear relatively rare in the road networks of a city, the road segments which are affected by potholes are selected interactively. While crack detection by Line Filter works very well, the patch detection is more challenging as patches sometimes look very similar to the intact surface. The spectral classification of pavement sections contributes to road condition mapping as it gives hints on aging of the road pavement.Straßen bilden die primären Transportwege für Personen und Güter und sind damit ein wichtiger Bestandteil der Infrastruktur. Der Aufwand für Instandhaltung und Wartung der Straßen ist erheblich, da sich die Fahrbahnoberfläche verformt und durch starke Belastung und Wettereinflüsse verschlechtert. Die Erfassung detaillierter Informationen über den Fahrbahnzustand ist Voraussetzung für eine sachgemäße Planung der Fahrbahnsanierung und -rehabilitation. Viele Unternehmen detektieren und lokalisieren die Fahrbahnschäden manuell entweder durch Vor-Ort-Inspektion oder durch Digitalisierung von Laserdaten und Bildern aus mobiler Datenerfassung. Eine Automatisierung der Straßenkartierung mit Laserdaten und Farbbildern steht noch in den Anfängen. Zudem werden bisher noch nicht die Alterungszustände der Asphaltdecke mit Hilfe der Spektrometrie bewertet. Diese Studie zielt auf den automatischen Prozess der Straßenzustandskartierung einschließlich der Straßenschäden und der Materialeigenschaften durch Integration von Laserscanning, RGB-Bilderfassung und Spektrometrie ab. Alle aufgezeichneten Daten werden mit GNSS / INS georeferenziert. Es werden Methoden für die Erkennung von Straßenschäden entwickelt, die sich an unterschiedliche Datenquellen bei unterschiedlichem Wetter- und Asphaltzustand anpassen können. Ein weiteres Ziel ist es, die Materialeigenschaften der Fahrbahnoberfläche mittels Spektrometrie-Daten zu analysieren und abzubilden. Derzeit gibt es keine standardisierten Testdatensätze für die Evaluierung von Verfahren zur Straßenzustandsbeschreibung. Deswegen wurden alle Daten, die in dieser Studie Verwendung finden, mit einem eigens für diesen Forschungszweck konfigurierten Messfahrzeug aufgezeichnet. Das Konzept für die Detektion und Lokalisierung der wichtigsten vier Arten von Straßenschäden, nämlich Spurrillen, Schlaglöcher, Risse und Flickstellen ist das folgende: Spurrillen und Schlaglöcher werden aus Laserdaten extrahiert, Risse und Flickstellen aus RGB- Bildern. Für jede dieser Straßenschäden werden mindestens zwei Methoden entwickelt, implementiert, miteinander verglichen und evaluiert um festzustellen, welche Methode die erfolgreichste ist. Im Hinblick auf die Materialeigenschaften werden Spektrometriedaten der Straßenabschnitte klassifiziert, um die Qualität des Straßenbelages zu bewerten. Da ein Spektrometer nahezu eine kontinuierliche Reflektivitätskurve im VIS-, NIR- und SWIR-Wellenlängenbereich aufzeichnet, können Merkmale der Asphaltalterung abgeleitet werden. Nach der Detektion und Lokalisierung der Straßenschäden und der Qualitätsklasse des Straßenbelages wird der übergreifende Straßenzustand mit Hilfe von Durchschlagsregeln als Kombination aller Zustandswerte und Qualitätsklassen ermittelt. In einem vorbereitenden Schritt für die Spurrillen- und Schlaglocherkennung wird die Straßenoberfläche aus mobilen Laserscanning-Daten basierend auf einem Höhensprung-Kriterium extrahiert. Für die Untersuchung zur Spurrillen-Erkennung werden alle Scanlinien verarbeitet. Mit einem Ansatz, der auf iterativer 1D-Polynomanpassung basiert, werden Spurrillen erfolgreich erkannt. Für eine Straßenbreite von 8-10m erweist sich ein Polynom sechsten Grades als am besten geeignet. Durch 1D-Kreuzkorrelation wird die Mitte der Spurrille erkannt. Eine alternative Methode, die die lokale Krümmung des Querprofils benutzt, erweist sich als empfindlich gegenüber Form und Breite einer Spurrille und ist weniger erfolgreich. Zur Schlaglocherkennung wird der Ansatz, der auf Polynomanpassung basiert, auf zwei Dimensionen verallgemeinert. Als Alternative wird eine Methode untersucht, die auf der Geodätischen Morphologischen Rekonstruktion beruht. Bivariate Polynomanpassung führt zu Überschwingen an den Rändern der Regionen. Die Detektion mit Hilfe der Geodätischen Morphologischen Rekonstruktion ist dagegen sehr erfolgreich. Zur Risserkennung werden drei Methoden untersucht, die rotationsinvariante Kerne verwenden. Linienfilter, Hochpassfilter und Lokale Binäre Muster werden implementiert. Ein Ziel des Konzeptes zur Risserkennung ist es, eine hohe Vollständigkeit zu erreichen. Die erfolgreichste Variante ist das Linienfilter, für das mit 81,2 % der höchste Grad an Vollständigkeit erzielt werden konnte. Zwei Texturmaße, nämlich der Betrag des Grauwert-Gradienten und die lokale Standardabweichung werden verwendet, um Flickstellen zu entdecken. Da Flickstellen hinsichtlich der Homogenität variieren können und nicht immer eine dunkle Grenze mit dem intakten Straßenbelag aufweisen, ist diejenige Methode, welche die lokale Standardabweichung benutzt, besser zur Erkennung von Flickstellen geeignet. Lineare Diskriminanzanalyse wird zur Analyse der Asphaltqualität und zur Klassifikation benutzt. Straßenabschnitte von ca. 4m Länge werden zwei Klassen („Gut“ und „Schlecht“) mit einer gesamten Accuracy von 77,6 % zugeordnet. Die experimentellen Untersuchungen zeigen, dass die entwickelten Methoden für die automatische Entdeckung von Straßenschäden sehr erfolgreich sind. Durch 1D Polynomanpassung an Laser-Scanlinien werden Spurrillen entdeckt. Zusätzlich zu Spurrillen werden auch Unebenheiten des Straßenbelages wie Aufschiebungen detektiert. Die Extraktion von Schlaglöchern ist weniger anspruchsvoll. Da Schlaglöcher relativ selten in den Straßennetzen von Städten auftreten, werden die Straßenabschnitte mit Schlaglöchern interaktiv ausgewählt. Während die Rissdetektion mit Linienfiltern sehr gut funktioniert, ist die Erkennung von Flickstellen eine größere Herausforderung, da Flickstellen manchmal der intakten Straßenoberfläche sehr ähnlich sehen. Die spektrale Klassifizierung der Straßenabschnitte trägt zur Straßenzustandsbewertung bei, indem sie Hinweise auf den Alterungszustand des Straßenbelages liefert

    Smart Road Danger Detection and Warning

    Get PDF
    Road dangers have caused numerous accidents, thus detecting them and warning users are critical to improving traffic safety. However, it is challenging to recognize road dangers from numerous normal data and warn road users due to cluttered real-world backgrounds, ever-changing road danger appearances, high intra-class differences, limited data for one party, and high privacy leakage risk of sensitive information. To address these challenges, in this thesis, three novel road danger detection and warning frameworks are proposed to improve the performance of real-time road danger prediction and notification in challenging real-world environments in four main aspects, i.e., accuracy, latency, communication efficiency, and privacy. Firstly, many existing road danger detection systems mainly process data on clouds. However, they cannot warn users timely about road dangers due to long distances. Meanwhile, supervised machine learning algorithms are usually used in these systems requiring large and precisely labeled datasets to perform well. The EcRD is proposed to improve latency and reduce labeling cost, which is an Edge-cloud-based Road Damage detection and warning framework that leverages the fast-responding advantage of edges and the large storage and computation resources advantages of the cloud. In EcRD, a simple yet efficient road segmentation algorithm is introduced for fast and accurate road area detection by filtering out noisy backgrounds. Additionally, a light-weighted road damage detector is developed based on Gray Level Co-occurrence Matrix (GLCM) features on edges for rapid hazardous road damage detection and warning. Further, a multi-types road damage detection model is proposed for long-term road management on the cloud, embedded with a novel image-label generator based on Cycle-Consistent Adversarial Networks, which automatically generates images with corresponding labels to improve road damage detection accuracy further. EcRD achieves 91.96% accuracy with only 0.0043s latency, which is around 579 times faster than cloud-based approaches without affecting users' experience while requiring very low storage and labeling cost. Secondly, although EcRD relieves the problem of high latency by edge computing techniques, road users can only achieve warnings of hazardous road damages within a small area due to the limited communication range of edges. Besides, untrusted edges might misuse users' personal information. A novel FedRD named FedRD is developed to improve the coverage range of warning information and protect data privacy. In FedRD, a new hazardous road damage detection model is proposed leveraging the advantages of feature fusion. A novel adaptive federated learning strategy is designed for high-performance model learning from different edges. A new individualized differential privacy approach with pixelization is proposed to protect users' privacy before sharing data. Simulation results show that FedRD achieves similar high detection performance (i.e., 90.32% accuracy) but with more than 1000 times wider coverage than the state-of-the-art, and works well when some edges only have limited samples; besides, it largely preserves users' privacy. Finally, despite the success of EcRD and FedRD in improving latency and protecting privacy, they are only based on a single modality (i.e., image/video) while nowadays, different modalities data becomes ubiquitous. Also, the communication cost of EcRD and FedRD are very high due to undifferentiated data transmission (both normal and dangerous data) and frequent model exchanges in its federated learning setting, respectively. A novel edge-cloud-based privacy-preserving Federated Multimodal learning framework for Road Danger detection and warning named FedMRD is introduced to leverage the multi-modality data in the real-world and reduce communication costs. In FedMRD, a novel multimodal road danger detection model considering both inter-and intra-class relations is developed. A communication-efficient federated learning strategy is proposed for collaborative model learning from edges with non-iid and imbalanced data. Further, a new multimodal differential privacy technique for high dimensional multimodal data with multiple attributes is introduced to protect data privacy directly on users' devices before uploading to edges. Experimental results demonstrate that FedMRD achieves around 96.42% higher accuracy with only 0.0351s latency and up to 250 times less communication cost compared with the state-of-the-art, and enables collaborative learning from multiple edges with non-iid and imbalanced data in different modalities while preservers users' privacy.2021-11-2

    Texture Structure Analysis

    Get PDF
    abstract: Texture analysis plays an important role in applications like automated pattern inspection, image and video compression, content-based image retrieval, remote-sensing, medical imaging and document processing, to name a few. Texture Structure Analysis is the process of studying the structure present in the textures. This structure can be expressed in terms of perceived regularity. Our human visual system (HVS) uses the perceived regularity as one of the important pre-attentive cues in low-level image understanding. Similar to the HVS, image processing and computer vision systems can make fast and efficient decisions if they can quantify this regularity automatically. In this work, the problem of quantifying the degree of perceived regularity when looking at an arbitrary texture is introduced and addressed. One key contribution of this work is in proposing an objective no-reference perceptual texture regularity metric based on visual saliency. Other key contributions include an adaptive texture synthesis method based on texture regularity, and a low-complexity reduced-reference visual quality metric for assessing the quality of synthesized textures. In order to use the best performing visual attention model on textures, the performance of the most popular visual attention models to predict the visual saliency on textures is evaluated. Since there is no publicly available database with ground-truth saliency maps on images with exclusive texture content, a new eye-tracking database is systematically built. Using the Visual Saliency Map (VSM) generated by the best visual attention model, the proposed texture regularity metric is computed. The proposed metric is based on the observation that VSM characteristics differ between textures of differing regularity. The proposed texture regularity metric is based on two texture regularity scores, namely a textural similarity score and a spatial distribution score. In order to evaluate the performance of the proposed regularity metric, a texture regularity database called RegTEX, is built as a part of this work. It is shown through subjective testing that the proposed metric has a strong correlation with the Mean Opinion Score (MOS) for the perceived regularity of textures. The proposed method is also shown to be robust to geometric and photometric transformations and outperforms some of the popular texture regularity metrics in predicting the perceived regularity. The impact of the proposed metric to improve the performance of many image-processing applications is also presented. The influence of the perceived texture regularity on the perceptual quality of synthesized textures is demonstrated through building a synthesized textures database named SynTEX. It is shown through subjective testing that textures with different degrees of perceived regularities exhibit different degrees of vulnerability to artifacts resulting from different texture synthesis approaches. This work also proposes an algorithm for adaptively selecting the appropriate texture synthesis method based on the perceived regularity of the original texture. A reduced-reference texture quality metric for texture synthesis is also proposed as part of this work. The metric is based on the change in perceived regularity and the change in perceived granularity between the original and the synthesized textures. The perceived granularity is quantified through a new granularity metric that is proposed in this work. It is shown through subjective testing that the proposed quality metric, using just 2 parameters, has a strong correlation with the MOS for the fidelity of synthesized textures and outperforms the state-of-the-art full-reference quality metrics on 3 different texture databases. Finally, the ability of the proposed regularity metric in predicting the perceived degradation of textures due to compression and blur artifacts is also established.Dissertation/ThesisPh.D. Electrical Engineering 201

    Robust texture classification based on machine learning

    Full text link

    Study on Co-occurrence-based Image Feature Analysis and Texture Recognition Employing Diagonal-Crisscross Local Binary Pattern

    Get PDF
    In this thesis, we focus on several important fields on real-world image texture analysis and recognition. We survey various important features that are suitable for texture analysis. Apart from the issue of variety of features, different types of texture datasets are also discussed in-depth. There is no thorough work covering the important databases and analyzing them in various viewpoints. We persuasively categorize texture databases ? based on many references. In this survey, we put a categorization to split these texture datasets into few basic groups and later put related datasets. Next, we exhaustively analyze eleven second-order statistical features or cues based on co-occurrence matrices to understand image texture surface. These features are exploited to analyze properties of image texture. The features are also categorized based on their angular orientations and their applicability. Finally, we propose a method called diagonal-crisscross local binary pattern (DCLBP) for texture recognition. We also propose two other extensions of the local binary pattern. Compare to the local binary pattern and few other extensions, we achieve that our proposed method performs satisfactorily well in two very challenging benchmark datasets, called the KTH-TIPS (Textures under varying Illumination, Pose and Scale) database, and the USC-SIPI (University of Southern California ? Signal and Image Processing Institute) Rotations Texture dataset.九州工業大学博士学位論文 学位記番号:工博甲第354号 学位授与年月日:平成25年9月27日CHAPTER 1 INTRODUCTION|CHAPTER 2 FEATURES FOR TEXTURE ANALYSIS|CHAPTER 3 IN-DEPTH ANALYSIS OF TEXTURE DATABASES|CHAPTER 4 ANALYSIS OF FEATURES BASED ON CO-OCCURRENCE IMAGE MATRIX|CHAPTER 5 CATEGORIZATION OF FEATURES BASED ON CO-OCCURRENCE IMAGE MATRIX|CHAPTER 6 TEXTURE RECOGNITION BASED ON DIAGONAL-CRISSCROSS LOCAL BINARY PATTERN|CHAPTER 7 CONCLUSIONS AND FUTURE WORK九州工業大学平成25年

    Modelling visual search for surface defects

    Get PDF
    Much work has been done on developing algorithms for automated surface defect detection. However, comparisons between these models and human perception are rarely carried out. This thesis aims to investigate how well human observers can nd defects in textured surfaces, over a wide range of task di culties. Stimuli for experiments will be generated using texture synthesis methods and human search strategies will be captured by use of an eye tracker. Two di erent modelling approaches will be explored. A computational LNL-based model will be developed and compared to human performance in terms of the number of xations required to find the target. Secondly, a stochastic simulation, based on empirical distributions of saccades, will be compared to human search strategies

    QUIS-CAMPI: Biometric Recognition in Surveillance Scenarios

    Get PDF
    The concerns about individuals security have justified the increasing number of surveillance cameras deployed both in private and public spaces. However, contrary to popular belief, these devices are in most cases used solely for recording, instead of feeding intelligent analysis processes capable of extracting information about the observed individuals. Thus, even though video surveillance has already proved to be essential for solving multiple crimes, obtaining relevant details about the subjects that took part in a crime depends on the manual inspection of recordings. As such, the current goal of the research community is the development of automated surveillance systems capable of monitoring and identifying subjects in surveillance scenarios. Accordingly, the main goal of this thesis is to improve the performance of biometric recognition algorithms in data acquired from surveillance scenarios. In particular, we aim at designing a visual surveillance system capable of acquiring biometric data at a distance (e.g., face, iris or gait) without requiring human intervention in the process, as well as devising biometric recognition methods robust to the degradation factors resulting from the unconstrained acquisition process. Regarding the first goal, the analysis of the data acquired by typical surveillance systems shows that large acquisition distances significantly decrease the resolution of biometric samples, and thus their discriminability is not sufficient for recognition purposes. In the literature, diverse works point out Pan Tilt Zoom (PTZ) cameras as the most practical way for acquiring high-resolution imagery at a distance, particularly when using a master-slave configuration. In the master-slave configuration, the video acquired by a typical surveillance camera is analyzed for obtaining regions of interest (e.g., car, person) and these regions are subsequently imaged at high-resolution by the PTZ camera. Several methods have already shown that this configuration can be used for acquiring biometric data at a distance. Nevertheless, these methods failed at providing effective solutions to the typical challenges of this strategy, restraining its use in surveillance scenarios. Accordingly, this thesis proposes two methods to support the development of a biometric data acquisition system based on the cooperation of a PTZ camera with a typical surveillance camera. The first proposal is a camera calibration method capable of accurately mapping the coordinates of the master camera to the pan/tilt angles of the PTZ camera. The second proposal is a camera scheduling method for determining - in real-time - the sequence of acquisitions that maximizes the number of different targets obtained, while minimizing the cumulative transition time. In order to achieve the first goal of this thesis, both methods were combined with state-of-the-art approaches of the human monitoring field to develop a fully automated surveillance capable of acquiring biometric data at a distance and without human cooperation, designated as QUIS-CAMPI system. The QUIS-CAMPI system is the basis for pursuing the second goal of this thesis. The analysis of the performance of the state-of-the-art biometric recognition approaches shows that these approaches attain almost ideal recognition rates in unconstrained data. However, this performance is incongruous with the recognition rates observed in surveillance scenarios. Taking into account the drawbacks of current biometric datasets, this thesis introduces a novel dataset comprising biometric samples (face images and gait videos) acquired by the QUIS-CAMPI system at a distance ranging from 5 to 40 meters and without human intervention in the acquisition process. This set allows to objectively assess the performance of state-of-the-art biometric recognition methods in data that truly encompass the covariates of surveillance scenarios. As such, this set was exploited for promoting the first international challenge on biometric recognition in the wild. This thesis describes the evaluation protocols adopted, along with the results obtained by the nine methods specially designed for this competition. In addition, the data acquired by the QUIS-CAMPI system were crucial for accomplishing the second goal of this thesis, i.e., the development of methods robust to the covariates of surveillance scenarios. The first proposal regards a method for detecting corrupted features in biometric signatures inferred by a redundancy analysis algorithm. The second proposal is a caricature-based face recognition approach capable of enhancing the recognition performance by automatically generating a caricature from a 2D photo. The experimental evaluation of these methods shows that both approaches contribute to improve the recognition performance in unconstrained data.A crescente preocupação com a segurança dos indivíduos tem justificado o crescimento do número de câmaras de vídeo-vigilância instaladas tanto em espaços privados como públicos. Contudo, ao contrário do que normalmente se pensa, estes dispositivos são, na maior parte dos casos, usados apenas para gravação, não estando ligados a nenhum tipo de software inteligente capaz de inferir em tempo real informações sobre os indivíduos observados. Assim, apesar de a vídeo-vigilância ter provado ser essencial na resolução de diversos crimes, o seu uso está ainda confinado à disponibilização de vídeos que têm que ser manualmente inspecionados para extrair informações relevantes dos sujeitos envolvidos no crime. Como tal, atualmente, o principal desafio da comunidade científica é o desenvolvimento de sistemas automatizados capazes de monitorizar e identificar indivíduos em ambientes de vídeo-vigilância. Esta tese tem como principal objetivo estender a aplicabilidade dos sistemas de reconhecimento biométrico aos ambientes de vídeo-vigilância. De forma mais especifica, pretende-se 1) conceber um sistema de vídeo-vigilância que consiga adquirir dados biométricos a longas distâncias (e.g., imagens da cara, íris, ou vídeos do tipo de passo) sem requerer a cooperação dos indivíduos no processo; e 2) desenvolver métodos de reconhecimento biométrico robustos aos fatores de degradação inerentes aos dados adquiridos por este tipo de sistemas. No que diz respeito ao primeiro objetivo, a análise aos dados adquiridos pelos sistemas típicos de vídeo-vigilância mostra que, devido à distância de captura, os traços biométricos amostrados não são suficientemente discriminativos para garantir taxas de reconhecimento aceitáveis. Na literatura, vários trabalhos advogam o uso de câmaras Pan Tilt Zoom (PTZ) para adquirir imagens de alta resolução à distância, principalmente o uso destes dispositivos no modo masterslave. Na configuração master-slave um módulo de análise inteligente seleciona zonas de interesse (e.g. carros, pessoas) a partir do vídeo adquirido por uma câmara de vídeo-vigilância e a câmara PTZ é orientada para adquirir em alta resolução as regiões de interesse. Diversos métodos já mostraram que esta configuração pode ser usada para adquirir dados biométricos à distância, ainda assim estes não foram capazes de solucionar alguns problemas relacionados com esta estratégia, impedindo assim o seu uso em ambientes de vídeo-vigilância. Deste modo, esta tese propõe dois métodos para permitir a aquisição de dados biométricos em ambientes de vídeo-vigilância usando uma câmara PTZ assistida por uma câmara típica de vídeo-vigilância. O primeiro é um método de calibração capaz de mapear de forma exata as coordenadas da câmara master para o ângulo da câmara PTZ (slave) sem o auxílio de outros dispositivos óticos. O segundo método determina a ordem pela qual um conjunto de sujeitos vai ser observado pela câmara PTZ. O método proposto consegue determinar em tempo-real a sequência de observações que maximiza o número de diferentes sujeitos observados e simultaneamente minimiza o tempo total de transição entre sujeitos. De modo a atingir o primeiro objetivo desta tese, os dois métodos propostos foram combinados com os avanços alcançados na área da monitorização de humanos para assim desenvolver o primeiro sistema de vídeo-vigilância completamente automatizado e capaz de adquirir dados biométricos a longas distâncias sem requerer a cooperação dos indivíduos no processo, designado por sistema QUIS-CAMPI. O sistema QUIS-CAMPI representa o ponto de partida para iniciar a investigação relacionada com o segundo objetivo desta tese. A análise do desempenho dos métodos de reconhecimento biométrico do estado-da-arte mostra que estes conseguem obter taxas de reconhecimento quase perfeitas em dados adquiridos sem restrições (e.g., taxas de reconhecimento maiores do que 99% no conjunto de dados LFW). Contudo, este desempenho não é corroborado pelos resultados observados em ambientes de vídeo-vigilância, o que sugere que os conjuntos de dados atuais não contêm verdadeiramente os fatores de degradação típicos dos ambientes de vídeo-vigilância. Tendo em conta as vulnerabilidades dos conjuntos de dados biométricos atuais, esta tese introduz um novo conjunto de dados biométricos (imagens da face e vídeos do tipo de passo) adquiridos pelo sistema QUIS-CAMPI a uma distância máxima de 40m e sem a cooperação dos sujeitos no processo de aquisição. Este conjunto permite avaliar de forma objetiva o desempenho dos métodos do estado-da-arte no reconhecimento de indivíduos em imagens/vídeos capturados num ambiente real de vídeo-vigilância. Como tal, este conjunto foi utilizado para promover a primeira competição de reconhecimento biométrico em ambientes não controlados. Esta tese descreve os protocolos de avaliação usados, assim como os resultados obtidos por 9 métodos especialmente desenhados para esta competição. Para além disso, os dados adquiridos pelo sistema QUIS-CAMPI foram essenciais para o desenvolvimento de dois métodos para aumentar a robustez aos fatores de degradação observados em ambientes de vídeo-vigilância. O primeiro é um método para detetar características corruptas em assinaturas biométricas através da análise da redundância entre subconjuntos de características. O segundo é um método de reconhecimento facial baseado em caricaturas automaticamente geradas a partir de uma única foto do sujeito. As experiências realizadas mostram que ambos os métodos conseguem reduzir as taxas de erro em dados adquiridos de forma não controlada
    corecore