8,744 research outputs found

    Population-Based Evolutionary Gaming for Unsupervised Person Re-identification

    Full text link
    Unsupervised person re-identification has achieved great success through the self-improvement of individual neural networks. However, limited by the lack of diversity of discriminant information, a single network has difficulty learning sufficient discrimination ability by itself under unsupervised conditions. To address this limit, we develop a population-based evolutionary gaming (PEG) framework in which a population of diverse neural networks is trained concurrently through selection, reproduction, mutation, and population mutual learning iteratively. Specifically, the selection of networks to preserve is modeled as a cooperative game and solved by the best-response dynamics, then the reproduction and mutation are implemented by cloning and fluctuating hyper-parameters of networks to learn more diversity, and population mutual learning improves the discrimination of networks by knowledge distillation from each other within the population. In addition, we propose a cross-reference scatter (CRS) to approximately evaluate re-ID models without labeled samples and adopt it as the criterion of network selection in PEG. CRS measures a model's performance by indirectly estimating the accuracy of its predicted pseudo-labels according to the cohesion and separation of the feature space. Extensive experiments demonstrate that (1) CRS approximately measures the performance of models without labeled samples; (2) and PEG produces new state-of-the-art accuracy for person re-identification, indicating the great potential of population-based network cooperative training for unsupervised learning.Comment: Accepted in IJC

    Informationsrouting, Korrespondenzfindung und Objekterkennung im Gehirn

    Get PDF
    The dissertation deals with the general problem of how the brain can establish correspondences between neural patterns stored in different cortical areas. Although an important capability in many cognitive areas like language understanding, abstract reasoning, or motor control, this thesis concentrates on invariant object recognition as application of correspondence finding. One part of the work presents a correspondence-based, neurally plausible system for face recognition. Other parts address the question of visual information routing over several stages by proposing optimal architectures for such routing ('switchyards') and deriving ontogenetic mechanisms for the growth of switchyards. Finally, the idea of multi-stage routing is united with the object recognition system introduced before, making suggestions of how the so far distinct feature-based and correspondence-based approaches to object recognition could be reconciled.Allgemein gesprochen beschäftigt sich die vorliegende Arbeit mit der Frage, wie das Gehirn Korrespondenzen zwischen Aktivitätsmustern finden kann. Dies ist ein zentrales Thema in der visuellen Objekterkennung, hat aber Bedeutung für alle Bereiche der neuronalen Datenverarbeitung vom Hören bis zum abstrakten Denken. Das Korrespondenzfinden sollte invariant gegenüber Veränderungen sein, die das Erscheinungsbild, aber nicht die Bedeutung der Muster ändern. Außerdem sollte es auch funktionieren, wenn die beiden Muster nicht direkt, sondern nur über Zwischenstationen miteinander verbunden sind. Voraussetzungen für das invariante Korrespondenzfinden zwischen Mustern sind einerseits die Existenz sinnvoller Verbindungsstrukturen, und andererseits ein prinzipieller neuronaler Mechanismus zum Finden von Korrespondenzen. Mit einem prinzipiellen Korrespondenzfindungsmechanismus befasst sich Kapitel 2 der Arbeit. Dieser beruht auf dynamischen Links zwischen den Punkten beider Muster, die durch punktuelle ähnlichkeit der Muster und globale Konsistenz mit benachbarten Links aktiviert werden. In mehrschichtigen Systemen können dynamische Links außer zur Korrespondenzfindung auch zum kontrollierten Routing von Information verwendet werden. Unter Verwendung dieser Eigenschaft wird in Kapitel 2 ein Gesichtserkennungssystem entwickelt, das invariant gegenüber Verschiebung und robust gegenüber Verformungen ist und gute Performanz auf Benchmarkdatenbanken In Kapitel 3 wird untersucht, was die sparsamste Methode ist, neuronale Muster so zu verbinden, dass es von jedem Punkt des einen Musters einen Pfad zu jedem Punkt des anderen gibt und visuelle Information von einem Muster zum anderen geroutet werden kann. Dabei wird die Gesamtmenge an benötigten neuronalen Ressourcen, also sowohl Verbindungen als auch merkmalrepräsentierende Einheiten der Zwischenschichten, minimiert. Dies führt zu mehrstufigen Strukturen mit weit gespreizten, aber dünn besetzten Verästelungen, die wir Switchyards nennen. Bei der Interpretation der Ergebnisse zeigt sich, dass Switchyards mit den qualitativen und quantitativen Gegebenheiten im Primatenhirn vereinbar sind, soweit diese bekannt sind. Kapitel 4 beschäftigt sich mit der Frage, wie solche doch recht komplizierten neuronalen Verbindungsstrukturen ontogenetisch entstehen können. Es wird ein möglicher Mechanismus vorgestellt, der auf chemischen Markern basiert. Die Marker werden von den Einheiten der untersten Schicht gebildet und diffundieren durch die entstehenden Verbindungen nach oben. Verbindungen wachsen bevorzugt zwischen Einheiten, die sehr unähnliche chemische Marker enthalten. Die resultierenden Verbindungsstrukturen sind beinahe identisch mit den in Kapitel 3 analytisch hergeleiteten Architekturen und biologisch sogar noch plausibler. Kapitel 5 führt die Ideen der vorangegangenen Kapitel zusammen, um das Korrespondenzfinden zwischen Mustern über mehrstufige Routingstrukturen hinweg zu realisieren. Es wird gezeigt, wie mit Hilfe von Switchyards Korrespondenzen zwischen normalen'' visuellen Mustern gefunden werden können, obwohl anfangs keine der einzelnen Stufen des Switchyards auf beiden Seiten Muster anliegen hat, die miteinander abgeglichen werden könnten. Im Anschluss wird das Prinzip zu einem vollständigen Erkennungssystem ausgebaut, das über mehrere Routingstufen hinweg ein gegebenes Eingangsmuster positionsinvariant einem mehrerer gespeicherter Muster zuordnen kann

    Taming Wild Faces: Web-Scale, Open-Universe Face Identification in Still and Video Imagery

    Get PDF
    With the increasing pervasiveness of digital cameras, the Internet, and social networking, there is a growing need to catalog and analyze large collections of photos and videos. In this dissertation, we explore unconstrained still-image and video-based face recognition in real-world scenarios, e.g. social photo sharing and movie trailers, where people of interest are recognized and all others are ignored. In such a scenario, we must obtain high precision in recognizing the known identities, while accurately rejecting those of no interest. Recent advancements in face recognition research has seen Sparse Representation-based Classification (SRC) advance to the forefront of competing methods. However, its drawbacks, slow speed and sensitivity to variations in pose, illumination, and occlusion, have hindered its wide-spread applicability. The contributions of this dissertation are three-fold: 1. For still-image data, we propose a novel Linearly Approximated Sparse Representation-based Classification (LASRC) algorithm that uses linear regression to perform sample selection for l1-minimization, thus harnessing the speed of least-squares and the robustness of SRC. On our large dataset collected from Facebook, LASRC performs equally to standard SRC with a speedup of 100-250x. 2. For video, applying the popular l1-minimization for face recognition on a frame-by-frame basis is prohibitively expensive computationally, so we propose a new algorithm Mean Sequence SRC (MSSRC) that performs video face recognition using a joint optimization leveraging all of the available video data and employing the knowledge that the face track frames belong to the same individual. Employing MSSRC results in a speedup of 5x on average over SRC on a frame-by-frame basis. 3. Finally, we make the observation that MSSRC sometimes assigns inconsistent identities to the same individual in a scene that could be corrected based on their visual similarity. Therefore, we construct a probabilistic affinity graph combining appearance and co-occurrence similarities to model the relationship between face tracks in a video. Using this relationship graph, we employ random walk analysis to propagate strong class predictions among similar face tracks, while dampening weak predictions. Our method results in a performance gain of 15.8% in average precision over using MSSRC alone

    Detection of grapevine yellows symptoms in Vitis vinifera L. with artificial intelligence

    Get PDF
    Grapevine yellows (GY) are a significant threat to grapes due to the severe symptoms and lack of treatments. Conventional diagnosis of the phytoplasmas associated to GYs relies on symptom identification, due to sensitivity limits of diagnostic tools (e.g. real time PCR) in asymptomatic vines, where the low concentration of the pathogen or its erratic distribution can lead to a high rate of false-negatives. GY's primary symptoms are leaf discoloration and irregular wood ripening, which can be easily confused for symptoms of other diseases making recognition a difficult task. Herein, we present a novel system, utilizing convolutional neural networks, for end-to-end detection of GY in red grape vine (cv. Sangiovese), using color images of leaf clippings. The diagnostic test detailed in this work does not require the user to be an expert at identifying GY. Data augmentation strategies make the system robust to alignment errors during data capture. When applied to the task of recognizing GY from digital images of leaf clippings—amongst many other diseases and a healthy control—the system has a sensitivity of 98.96% and a specificity of 99.40%. Deep learning has 35.97% and 9.88% better predictive value (PPV) when recognizing GY from sight, than a baseline system without deep learning and trained humans respectively. We evaluate six neural network architectures: AlexNet, GoogLeNet, Inception v3, ResNet-50, ResNet-101 and SqueezeNet. We find ResNet-50 to be the best compromise of accuracy and training cost. The trained neural networks, code to reproduce the experiments, and data of leaf clipping images are available on the internet. This work will advance the frontier of GY detection by improving detection speed, enabling a more effective response to the disease

    Detection of grapevine yellows symptoms in Vitis vinifera L. with artificial intelligence

    Get PDF
    Abstract Grapevine yellows (GY) are a significant threat to grapes due to the severe symptoms and lack of treatments. Conventional diagnosis of the phytoplasmas associated to GYs relies on symptom identification, due to sensitivity limits of diagnostic tools (e.g. real time PCR) in asymptomatic vines, where the low concentration of the pathogen or its erratic distribution can lead to a high rate of false-negatives. GY's primary symptoms are leaf discoloration and irregular wood ripening, which can be easily confused for symptoms of other diseases making recognition a difficult task. Herein, we present a novel system, utilizing convolutional neural networks, for end-to-end detection of GY in red grape vine (cv. Sangiovese), using color images of leaf clippings. The diagnostic test detailed in this work does not require the user to be an expert at identifying GY. Data augmentation strategies make the system robust to alignment errors during data capture. When applied to the task of recognizing GY from digital images of leaf clippings—amongst many other diseases and a healthy control—the system has a sensitivity of 98.96% and a specificity of 99.40%. Deep learning has 35.97% and 9.88% better predictive value (PPV) when recognizing GY from sight, than a baseline system without deep learning and trained humans respectively. We evaluate six neural network architectures: AlexNet, GoogLeNet, Inception v3, ResNet-50, ResNet-101 and SqueezeNet. We find ResNet-50 to be the best compromise of accuracy and training cost. The trained neural networks, code to reproduce the experiments, and data of leaf clipping images are available on the internet. This work will advance the frontier of GY detection by improving detection speed, enabling a more effective response to the disease
    corecore