Search CORE

19 research outputs found

Эффективное распознавание лиц на основе последовательного анализа нейросетевых дескрипторов и детектирования миноритарных классов

Author: Савченко А. В.
Соколова А. Д.
Соколова А. Д. Савченко А. В.
Publication venue
Publication date: 01/01/2023
Field of study

Исследуются способы повышения точности распознавания лиц на основе обнаружения входных изображений, которые редко встречаются в наборах данных, использующихся для обучения нейросетевых дескрипторов. В современных свободно распространяемых обучающих выборках обычнопредставлены изображения людей в основном среднего возраста и европеоидной расы, из-за этого большинство алгоритмов ошибаются на изображениях пожилых людей или детей, лицах более редких национальностей и т.п. В работе предложен алгоритм детектирования таких данныхс последующей их отбраковкой, на первом этапе которого используется сверточная нейронная сеть, предобученная на специально созданном наборе редких данных. Второй этап – применение последовательного анализа дескрипторов для повышения вычислительной эффективности классификации. Экспериментальное исследование на наборе данных VGGFace2 с использованием нейросетевых дескрипторов, в том числе современных моделей InsightFace, продемонстрировало повышенную эффективность предложенного алгоритма п

Samara University

Balancing Biases and Preserving Privacy on Balanced Faces in the Wild

Author: Fu Yun
Henon Yann
Qin Can
Robinson Joseph P
Timoner Samson
Publication venue
Publication date: 21/11/2022
Field of study

Demographic biases exist in current models used for facial recognition (FR). Our Balanced Faces in the Wild (BFW) dataset is a proxy to measure bias across ethnicity and gender subgroups, allowing one to characterize FR performances per subgroup. We show that results are non-optimal when a single score threshold determines whether sample pairs are genuine or imposters. Furthermore, within subgroups, performance often varies significantly from the global average. Thus, specific error rates only hold for populations matching the validation data. We mitigate the imbalanced performances using a novel domain adaptation learning scheme on the facial features extracted from state-of-the-art neural networks, boosting the average performance. The proposed method also preserves identity information while removing demographic knowledge. The removal of demographic knowledge prevents potential biases from being injected into decision-making and protects privacy since demographic information is no longer available. We explore the proposed method and show that subgroup classifiers can no longer learn from the features projected using our domain adaptation scheme. For source code and data, see https://github.com/visionjo/facerec-bias-bfw.Comment: arXiv admin note: text overlap with arXiv:2102.0894

arXiv.org e-Print Archive

Обнаружение персональных данных в фотоальбоме на основе кластеризации лиц и классификации текста сканированных документов

Author: Копейкина Л. Н.
Копейкина Л. Н. Савченко А. В.
Савченко А. В.
Publication venue
Publication date: 01/01/2020
Field of study

Samara University

Open-set face identification with automatic detection of out-of-distribution images

Author: Николенко С.И.
Савченко А.В.
Соколова А.Д.
Publication venue: Самарский национальный исследовательский университет
Publication date: 01/10/2022
Field of study

Одной из основных проблем современных нейросетевых дескрипторов в задаче идентификации лиц является малое число обучающих примеров определенного типа: изображения плохого качества, разный масштаб или освещение, лица детей, пожилых людей, редкие расы. В результате точность распознавания оказывается низкой для входных изображений, не похожих на большинство изображений в наборе данных, используемом для настройки метода извлечения признаков. В работе предлагается способ преодоления такой проблемы за счет автоматического обнаружения нетипичных входных изображений на основе введения предварительного этапа их автоматической отбраковки. Для этого используется специальная свёрточная сеть, обученная на наборе редких данных, которые обрабатывались с помощью известных алгоритмов преобразования изображений. Для повышения вычислительной эффективности решение о наличии редкого изображения принимается на основе того же дескриптора лица, который используется в классификаторе. Экспериментальное исследование подтвердило преимущества в точности предложенного подхода для нескольких наборов данных лиц и современных нейросетевых дескрипторов.Исследование выполнено за счет гранта Российского научного фонда (проект No 20-71-10010). Исследование Николенко С.И. поддержано Санкт-Петербургским государственным университетом, проект № 73555239 «Искусственный интеллект и наука о данных: теория, технология, отраслевые и междисциплинарные исследования и приложения»

Samara University

Self-supervised Face Representation Learning

Author: Sharma Vivek
Publication venue: KIT-Bibliothek, Karlsruhe
Publication date: 01/01/2020
Field of study

This thesis investigates fine-tuning deep face features in a self-supervised manner for discriminative face representation learning, wherein we develop methods to automatically generate pseudo-labels for training a neural network. Most importantly solving this problem helps us to advance the state-of-the-art in representation learning and can be beneficial to a variety of practical downstream tasks. Fortunately, there is a vast amount of videos on the internet that can be used by machines to learn an effective representation. We present methods that can learn a strong face representation from large-scale data be the form of images or video. However, while learning a good representation using a deep learning algorithm requires a large-scale dataset with manually curated labels, we propose self-supervised approaches to generate pseudo-labels utilizing the temporal structure of the video data and similarity constraints to get supervision from the data itself. We aim to learn a representation that exhibits small distances between samples from the same person, and large inter-person distances in feature space. Using metric learning one could achieve that as it is comprised of a pull-term, pulling data points from the same class closer, and a push-term, pushing data points from a different class further away. Metric learning for improving feature quality is useful but requires some form of external supervision to provide labels for the same or different pairs. In the case of face clustering in TV series, we may obtain this supervision from tracks and other cues. The tracking acts as a form of high precision clustering (grouping detections within a shot) and is used to automatically generate positive and negative pairs of face images. Inspired from that we propose two variants of discriminative approaches: Track-supervised Siamese network (TSiam) and Self-supervised Siamese network (SSiam). In TSiam, we utilize the tracking supervision to obtain the pair, additional we include negative training pairs for singleton tracks -- tracks that are not temporally co-occurring. As supervision from tracking may not always be available, to enable the use of metric learning without any supervision we propose an effective approach SSiam that can generate the required pairs automatically during training. In SSiam, we leverage dynamic generation of positive and negative pairs based on sorting distances (i.e. ranking) on a subset of frames and do not have to only rely on video/track based supervision. Next, we present a method namely Clustering-based Contrastive Learning (CCL), a new clustering-based representation learning approach that utilizes automatically discovered partitions obtained from a clustering algorithm (FINCH) as weak supervision along with inherent video constraints to learn discriminative face features. As annotating datasets is costly and difficult, using label-free and weak supervision obtained from a clustering algorithm as a proxy learning task is promising. Through our analysis, we show that creating positive and negative training pairs using clustering predictions help to improve the performance for video face clustering. We then propose a method face grouping on graphs (FGG), a method for unsupervised fine-tuning of deep face feature representations. We utilize a graph structure with positive and negative edges over a set of face-tracks based on their temporal structure of the video data and similarity-based constraints. Using graph neural networks, the features communicate over the edges allowing each track\u27s feature to exchange information with its neighbors, and thus push each representation in a direction in feature space that groups all representations of the same person together and separates representations of a different person. Having developed these methods to generate weak-labels for face representation learning, next we propose to learn compact yet effective representation for describing face tracks in videos into compact descriptors, that can complement previous methods towards learning a more powerful face representation. Specifically, we propose Temporal Compact Bilinear Pooling (TCBP) to encode the temporal segments in videos into a compact descriptor. TCBP possesses the ability to capture interactions between each element of the feature representation with one-another over a long-range temporal context. We integrated our previous methods TSiam, SSiam and CCL with TCBP and demonstrated that TCBP has excellent capabilities in learning a strong face representation. We further show TCBP has exceptional transfer abilities to applications such as multimodal video clip representation that jointly encodes images, audio, video and text, and video classification. All of these contributions are demonstrated on benchmark video clustering datasets: The Big Bang Theory, Buffy the Vampire Slayer and Harry Potter 1. We provide extensive evaluations on these datasets achieving a significant boost in performance over the base features, and in comparison to the state-of-the-art results

KITopen

Introduction: Ways of Machine Seeing

Author: Azar M.
Azar M.
Cox G.
Cox G.
Impett L.
Impett L.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2021
Field of study

How do machines, and, in particular, computational technologies, change the way we see the world? This special issue brings together researchers from a wide range of disciplines to explore the entanglement of machines and their ways of seeing from new critical perspectives. This 'editorial' is for a special issue of AI & Society, which includes contributions from: María Jesús Schultz Abarca, Peter Bell, Tobias Blanke, Benjamin Bratton, Claudio Celis Bueno, Kate Crawford, Iain Emsley, Abelardo Gil-Fournier, Daniel Chávez Heras, Vladan Joler, Nicolas Malevé, Lev Manovich, Nicholas Mirzoeff, Perle Møhl, Bruno Moreschi, Fabian Offert, Trevor Paglan, Jussi Parikka, Luciana Parisi, Matteo Pasquinelli, Gabriel Pereira, Carloalberto Treccani, Rebecca Uliasz, and Manuel van der Veen

LSBU Research Open

Image and Video Forensics

Author
Publication venue: 'MDPI AG'
Publication date: 24/02/2022
Field of study

Nowadays, images and videos have become the main modalities of information being exchanged in everyday life, and their pervasiveness has led the image forensics community to question their reliability, integrity, confidentiality, and security. Multimedia contents are generated in many different ways through the use of consumer electronics and high-quality digital imaging devices, such as smartphones, digital cameras, tablets, and wearable and IoT devices. The ever-increasing convenience of image acquisition has facilitated instant distribution and sharing of digital images on digital social platforms, determining a great amount of exchange data. Moreover, the pervasiveness of powerful image editing tools has allowed the manipulation of digital images for malicious or criminal ends, up to the creation of synthesized images and videos with the use of deep learning techniques. In response to these threats, the multimedia forensics community has produced major research efforts regarding the identification of the source and the detection of manipulation. In all cases (e.g., forensic investigations, fake news debunking, information warfare, and cyberattacks) where images and videos serve as critical evidence, forensic technologies that help to determine the origin, authenticity, and integrity of multimedia content can become essential tools. This book aims to collect a diverse and complementary set of articles that demonstrate new developments and applications in image and video forensics to tackle new and serious challenges to ensure media authenticity

Directory of Open Access Books (DOAB)