9 research outputs found

    Self-supervised Face Representation Learning

    Get PDF
    This thesis investigates fine-tuning deep face features in a self-supervised manner for discriminative face representation learning, wherein we develop methods to automatically generate pseudo-labels for training a neural network. Most importantly solving this problem helps us to advance the state-of-the-art in representation learning and can be beneficial to a variety of practical downstream tasks. Fortunately, there is a vast amount of videos on the internet that can be used by machines to learn an effective representation. We present methods that can learn a strong face representation from large-scale data be the form of images or video. However, while learning a good representation using a deep learning algorithm requires a large-scale dataset with manually curated labels, we propose self-supervised approaches to generate pseudo-labels utilizing the temporal structure of the video data and similarity constraints to get supervision from the data itself. We aim to learn a representation that exhibits small distances between samples from the same person, and large inter-person distances in feature space. Using metric learning one could achieve that as it is comprised of a pull-term, pulling data points from the same class closer, and a push-term, pushing data points from a different class further away. Metric learning for improving feature quality is useful but requires some form of external supervision to provide labels for the same or different pairs. In the case of face clustering in TV series, we may obtain this supervision from tracks and other cues. The tracking acts as a form of high precision clustering (grouping detections within a shot) and is used to automatically generate positive and negative pairs of face images. Inspired from that we propose two variants of discriminative approaches: Track-supervised Siamese network (TSiam) and Self-supervised Siamese network (SSiam). In TSiam, we utilize the tracking supervision to obtain the pair, additional we include negative training pairs for singleton tracks -- tracks that are not temporally co-occurring. As supervision from tracking may not always be available, to enable the use of metric learning without any supervision we propose an effective approach SSiam that can generate the required pairs automatically during training. In SSiam, we leverage dynamic generation of positive and negative pairs based on sorting distances (i.e. ranking) on a subset of frames and do not have to only rely on video/track based supervision. Next, we present a method namely Clustering-based Contrastive Learning (CCL), a new clustering-based representation learning approach that utilizes automatically discovered partitions obtained from a clustering algorithm (FINCH) as weak supervision along with inherent video constraints to learn discriminative face features. As annotating datasets is costly and difficult, using label-free and weak supervision obtained from a clustering algorithm as a proxy learning task is promising. Through our analysis, we show that creating positive and negative training pairs using clustering predictions help to improve the performance for video face clustering. We then propose a method face grouping on graphs (FGG), a method for unsupervised fine-tuning of deep face feature representations. We utilize a graph structure with positive and negative edges over a set of face-tracks based on their temporal structure of the video data and similarity-based constraints. Using graph neural networks, the features communicate over the edges allowing each track\u27s feature to exchange information with its neighbors, and thus push each representation in a direction in feature space that groups all representations of the same person together and separates representations of a different person. Having developed these methods to generate weak-labels for face representation learning, next we propose to learn compact yet effective representation for describing face tracks in videos into compact descriptors, that can complement previous methods towards learning a more powerful face representation. Specifically, we propose Temporal Compact Bilinear Pooling (TCBP) to encode the temporal segments in videos into a compact descriptor. TCBP possesses the ability to capture interactions between each element of the feature representation with one-another over a long-range temporal context. We integrated our previous methods TSiam, SSiam and CCL with TCBP and demonstrated that TCBP has excellent capabilities in learning a strong face representation. We further show TCBP has exceptional transfer abilities to applications such as multimodal video clip representation that jointly encodes images, audio, video and text, and video classification. All of these contributions are demonstrated on benchmark video clustering datasets: The Big Bang Theory, Buffy the Vampire Slayer and Harry Potter 1. We provide extensive evaluations on these datasets achieving a significant boost in performance over the base features, and in comparison to the state-of-the-art results

    GPT Semantic Networking: A Dream of the Semantic Web – The Time is Now

    Get PDF
    The book presents research and practical implementations related to natural language processing (NLP) technologies based on the concept of artificial intelligence, generative AI, and the concept of Complex Networks aimed at creating Semantic Networks. The main principles of NLP, training models on large volumes of text data, new universal and multi-purpose language processing systems are presented. It is shown how the combination of NLP and Semantic Networks technologies opens up new horizons for text analysis, context understanding, the formation of domain models, causal networks, etc. This book presents methods for creating Semantic Networks based on prompt engineering. Practices are presented that will help build semantic networks capable of solving complex problems and making revolutionary changes in the analytical activity. The publication is intended for those who are going to use large language models for the construction and analysis of semantic networks in order to solve applied problems, in particular, in the field of decision making.У книзі представлені дослідження та практичні реалізації технологій обробки природної мови (НЛП), заснованих на концепції штучного інтелект, генеративний ШІ та концепція складних мереж, спрямована на створення семантичних мереж. Представлено основні принципи НЛП, моделі навчання на великих обсягах текстових даних, нові універсальні та багатоцільові системи обробки мови. Показано, як поєднання технологій NLP і семантичних мереж відкриває нові горизонти для аналізу тексту, розуміння контексту, формування моделей домену, причинно-наслідкових мереж тощо. У цій книзі представлені методи створення семантичних мереж на основі оперативного проектування. Представлені практики, які допоможуть побудувати семантичні мережі, здатні вирішувати складні проблеми та вносити революційні зміни в аналітичну діяльність. Видання розраховане на тих, хто збирається використовувати велику мову моделі побудови та аналізу семантичних мереж з метою вирішення прикладних задач, зокрема, у сфері прийняття рішень

    LIPIcs, Volume 261, ICALP 2023, Complete Volume

    Get PDF
    LIPIcs, Volume 261, ICALP 2023, Complete Volum

    On metrics and models for multiplex networks

    Get PDF
    In this thesis, we extend the concept of null models as canonical ensembles of multi-graphs with given constraints and present new metrics able to characterize real-world layered systems based on their correlation patterns. We make extensive use of the maximum-entropy method in order to find the analytical expression of the expectation values of several topological quantities; furthermore, we employ the maximum-likelihood method to fit the models to real datasets. One of the main contributions of the present work is providing models and metrics that can be directly applied to real data. We introduce improved measures of overlap between layers of a multiplex and exploit such quantities to provide a new network reconstruction method applicable to multi-layer graphs. It turns out that this methodology, applicable to a specific class of multi-layer networks, can be successfully employed to reconstruct the World Trade Multiplex. Furthermore, we illustrate that the maximum-entropy models also allow us to find the so-called backbone of a real network, i.e. the information which is irreducible to the single-node properties and is therefore peculiar to the network itself. We conclude the thesis moving our attention to a different dataset, namely the scientific publication system.Theoretical Physic

    25th Annual Computational Neuroscience Meeting: CNS-2016

    Get PDF
    Abstracts of the 25th Annual Computational Neuroscience Meeting: CNS-2016 Seogwipo City, Jeju-do, South Korea. 2–7 July 201

    25th annual computational neuroscience meeting: CNS-2016

    Get PDF
    The same neuron may play different functional roles in the neural circuits to which it belongs. For example, neurons in the Tritonia pedal ganglia may participate in variable phases of the swim motor rhythms [1]. While such neuronal functional variability is likely to play a major role the delivery of the functionality of neural systems, it is difficult to study it in most nervous systems. We work on the pyloric rhythm network of the crustacean stomatogastric ganglion (STG) [2]. Typically network models of the STG treat neurons of the same functional type as a single model neuron (e.g. PD neurons), assuming the same conductance parameters for these neurons and implying their synchronous firing [3, 4]. However, simultaneous recording of PD neurons shows differences between the timings of spikes of these neurons. This may indicate functional variability of these neurons. Here we modelled separately the two PD neurons of the STG in a multi-neuron model of the pyloric network. Our neuron models comply with known correlations between conductance parameters of ionic currents. Our results reproduce the experimental finding of increasing spike time distance between spikes originating from the two model PD neurons during their synchronised burst phase. The PD neuron with the larger calcium conductance generates its spikes before the other PD neuron. Larger potassium conductance values in the follower neuron imply longer delays between spikes, see Fig. 17.Neuromodulators change the conductance parameters of neurons and maintain the ratios of these parameters [5]. Our results show that such changes may shift the individual contribution of two PD neurons to the PD-phase of the pyloric rhythm altering their functionality within this rhythm. Our work paves the way towards an accessible experimental and computational framework for the analysis of the mechanisms and impact of functional variability of neurons within the neural circuits to which they belong

    LIPIcs, Volume 248, ISAAC 2022, Complete Volume

    Get PDF
    LIPIcs, Volume 248, ISAAC 2022, Complete Volum

    Collected Papers (on Physics, Artificial Intelligence, Health Issues, Decision Making, Economics, Statistics), Volume XI

    Get PDF
    This eleventh volume of Collected Papers includes 90 papers comprising 988 pages on Physics, Artificial Intelligence, Health Issues, Decision Making, Economics, Statistics, written between 2001-2022 by the author alone or in collaboration with the following 84 co-authors (alphabetically ordered) from 19 countries: Abhijit Saha, Abu Sufian, Jack Allen, Shahbaz Ali, Ali Safaa Sadiq, Aliya Fahmi, Atiqa Fakhar, Atiqa Firdous, Sukanto Bhattacharya, Robert N. Boyd, Victor Chang, Victor Christianto, V. Christy, Dao The Son, Debjit Dutta, Azeddine Elhassouny, Fazal Ghani, Fazli Amin, Anirudha Ghosha, Nasruddin Hassan, Hoang Viet Long, Jhulaneswar Baidya, Jin Kim, Jun Ye, Darjan Karabašević, Vasilios N. Katsikis, Ieva Meidutė-Kavaliauskienė, F. Kaymarm, Nour Eldeen M. Khalifa, Madad Khan, Qaisar Khan, M. Khoshnevisan, Kifayat Ullah,, Volodymyr Krasnoholovets, Mukesh Kumar, Le Hoang Son, Luong Thi Hong Lan, Tahir Mahmood, Mahmoud Ismail, Mohamed Abdel-Basset, Siti Nurul Fitriah Mohamad, Mohamed Loey, Mai Mohamed, K. Mohana, Kalyan Mondal, Muhammad Gulfam, Muhammad Khalid Mahmood, Muhammad Jamil, Muhammad Yaqub Khan, Muhammad Riaz, Nguyen Dinh Hoa, Cu Nguyen Giap, Nguyen Tho Thong, Peide Liu, Pham Huy Thong, Gabrijela Popović‬‬‬‬‬‬‬‬‬‬, Surapati Pramanik, Dmitri Rabounski, Roslan Hasni, Rumi Roy, Tapan Kumar Roy, Said Broumi, Saleem Abdullah, Muzafer Saračević, Ganeshsree Selvachandran, Shariful Alam, Shyamal Dalapati, Housila P. Singh, R. Singh, Rajesh Singh, Predrag S. Stanimirović, Kasan Susilo, Dragiša Stanujkić, Alexandra Şandru, Ovidiu Ilie Şandru, Zenonas Turskis, Yunita Umniyati, Alptekin Ulutaș, Maikel Yelandi Leyva Vázquez, Binyamin Yusoff, Edmundas Kazimieras Zavadskas, Zhao Loon Wang.‬‬‬
    corecore