7 research outputs found

    Recognizing Characters in Art History Using Deep Learning

    Full text link
    In the field of Art History, images of artworks and their contexts are core to understanding the underlying semantic information. However, the highly complex and sophisticated representation of these artworks makes it difficult, even for the experts, to analyze the scene. From the computer vision perspective, the task of analyzing such artworks can be divided into sub-problems by taking a bottom-up approach. In this paper, we focus on the problem of recognizing the characters in Art History. From the iconography of AnnunciationAnnunciation ofof thethe LordLord (Figure 1), we consider the representation of the main protagonists, MaryMary and GabrielGabriel, across different artworks and styles. We investigate and present the findings of training a character classifier on features extracted from their face images. The limitations of this method, and the inherent ambiguity in the representation of GabrielGabriel, motivated us to consider their bodies (a bigger context) to analyze in order to recognize the characters. Convolutional Neural Networks (CNN) trained on the bodies of MaryMary and GabrielGabriel are able to learn person related features and ultimately improve the performance of character recognition. We introduce a new technique that generates more data with similar styles, effectively creating data in the similar domain. We present experiments and analysis on three different models and show that the model trained on domain related data gives the best performance for recognizing character. Additionally, we analyze the localized image regions for the network predictions. Code is open-sourced and available at https://github.com/prathmeshrmadhu/recognize_characters_art_history and the link to the published peer-reviewed article is https://dl.acm.org/citation.cfm?id=3357242

    There Is a Digital Art History

    Full text link
    In this paper, we revisit Johanna Drucker's question, "Is there a digital art history?" -- posed exactly a decade ago -- in the light of the emergence of large-scale, transformer-based vision models. While more traditional types of neural networks have long been part of digital art history, and digital humanities projects have recently begun to use transformer models, their epistemic implications and methodological affordances have not yet been systematically analyzed. We focus our analysis on two main aspects that, together, seem to suggest a coming paradigm shift towards a "digital" art history in Drucker's sense. On the one hand, the visual-cultural repertoire newly encoded in large-scale vision models has an outsized effect on digital art history. The inclusion of significant numbers of non-photographic images allows for the extraction and automation of different forms of visual logics. Large-scale vision models have "seen" large parts of the Western visual canon mediated by Net visual culture, and they continuously solidify and concretize this canon through their already widespread application in all aspects of digital life. On the other hand, based on two technical case studies of utilizing a contemporary large-scale visual model to investigate basic questions from the fields of art history and urbanism, we suggest that such systems require a new critical methodology that takes into account the epistemic entanglement of a model and its applications. This new methodology reads its corpora through a neural model's training data, and vice versa: the visual ideologies of research datasets and training datasets become entangled

    ARTIFICIAL INTELLIGENCE RATERS: NEURAL NETWORKS FOR RATING PICTORIAL EXPRESSION

    Get PDF
    Previous studies on classification of fine art show that features of paintings can be captured and categorized using machine learning approaches. This progress can also benefit art psychology by facilitating data collection on artworks without the need to recruit experts as raters. In this study a machine learning approach is used to predict the ratings of RizbA, a Rating instrument for two-dimensional pictorial works. Based on a pre-trained model, the algorithm was fine-tuned via transfer learning on 886 pictorial works by contemporary professional artists and non-professionals. As quality criterion, artificial intelligence raters (ART) are compared with generic raters (GR) created from the real human expert raters, using error rate and mean squared error (MSE). ART ratings have been found to have the same error range as randomly chosen human ratings. Therefore, they can be seen as equivalent to real human expert raters for almost all items in RizbA. Further training with more data will close the gap to the human raters on all items

    ARTificial intelligence raters. Neural networks for rating pictorial expression

    Get PDF
    Previous studies on classification of fine art show that features of paintings can be captured and categorized using machine learning approaches. This progress can also benefit art psychology by facilitating data collection on artworks without the need to recruit experts as raters. In this study a machine learning approach is used to predict the ratings of RizbA, a Rating instrument for two-dimensional pictorial works. Based on a pre-trained model, the algorithm was fine-tuned via transfer learning on 886 pictorial works by contemporary professional artists and non-professionals. As quality criterion, artificial intelligence raters (ART) are compared with generic raters (GR) created from the real human expert raters, using error rate and mean squared error (MSE). ART ratings have been found to have the same error range as randomly chosen human ratings. Therefore, they can be seen as equivalent to real human expert raters for almost all items in RizbA. Further training with more data will close the gap to the human raters on all items

    An analysis of the transfer learning of convolutional neural networks for artistic images

    Full text link
    Transfer learning from huge natural image datasets, fine-tuning of deep neural networks and the use of the corresponding pre-trained networks have become de facto the core of art analysis applications. Nevertheless, the effects of transfer learning are still poorly understood. In this paper, we first use techniques for visualizing the network internal representations in order to provide clues to the understanding of what the network has learned on artistic images. Then, we provide a quantitative analysis of the changes introduced by the learning process thanks to metrics in both the feature and parameter spaces, as well as metrics computed on the set of maximal activation images. These analyses are performed on several variations of the transfer learning procedure. In particular, we observed that the network could specialize some pre-trained filters to the new image modality and also that higher layers tend to concentrate classes. Finally, we have shown that a double fine-tuning involving a medium-size artistic dataset can improve the classification on smaller datasets, even when the task changes.Comment: Accepted at Workshop on Fine Art Pattern Extraction and Recognition (FAPER), ICPR, 202

    Deep learning structure for directed graph data

    Get PDF
    Deep learning structures have achieved outstanding success in many different domains. Existing research works have proposed and presented many state-of-the-art deep neural networks to solve different learning tasks in various research fields such as speech processing and image recognition. Graph neural networks (GNNs) are considered as a type of deep neural network and their numerical representation from the graph does improve the performance of networks. In the real-world cases, data is not only in the form of simple graph, but also they could contain direction information in the graph resulting in the so-called directed graph data. This thesis will introduce and explain the first attempt in this domain to apply Singular Value Decomposition (SVD) on adjacency matrix for graph convolutional neural networks and propose SVD-GCN. This thesis also utilizes the framelet decomposition to help better filter the graph signals, thus to improve novel structure鈥檚 performance in node classification task and to enhance the robustness of the model when encountering high-level noise attack. The thesis also applies the new model on link prediction tasks. All the experimental results demonstrate SVD-GCN鈥檚 outstanding performances in both node-level and edgelevel learning tasks
    corecore