7,067 research outputs found
MCCFNet: multi-channel color fusion network for cognitive classification of traditional Chinese paintings.
The computational modeling and analysis of traditional Chinese painting rely heavily on cognitive classification based on visual perception. This approach is crucial for understanding and identifying artworks created by different artists. However, the effective integration of visual perception into artificial intelligence (AI) models remains largely unexplored. Additionally, the classification research of Chinese painting faces certain challenges, such as insufficient investigation into the specific characteristics of painting images for author classification and recognition. To address these issues, we propose a novel framework called multi-channel color fusion network (MCCFNet), which aims to extract visual features from diverse color perspectives. By considering multiple color channels, MCCFNet enhances the ability of AI models to capture intricate details and nuances present in Chinese painting. To improve the performance of the DenseNet model, we introduce a regional weighted pooling (RWP) strategy specifically designed for the DenseNet169 architecture. This strategy enhances the extraction of highly discriminative features. In our experimental evaluation, we comprehensively compared the performance of our proposed MCCFNet model against six state-of-the-art models. The comparison was conducted on a dataset consisting of 2436 TCP samples, derived from the works of 10 renowned Chinese artists. The evaluation metrics employed for performance assessment were Top-1 Accuracy and the area under the curve (AUC). The experimental results have shown that our proposed MCCFNet model significantly outperform all other benchmarking methods with the highest classification accuracy of 98.68%. Meanwhile, the classification accuracy of any deep learning models on TCP can be much improved when adopting our proposed framework
Adding New Tasks to a Single Network with Weight Transformations using Binary Masks
Visual recognition algorithms are required today to exhibit adaptive
abilities. Given a deep model trained on a specific, given task, it would be
highly desirable to be able to adapt incrementally to new tasks, preserving
scalability as the number of new tasks increases, while at the same time
avoiding catastrophic forgetting issues. Recent work has shown that masking the
internal weights of a given original conv-net through learned binary variables
is a promising strategy. We build upon this intuition and take into account
more elaborated affine transformations of the convolutional weights that
include learned binary masks. We show that with our generalization it is
possible to achieve significantly higher levels of adaptation to new tasks,
enabling the approach to compete with fine tuning strategies by requiring
slightly more than 1 bit per network parameter per additional task. Experiments
on two popular benchmarks showcase the power of our approach, that achieves the
new state of the art on the Visual Decathlon Challenge
Cross Pixel Optical Flow Similarity for Self-Supervised Learning
We propose a novel method for learning convolutional neural image
representations without manual supervision. We use motion cues in the form of
optical flow, to supervise representations of static images. The obvious
approach of training a network to predict flow from a single image can be
needlessly difficult due to intrinsic ambiguities in this prediction task. We
instead propose a much simpler learning goal: embed pixels such that the
similarity between their embeddings matches that between their optical flow
vectors. At test time, the learned deep network can be used without access to
video or flow information and transferred to tasks such as image
classification, detection, and segmentation. Our method, which significantly
simplifies previous attempts at using motion for self-supervision, achieves
state-of-the-art results in self-supervision using motion cues, competitive
results for self-supervision in general, and is overall state of the art in
self-supervised pretraining for semantic image segmentation, as demonstrated on
standard benchmarks
A Data Set and a Convolutional Model for Iconography Classification in Paintings
Iconography in art is the discipline that studies the visual content of
artworks to determine their motifs and themes andto characterize the way these
are represented. It is a subject of active research for a variety of purposes,
including the interpretation of meaning, the investigation of the origin and
diffusion in time and space of representations, and the study of influences
across artists and art works. With the proliferation of digital archives of art
images, the possibility arises of applying Computer Vision techniques to the
analysis of art images at an unprecedented scale, which may support iconography
research and education. In this paper we introduce a novel paintings data set
for iconography classification and present the quantitativeand qualitative
results of applying a Convolutional Neural Network (CNN) classifier to the
recognition of the iconography of artworks. The proposed classifier achieves
good performances (71.17% Precision, 70.89% Recall, 70.25% F1-Score and 72.73%
Average Precision) in the task of identifying saints in Christian religious
paintings, a task made difficult by the presence of classes with very similar
visual features. Qualitative analysis of the results shows that the CNN focuses
on the traditional iconic motifs that characterize the representation of each
saint and exploits such hints to attain correct identification. The ultimate
goal of our work is to enable the automatic extraction, decomposition, and
comparison of iconography elements to support iconographic studies and
automatic art work annotation.Comment: Published at ACM Journal on Computing and Cultural Heritage (JOCCH)
https://doi.org/10.1145/345888
- …