153 research outputs found

    Recognizing the artistic style of fine art paintings with deep learning for an augmented reality application

    Get PDF
    The rapid digitalization of artwork collections in libraries, museums, galleries, and art centers has resulted in a growing interest in developing autonomous systems capable of understanding art concepts and categorizing fine art paintings as it became difficult to manually manipulate the content of these collections. However, the task of automatic categorization comes with significant challenges due to the subjective interpretation and perception of art elements and the reliance on accurate annotations provided by art experts. As in recent years, deep learning approaches and computer vision techniques have shown remarkable performance in automating painting classification; this research aims to develop efficient deep learning systems that can automatically classify the artistic style of fine-art paintings. In this thesis, we investigate the effectiveness of seven pre-trained EfficientNet models for identifying the style of a painting and propose custom models based on pre-trained EfficientNet architectures. In addition, we analyzed the impact of deep retraining the last eight layers on the performance of the custom models. The experimental results on the standard fine art painting classification dataset, Painting-91 indicate that deep retraining of the last eight layers of the custom models yields the best performance, achieving a 5% improvement compared to the base models. This demonstrates the effectiveness of leveraging pre-trained EfficientNet models for automatic artistic style identification in paintings. Moreover, the study presents a framework that compares the performance of six pre-trained convolutional neural networks (Xception, ResNet50, InceptionV3, InceptionResNetV2, DenseNet121, and EfficientNet B3) for identifying artistic styles in paintings. Notably, Xception architecture is employed for this purpose for the first time. Furthermore, the impact of different optimizers (SGD, RMSprop, and Adam) and two learning rates (1e-2 and 1e-4) on model performance is studied using transfer learning. The experiments on two different art classification datasets, Pandora18k and Painting-91 revealed that InceptionResNetV2 achieves the highest accuracy for style classification on both datasets when trained with the Adam optimizer and a learning rate of 1e-4. Integrating deep learning algorithms and transfer learning techniques into fine art painting analysis and classification offers promising avenues for automating style identification tasks. The proposed models and findings contribute to the development of automatic methods that enable the art community to efficiently analyze and categorize the vast number of digital paintings available on the internet

    Computational Emotion Analysis From Images: Recent Advances and Future Directions

    Full text link
    Emotions are usually evoked in humans by images. Recently, extensive research efforts have been dedicated to understanding the emotions of images. In this chapter, we aim to introduce image emotion analysis (IEA) from a computational perspective with the focus on summarizing recent advances and suggesting future directions. We begin with commonly used emotion representation models from psychology. We then define the key computational problems that the researchers have been trying to solve and provide supervised frameworks that are generally used for different IEA tasks. After the introduction of major challenges in IEA, we present some representative methods on emotion feature extraction, supervised classifier learning, and domain adaptation. Furthermore, we introduce available datasets for evaluation and summarize some main results. Finally, we discuss some open questions and future directions that researchers can pursue.Comment: Accepted chapter in the book "Human Perception of Visual Information Psychological and Computational Perspective

    Affective image content analysis: two decades review and new perspectives

    Get PDF

    Affective Image Content Analysis: Two Decades Review and New Perspectives

    Get PDF
    Images can convey rich semantics and induce various emotions in viewers. Recently, with the rapid advancement of emotional intelligence and the explosive growth of visual data, extensive research efforts have been dedicated to affective image content analysis (AICA). In this survey, we will comprehensively review the development of AICA in the recent two decades, especially focusing on the state-of-the-art methods with respect to three main challenges -- the affective gap, perception subjectivity, and label noise and absence. We begin with an introduction to the key emotion representation models that have been widely employed in AICA and description of available datasets for performing evaluation with quantitative comparison of label noise and dataset bias. We then summarize and compare the representative approaches on (1) emotion feature extraction, including both handcrafted and deep features, (2) learning methods on dominant emotion recognition, personalized emotion prediction, emotion distribution learning, and learning from noisy data or few labels, and (3) AICA based applications. Finally, we discuss some challenges and promising research directions in the future, such as image content and context understanding, group emotion clustering, and viewer-image interaction.Comment: Accepted by IEEE TPAM

    A solution to the hyper complex, cross domain reality of artificial intelligence: The hierarchy of AI

    Get PDF
    Artificial Intelligence (AI) is an umbrella term used to describe machine-based forms of learning. This can encapsulate anything from Siri, Apple's smartphone-based assistant, to Tesla's autonomous vehicles (self-driving cars). At present, there are no set criteria to classify AI. The implications of which include public uncertainty, corporate scepticism, diminished confidence, insufficient funding and limited progress. Current substantial challenges exist with AI such as the use of combinationally large search space, prediction errors against ground truth values, the use of quantum error correction strategies. These are discussed in addition to fundamental data issues across collection, sample error and quality. The concept of cross realms and domains used to inform AI, is considered. Furthermore there is the issue of the confusing range of current AI labels. This paper aims to provide a more consistent form of classification, to be used by institutions and organisations alike, as they endeavour to make AI part of their practice. In turn, this seeks to promote transparency and increase trust. This has been done through primary research, including a panel of data scientists / experts in the field, and through a literature review on existing research. The authors propose a model solution in that of the Hierarchy of AI

    14th Conference on DATA ANALYSIS METHODS for Software Systems

    Get PDF
    DAMSS-2023 is the 14th International Conference on Data Analysis Methods for Software Systems, held in Druskininkai, Lithuania. Every year at the same venue and time. The exception was in 2020, when the world was gripped by the Covid-19 pandemic and the movement of people was severely restricted. After a year’s break, the conference was back on track, and the next conference was successful in achieving its primary goal of lively scientific communication. The conference focuses on live interaction among participants. For better efficiency of communication among participants, most of the presentations are poster presentations. This format has proven to be highly effective. However, we have several oral sections, too. The history of the conference dates back to 2009 when 16 papers were presented. It began as a workshop and has evolved into a well-known conference. The idea of such a workshop originated at the Institute of Mathematics and Informatics, now the Institute of Data Science and Digital Technologies of Vilnius University. The Lithuanian Academy of Sciences and the Lithuanian Computer Society supported this idea, which gained enthusiastic acceptance from both the Lithuanian and international scientific communities. This year’s conference features 84 presentations, with 137 registered participants from 11 countries. The conference serves as a gathering point for researchers from six Lithuanian universities, making it the main annual meeting for Lithuanian computer scientists. The primary aim of the conference is to showcase research conducted at Lithuanian and foreign universities in the fields of data science and software engineering. The annual organization of the conference facilitates the rapid exchange of new ideas within the scientific community. Seven IT companies supported the conference this year, indicating the relevance of the conference topics to the business sector. In addition, the conference is supported by the Lithuanian Research Council and the National Science and Technology Council (Taiwan, R. O. C.). The conference covers a wide range of topics, including Applied Mathematics, Artificial Intelligence, Big Data, Bioinformatics, Blockchain Technologies, Business Rules, Software Engineering, Cybersecurity, Data Science, Deep Learning, High-Performance Computing, Data Visualization, Machine Learning, Medical Informatics, Modelling Educational Data, Ontological Engineering, Optimization, Quantum Computing, Signal Processing. This book provides an overview of all presentations from the DAMSS-2023 conference

    Morphological Analysis for Object Recognition, Matching, and Applications

    Get PDF
    This thesis deals with the detection and classifcation of objects in visual images and with the analysis of shape changes between object instances. Whereas the task of object recognition focuses on learning models which describe common properties between instances of a specific category, the analysis of the specific differences between instances is also relevant to understand the objects and the categories themselves. This research is governed by the idea that important properties for the automatic perception and understanding of objects are transmitted through their geometry or shape. Therefore, models for object recognition and shape matching are devised which exploit the geometry and properties of the objects, using as little user supervision as possible. In order to learn object models for detection in a reliable manner, suitable object representations are required. The key idea in this work is to use a richer representation of the object shape within the object model in order to increase the description power and thus the performance of the whole system. For this purpose, we first investigate the integration of curvature information of shapes in the object model which is learned. Since natural objects intrinsically exhibit curved boundaries, an object is better described if this shape cue is integrated. This subject extends the widely used object representation based on gradient orientation histograms by incorporating a robust histogram-based description of curvature. We show that integrating this information substantially improves detection results over descriptors that solely rely upon histograms of orientated gradients. The impact of using richer shape representations for object recognition is further investigated through a novel method which goes beyond traditional bounding-box representations for objects. Visual recognition requires learning object models from training data. Commonly, training samples are annotated by marking only the bounding-box of objects since this appears to be the best trade-off between labeling information and effectiveness. However, objects are typically not box-shaped. Thus, the usual parametrization of objects using a bounding box seems inappropriate since such a box contains a significant amount of background clutter. Therefore, the presented approach learns object models for detection while simultaneously learning to segregate objects from clutter and extracting their overall shape, without however, requiring manual segmentation of the training samples. Shape equivalence is another interesting property related to shape. It refers to the ability of perceiving two distinct objects as having the same or similar shape. This thesis also explores the usage of this ability to detect objects in unsupervised scenarios, that is where no annotation of training data is available for learning a statistical model. For this purpose, a dataset of historical Chinese cartoons drawn during the Cultural Revolution and immediately thereafter is analyzed. Relevant objects in this dataset are emphasized through annuli of light rays. The idea of our method is to consider the different annuli as shape equivalent objects, that is, as objects sharing the same shape and devise a method to detect them. Thereafter, it is possible to indirectly infer the position, size and scale of the emphasized objects using the annuli detections. Not only commonalities among objects, but also the specific differences between them are perceived by a visual system. These differences can be understood through the analysis of how objects and their shape change. For this reason, this thesis also develops a novel methodology for analyzing the shape deformation between a single pair of images under missing correspondences. The key observation is that objects cannot deform arbitrarily, but rather the deformation itself follows the geometry and constraints imposed by the object itself. We describe the overall complex object deformation using a piecewise linear model. Thereby, we are able to identify each of the parts in the shape which share the same deformation. Thus, we are able to understand how an object and its parts were transformed. A remarkable property of the algorithm is the ability to automatically estimate the model complexity according to the overall complexity of the shape deformation. Specifically, the introduced methodology is used to analyze the deformation between original instances and reproductions of artworks. The nature of the analyzed alterations ranges from deliberate modifications by the artist to geometrical errors accumulated during the reproduction process of the image. The usage of this method within this application shows how productive the interaction between computer vision and the field of the humanities is. The goal is not to supplant human expertise, but to enhance and deepen connoisseurship about a given problem

    Convergence of Intelligent Data Acquisition and Advanced Computing Systems

    Get PDF
    This book is a collection of published articles from the Sensors Special Issue on "Convergence of Intelligent Data Acquisition and Advanced Computing Systems". It includes extended versions of the conference contributions from the 10th IEEE International Conference on Intelligent Data Acquisition and Advanced Computing Systems: Technology and Applications (IDAACS’2019), Metz, France, as well as external contributions

    Natural Language Processing using Deep Learning in Social Media

    Full text link
    [ES] En los últimos años, los modelos de aprendizaje automático profundo (AP) han revolucionado los sistemas de procesamiento de lenguaje natural (PLN). Hemos sido testigos de un avance formidable en las capacidades de estos sistemas y actualmente podemos encontrar sistemas que integran modelos PLN de manera ubicua. Algunos ejemplos de estos modelos con los que interaccionamos a diario incluyen modelos que determinan la intención de la persona que escribió un texto, el sentimiento que pretende comunicar un tweet o nuestra ideología política a partir de lo que compartimos en redes sociales. En esta tesis se han propuestos distintos modelos de PNL que abordan tareas que estudian el texto que se comparte en redes sociales. En concreto, este trabajo se centra en dos tareas fundamentalmente: el análisis de sentimientos y el reconocimiento de la personalidad de la persona autora de un texto. La tarea de analizar el sentimiento expresado en un texto es uno de los problemas principales en el PNL y consiste en determinar la polaridad que un texto pretende comunicar. Se trata por lo tanto de una tarea estudiada en profundidad de la cual disponemos de una vasta cantidad de recursos y modelos. Por el contrario, el problema del reconocimiento de personalidad es una tarea revolucionaria que tiene como objetivo determinar la personalidad de los usuarios considerando su estilo de escritura. El estudio de esta tarea es más marginal por lo que disponemos de menos recursos para abordarla pero que no obstante presenta un gran potencial. A pesar de que el enfoque principal de este trabajo fue el desarrollo de modelos de aprendizaje profundo, también hemos propuesto modelos basados en recursos lingüísticos y modelos clásicos del aprendizaje automático. Estos últimos modelos nos han permitido explorar las sutilezas de distintos elementos lingüísticos como por ejemplo el impacto que tienen las emociones en la clasificación correcta del sentimiento expresado en un texto. Posteriormente, tras estos trabajos iniciales se desarrollaron modelos AP, en particular, Redes neuronales convolucionales (RNC) que fueron aplicadas a las tareas previamente citadas. En el caso del reconocimiento de la personalidad, se han comparado modelos clásicos del aprendizaje automático con modelos de aprendizaje profundo, pudiendo establecer una comparativa bajo las mismas premisas. Cabe destacar que el PNL ha evolucionado drásticamente en los últimos años gracias al desarrollo de campañas de evaluación pública, donde múltiples equipos de investigación comparan las capacidades de los modelos que proponen en las mismas condiciones. La mayoría de los modelos presentados en esta tesis fueron o bien evaluados mediante campañas de evaluación públicas, o bien emplearon la configuración de una campaña pública previamente celebrada. Siendo conscientes, por lo tanto, de la importancia de estas campañas para el avance del PNL, desarrollamos una campaña de evaluación pública cuyo objetivo era clasificar el tema tratado en un tweet, para lo cual recogimos y etiquetamos un nuevo conjunto de datos. A medida que avanzabamos en el desarrollo del trabajo de esta tesis, decidimos estudiar en profundidad como las RNC se aplicaban a las tareas de PNL. En este sentido, se exploraron dos líneas de trabajo. En primer lugar, propusimos un método de relleno semántico para RNC, que plantea una nueva manera de representar el texto para resolver tareas de PNL. Y en segundo lugar, se introdujo un marco teórico para abordar una de las críticas más frecuentes del aprendizaje profundo, el cual es la falta de interpretabilidad. Este marco busca visualizar qué patrones léxicos, si los hay, han sido aprendidos por la red para clasificar un texto.[CA] En els últims anys, els models d'aprenentatge automàtic profund (AP) han revolucionat els sistemes de processament de llenguatge natural (PLN). Hem estat testimonis d'un avanç formidable en les capacitats d'aquests sistemes i actualment podem trobar sistemes que integren models PLN de manera ubiqua. Alguns exemples d'aquests models amb els quals interaccionem diàriament inclouen models que determinen la intenció de la persona que va escriure un text, el sentiment que pretén comunicar un tweet o la nostra ideologia política a partir del que compartim en xarxes socials. En aquesta tesi s'han proposats diferents models de PNL que aborden tasques que estudien el text que es comparteix en xarxes socials. En concret, aquest treball se centra en dues tasques fonamentalment: l'anàlisi de sentiments i el reconeixement de la personalitat de la persona autora d'un text. La tasca d'analitzar el sentiment expressat en un text és un dels problemes principals en el PNL i consisteix a determinar la polaritat que un text pretén comunicar. Es tracta per tant d'una tasca estudiada en profunditat de la qual disposem d'una vasta quantitat de recursos i models. Per contra, el problema del reconeixement de la personalitat és una tasca revolucionària que té com a objectiu determinar la personalitat dels usuaris considerant el seu estil d'escriptura. L'estudi d'aquesta tasca és més marginal i en conseqüència disposem de menys recursos per abordar-la però no obstant i això presenta un gran potencial. Tot i que el fouc principal d'aquest treball va ser el desenvolupament de models d'aprenentatge profund, també hem proposat models basats en recursos lingüístics i models clàssics de l'aprenentatge automàtic. Aquests últims models ens han permès explorar les subtileses de diferents elements lingüístics com ara l'impacte que tenen les emocions en la classificació correcta del sentiment expressat en un text. Posteriorment, després d'aquests treballs inicials es van desenvolupar models AP, en particular, Xarxes neuronals convolucionals (XNC) que van ser aplicades a les tasques prèviament esmentades. En el cas de el reconeixement de la personalitat, s'han comparat models clàssics de l'aprenentatge automàtic amb models d'aprenentatge profund la qual cosa a permet establir una comparativa de les dos aproximacions sota les mateixes premisses. Cal remarcar que el PNL ha evolucionat dràsticament en els últims anys gràcies a el desenvolupament de campanyes d'avaluació pública on múltiples equips d'investigació comparen les capacitats dels models que proposen sota les mateixes condicions. La majoria dels models presentats en aquesta tesi van ser o bé avaluats mitjançant campanyes d'avaluació públiques, o bé s'ha emprat la configuració d'una campanya pública prèviament celebrada. Sent conscients, per tant, de la importància d'aquestes campanyes per a l'avanç del PNL, vam desenvolupar una campanya d'avaluació pública on l'objectiu era classificar el tema tractat en un tweet, per a la qual cosa vam recollir i etiquetar un nou conjunt de dades. A mesura que avançàvem en el desenvolupament del treball d'aquesta tesi, vam decidir estudiar en profunditat com les XNC s'apliquen a les tasques de PNL. En aquest sentit, es van explorar dues línies de treball.En primer lloc, vam proposar un mètode d'emplenament semàntic per RNC, que planteja una nova manera de representar el text per resoldre tasques de PNL. I en segon lloc, es va introduir un marc teòric per abordar una de les crítiques més freqüents de l'aprenentatge profund, el qual és la falta de interpretabilitat. Aquest marc cerca visualitzar quins patrons lèxics, si n'hi han, han estat apresos per la xarxa per classificar un text.[EN] In the last years, Deep Learning (DL) has revolutionised the potential of automatic systems that handle Natural Language Processing (NLP) tasks. We have witnessed a tremendous advance in the performance of these systems. Nowadays, we found embedded systems ubiquitously, determining the intent of the text we write, the sentiment of our tweets or our political views, for citing some examples. In this thesis, we proposed several NLP models for addressing tasks that deal with social media text. Concretely, this work is focused mainly on Sentiment Analysis and Personality Recognition tasks. Sentiment Analysis is one of the leading problems in NLP, consists of determining the polarity of a text, and it is a well-known task where the number of resources and models proposed is vast. In contrast, Personality Recognition is a breakthrough task that aims to determine the users' personality using their writing style, but it is more a niche task with fewer resources designed ad-hoc but with great potential. Despite the fact that the principal focus of this work was on the development of Deep Learning models, we have also proposed models based on linguistic resources and classical Machine Learning models. Moreover, in this more straightforward setup, we have explored the nuances of different language devices, such as the impact of emotions in the correct classification of the sentiment expressed in a text. Afterwards, DL models were developed, particularly Convolutional Neural Networks (CNNs), to address previously described tasks. In the case of Personality Recognition, we explored the two approaches, which allowed us to compare the models under the same circumstances. Noteworthy, NLP has evolved dramatically in the last years through the development of public evaluation campaigns, where multiple research teams compare the performance of their approaches under the same conditions. Most of the models here presented were either assessed in an evaluation task or either used their setup. Recognising the importance of this effort, we curated and developed an evaluation campaign for classifying political tweets. In addition, as we advanced in the development of this work, we decided to study in-depth CNNs applied to NLP tasks. Two lines of work were explored in this regard. Firstly, we proposed a semantic-based padding method for CNNs, which addresses how to represent text more appropriately for solving NLP tasks. Secondly, a theoretical framework was introduced for tackling one of the most frequent critics of Deep Learning: interpretability. This framework seeks to visualise what lexical patterns, if any, the CNN is learning in order to classify a sentence. In summary, the main achievements presented in this thesis are: - The organisation of an evaluation campaign for Topic Classification from texts gathered from social media. - The proposal of several Machine Learning models tackling the Sentiment Analysis task from social media. Besides, a study of the impact of linguistic devices such as figurative language in the task is presented. - The development of a model for inferring the personality of a developer provided the source code that they have written. - The study of Personality Recognition tasks from social media following two different approaches, models based on machine learning algorithms and handcrafted features, and models based on CNNs were proposed and compared both approaches. - The introduction of new semantic-based paddings for optimising how the text was represented in CNNs. - The definition of a theoretical framework to provide interpretable information to what CNNs were learning internally.Giménez Fayos, MT. (2021). Natural Language Processing using Deep Learning in Social Media [Tesis doctoral]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/172164TESI
    corecore