30 research outputs found

    Real-time logo detection in brand-related social media images

    Get PDF
    This paper presents a work consisting in using deep convolutional neural networks (CNNs) for real-time logo detection in brand-related social media images. The final goal is to facilitate searching and discovering user-generated content (UGC) with potential value for digital marketing tasks. The images are captured in real time and automatically annotated with two CNNs designed for object detection, SSD InceptionV2 and Faster Atrous InceptionV4 (that provides better performance on small objects). We report experiments with 2 real brands, Estrella Damm and Futbol Club Barcelona. We examine the impact of different configurations and derive conclusions aiming to pave the way towards systematic and optimized methodologies for automatic logo detection in UGC.This work is partially supported by the Spanish Ministry of Economy and Competitivity under contract TIN2015-65316-P and by the SGR programme (2014- SGR-1051 and 2017-SGR-962) of the Catalan Government.Peer ReviewedPostprint (author's final draft

    Improving Facial Emotion Recognition with Image processing and Deep Learning

    Get PDF
    Humans often use facial expressions along with words in order to communicate effectively. There has been extensive study of how we can classify facial emotion with computer vision methodologies. These have had varying levels of success given challenges and the limitations of databases, such as static data or facial capture in non-real environments. Given this, we believe that new preprocessing techniques are required to improve the accuracy of facial detection models. In this paper, we propose a new yet simple method for facial expression recognition that enhances accuracy. We conducted our experiments on the FER-2013 dataset that contains static facial images. We utilized Unsharp Mask and Histogram equalization to emphasize texture and details of the images. We implemented Convolution Neural Networks [CNNs] to classify the images into 7 different facial expressions, yielding an accuracy of 69.46% on the test set. We also employed pre-trained models such as Resnet-50, Senet-50, VGG16, and FaceNet, and applied transfer learning to achieve an accuracy of 76.01% using an ensemble of seven models

    Investigating the Robustness of Sequential Recommender Systems Against Training Data Perturbations: an Empirical Study

    Full text link
    Sequential Recommender Systems (SRSs) have been widely used to model user behavior over time, but their robustness in the face of perturbations to training data is a critical issue. In this paper, we conduct an empirical study to investigate the effects of removing items at different positions within a temporally ordered sequence. We evaluate two different SRS models on multiple datasets, measuring their performance using Normalized Discounted Cumulative Gain (NDCG) and Rank Sensitivity List metrics. Our results demonstrate that removing items at the end of the sequence significantly impacts performance, with NDCG decreasing up to 60\%, while removing items from the beginning or middle has no significant effect. These findings highlight the importance of considering the position of the perturbed items in the training data and shall inform the design of more robust SRSs

    A Study on Human Face Expressions using Convolutional Neural Networks and Generative Adversarial Networks

    Get PDF
    Human beings express themselves via words, signs, gestures, and facial emotions. Previous research using pre-trained convolutional models had been done by freezing the entire network and running the models without the use of any image processing techniques. In this research, we attempt to enhance the accuracy of many deep CNN architectures like ResNet and Senet, using a variety of different image processing techniques like Image Data Generator, Histogram Equalization, and UnSharpMask. We used FER 2013, which is a dataset containing multiple classes of images. While working on these models, we decided to take things to the next level, and we attempted to make changes to the models themselves to improve their accuracy. While working on this research, we were introduced to another concept in Deep Learning known as Generative Adversarial Networks, which are also known as GANs. They are generative deep learning models which are based on deep CNN models, and they comprise two CNN models - a Generator and a Discriminator. The primary task of the former is to generate random noises in the form of images and passes them to the latter. The Discriminator compares the noise with the input image and accepts/rejects it, based on the similarity. Over the years, there have been various distinguished architectures of GANs namely CycleGAN, StyleGAN, etc. which have allowed us to create sophisticated architectures to not only generate the same image as the original input but also to make changes to them and generate different images. For example, CycleGAN allows us to change the season of scenery from Summer to Winter or change the emotion in the face of a person from happy to sad. Though these sophisticated models are good, we are working with an architecture that has two deep neural networks, which essentially creates problems with hyperparameter tuning and overfitting

    Mining Primary Care Electronic Health Records for Automatic Disease Phenotyping: A Transparent Machine Learning Framework

    Get PDF
    (1) Background: We aimed to develop a transparent machine-learning (ML) framework to automatically identify patients with a condition from electronic health records (EHRs) via a parsimonious set of features. (2) Methods: We linked multiple sources of EHRs, including 917,496,869 primary care records and 40,656,805 secondary care records and 694,954 records from specialist surgeries between 2002 and 2012, to generate a unique dataset. Then, we treated patient identification as a problem of text classification and proposed a transparent disease-phenotyping framework. This framework comprises a generation of patient representation, feature selection, and optimal phenotyping algorithm development to tackle the imbalanced nature of the data. This framework was extensively evaluated by identifying rheumatoid arthritis (RA) and ankylosing spondylitis (AS). (3) Results: Being applied to the linked dataset of 9657 patients with 1484 cases of rheumatoid arthritis (RA) and 204 cases of ankylosing spondylitis (AS), this framework achieved accuracy and positive predictive values of 86.19% and 88.46%, respectively, for RA and 99.23% and 97.75% for AS, comparable with expert knowledge-driven methods. (4) Conclusions: This framework could potentially be used as an efficient tool for identifying patients with a condition of interest from EHRs, helping clinicians in clinical decision-support process
    corecore