3,873 research outputs found

    Multi-Label Logo Classification using Convolutional Neural Networks

    Get PDF
    The classification of logos is a particular case within computer vision since they have their own characteristics. Logos can contain only text, iconic images or a combination of both, and they usually include figurative symbols designed by experts that vary substantially besides they may share the same semantics. This work presents a method for multi-label classification and retrieval of logo images. For this, Convolutional Neural Networks (CNN) are trained to classify logos from the European Union TradeMark (EUTM) dataset according to their colors, shapes, sectors and figurative designs. An auto-encoder is also trained to learn representations of the input images. Once trained, the neural codes from the last convolutional layers in the CNN and the central layer of the auto-encoder can be used to perform similarity search through kNN, allowing us to obtain the most similar logos based on their color, shape, sector, figurative elements, overall features, or a weighted combination of them provided by the user. To the best of our knowledge, this is the first multi-label classification method for logos, and the only one that allows retrieving a ranking of images with these criteria provided by the user.This work is supported by the Spanish Ministry HISPAMUS project with code TIN2017-86576-R, partially funded by the EU

    Proceedings of the 2nd Computer Science Student Workshop: Microsoft Istanbul, Turkey, April 9, 2011

    Get PDF

    Multi-label logo recognition and retrieval based on weighted fusion of neural features

    Get PDF
    Classifying logo images is a challenging task as they contain elements such as text or shapes that can represent anything from known objects to abstract shapes. While the current state of the art for logo classification addresses the problem as a multi-class task focusing on a single characteristic, logos can have several simultaneous labels, such as different colours. This work proposes a method that allows visually similar logos to be classified and searched from a set of data according to their shape, colour, commercial sector, semantics, general characteristics, or a combination of features selected by the user. Unlike previous approaches, the proposal employs a series of multi-label deep neural networks specialized in specific attributes and combines the obtained features to perform the similarity search. To delve into the classification system, different existing logo topologies are compared and some of their problems are analysed, such as the incomplete labelling that trademark registration databases usually contain. The proposal is evaluated considering 76,000 logos (seven times more than previous approaches) from the European Union Trademarks dataset, which is organized hierarchically using the Vienna ontology. Overall, experimentation attains reliable quantitative and qualitative results, reducing the normalized average rank error of the state-of-the-art from 0.040 to 0.018 for the Trademark Image Retrieval task. Finally, given that the semantics of logos can often be subjective, graphic design students and professionals were surveyed. Results show that the proposed methodology provides better labelling than a human expert operator, improving the label ranking average precision from 0.53 to 0.68.This work was supported by the Pattern Recognition and Artificial Intelligence Group (PRAIG) from the University of Alicante and the University Institute for Computing Research (IUII). The Conselleria d'Innovació, Universitats, Ciència I Societat Digital from Generalitat Valenciana and FEDER provided some of the computing resources used in this project through IDIFEDER/2020/003. This research was partially supported by the Conselleria de Educación, Universidades y Empleo, for the project "clasifIA" of the Escola Superior d'Art i Disseny d'Alacant

    One-Class Subject Identification From Smartphone-Acquired Walking Data

    Get PDF
    In this work, a novel type of human identification system is proposed, which has the aim to recognize a user from his biometric traits of his way of walk. A smartphone is utilized to acquire motion data from the built-in sensors. Data from accelerometer and gyroscope are processed through a cycle extraction phase, a Convolutional Neural Network for feature extraction and a One-Class SVM classifier for identification. From quantitave results the system achieves an Equal Error Rate close to 1

    Vehicle make and model recognition for intelligent transportation monitoring and surveillance.

    Get PDF
    Vehicle Make and Model Recognition (VMMR) has evolved into a significant subject of study due to its importance in numerous Intelligent Transportation Systems (ITS), such as autonomous navigation, traffic analysis, traffic surveillance and security systems. A highly accurate and real-time VMMR system significantly reduces the overhead cost of resources otherwise required. The VMMR problem is a multi-class classification task with a peculiar set of issues and challenges like multiplicity, inter- and intra-make ambiguity among various vehicles makes and models, which need to be solved in an efficient and reliable manner to achieve a highly robust VMMR system. In this dissertation, facing the growing importance of make and model recognition of vehicles, we present a VMMR system that provides very high accuracy rates and is robust to several challenges. We demonstrate that the VMMR problem can be addressed by locating discriminative parts where the most significant appearance variations occur in each category, and learning expressive appearance descriptors. Given these insights, we consider two data driven frameworks: a Multiple-Instance Learning-based (MIL) system using hand-crafted features and an extended application of deep neural networks using MIL. Our approach requires only image level class labels, and the discriminative parts of each target class are selected in a fully unsupervised manner without any use of part annotations or segmentation masks, which may be costly to obtain. This advantage makes our system more intelligent, scalable, and applicable to other fine-grained recognition tasks. We constructed a dataset with 291,752 images representing 9,170 different vehicles to validate and evaluate our approach. Experimental results demonstrate that the localization of parts and distinguishing their discriminative powers for categorization improve the performance of fine-grained categorization. Extensive experiments conducted using our approaches yield superior results for images that were occluded, under low illumination, partial camera views, or even non-frontal views, available in our real-world VMMR dataset. The approaches presented herewith provide a highly accurate VMMR system for rea-ltime applications in realistic environments.\\ We also validate our system with a significant application of VMMR to ITS that involves automated vehicular surveillance. We show that our application can provide law inforcement agencies with efficient tools to search for a specific vehicle type, make, or model, and to track the path of a given vehicle using the position of multiple cameras

    The 2nd Conference of PhD Students in Computer Science

    Get PDF

    Gradient metasurfaces: a review of fundamentals and applications

    Full text link
    In the wake of intense research on metamaterials the two-dimensional analogue, known as metasurfaces, has attracted progressively increasing attention in recent years due to the ease of fabrication and smaller insertion losses, while enabling an unprecedented control over spatial distributions of transmitted and reflected optical fields. Metasurfaces represent optically thin planar arrays of resonant subwavelength elements that can be arranged in a strictly or quasi periodic fashion, or even in an aperiodic manner, depending on targeted optical wavefronts to be molded with their help. This paper reviews a broad subclass of metasurfaces, viz. gradient metasurfaces, which are devised to exhibit spatially varying optical responses resulting in spatially varying amplitudes, phases and polarizations of scattered fields. Starting with introducing the concept of gradient metasurfaces, we present classification of different metasurfaces from the viewpoint of their responses, differentiating electrical-dipole, geometric, reflective and Huygens' metasurfaces. The fundamental building blocks essential for the realization of metasurfaces are then discussed in order to elucidate the underlying physics of various physical realizations of both plasmonic and purely dielectric metasurfaces. We then overview the main applications of gradient metasurfaces, including waveplates, flat lenses, spiral phase plates, broadband absorbers, color printing, holograms, polarimeters and surface wave couplers. The review is terminated with a short section on recently developed nonlinear metasurfaces, followed by the outlook presenting our view on possible future developments and perspectives for future applications.Comment: Accepted for publication in Reports on Progress in Physic

    Enhanced iris recognition: Algorithms for segmentation, matching and synthesis

    Get PDF
    This thesis addresses the issues of segmentation, matching, fusion and synthesis in the context of irises and makes a four-fold contribution. The first contribution of this thesis is a post matching algorithm that observes the structure of the differences in feature templates to enhance recognition accuracy. The significance of the scheme is its robustness to inaccuracies in the iris segmentation process. Experimental results on the CASIA database indicate the efficacy of the proposed technique. The second contribution of this thesis is a novel iris segmentation scheme that employs Geodesic Active Contours to extract the iris from the surrounding structures. The proposed scheme elicits the iris texture in an iterative fashion depending upon both the local and global conditions of the image. The performance of an iris recognition algorithm on both the WVU non-ideal and CASIA iris database is observed to improve upon application of the proposed segmentation algorithm. The third contribution of this thesis is the fusion of multiple instances of the same iris and multiple iris units of the eye, i.e., the left and right iris at the match score level. Using simple sum rule, it is demonstrated that both multi-instance and multi-unit fusion of iris can lead to a significant improvement in matching accuracy. The final contribution is a technique to create a large database of digital renditions of iris images that can be used to evaluate the performance of iris recognition algorithms. This scheme is implemented in two stages. In the first stage, a Markov Random Field model is used to generate a background texture representing the global iris appearance. In the next stage a variety of iris features, viz., radial and concentric furrows, collarette and crypts, are generated and embedded in the texture field. Experimental results confirm the validity of the synthetic irises generated using this technique

    A Review of Findings from Neuroscience and Cognitive Psychology as Possible Inspiration for the Path to Artificial General Intelligence

    Full text link
    This review aims to contribute to the quest for artificial general intelligence by examining neuroscience and cognitive psychology methods for potential inspiration. Despite the impressive advancements achieved by deep learning models in various domains, they still have shortcomings in abstract reasoning and causal understanding. Such capabilities should be ultimately integrated into artificial intelligence systems in order to surpass data-driven limitations and support decision making in a way more similar to human intelligence. This work is a vertical review that attempts a wide-ranging exploration of brain function, spanning from lower-level biological neurons, spiking neural networks, and neuronal ensembles to higher-level concepts such as brain anatomy, vector symbolic architectures, cognitive and categorization models, and cognitive architectures. The hope is that these concepts may offer insights for solutions in artificial general intelligence.Comment: 143 pages, 49 figures, 244 reference
    • …
    corecore