130,898 research outputs found

    Symbolic inductive bias for visually grounded learning of spoken language

    Full text link
    A widespread approach to processing spoken language is to first automatically transcribe it into text. An alternative is to use an end-to-end approach: recent works have proposed to learn semantic embeddings of spoken language from images with spoken captions, without an intermediate transcription step. We propose to use multitask learning to exploit existing transcribed speech within the end-to-end setting. We describe a three-task architecture which combines the objectives of matching spoken captions with corresponding images, speech with text, and text with images. We show that the addition of the speech/text task leads to substantial performance improvements on image retrieval when compared to training the speech/image task in isolation. We conjecture that this is due to a strong inductive bias transcribed speech provides to the model, and offer supporting evidence for this.Comment: ACL 201

    Model for an Intelligent Operating System for Executing Tasks on a Reconfigurable Parallel Architecture

    Get PDF
    Parallel processing is one approach to achieve the large computational processing capabilities required by many real-time computing tasks. One of the problems that must be addressed in the use of reconfigurable multiprocessor systems is matching the architecture configuration to the algorithms to be executed. This paper presents a conceptual model that explores the potential of artificial intelligence tools, specifically expert systems, to design an Intelligent Operating System for multiprocessor systems. The target task is the implementation of image understanding systems on multiprocessor architectures. PASM is used as an example multiprocessor. The Intelligent Operating System concepts developed here could also be used to address other problems requiring real-time processing. An example image understanding task is presented to illustrate the concept of intelligent scheduling by the Intelligent Operating System. Also considered is the use of the conceptual model when developing an image understanding system in order to test different strategies for choosing algorithms, imposing execution order constraints, and integrating results from various algorithms

    Fingerprint image enhancement using fully convolutional deep autoencoders / Destaque de imagens de impressão digital utilizando autoencoders profundos totalmente convolucionais

    Get PDF
    Image quality for fingerprint samples is critical for the matching process. Novel methods introduce deep learning matching techniques based on convolutions neural networks to enhance degraded fingerprint images. However, due to the nature of the enhanced image problem, these methods tend to rely on processing small image patches to achieve their goal. Such an approach may often yield satisfactory results while having high computational costs due to overlapping in patches. In this paper, we propose a fast and accurate fully convolutional neural network based on an auto-encoder architecture to enhance the quality of fingerprint images. We do not use the patch processing method and instead train a model to enhance the image as a whole. After exhaustive testing, we achieve a model that can quickly perform the desired task, while achieving an average of 97.956% and 83.748% per pixel accuracy on the easiest and hardest dataset respectively. The models were trained on the publicly available Fingerprint Verification Competition datasets. We then highlight the most general model that can best enhance the quality of all datasets

    Neighbourhood Consensus Networks

    Get PDF
    We address the problem of finding reliable dense correspondences between a pair of images. This is a challenging task due to strong appearance differences between the corresponding scene elements and ambiguities generated by repetitive patterns. The contributions of this work are threefold. First, inspired by the classic idea of disambiguating feature matches using semi-local constraints, we develop an end-to-end trainable convolutional neural network architecture that identifies sets of spatially consistent matches by analyzing neighbourhood consensus patterns in the 4D space of all possible correspondences between a pair of images without the need for a global geometric model. Second, we demonstrate that the model can be trained effectively from weak supervision in the form of matching and non-matching image pairs without the need for costly manual annotation of point to point correspondences. Third, we show the proposed neighbourhood consensus network can be applied to a range of matching tasks including both category- and instance-level matching, obtaining the state-of-the-art results on the PF Pascal dataset and the InLoc indoor visual localization benchmark.Comment: In Proceedings of the 32nd Conference on Neural Information Processing Systems (NeurIPS 2018

    Biometrics-as-a-Service: A Framework to Promote Innovative Biometric Recognition in the Cloud

    Full text link
    Biometric recognition, or simply biometrics, is the use of biological attributes such as face, fingerprints or iris in order to recognize an individual in an automated manner. A key application of biometrics is authentication; i.e., using said biological attributes to provide access by verifying the claimed identity of an individual. This paper presents a framework for Biometrics-as-a-Service (BaaS) that performs biometric matching operations in the cloud, while relying on simple and ubiquitous consumer devices such as smartphones. Further, the framework promotes innovation by providing interfaces for a plurality of software developers to upload their matching algorithms to the cloud. When a biometric authentication request is submitted, the system uses a criteria to automatically select an appropriate matching algorithm. Every time a particular algorithm is selected, the corresponding developer is rendered a micropayment. This creates an innovative and competitive ecosystem that benefits both software developers and the consumers. As a case study, we have implemented the following: (a) an ocular recognition system using a mobile web interface providing user access to a biometric authentication service, and (b) a Linux-based virtual machine environment used by software developers for algorithm development and submission
    corecore