130,898 research outputs found
Symbolic inductive bias for visually grounded learning of spoken language
A widespread approach to processing spoken language is to first automatically
transcribe it into text. An alternative is to use an end-to-end approach:
recent works have proposed to learn semantic embeddings of spoken language from
images with spoken captions, without an intermediate transcription step. We
propose to use multitask learning to exploit existing transcribed speech within
the end-to-end setting. We describe a three-task architecture which combines
the objectives of matching spoken captions with corresponding images, speech
with text, and text with images. We show that the addition of the speech/text
task leads to substantial performance improvements on image retrieval when
compared to training the speech/image task in isolation. We conjecture that
this is due to a strong inductive bias transcribed speech provides to the
model, and offer supporting evidence for this.Comment: ACL 201
Model for an Intelligent Operating System for Executing Tasks on a Reconfigurable Parallel Architecture
Parallel processing is one approach to achieve the large computational processing capabilities required by many real-time computing tasks. One of the problems that must be addressed in the use of reconfigurable multiprocessor systems is matching the architecture configuration to the algorithms to be executed. This paper presents a conceptual model that explores the potential of artificial intelligence tools, specifically expert systems, to design an Intelligent Operating System for multiprocessor systems. The target task is the implementation of image understanding systems on multiprocessor architectures. PASM is used as an example multiprocessor. The Intelligent Operating System concepts developed here could also be used to address other problems requiring real-time processing. An example image understanding task is presented to illustrate the concept of intelligent scheduling by the Intelligent Operating System. Also considered is the use of the conceptual model when developing an image understanding system in order to test different strategies for choosing algorithms, imposing execution order constraints, and integrating results from various algorithms
Fingerprint image enhancement using fully convolutional deep autoencoders / Destaque de imagens de impressão digital utilizando autoencoders profundos totalmente convolucionais
Image quality for fingerprint samples is critical for the matching process. Novel methods introduce deep learning matching techniques based on convolutions neural networks to enhance degraded fingerprint images. However, due to the nature of the enhanced image problem, these methods tend to rely on processing small image patches to achieve their goal. Such an approach may often yield satisfactory results while having high computational costs due to overlapping in patches. In this paper, we propose a fast and accurate fully convolutional neural network based on an auto-encoder architecture to enhance the quality of fingerprint images. We do not use the patch processing method and instead train a model to enhance the image as a whole. After exhaustive testing, we achieve a model that can quickly perform the desired task, while achieving an average of 97.956% and 83.748% per pixel accuracy on the easiest and hardest dataset respectively. The models were trained on the publicly available Fingerprint Verification Competition datasets. We then highlight the most general model that can best enhance the quality of all datasets
Neighbourhood Consensus Networks
We address the problem of finding reliable dense correspondences between a
pair of images. This is a challenging task due to strong appearance differences
between the corresponding scene elements and ambiguities generated by
repetitive patterns. The contributions of this work are threefold. First,
inspired by the classic idea of disambiguating feature matches using semi-local
constraints, we develop an end-to-end trainable convolutional neural network
architecture that identifies sets of spatially consistent matches by analyzing
neighbourhood consensus patterns in the 4D space of all possible
correspondences between a pair of images without the need for a global
geometric model. Second, we demonstrate that the model can be trained
effectively from weak supervision in the form of matching and non-matching
image pairs without the need for costly manual annotation of point to point
correspondences. Third, we show the proposed neighbourhood consensus network
can be applied to a range of matching tasks including both category- and
instance-level matching, obtaining the state-of-the-art results on the PF
Pascal dataset and the InLoc indoor visual localization benchmark.Comment: In Proceedings of the 32nd Conference on Neural Information
Processing Systems (NeurIPS 2018
Biometrics-as-a-Service: A Framework to Promote Innovative Biometric Recognition in the Cloud
Biometric recognition, or simply biometrics, is the use of biological
attributes such as face, fingerprints or iris in order to recognize an
individual in an automated manner. A key application of biometrics is
authentication; i.e., using said biological attributes to provide access by
verifying the claimed identity of an individual. This paper presents a
framework for Biometrics-as-a-Service (BaaS) that performs biometric matching
operations in the cloud, while relying on simple and ubiquitous consumer
devices such as smartphones. Further, the framework promotes innovation by
providing interfaces for a plurality of software developers to upload their
matching algorithms to the cloud. When a biometric authentication request is
submitted, the system uses a criteria to automatically select an appropriate
matching algorithm. Every time a particular algorithm is selected, the
corresponding developer is rendered a micropayment. This creates an innovative
and competitive ecosystem that benefits both software developers and the
consumers. As a case study, we have implemented the following: (a) an ocular
recognition system using a mobile web interface providing user access to a
biometric authentication service, and (b) a Linux-based virtual machine
environment used by software developers for algorithm development and
submission
- …