7,376 research outputs found
Language Models for Image Captioning: The Quirks and What Works
Two recent approaches have achieved state-of-the-art results in image
captioning. The first uses a pipelined process where a set of candidate words
is generated by a convolutional neural network (CNN) trained on images, and
then a maximum entropy (ME) language model is used to arrange these words into
a coherent sentence. The second uses the penultimate activation layer of the
CNN as input to a recurrent neural network (RNN) that then generates the
caption sequence. In this paper, we compare the merits of these different
language modeling approaches for the first time by using the same
state-of-the-art CNN as input. We examine issues in the different approaches,
including linguistic irregularities, caption repetition, and data set overlap.
By combining key aspects of the ME and RNN methods, we achieve a new record
performance over previously published results on the benchmark COCO dataset.
However, the gains we see in BLEU do not translate to human judgments.Comment: See http://research.microsoft.com/en-us/projects/image_captioning for
project informatio
Time fractals and discrete scale invariance with trapped ions
We show that a one-dimensional chain of trapped ions can be engineered to
produce a quantum mechanical system with discrete scale invariance and
fractal-like time dependence. By discrete scale invariance we mean a system
that replicates itself under a rescaling of distance for some scale factor, and
a time fractal is a signal that is invariant under the rescaling of time. These
features are reminiscent of the Efimov effect, which has been predicted and
observed in bound states of three-body systems. We demonstrate that discrete
scale invariance in the trapped ion system can be controlled with two
independently tunable parameters. We also discuss the extension to n-body
states where the discrete scaling symmetry has an exotic heterogeneous
structure. The results we present can be realized using currently available
technologies developed for trapped ion quantum systems.Comment: 4 + 5 pages (main + supplemental materials), 2 + 3 figures (main +
supplemental materials), version to appear in Physical Review A Rapid
Communication
Generating Natural Questions About an Image
There has been an explosion of work in the vision & language community during
the past few years from image captioning to video transcription, and answering
questions about images. These tasks have focused on literal descriptions of the
image. To move beyond the literal, we choose to explore how questions about an
image are often directed at commonsense inference and the abstract events
evoked by objects in the image. In this paper, we introduce the novel task of
Visual Question Generation (VQG), where the system is tasked with asking a
natural and engaging question when shown an image. We provide three datasets
which cover a variety of images from object-centric to event-centric, with
considerably more abstract training data than provided to state-of-the-art
captioning systems thus far. We train and test several generative and retrieval
models to tackle the task of VQG. Evaluation results show that while such
models ask reasonable questions for a variety of images, there is still a wide
gap with human performance which motivates further work on connecting images
with commonsense knowledge and pragmatics. Our proposed task offers a new
challenge to the community which we hope furthers interest in exploring deeper
connections between vision & language.Comment: Proceedings of the 54th Annual Meeting of the Association for
Computational Linguistic
Nanoscale Metamaterial Optical Waveguides with Ultrahigh Refractive Indices
We propose deep-subwavelength optical waveguides based on metal-dielectric
multilayer indefinite metamaterials with ultrahigh effective refractive
indices. Waveguide modes with different mode orders are systematically analyzed
with numerical simulations based on both metal-dielectric multilayer structures
and the effective medium approach. The dependences of waveguide mode indices,
propagation lengths and mode areas on different mode orders, free space
wavelengths and sizes of waveguide cross sections are studied. Furthermore,
waveguide modes are also illustrated with iso-frequency contours in the wave
vector space in order to investigate the mechanism of waveguide mode cutoff for
high order modes. The deep-subwavelength optical waveguide with a size smaller
than {\lambda}0/50 and a mode area in the order of 10-4 {\lambda}02 is
realized, and an ultrahigh effective refractive index up to 62.0 is achieved at
the telecommunication wavelength. This new type of metamaterial optical
waveguide opens up opportunities for various applications in enhanced
light-matter interactions.Comment: 22 pages, 8 figure
Stochastic Substitute Training: A Gray-box Approach to Craft Adversarial Examples Against Gradient Obfuscation Defenses
It has been shown that adversaries can craft example inputs to neural
networks which are similar to legitimate inputs but have been created to
purposely cause the neural network to misclassify the input. These adversarial
examples are crafted, for example, by calculating gradients of a carefully
defined loss function with respect to the input. As a countermeasure, some
researchers have tried to design robust models by blocking or obfuscating
gradients, even in white-box settings. Another line of research proposes
introducing a separate detector to attempt to detect adversarial examples. This
approach also makes use of gradient obfuscation techniques, for example, to
prevent the adversary from trying to fool the detector. In this paper, we
introduce stochastic substitute training, a gray-box approach that can craft
adversarial examples for defenses which obfuscate gradients. For those defenses
that have tried to make models more robust, with our technique, an adversary
can craft adversarial examples with no knowledge of the defense. For defenses
that attempt to detect the adversarial examples, with our technique, an
adversary only needs very limited information about the defense to craft
adversarial examples. We demonstrate our technique by applying it against two
defenses which make models more robust and two defenses which detect
adversarial examples.Comment: Accepted by AISec '18: 11th ACM Workshop on Artificial Intelligence
and Security. Source code at https://github.com/S-Mohammad-Hashemi/SS
Maximally localized Wannier functions in LaMnO3 within PBE+U, hybrid functionals, and partially self-consistent GW: an efficient route to construct ab-initio tight-binding parameters for e_g perovskites
Using the newly developed VASP2WANNIER90 interface we have constructed
maximally localized Wannier functions (MLWFs) for the e_g states of the
prototypical Jahn-Teller magnetic perovskite LaMnO3 at different levels of
approximation for the exchange-correlation kernel. These include conventional
density functional theory (DFT) with and without additional on-site Hubbard U
term, hybrid-DFT, and partially self-consistent GW. By suitably mapping the
MLWFs onto an effective e_g tight-binding (TB) Hamiltonian we have computed a
complete set of TB parameters which should serve as guidance for more elaborate
treatments of correlation effects in effective Hamiltonian-based approaches.
The method-dependent changes of the calculated TB parameters and their
interplay with the electron-electron (el-el) interaction term are discussed and
interpreted. We discuss two alternative model parameterizations: one in which
the effects of the el-el interaction are implicitly incorporated in the
otherwise "noninteracting" TB parameters, and a second where we include an
explicit mean-field el-el interaction term in the TB Hamiltonian. Both models
yield a set of tabulated TB parameters which provide the band dispersion in
excellent agreement with the underlying ab initio and MLWF bands.Comment: 30 pages, 7 figure
- …
