692 research outputs found
Move Forward and Tell: A Progressive Generator of Video Descriptions
We present an efficient framework that can generate a coherent paragraph to
describe a given video. Previous works on video captioning usually focus on
video clips. They typically treat an entire video as a whole and generate the
caption conditioned on a single embedding. On the contrary, we consider videos
with rich temporal structures and aim to generate paragraph descriptions that
can preserve the story flow while being coherent and concise. Towards this
goal, we propose a new approach, which produces a descriptive paragraph by
assembling temporally localized descriptions. Given a video, it selects a
sequence of distinctive clips and generates sentences thereon in a coherent
manner. Particularly, the selection of clips and the production of sentences
are done jointly and progressively driven by a recurrent network -- what to
describe next depends on what have been said before. Here, the recurrent
network is learned via self-critical sequence training with both sentence-level
and paragraph-level rewards. On the ActivityNet Captions dataset, our method
demonstrated the capability of generating high-quality paragraph descriptions
for videos. Compared to those by other methods, the descriptions produced by
our method are often more relevant, more coherent, and more concise.Comment: Accepted by ECCV 201
Conditional Image-Text Embedding Networks
This paper presents an approach for grounding phrases in images which jointly
learns multiple text-conditioned embeddings in a single end-to-end model. In
order to differentiate text phrases into semantically distinct subspaces, we
propose a concept weight branch that automatically assigns phrases to
embeddings, whereas prior works predefine such assignments. Our proposed
solution simplifies the representation requirements for individual embeddings
and allows the underrepresented concepts to take advantage of the shared
representations before feeding them into concept-specific layers. Comprehensive
experiments verify the effectiveness of our approach across three phrase
grounding datasets, Flickr30K Entities, ReferIt Game, and Visual Genome, where
we obtain a (resp.) 4%, 3%, and 4% improvement in grounding performance over a
strong region-phrase embedding baseline.Comment: ECCV 2018 accepted pape
Activation of the receptor protein tyrosine kinase EphB4 in endometrial hyperplasia and endometrial carcinoma
Background: Members of the Eph family of tyrosine kinases have been implicated in embryonic pattern formation and vascular development; however, little is known about their role in the adult organism. We have observed estrogen-dependent EphB4 expression in the normal breast suggesting its implication in the hormone-controlled homeostasis of this organ. Since the endometrium is a similarly hormone dependent organ and endometrial carcinoma is thought to result from estrogenic stimulation, we have investigated EphB4 expression in normal human endometrium and during its carcinogenesis. Patients and methods: EphB4 expression was analyzed immunohistochemically in 26 normal endometrium specimens, 15 hyperplasias and 102 endometrioid adenocarcinomas and correlated with clinical and prognostic tumor characteristics. Results: In normal endometrial tissue no EphB4 protein was detected. Strikingly, we observed a drastic increase (P <0.0001) in the number of EphB4 protein-expressing glandular epithelial cells in the majority of hyperplasias and carcinomas. Moreover, we found a statistically highly significant positive correlation between EphB4 expression and post-menopausal stage of the patient (P = 0.007). Conclusions: These findings indicate that in the endometrium, EphB4 is an early indicator of malignant development and, thus, EphB4 may represent a potent tool for diagnosis and therapeutic interventio
Nano-displacement measurements using spatially multimode squeezed light
We demonstrate the possibility of surpassing the quantum noise limit for
simultaneous multi-axis spatial displacement measurements that have zero mean
values. The requisite resources for these measurements are squeezed light beams
with exotic transverse mode profiles. We show that, in principle, lossless
combination of these modes can be achieved using the non-degenerate Gouy phase
shift of optical resonators. When the combined squeezed beams are measured with
quadrant detectors, we experimentally demonstrate a simultaneous reduction in
the transverse x- and y- displacement fluctuations of 2.2 dB and 3.1 dB below
the quantum noise limit.Comment: 21 pages, 9 figures, submitted to "Special Issue on Fluctuations &
Noise in Photonics & Quantum Optics" of J. Opt.
Visual Reasoning with Multi-hop Feature Modulation
Recent breakthroughs in computer vision and natural language processing have
spurred interest in challenging multi-modal tasks such as visual
question-answering and visual dialogue. For such tasks, one successful approach
is to condition image-based convolutional network computation on language via
Feature-wise Linear Modulation (FiLM) layers, i.e., per-channel scaling and
shifting. We propose to generate the parameters of FiLM layers going up the
hierarchy of a convolutional network in a multi-hop fashion rather than all at
once, as in prior work. By alternating between attending to the language input
and generating FiLM layer parameters, this approach is better able to scale to
settings with longer input sequences such as dialogue. We demonstrate that
multi-hop FiLM generation achieves state-of-the-art for the short input
sequence task ReferIt --- on-par with single-hop FiLM generation --- while also
significantly outperforming prior state-of-the-art and single-hop FiLM
generation on the GuessWhat?! visual dialogue task.Comment: In Proc of ECCV 201
NODIS: Neural Ordinary Differential Scene Understanding
Semantic image understanding is a challenging topic in computer vision. It
requires to detect all objects in an image, but also to identify all the
relations between them. Detected objects, their labels and the discovered
relations can be used to construct a scene graph which provides an abstract
semantic interpretation of an image. In previous works, relations were
identified by solving an assignment problem formulated as Mixed-Integer Linear
Programs. In this work, we interpret that formulation as Ordinary Differential
Equation (ODE). The proposed architecture performs scene graph inference by
solving a neural variant of an ODE by end-to-end learning. It achieves
state-of-the-art results on all three benchmark tasks: scene graph generation
(SGGen), classification (SGCls) and visual relationship detection (PredCls) on
Visual Genome benchmark
Reflectance spectra of synthetic Fe-free ortho- and clinoenstatites in the UV/ VIS/IR and implications for remote sensing detection of Fe-free pyroxenes on planetary surfaces
For a better spectral characterization of planetary bodies with enstatite-rich surfaces like Mercury or E-type asteroids, we synthesized two different enstatite (Mg2Si2O6) polymorphs: Orthoenstatite and clinoenstatite. Both enstatite polymorphs are known from the meteorite record and are commonly observed in aubrites and enstatite chondrites. The synthesized enstatites are particulate samples suitable for laboratory reflectance measurements and can be used for compositional modelling by preparing mixtures of samples in the laboratory or by using the sample's spectra in mathematical models. We report on the synthesis process, chemical composition, grain size distribution, and reflectance spectra of these synthetic enstatites covering the wavelength range from 0.25 to 17 μm, compare them to other pyroxenes (meteoritic enstatite and other synthetic enstatites and diopside), and discuss the implications of retrieving surface compositions of planetary bodies like E-type asteroids, comets, or Mercury. Both enstatite spectra are very bright in the VIS and NIR and show almost neutral to slightly bluish spectral slopes with a steep absorption in the UV. Very low iron in the enstatites (below ~0.04 wt% FeO) already results in weak albeit noticeable absorptions in the VNIR between 0.4 and 0.9 μm. Orthoenstatite and clinoenstatite are not distinguishable based only on their spectra in the VIS and NIR. At the Reststrahlen bands in the MIR a systematic difference in the number and exact position of local minima at ~10 μm between clinoenstatite and orthoenstatite is evident. This can be used to discern between the polymorphs in this wavelength range. Additionally, we can distinguish between Fe-free low- and high-Ca pyroxenes in the MIR
Recommended from our members
Non-stoichiometric oxide and metal interfaces and reactions
We have employed a combination of experimental surface science techniques and density functional calculations to study the reduction of TiO2(110) surfaces through the doping with submonolayer transition metals. We concentrate on the role of Ti adatoms in self doping of rutile and contrast the behaviour to that of Cr. DFT+U calculations enable identification of probable adsorption structures and their spectroscopic characteristics. Adsorption of both metals leads to a broken symmetry and an asymmetric charge transfer localised around the defect site of a mixed localised/delocalised character. Charge transfer creates defect states with Ti 3d character in the band gap at similar to 1-eV binding energy. Cr adsorption, however, leads to a very large shift in the valence-band edge to higher binding energy and the creation of Cr 3d states at 2.8-eV binding energy. Low-temperature oxidation lifts the Ti-derived band-gap states and modifies the intensity of the Cr features, indicative of a change of oxidation state from Cr3+ to Cr4+. Higher temperature processing leads to a loss of Cr from the surface region, indicative of its substitution into the bulk
- …