692 research outputs found

    Move Forward and Tell: A Progressive Generator of Video Descriptions

    Full text link
    We present an efficient framework that can generate a coherent paragraph to describe a given video. Previous works on video captioning usually focus on video clips. They typically treat an entire video as a whole and generate the caption conditioned on a single embedding. On the contrary, we consider videos with rich temporal structures and aim to generate paragraph descriptions that can preserve the story flow while being coherent and concise. Towards this goal, we propose a new approach, which produces a descriptive paragraph by assembling temporally localized descriptions. Given a video, it selects a sequence of distinctive clips and generates sentences thereon in a coherent manner. Particularly, the selection of clips and the production of sentences are done jointly and progressively driven by a recurrent network -- what to describe next depends on what have been said before. Here, the recurrent network is learned via self-critical sequence training with both sentence-level and paragraph-level rewards. On the ActivityNet Captions dataset, our method demonstrated the capability of generating high-quality paragraph descriptions for videos. Compared to those by other methods, the descriptions produced by our method are often more relevant, more coherent, and more concise.Comment: Accepted by ECCV 201

    Conditional Image-Text Embedding Networks

    Full text link
    This paper presents an approach for grounding phrases in images which jointly learns multiple text-conditioned embeddings in a single end-to-end model. In order to differentiate text phrases into semantically distinct subspaces, we propose a concept weight branch that automatically assigns phrases to embeddings, whereas prior works predefine such assignments. Our proposed solution simplifies the representation requirements for individual embeddings and allows the underrepresented concepts to take advantage of the shared representations before feeding them into concept-specific layers. Comprehensive experiments verify the effectiveness of our approach across three phrase grounding datasets, Flickr30K Entities, ReferIt Game, and Visual Genome, where we obtain a (resp.) 4%, 3%, and 4% improvement in grounding performance over a strong region-phrase embedding baseline.Comment: ECCV 2018 accepted pape

    Activation of the receptor protein tyrosine kinase EphB4 in endometrial hyperplasia and endometrial carcinoma

    Get PDF
    Background: Members of the Eph family of tyrosine kinases have been implicated in embryonic pattern formation and vascular development; however, little is known about their role in the adult organism. We have observed estrogen-dependent EphB4 expression in the normal breast suggesting its implication in the hormone-controlled homeostasis of this organ. Since the endometrium is a similarly hormone dependent organ and endometrial carcinoma is thought to result from estrogenic stimulation, we have investigated EphB4 expression in normal human endometrium and during its carcinogenesis. Patients and methods: EphB4 expression was analyzed immunohistochemically in 26 normal endometrium specimens, 15 hyperplasias and 102 endometrioid adenocarcinomas and correlated with clinical and prognostic tumor characteristics. Results: In normal endometrial tissue no EphB4 protein was detected. Strikingly, we observed a drastic increase (P <0.0001) in the number of EphB4 protein-expressing glandular epithelial cells in the majority of hyperplasias and carcinomas. Moreover, we found a statistically highly significant positive correlation between EphB4 expression and post-menopausal stage of the patient (P = 0.007). Conclusions: These findings indicate that in the endometrium, EphB4 is an early indicator of malignant development and, thus, EphB4 may represent a potent tool for diagnosis and therapeutic interventio

    Nano-displacement measurements using spatially multimode squeezed light

    Full text link
    We demonstrate the possibility of surpassing the quantum noise limit for simultaneous multi-axis spatial displacement measurements that have zero mean values. The requisite resources for these measurements are squeezed light beams with exotic transverse mode profiles. We show that, in principle, lossless combination of these modes can be achieved using the non-degenerate Gouy phase shift of optical resonators. When the combined squeezed beams are measured with quadrant detectors, we experimentally demonstrate a simultaneous reduction in the transverse x- and y- displacement fluctuations of 2.2 dB and 3.1 dB below the quantum noise limit.Comment: 21 pages, 9 figures, submitted to "Special Issue on Fluctuations & Noise in Photonics & Quantum Optics" of J. Opt.

    Visual Reasoning with Multi-hop Feature Modulation

    Get PDF
    Recent breakthroughs in computer vision and natural language processing have spurred interest in challenging multi-modal tasks such as visual question-answering and visual dialogue. For such tasks, one successful approach is to condition image-based convolutional network computation on language via Feature-wise Linear Modulation (FiLM) layers, i.e., per-channel scaling and shifting. We propose to generate the parameters of FiLM layers going up the hierarchy of a convolutional network in a multi-hop fashion rather than all at once, as in prior work. By alternating between attending to the language input and generating FiLM layer parameters, this approach is better able to scale to settings with longer input sequences such as dialogue. We demonstrate that multi-hop FiLM generation achieves state-of-the-art for the short input sequence task ReferIt --- on-par with single-hop FiLM generation --- while also significantly outperforming prior state-of-the-art and single-hop FiLM generation on the GuessWhat?! visual dialogue task.Comment: In Proc of ECCV 201

    NODIS: Neural Ordinary Differential Scene Understanding

    Get PDF
    Semantic image understanding is a challenging topic in computer vision. It requires to detect all objects in an image, but also to identify all the relations between them. Detected objects, their labels and the discovered relations can be used to construct a scene graph which provides an abstract semantic interpretation of an image. In previous works, relations were identified by solving an assignment problem formulated as Mixed-Integer Linear Programs. In this work, we interpret that formulation as Ordinary Differential Equation (ODE). The proposed architecture performs scene graph inference by solving a neural variant of an ODE by end-to-end learning. It achieves state-of-the-art results on all three benchmark tasks: scene graph generation (SGGen), classification (SGCls) and visual relationship detection (PredCls) on Visual Genome benchmark

    Reflectance spectra of synthetic Fe-free ortho- and clinoenstatites in the UV/ VIS/IR and implications for remote sensing detection of Fe-free pyroxenes on planetary surfaces

    Get PDF
    For a better spectral characterization of planetary bodies with enstatite-rich surfaces like Mercury or E-type asteroids, we synthesized two different enstatite (Mg2Si2O6) polymorphs: Orthoenstatite and clinoenstatite. Both enstatite polymorphs are known from the meteorite record and are commonly observed in aubrites and enstatite chondrites. The synthesized enstatites are particulate samples suitable for laboratory reflectance measurements and can be used for compositional modelling by preparing mixtures of samples in the laboratory or by using the sample's spectra in mathematical models. We report on the synthesis process, chemical composition, grain size distribution, and reflectance spectra of these synthetic enstatites covering the wavelength range from 0.25 to 17 μm, compare them to other pyroxenes (meteoritic enstatite and other synthetic enstatites and diopside), and discuss the implications of retrieving surface compositions of planetary bodies like E-type asteroids, comets, or Mercury. Both enstatite spectra are very bright in the VIS and NIR and show almost neutral to slightly bluish spectral slopes with a steep absorption in the UV. Very low iron in the enstatites (below ~0.04 wt% FeO) already results in weak albeit noticeable absorptions in the VNIR between 0.4 and 0.9 μm. Orthoenstatite and clinoenstatite are not distinguishable based only on their spectra in the VIS and NIR. At the Reststrahlen bands in the MIR a systematic difference in the number and exact position of local minima at ~10 μm between clinoenstatite and orthoenstatite is evident. This can be used to discern between the polymorphs in this wavelength range. Additionally, we can distinguish between Fe-free low- and high-Ca pyroxenes in the MIR
    corecore