13,426 research outputs found
Video Captioning with Guidance of Multimodal Latent Topics
The topic diversity of open-domain videos leads to various vocabularies and
linguistic expressions in describing video contents, and therefore, makes the
video captioning task even more challenging. In this paper, we propose an
unified caption framework, M&M TGM, which mines multimodal topics in
unsupervised fashion from data and guides the caption decoder with these
topics. Compared to pre-defined topics, the mined multimodal topics are more
semantically and visually coherent and can reflect the topic distribution of
videos better. We formulate the topic-aware caption generation as a multi-task
learning problem, in which we add a parallel task, topic prediction, in
addition to the caption task. For the topic prediction task, we use the mined
topics as the teacher to train a student topic prediction model, which learns
to predict the latent topics from multimodal contents of videos. The topic
prediction provides intermediate supervision to the learning process. As for
the caption task, we propose a novel topic-aware decoder to generate more
accurate and detailed video descriptions with the guidance from latent topics.
The entire learning procedure is end-to-end and it optimizes both tasks
simultaneously. The results from extensive experiments conducted on the MSR-VTT
and Youtube2Text datasets demonstrate the effectiveness of our proposed model.
M&M TGM not only outperforms prior state-of-the-art methods on multiple
evaluation metrics and on both benchmark datasets, but also achieves better
generalization ability.Comment: ACM Multimedia 201
Recommended from our members
Low-cost and low-topography fabrication of multilayer interconnections for microfluidic devices
Multilayer interconnections are needed for microdevices with a large number of independent electrodes. A multi-level photolithographic process is commonly employed to provide multilayer interconnections in integrated circuit (IC) devices, but it is often too expensive for large-area or disposable devices frequently needed for microfluidics. The printed circuit board (PCB) can provide multilayer interconnection at low cost, but its rough topography poses a challenge for small droplets to slide over. Here we report a low-cost fabrication of low-topography multilayer interconnects by selective and controlled anodization of thin-film metal layers. The process utilizes anodization of metal (tantalum in this paper) or, more specifically, repetitions of a partial anodization to form insulation layers between conductive layers and a full anodization to form isolating regions between electrodes, replacing the usual process of depositing, planarizing, and etching insulation layers. After verifying the electric connections and insulations as intended, the developed method is applied to electrowetting-on-dielectric (EWOD), whose complex microfluidic products are currently built on PCB or thin-film transistor (TFT) substrates. To demonstrate the utility, we fabricated a 3 metal-layer EWOD device with steps (surface topography) less than 1 micrometer (vs. > 10 micrometers of PCB EWOD devices) and confirmed basic digital microfluidic operations
Economic feasibility analysis of a renewable energy project in the rural China
AbstractIn this paper, an economic feasibility analysis of a wind farm is presented. Three situations including cost benefit analysis of current situation, government wind power subsidy on wind power price, and Clean Development Mechanism (CDM) of wind farm are considered. Results show that wind power generating system is a good choice for both energy saving and GHG emission reduction compared with the other power generation systems. It is also proved that the construction of wind farm is an attractive choice for investors. Finally, CDM program and government subsidy on wind power are suggested as two efficient approaches to boost the wind power development
Information Geometry Theoretic Measures for Characterizing Neural Information Processing from Simulated EEG Signals
In this work, we explore information geometry theoretic measures for characterizing neural information processing from EEG signals simulated by stochastic nonlinear coupled oscillator models for both healthy subjects and Alzheimer’s disease (AD) patients with both eyes-closed and eyes-open conditions. In particular, we employ information rates to quantify the time evolution of probability density functions of simulated EEG signals, and employ causal information rates to quantify one signal’s instantaneous influence on another signal’s information rate. These two measures help us find significant and interesting distinctions between healthy subjects and AD patients when they open or close their eyes. These distinctions may be further related to differences in neural information processing activities of the corresponding brain regions, and to differences in connectivities among these brain regions. Our results show that information rate and causal information rate are superior to their more traditional or established information-theoretic counterparts, i.e., differential entropy and transfer entropy, respectively. Since these novel, information geometry theoretic measures can be applied to experimental EEG signals in a model-free manner, and they are capable of quantifying non-stationary time-varying effects, nonlinearity, and non-Gaussian stochasticity presented in real-world EEG signals, we believe that they can form an important and powerful tool-set for both understanding neural information processing in the brain and the diagnosis of neurological disorders, such as Alzheimer’s disease as presented in this work
- …