9,656 research outputs found
Video Question Answering via Attribute-Augmented Attention Network Learning
Video Question Answering is a challenging problem in visual information
retrieval, which provides the answer to the referenced video content according
to the question. However, the existing visual question answering approaches
mainly tackle the problem of static image question, which may be ineffectively
for video question answering due to the insufficiency of modeling the temporal
dynamics of video contents. In this paper, we study the problem of video
question answering by modeling its temporal dynamics with frame-level attention
mechanism. We propose the attribute-augmented attention network learning
framework that enables the joint frame-level attribute detection and unified
video representation learning for video question answering. We then incorporate
the multi-step reasoning process for our proposed attention network to further
improve the performance. We construct a large-scale video question answering
dataset. We conduct the experiments on both multiple-choice and open-ended
video question answering tasks to show the effectiveness of the proposed
method.Comment: Accepted for SIGIR 201
Survey of the status quo of TB infection of medical personnel in infectious diseases hospital
Strengthening climate prevention through economic globalization, clean energy, and financial development in N11 countries: evidence from advance panel estimations
This study evaluates the relevancy of economic globalization,
financial development, and clean energy, in strengthening the
environmental sustainability of the next 11 economies over a
time period pertaining to 1995–2018. In order to achieve the
objective of this study, the advanced panel estimation techniques
of unit root testing, and the cointegration analysis have been
applied due to the presence of the cross-sectional-dependence
and heterogeneity of the slope parameters in the panel data. The
long-run output coefficients have been estimated through the
Cross-Sectional Autoregressive Distributive Lag Model (CS-ARDL).
Moreover, the causality test for a heterogeneous panel has also
been employed in order to determine the causal relationships
among the variables that are under study. Our empirical findings
of these tests indicate that financial development and economic
globalization tend to contribute to the deterioration of environmental quality, but clean energy is productive for its improvement. The bi-directional causal relationship is recognized to exist
between CO2 emission and all the variables. Based on these findings, the study recommends adopting economic growth policies
that are aligned with the defined environmental regulations, thus
promoting the use of more clean energy resources. These include
resources such as renewable energy and incorporating the environmental welfare goals into financial development plans in
N11 economies
Rethinking Multi-Modal Alignment in Video Question Answering from Feature and Sample Perspectives
Reasoning about causal and temporal event relations in videos is a new
destination of Video Question Answering (VideoQA).The major stumbling block to
achieve this purpose is the semantic gap between language and video since they
are at different levels of abstraction. Existing efforts mainly focus on
designing sophisticated architectures while utilizing frame- or object-level
visual representations. In this paper, we reconsider the multi-modal alignment
problem in VideoQA from feature and sample perspectives to achieve better
performance. From the view of feature,we break down the video into trajectories
and first leverage trajectory feature in VideoQA to enhance the alignment
between two modalities. Moreover, we adopt a heterogeneous graph architecture
and design a hierarchical framework to align both trajectory-level and
frame-level visual feature with language feature. In addition, we found that
VideoQA models are largely dependent on language priors and always neglect
visual-language interactions. Thus, two effective yet portable training
augmentation strategies are designed to strengthen the cross-modal
correspondence ability of our model from the view of sample. Extensive results
show that our method outperforms all the state-of-the-art models on the
challenging NExT-QA benchmark, which demonstrates the effectiveness of the
proposed method
Recommended from our members
Multistaged discharge constructing heterostructure with enhanced solid-solution behavior for long-life lithium-oxygen batteries.
Inferior charge transport in insulating and bulk discharge products is one of the main factors resulting in poor cycling stability of lithium-oxygen batteries with high overpotential and large capacity decay. Here we report a two-step oxygen reduction approach by pre-depositing a potassium carbonate layer on the cathode surface in a potassium-oxygen battery to direct the growth of defective film-like discharge products in the successive cycling of lithium-oxygen batteries. The formation of defective film with improved charge transport and large contact area with a catalyst plays a critical role in the facile decomposition of discharge products and the sustained stability of the battery. Multistaged discharge constructing lithium peroxide-based heterostructure with band discontinuities and a relatively low lithium diffusion barrier may be responsible for the growth of defective film-like discharge products. This strategy offers a promising route for future development of cathode catalysts that can be used to extend the cycling life of lithium-oxygen batteries
- …