2,209 research outputs found
Bidirectional Correlation-Driven Inter-Frame Interaction Transformer for Referring Video Object Segmentation
Referring video object segmentation (RVOS) aims to segment the target object
in a video sequence described by a language expression. Typical multimodal
Transformer based RVOS approaches process video sequence in a frame-independent
manner to reduce the high computational cost, which however restricts the
performance due to the lack of inter-frame interaction for temporal coherence
modeling and spatio-temporal representation learning of the referred object.
Besides, the absence of sufficient cross-modal interactions results in weak
correlation between the visual and linguistic features, which increases the
difficulty of decoding the target information and limits the performance of the
model. In this paper, we propose a bidirectional correlation-driven inter-frame
interaction Transformer, dubbed BIFIT, to address these issues in RVOS.
Specifically, we design a lightweight and plug-and-play inter-frame interaction
module in the Transformer decoder to efficiently learn the spatio-temporal
features of the referred object, so as to decode the object information in the
video sequence more precisely and generate more accurate segmentation results.
Moreover, a bidirectional vision-language interaction module is implemented
before the multimodal Transformer to enhance the correlation between the visual
and linguistic features, thus facilitating the language queries to decode more
precise object information from visual features and ultimately improving the
segmentation performance. Extensive experimental results on four benchmarks
validate the superiority of our BIFIT over state-of-the-art methods and the
effectiveness of our proposed modules
2-Amino-4,6-dimethylpyrimidine–benzoic acid (1/1)
The crystal of the title compound, C6H9N3·C7H6O2, contains tetrameric hydrogen-bonded units comprising a central pair of 2-aminopyrimidine molecules linked across a centre of inversion by N—H⋯N hydrogen bonds and two pendant benzoic acid molecules attached through N—H⋯O and O—H⋯N hydrogen bonds. These hydrogen-bonded units are arranged into layers in (002)
4-Methyl-6-phenylpyrimidin-2-amine
The title compound, C11H11N3, was synthesized as part of our research into functionalized pyrimidines. It crystallizes with two independent molecules in the asymmetric unit that differ only in the twist between the two aromatic rings; the torsion angles between the rings are 29.9 (2) and 45.1 (2)°. The crystal packing is dominated by intermolecular N—H⋯N hydrogen bonds between independent and equivalent molecules, forming an infinite three-dimensional network
Strategies for Searching Video Content with Text Queries or Video Examples
The large number of user-generated videos uploaded on to the Internet
everyday has led to many commercial video search engines, which mainly rely on
text metadata for search. However, metadata is often lacking for user-generated
videos, thus these videos are unsearchable by current search engines.
Therefore, content-based video retrieval (CBVR) tackles this metadata-scarcity
problem by directly analyzing the visual and audio streams of each video. CBVR
encompasses multiple research topics, including low-level feature design,
feature fusion, semantic detector training and video search/reranking. We
present novel strategies in these topics to enhance CBVR in both accuracy and
speed under different query inputs, including pure textual queries and query by
video examples. Our proposed strategies have been incorporated into our
submission for the TRECVID 2014 Multimedia Event Detection evaluation, where
our system outperformed other submissions in both text queries and video
example queries, thus demonstrating the effectiveness of our proposed
approaches
Defense against Adversarial Cloud Attack on Remote Sensing Salient Object Detection
Detecting the salient objects in a remote sensing image has wide applications
for the interdisciplinary research. Many existing deep learning methods have
been proposed for Salient Object Detection (SOD) in remote sensing images and
get remarkable results. However, the recent adversarial attack examples,
generated by changing a few pixel values on the original remote sensing image,
could result in a collapse for the well-trained deep learning based SOD model.
Different with existing methods adding perturbation to original images, we
propose to jointly tune adversarial exposure and additive perturbation for
attack and constrain image close to cloudy image as Adversarial Cloud. Cloud is
natural and common in remote sensing images, however, camouflaging cloud based
adversarial attack and defense for remote sensing images are not well studied
before. Furthermore, we design DefenseNet as a learn-able pre-processing to the
adversarial cloudy images so as to preserve the performance of the deep
learning based remote sensing SOD model, without tuning the already deployed
deep SOD model. By considering both regular and generalized adversarial
examples, the proposed DefenseNet can defend the proposed Adversarial Cloud in
white-box setting and other attack methods in black-box setting. Experimental
results on a synthesized benchmark from the public remote sensing SOD dataset
(EORSSD) show the promising defense against adversarial cloud attacks
Rank Optimization for MIMO systems with RIS: Simulation and Measurement
Reconfigurable intelligent surface (RIS) is a promising technology that can
reshape the electromagnetic environment in wireless networks, offering various
possibilities for enhancing wireless channels. Motivated by this, we
investigate the channel optimization for multiple-input multiple-output (MIMO)
systems assisted by RIS. In this paper, an efficient RIS optimization method is
proposed to enhance the effective rank of the MIMO channel for achievable rate
improvement. Numerical results are presented to verify the effectiveness of RIS
in improving MIMO channels. Additionally, we construct a 22
RIS-assisted MIMO prototype to perform experimental measurements and validate
the performance of our proposed algorithm. The results reveal a significant
increase in effective rank and achievable rate for the RIS-assisted MIMO
channel compared to the MIMO channel without RIS
- …