Search CORE

318 research outputs found

Towards Optimal Discrete Online Hashing with Balanced Similarity

Author: Ji Rongrong
Lin Mingbao
Liu Hong
Sun Xiaoshuai
Wu Yongjian
Wu Yunsheng
Publication venue
Publication date: 23/04/2019
Field of study

When facing large-scale image datasets, online hashing serves as a promising solution for online retrieval and prediction tasks. It encodes the online streaming data into compact binary codes, and simultaneously updates the hash functions to renew codes of the existing dataset. To this end, the existing methods update hash functions solely based on the new data batch, without investigating the correlation between such new data and the existing dataset. In addition, existing works update the hash functions using a relaxation process in its corresponding approximated continuous space. And it remains as an open problem to directly apply discrete optimizations in online hashing. In this paper, we propose a novel supervised online hashing method, termed Balanced Similarity for Online Discrete Hashing (BSODH), to solve the above problems in a unified framework. BSODH employs a well-designed hashing algorithm to preserve the similarity between the streaming data and the existing dataset via an asymmetric graph regularization. We further identify the "data-imbalance" problem brought by the constructed asymmetric graph, which restricts the application of discrete optimization in our problem. Therefore, a novel balanced similarity is further proposed, which uses two equilibrium factors to balance the similar and dissimilar weights and eventually enables the usage of discrete optimizations. Extensive experiments conducted on three widely-used benchmarks demonstrate the advantages of the proposed method over the state-of-the-art methods.Comment: 8 pages, 11 figures, conferenc

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

What Goes beyond Multi-modal Fusion in One-stage Referring Expression Comprehension: An Empirical Study

Author: Huang Shubin
Ji Rongrong
Luo Gen
Sun Jiamu
Sun Xiaoshuai
Wu Yongjian
Ye Qixiang
Zhou Yiyi
Publication venue
Publication date: 16/04/2022
Field of study

Most of the existing work in one-stage referring expression comprehension (REC) mainly focuses on multi-modal fusion and reasoning, while the influence of other factors in this task lacks in-depth exploration. To fill this gap, we conduct an empirical study in this paper. Concretely, we first build a very simple REC network called SimREC, and ablate 42 candidate designs/settings, which covers the entire process of one-stage REC from network design to model training. Afterwards, we conduct over 100 experimental trials on three benchmark datasets of REC. The extensive experimental results not only show the key factors that affect REC performance in addition to multi-modal fusion, e.g., multi-scale features and data augmentation, but also yield some findings that run counter to conventional understanding. For example, as a vision and language (V&L) task, REC does is less impacted by language prior. In addition, with a proper combination of these findings, we can improve the performance of SimREC by a large margin, e.g., +27.12% on RefCOCO+, which outperforms all existing REC methods. But the most encouraging finding is that with much less training overhead and parameters, SimREC can still achieve better performance than a set of large-scale pre-trained models, e.g., UNITER and VILLA, portraying the special role of REC in existing V&L research

arXiv.org e-Print Archive

Towards Language-guided Visual Recognition via Dynamic Convolutions

Author: Ding Xinghao
Gao Yue
Huang Feiyue
Ji Rongrong
Luo Gen
Sun Xiaoshuai
Wu Yongjian
Zhou Yiyi
Publication venue
Publication date: 17/10/2021
Field of study

In this paper, we are committed to establishing an unified and end-to-end multi-modal network via exploring the language-guided visual recognition. To approach this target, we first propose a novel multi-modal convolution module called Language-dependent Convolution (LaConv). Its convolution kernels are dynamically generated based on natural language information, which can help extract differentiated visual features for different multi-modal examples. Based on the LaConv module, we further build the first fully language-driven convolution network, termed as LaConvNet, which can unify the visual recognition and multi-modal reasoning in one forward structure. To validate LaConv and LaConvNet, we conduct extensive experiments on four benchmark datasets of two vision-and-language tasks, i.e., visual question answering (VQA) and referring expression comprehension (REC). The experimental results not only shows the performance gains of LaConv compared to the existing multi-modal modules, but also witness the merits of LaConvNet as an unified network, including compact network, high generalization ability and excellent performance, e.g., +4.7% on RefCOCO+

arXiv.org e-Print Archive

RNA-seq-based digital gene expression analysis reveals modification of host defense responses by rice stripe virus during disease symptom development in Arabidopsis

Author: Feng Sun
Juan Li
Linlin Du
Peng Fang
Tong Zhou
Wenbiao Shen
Yijun Zhou
Ying Lan
Yongjian Fan
Publication venue: Springer Nature
Publication date: 01/01/2016
Field of study

DEGs involved in protein phosphorylation at 14 dpi. (XLSX 52Â kb

Springer - Publisher Connector

FigShare