Search CORE

124 research outputs found

Ranking-based Deep Cross-modal Hashing

Author: Domeniconi Carlotta
Guo Maozu
Liu Xuanwu
Ren Yazhou
Wang Jun
Yu Guoxian
Publication venue
Publication date: 11/05/2019
Field of study

Cross-modal hashing has been receiving increasing interests for its low storage cost and fast query speed in multi-modal data retrievals. However, most existing hashing methods are based on hand-crafted or raw level features of objects, which may not be optimally compatible with the coding process. Besides, these hashing methods are mainly designed to handle simple pairwise similarity. The complex multilevel ranking semantic structure of instances associated with multiple labels has not been well explored yet. In this paper, we propose a ranking-based deep cross-modal hashing approach (RDCMH). RDCMH firstly uses the feature and label information of data to derive a semi-supervised semantic ranking list. Next, to expand the semantic representation power of hand-crafted features, RDCMH integrates the semantic ranking information into deep cross-modal hashing and jointly optimizes the compatible parameters of deep feature representations and of hashing functions. Experiments on real multi-modal datasets show that RDCMH outperforms other competitive baselines and achieves the state-of-the-art performance in cross-modal retrieval applications

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Coarse embeddings at infinity and generalized expanders at infinity

Author: Deng Jintao
Guo Liang
Wang Qin
Zhang Yazhou
Publication venue
Publication date: 15/07/2022
Field of study

We introduce a notion of coarse embedding at infinity into Hilbert space for metric spaces, which is a weakening of the notion of fibred coarse embedding and a far generalization of Gromov's concept of coarse embedding. It turns out that a residually finite group admits a coarse embedding into Hilbert space if and only if one (or equivalently, every) box space of the group admits a coarse embedding at infinity into Hilbert space. Moreover, we introduce a concept of generalized expander at infinity and show that it is an obstruction to coarse embeddability at infinity.Comment: 20 page

arXiv.org e-Print Archive

A Bott periodicity theorem for $\ell^p$ -spaces and the coarse Novikov conjecture at infinity

Author: Guo Liang
Luo Zheng
Wang Qin
Zhang Yazhou
Publication venue
Publication date: 18/07/2022
Field of study

We formulate and prove a Bott periodicity theorem for an

\ell^p

-space (

1\leq p<\infty

). For a proper metric space

X

with bounded geometry, we introduce a version of

K

-homology at infinity, denoted by

K_*^{\infty}(X)

, and the Roe algebra at infinity, denoted by

C^*_{\infty}(X)

. Then the coarse assembly map descents to a map from

\lim_{d\to\infty}K_*^{\infty}(P_d(X))

K_*(C^*_{\infty}(X))

, called the coarse assembly map at infinity. We show that to prove the coarse Novikov conjecture, it suffices to prove the coarse assembly map at infinity is an injection. As a result, we show that the coarse Novikov conjecture holds for any metric space with bounded geometry which admits a fibred coarse embedding into an

\ell^p

-space. These include all box spaces of a residually finite hyperbolic group and a large class of warped cones of a compact space with an action by a hyperbolic group.Comment: 55 page

arXiv.org e-Print Archive

Hierarchical Feature Alignment Network for Unsupervised Video Object Segmentation

Author: Pei Gensheng
Shen Fumin
Tang Jinhui
Tang Zhenmin
Xie Guo-Sen
Yao Yazhou
Publication venue
Publication date: 19/07/2022
Field of study

Optical flow is an easily conceived and precious cue for advancing unsupervised video object segmentation (UVOS). Most of the previous methods directly extract and fuse the motion and appearance features for segmenting target objects in the UVOS setting. However, optical flow is intrinsically an instantaneous velocity of all pixels among consecutive frames, thus making the motion features not aligned well with the primary objects among the corresponding frames. To solve the above challenge, we propose a concise, practical, and efficient architecture for appearance and motion feature alignment, dubbed hierarchical feature alignment network (HFAN). Specifically, the key merits in HFAN are the sequential Feature AlignMent (FAM) module and the Feature AdaptaTion (FAT) module, which are leveraged for processing the appearance and motion features hierarchically. FAM is capable of aligning both appearance and motion features with the primary object semantic representations, respectively. Further, FAT is explicitly designed for the adaptive fusion of appearance and motion features to achieve a desirable trade-off between cross-modal features. Extensive experiments demonstrate the effectiveness of the proposed HFAN, which reaches a new state-of-the-art performance on DAVIS-16, achieving 88.7

\mathcal{J}\&\mathcal{F}

Mean, i.e., a relative improvement of 3.5% over the best published result.Comment: Accepted by ECCV-202

arXiv.org e-Print Archive