128 research outputs found

    Ranking-based Deep Cross-modal Hashing

    Full text link
    Cross-modal hashing has been receiving increasing interests for its low storage cost and fast query speed in multi-modal data retrievals. However, most existing hashing methods are based on hand-crafted or raw level features of objects, which may not be optimally compatible with the coding process. Besides, these hashing methods are mainly designed to handle simple pairwise similarity. The complex multilevel ranking semantic structure of instances associated with multiple labels has not been well explored yet. In this paper, we propose a ranking-based deep cross-modal hashing approach (RDCMH). RDCMH firstly uses the feature and label information of data to derive a semi-supervised semantic ranking list. Next, to expand the semantic representation power of hand-crafted features, RDCMH integrates the semantic ranking information into deep cross-modal hashing and jointly optimizes the compatible parameters of deep feature representations and of hashing functions. Experiments on real multi-modal datasets show that RDCMH outperforms other competitive baselines and achieves the state-of-the-art performance in cross-modal retrieval applications

    Coarse embeddings at infinity and generalized expanders at infinity

    Full text link
    We introduce a notion of coarse embedding at infinity into Hilbert space for metric spaces, which is a weakening of the notion of fibred coarse embedding and a far generalization of Gromov's concept of coarse embedding. It turns out that a residually finite group admits a coarse embedding into Hilbert space if and only if one (or equivalently, every) box space of the group admits a coarse embedding at infinity into Hilbert space. Moreover, we introduce a concept of generalized expander at infinity and show that it is an obstruction to coarse embeddability at infinity.Comment: 20 page

    A Bott periodicity theorem for β„“p\ell^p-spaces and the coarse Novikov conjecture at infinity

    Full text link
    We formulate and prove a Bott periodicity theorem for an β„“p\ell^p-space (1≀p<∞1\leq p<\infty). For a proper metric space XX with bounded geometry, we introduce a version of KK-homology at infinity, denoted by Kβˆ—βˆž(X)K_*^{\infty}(X), and the Roe algebra at infinity, denoted by Cβˆžβˆ—(X)C^*_{\infty}(X). Then the coarse assembly map descents to a map from lim⁑dβ†’βˆžKβˆ—βˆž(Pd(X))\lim_{d\to\infty}K_*^{\infty}(P_d(X)) to Kβˆ—(Cβˆžβˆ—(X))K_*(C^*_{\infty}(X)), called the coarse assembly map at infinity. We show that to prove the coarse Novikov conjecture, it suffices to prove the coarse assembly map at infinity is an injection. As a result, we show that the coarse Novikov conjecture holds for any metric space with bounded geometry which admits a fibred coarse embedding into an β„“p\ell^p-space. These include all box spaces of a residually finite hyperbolic group and a large class of warped cones of a compact space with an action by a hyperbolic group.Comment: 55 page

    Hierarchical Feature Alignment Network for Unsupervised Video Object Segmentation

    Full text link
    Optical flow is an easily conceived and precious cue for advancing unsupervised video object segmentation (UVOS). Most of the previous methods directly extract and fuse the motion and appearance features for segmenting target objects in the UVOS setting. However, optical flow is intrinsically an instantaneous velocity of all pixels among consecutive frames, thus making the motion features not aligned well with the primary objects among the corresponding frames. To solve the above challenge, we propose a concise, practical, and efficient architecture for appearance and motion feature alignment, dubbed hierarchical feature alignment network (HFAN). Specifically, the key merits in HFAN are the sequential Feature AlignMent (FAM) module and the Feature AdaptaTion (FAT) module, which are leveraged for processing the appearance and motion features hierarchically. FAM is capable of aligning both appearance and motion features with the primary object semantic representations, respectively. Further, FAT is explicitly designed for the adaptive fusion of appearance and motion features to achieve a desirable trade-off between cross-modal features. Extensive experiments demonstrate the effectiveness of the proposed HFAN, which reaches a new state-of-the-art performance on DAVIS-16, achieving 88.7 J&F\mathcal{J}\&\mathcal{F} Mean, i.e., a relative improvement of 3.5% over the best published result.Comment: Accepted by ECCV-202
    • …
    corecore