2,333 research outputs found
Combination of Multiple Bipartite Ranking for Web Content Quality Evaluation
Web content quality estimation is crucial to various web content processing
applications. Our previous work applied Bagging + C4.5 to achive the best
results on the ECML/PKDD Discovery Challenge 2010, which is the comibination of
many point-wise rankinig models. In this paper, we combine multiple pair-wise
bipartite ranking learner to solve the multi-partite ranking problems for the
web quality estimation. In encoding stage, we present the ternary encoding and
the binary coding extending each rank value to (L is the number of the
different ranking value). For the decoding, we discuss the combination of
multiple ranking results from multiple bipartite ranking models with the
predefined weighting and the adaptive weighting. The experiments on ECML/PKDD
2010 Discovery Challenge datasets show that \textit{binary coding} +
\textit{predefined weighting} yields the highest performance in all four
combinations and furthermore it is better than the best results reported in
ECML/PKDD 2010 Discovery Challenge competition.Comment: 17 pages, 8 figures, 2 table
Context Does Matter: End-to-end Panoptic Narrative Grounding with Deformable Attention Refined Matching Network
Panoramic Narrative Grounding (PNG) is an emerging visual grounding task that
aims to segment visual objects in images based on dense narrative captions. The
current state-of-the-art methods first refine the representation of phrase by
aggregating the most similar image pixels, and then match the refined text
representations with the pixels of the image feature map to generate
segmentation results. However, simply aggregating sampled image features
ignores the contextual information, which can lead to phrase-to-pixel
mis-match. In this paper, we propose a novel learning framework called
Deformable Attention Refined Matching Network (DRMN), whose main idea is to
bring deformable attention in the iterative process of feature learning to
incorporate essential context information of different scales of pixels. DRMN
iteratively re-encodes pixels with the deformable attention network after
updating the feature representation of the top- most similar pixels. As
such, DRMN can lead to accurate yet discriminative pixel representations,
purify the top- most similar pixels, and consequently alleviate the
phrase-to-pixel mis-match substantially.Experimental results show that our
novel design significantly improves the matching results between text phrases
and image pixels. Concretely, DRMN achieves new state-of-the-art performance on
the PNG benchmark with an average recall improvement 3.5%. The codes are
available in: https://github.com/JaMesLiMers/DRMN.Comment: Accepted by ICDM 202
AMatFormer: Efficient Feature Matching via Anchor Matching Transformer
Learning based feature matching methods have been commonly studied in recent
years. The core issue for learning feature matching is to how to learn (1)
discriminative representations for feature points (or regions) within each
intra-image and (2) consensus representations for feature points across
inter-images. Recently, self- and cross-attention models have been exploited to
address this issue. However, in many scenes, features are coming with
large-scale, redundant and outliers contaminated. Previous
self-/cross-attention models generally conduct message passing on all primal
features which thus lead to redundant learning and high computational cost. To
mitigate limitations, inspired by recent seed matching methods, in this paper,
we propose a novel efficient Anchor Matching Transformer (AMatFormer) for the
feature matching problem. AMatFormer has two main aspects: First, it mainly
conducts self-/cross-attention on some anchor features and leverages these
anchor features as message bottleneck to learn the representations for all
primal features. Thus, it can be implemented efficiently and compactly. Second,
AMatFormer adopts a shared FFN module to further embed the features of two
images into the common domain and thus learn the consensus feature
representations for the matching problem. Experiments on several benchmarks
demonstrate the effectiveness and efficiency of the proposed AMatFormer
matching approach.Comment: Accepted by IEEE Transactions on Multimedia (TMM) 202
- …