373 research outputs found
Interpretation on Multi-modal Visual Fusion
In this paper, we present an analytical framework and a novel metric to shed
light on the interpretation of the multimodal vision community. Our approach
involves measuring the proposed semantic variance and feature similarity across
modalities and levels, and conducting semantic and quantitative analyses
through comprehensive experiments. Specifically, we investigate the consistency
and speciality of representations across modalities, evolution rules within
each modality, and the collaboration logic used when optimizing a
multi-modality model. Our studies reveal several important findings, such as
the discrepancy in cross-modal features and the hybrid multi-modal cooperation
rule, which highlights consistency and speciality simultaneously for
complementary inference. Through our dissection and findings on multi-modal
fusion, we facilitate a rethinking of the reasonability and necessity of
popular multi-modal vision fusion strategies. Furthermore, our work lays the
foundation for designing a trustworthy and universal multi-modal fusion model
for a variety of tasks in the future.Comment: This version was under review since 2023/3/
A Scalable Neural Network for DSIC Affine Maximizer Auction Design
Automated auction design aims to find empirically high-revenue mechanisms
through machine learning. Existing works on multi item auction scenarios can be
roughly divided into RegretNet-like and affine maximizer auctions (AMAs)
approaches. However, the former cannot strictly ensure dominant strategy
incentive compatibility (DSIC), while the latter faces scalability issue due to
the large number of allocation candidates. To address these limitations, we
propose AMenuNet, a scalable neural network that constructs the AMA parameters
(even including the allocation menu) from bidder and item representations.
AMenuNet is always DSIC and individually rational (IR) due to the properties of
AMAs, and it enhances scalability by generating candidate allocations through a
neural network. Additionally, AMenuNet is permutation equivariant, and its
number of parameters is independent of auction scale. We conduct extensive
experiments to demonstrate that AMenuNet outperforms strong baselines in both
contextual and non-contextual multi-item auctions, scales well to larger
auctions, generalizes well to different settings, and identifies useful
deterministic allocations. Overall, our proposed approach offers an effective
solution to automated DSIC auction design, with improved scalability and strong
revenue performance in various settings.Comment: NeurIPS 2023 (spotlight
KALMANBOT: KalmanNet-Aided Bollinger Bands for Pairs Trading
Pairs trading is a family of trading policies based on monitoring the
relationships between pairs of assets. A common pairs trading approach relies
on state space (SS) modeling, from which financial indicators can be obtained
with low complexity and latency using a Kalman filter (KF), and processed using
classic policies such as Bollinger bands (BB). However, such SS models are
inherently approximated and mismatched, often degrading the revenue. In this
work we propose KalmanBOT, a data-aided policy that preserves the advantages of
KF-aided BB policies while leveraging data to overcome the approximated nature
of the SS model. We adopt the recent KalmanNet architecture, and approximate
the BB policy with a differentiable mapping, converting the policy into a
trainable model. We empirically demonstrate that KalmanBOT yields improved
rewards compared with model-based and data-driven benchmarks
Have media texts become more humorous?
As a research topic, humour has drawn much attention from multiple disciplines including linguistics. Based on Engelthaler & Hills’ (2018) humour scale, this study developed a measure named Humour Index (HMI) to quantify the degree of humour of texts. This measure was applied to examine the diachronic changes in the degree of humour of American newspapers and magazines across a time span of 118 years (1900-2017) with the use of texts from Corpus of Historical American English (COHA). Besides, the study also discussed the contributions of different types of words to the degree of humour in the two genres. The results show significant uptrends in the degree of humour of both newspapers and magazines in the examined period. Moreover, derogatory and offensive words are found to be less frequently used than other categories of words in both genres. This study provides both theoretical and methodological implications for humour studies and claims or hypotheses of previous research, such as infotainment and linguistic positivity bias
Relational Learning between Multiple Pulmonary Nodules via Deep Set Attention Transformers
Diagnosis and treatment of multiple pulmonary nodules are clinically
important but challenging. Prior studies on nodule characterization use
solitary-nodule approaches on multiple nodular patients, which ignores the
relations between nodules. In this study, we propose a multiple instance
learning (MIL) approach and empirically prove the benefit to learn the
relations between multiple nodules. By treating the multiple nodules from a
same patient as a whole, critical relational information between
solitary-nodule voxels is extracted. To our knowledge, it is the first study to
learn the relations between multiple pulmonary nodules. Inspired by recent
advances in natural language processing (NLP) domain, we introduce a
self-attention transformer equipped with 3D CNN, named {NoduleSAT}, to replace
typical pooling-based aggregation in multiple instance learning. Extensive
experiments on lung nodule false positive reduction on LUNA16 database, and
malignancy classification on LIDC-IDRI database, validate the effectiveness of
the proposed method.Comment: 2020 IEEE 17th International Symposium on Biomedical Imaging (ISBI
2020
Graph-Skeleton: ~1% Nodes are Sufficient to Represent Billion-Scale Graph
Due to the ubiquity of graph data on the web, web graph mining has become a
hot research spot. Nonetheless, the prevalence of large-scale web graphs in
real applications poses significant challenges to storage, computational
capacity and graph model design. Despite numerous studies to enhance the
scalability of graph models, a noticeable gap remains between academic research
and practical web graph mining applications. One major cause is that in most
industrial scenarios, only a small part of nodes in a web graph are actually
required to be analyzed, where we term these nodes as target nodes, while
others as background nodes. In this paper, we argue that properly fetching and
condensing the background nodes from massive web graph data might be a more
economical shortcut to tackle the obstacles fundamentally. To this end, we make
the first attempt to study the problem of massive background nodes compression
for target nodes classification. Through extensive experiments, we reveal two
critical roles played by the background nodes in target node classification:
enhancing structural connectivity between target nodes, and feature correlation
with target nodes. Followingthis, we propose a novel Graph-Skeleton1 model,
which properly fetches the background nodes, and further condenses the semantic
and topological information of background nodes within similar
target-background local structures. Extensive experiments on various web graph
datasets demonstrate the effectiveness and efficiency of the proposed method.
In particular, for MAG240M dataset with 0.24 billion nodes, our generated
skeleton graph achieves highly comparable performance while only containing
1.8% nodes of the original graph.Comment: 21 pages, 11 figures, In Proceedings of the ACM Web Conference 2024
(WWW'24
- …