67 research outputs found
Probing Spurious Correlations in Popular Event-Based Rumor Detection Benchmarks
As social media becomes a hotbed for the spread of misinformation, the
crucial task of rumor detection has witnessed promising advances fostered by
open-source benchmark datasets. Despite being widely used, we find that these
datasets suffer from spurious correlations, which are ignored by existing
studies and lead to severe overestimation of existing rumor detection
performance. The spurious correlations stem from three causes: (1) event-based
data collection and labeling schemes assign the same veracity label to multiple
highly similar posts from the same underlying event; (2) merging multiple data
sources spuriously relates source identities to veracity labels; and (3)
labeling bias. In this paper, we closely investigate three of the most popular
rumor detection benchmark datasets (i.e., Twitter15, Twitter16 and PHEME), and
propose event-separated rumor detection as a solution to eliminate spurious
cues. Under the event-separated setting, we observe that the accuracy of
existing state-of-the-art models drops significantly by over 40%, becoming only
comparable to a simple neural classifier. To better address this task, we
propose Publisher Style Aggregation (PSA), a generalizable approach that
aggregates publisher posting records to learn writing style and veracity
stance. Extensive experiments demonstrate that our method outperforms existing
baselines in terms of effectiveness, efficiency and generalizability.Comment: Accepted to ECML-PKDD 202
HiPA: Enabling One-Step Text-to-Image Diffusion Models via High-Frequency-Promoting Adaptation
Diffusion models have revolutionized text-to-image generation, but their
real-world applications are hampered by the extensive time needed for hundreds
of diffusion steps. Although progressive distillation has been proposed to
speed up diffusion sampling to 2-8 steps, it still falls short in one-step
generation, and necessitates training multiple student models, which is highly
parameter-extensive and time-consuming. To overcome these limitations, we
introduce High-frequency-Promoting Adaptation (HiPA), a parameter-efficient
approach to enable one-step text-to-image diffusion. Grounded in the insight
that high-frequency information is essential but highly lacking in one-step
diffusion, HiPA focuses on training one-step, low-rank adaptors to specifically
enhance the under-represented high-frequency abilities of advanced diffusion
models. The learned adaptors empower these diffusion models to generate
high-quality images in just a single step. Compared with progressive
distillation, HiPA achieves much better performance in one-step text-to-image
generation (37.3 23.8 in FID-5k on MS-COCO 2017) and 28.6x
training speed-up (108.8 3.8 A100 GPU days), requiring only 0.04%
training parameters (7,740 million 3.3 million). We also
demonstrate HiPA's effectiveness in text-guided image editing, inpainting and
super-resolution tasks, where our adapted models consistently deliver
high-quality outputs in just one diffusion step. The source code will be
released
Graph Neural Network-Based Anomaly Detection in Multivariate Time Series
Given high-dimensional time series data (e.g., sensor data), how can we
detect anomalous events, such as system faults and attacks? More challengingly,
how can we do this in a way that captures complex inter-sensor relationships,
and detects and explains anomalies which deviate from these relationships?
Recently, deep learning approaches have enabled improvements in anomaly
detection in high-dimensional datasets; however, existing methods do not
explicitly learn the structure of existing relationships between variables, or
use them to predict the expected behavior of time series. Our approach combines
a structure learning approach with graph neural networks, additionally using
attention weights to provide explainability for the detected anomalies.
Experiments on two real-world sensor datasets with ground truth anomalies show
that our method detects anomalies more accurately than baseline approaches,
accurately captures correlations between sensors, and allows users to deduce
the root cause of a detected anomaly.Comment: Accepted at AAAI Conference on Artificial Intelligence (AAAI), 202
Prompt-Based Zero- and Few-Shot Node Classification: A Multimodal Approach
Multimodal data empowers machine learning models to better understand the
world from various perspectives. In this work, we study the combination of
\emph{text and graph} modalities, a challenging but understudied combination
which is prevalent across multiple settings including citation networks, social
media, and the web. We focus on the popular task of node classification using
limited labels; in particular, under the zero- and few-shot scenarios. In
contrast to the standard pipeline which feeds standard precomputed (e.g.,
bag-of-words) text features into a graph neural network, we propose
\textbf{T}ext-\textbf{A}nd-\textbf{G}raph (TAG) learning, a more deeply
multimodal approach that integrates the raw texts and graph topology into the
model design, and can effectively learn from limited supervised signals without
any meta-learning procedure. TAG is a two-stage model with (1) a prompt- and
graph-based module which generates prior logits that can be directly used for
zero-shot node classification, and (2) a trainable module that further
calibrates these prior logits in a few-shot manner. Experiments on two node
classification datasets show that TAG outperforms all the baselines by a large
margin in both zero- and few-shot settings.Comment: Work in progres
GraphCleaner: Detecting Mislabelled Samples in Popular Graph Learning Benchmarks
Label errors have been found to be prevalent in popular text, vision, and
audio datasets, which heavily influence the safe development and evaluation of
machine learning algorithms. Despite increasing efforts towards improving the
quality of generic data types, such as images and texts, the problem of
mislabel detection in graph data remains underexplored. To bridge the gap, we
explore mislabelling issues in popular real-world graph datasets and propose
GraphCleaner, a post-hoc method to detect and correct these mislabelled nodes
in graph datasets. GraphCleaner combines the novel ideas of 1) Synthetic
Mislabel Dataset Generation, which seeks to generate realistic mislabels; and
2) Neighborhood-Aware Mislabel Detection, where neighborhood dependency is
exploited in both labels and base classifier predictions. Empirical evaluations
on 6 datasets and 6 experimental settings demonstrate that GraphCleaner
outperforms the closest baseline, with an average improvement of 0.14 in F1
score, and 0.16 in MCC. On real-data case studies, GraphCleaner detects real
and previously unknown mislabels in popular graph benchmarks: PubMed, Cora,
CiteSeer and OGB-arxiv; we find that at least 6.91% of PubMed data is
mislabelled or ambiguous, and simply removing these mislabelled data can boost
evaluation performance from 86.71% to 89.11%.Comment: ICML 202
TAP: A Comprehensive Data Repository for Traffic Accident Prediction in Road Networks
Road safety is a major global public health concern. Effective traffic crash
prediction can play a critical role in reducing road traffic accidents.
However, Existing machine learning approaches tend to focus on predicting
traffic accidents in isolation, without considering the potential relationships
between different accident locations within road networks. To incorporate graph
structure information, graph-based approaches such as Graph Neural Networks
(GNNs) can be naturally applied. However, applying GNNs to the accident
prediction problem faces challenges due to the lack of suitable
graph-structured traffic accident datasets. To bridge this gap, we have
constructed a real-world graph-based Traffic Accident Prediction (TAP) data
repository, along with two representative tasks: accident occurrence prediction
and accident severity prediction. With nationwide coverage, real-world network
topology, and rich geospatial features, this data repository can be used for a
variety of traffic-related tasks. We further comprehensively evaluate eleven
state-of-the-art GNN variants and two non-graph-based machine learning methods
using the created datasets. Significantly facilitated by the proposed data, we
develop a novel Traffic Accident Vulnerability Estimation via Linkage (TRAVEL)
model, which is designed to capture angular and directional information from
road networks. We demonstrate that the proposed model consistently outperforms
the baselines. The data and code are available on GitHub
(https://github.com/baixianghuang/travel).Comment: 10 pages, 5 figure
Efficient Heterogeneous Graph Learning via Random Projection
Heterogeneous Graph Neural Networks (HGNNs) are powerful tools for deep
learning on heterogeneous graphs. Typical HGNNs require repetitive message
passing during training, limiting efficiency for large-scale real-world graphs.
Recent pre-computation-based HGNNs use one-time message passing to transform a
heterogeneous graph into regular-shaped tensors, enabling efficient mini-batch
training. Existing pre-computation-based HGNNs can be mainly categorized into
two styles, which differ in how much information loss is allowed and
efficiency. We propose a hybrid pre-computation-based HGNN, named Random
Projection Heterogeneous Graph Neural Network (RpHGNN), which combines the
benefits of one style's efficiency with the low information loss of the other
style. To achieve efficiency, the main framework of RpHGNN consists of
propagate-then-update iterations, where we introduce a Random Projection
Squashing step to ensure that complexity increases only linearly. To achieve
low information loss, we introduce a Relation-wise Neighbor Collection
component with an Even-odd Propagation Scheme, which aims to collect
information from neighbors in a finer-grained way. Experimental results
indicate that our approach achieves state-of-the-art results on seven small and
large benchmark datasets while also being 230% faster compared to the most
effective baseline. Surprisingly, our approach not only surpasses
pre-processing-based baselines but also outperforms end-to-end methods.Comment: This work has been submitted to the IEEE for possible publication.
Copyright may be transferred without notice, after which this version may no
longer be accessibl
Great Models Think Alike: Improving Model Reliability via Inter-Model Latent Agreement
Reliable application of machine learning is of primary importance to the
practical deployment of deep learning methods. A fundamental challenge is that
models are often unreliable due to overconfidence. In this paper, we estimate a
model's reliability by measuring \emph{the agreement between its latent space,
and the latent space of a foundation model}. However, it is challenging to
measure the agreement between two different latent spaces due to their
incoherence, \eg, arbitrary rotations and different dimensionality. To overcome
this incoherence issue, we design a \emph{neighborhood agreement measure}
between latent spaces and find that this agreement is surprisingly
well-correlated with the reliability of a model's predictions. Further, we show
that fusing neighborhood agreement into a model's predictive confidence in a
post-hoc way significantly improves its reliability. Theoretical analysis and
extensive experiments on failure detection across various datasets verify the
effectiveness of our method on both in-distribution and out-of-distribution
settings.Comment: ICML 202
- β¦