126 research outputs found
Explainable Spatio-Temporal Graph Neural Networks
Spatio-temporal graph neural networks (STGNNs) have gained popularity as a
powerful tool for effectively modeling spatio-temporal dependencies in diverse
real-world urban applications, including intelligent transportation and public
safety. However, the black-box nature of STGNNs limits their interpretability,
hindering their application in scenarios related to urban resource allocation
and policy formulation. To bridge this gap, we propose an Explainable
Spatio-Temporal Graph Neural Networks (STExplainer) framework that enhances
STGNNs with inherent explainability, enabling them to provide accurate
predictions and faithful explanations simultaneously. Our framework integrates
a unified spatio-temporal graph attention network with a positional information
fusion layer as the STG encoder and decoder, respectively. Furthermore, we
propose a structure distillation approach based on the Graph Information
Bottleneck (GIB) principle with an explainable objective, which is instantiated
by the STG encoder and decoder. Through extensive experiments, we demonstrate
that our STExplainer outperforms state-of-the-art baselines in terms of
predictive accuracy and explainability metrics (i.e., sparsity and fidelity) on
traffic and crime prediction tasks. Furthermore, our model exhibits superior
representation ability in alleviating data missing and sparsity issues. The
implementation code is available at: https://github.com/HKUDS/STExplainer.Comment: 32nd ACM International Conference on Information and Knowledge
Management (CIKM' 23
Spatio-Temporal Meta Contrastive Learning
Spatio-temporal prediction is crucial in numerous real-world applications,
including traffic forecasting and crime prediction, which aim to improve public
transportation and safety management. Many state-of-the-art models demonstrate
the strong capability of spatio-temporal graph neural networks (STGNN) to
capture complex spatio-temporal correlations. However, despite their
effectiveness, existing approaches do not adequately address several key
challenges. Data quality issues, such as data scarcity and sparsity, lead to
data noise and a lack of supervised signals, which significantly limit the
performance of STGNN. Although recent STGNN models with contrastive learning
aim to address these challenges, most of them use pre-defined augmentation
strategies that heavily depend on manual design and cannot be customized for
different Spatio-Temporal Graph (STG) scenarios. To tackle these challenges, we
propose a new spatio-temporal contrastive learning (CL4ST) framework to encode
robust and generalizable STG representations via the STG augmentation paradigm.
Specifically, we design the meta view generator to automatically construct node
and edge augmentation views for each disentangled spatial and temporal graph in
a data-driven manner. The meta view generator employs meta networks with
parameterized generative model to customize the augmentations for each input.
This personalizes the augmentation strategies for every STG and endows the
learning framework with spatio-temporal-aware information. Additionally, we
integrate a unified spatio-temporal graph attention network with the proposed
meta view generator and two-branch graph contrastive learning paradigms.
Extensive experiments demonstrate that our CL4ST significantly improves
performance over various state-of-the-art baselines in traffic and crime
prediction.Comment: 32nd ACM International Conference on Information and Knowledge
Management (CIKM' 23
PromptMM: Multi-Modal Knowledge Distillation for Recommendation with Prompt-Tuning
Multimedia online platforms (e.g., Amazon, TikTok) have greatly benefited
from the incorporation of multimedia (e.g., visual, textual, and acoustic)
content into their personal recommender systems. These modalities provide
intuitive semantics that facilitate modality-aware user preference modeling.
However, two key challenges in multi-modal recommenders remain unresolved: i)
The introduction of multi-modal encoders with a large number of additional
parameters causes overfitting, given high-dimensional multi-modal features
provided by extractors (e.g., ViT, BERT). ii) Side information inevitably
introduces inaccuracies and redundancies, which skew the modality-interaction
dependency from reflecting true user preference. To tackle these problems, we
propose to simplify and empower recommenders through Multi-modal Knowledge
Distillation (PromptMM) with the prompt-tuning that enables adaptive quality
distillation. Specifically, PromptMM conducts model compression through
distilling u-i edge relationship and multi-modal node content from cumbersome
teachers to relieve students from the additional feature reduction parameters.
To bridge the semantic gap between multi-modal context and collaborative
signals for empowering the overfitting teacher, soft prompt-tuning is
introduced to perform student task-adaptive. Additionally, to adjust the impact
of inaccuracies in multimedia data, a disentangled multi-modal list-wise
distillation is developed with modality-aware re-weighting mechanism.
Experiments on real-world data demonstrate PromptMM's superiority over existing
techniques. Ablation tests confirm the effectiveness of key components.
Additional tests show the efficiency and effectiveness.Comment: WWW 202
Void Fraction Measurement of Gas-Liquid Two-Phase Flow from Differential Pressure
Void fraction is an important process variable for the volume and mass computation required for transportation of gas–liquid mixture in pipelines, storage in tanks, metering and custody transfer. Inaccurate measurement would introduce errors in product measurement with potentials for loss of revenue. Accurate measurement is often constrained by invasive and expensive online measurement techniques. This work focuses on the use of cost effective and non-invasive pressure sensors to calculate the gas void fraction of gas–liquid flow. The differential pressure readings from the vertical upward bubbly and slug air–water flow are substituted into classical mathematical models based on energy conservation to derive the void fraction. Electrical Resistance Tomography (ERT) and Wire-mesh Sensor (WMS) are used as benchmark to validate the void fraction obtained from the differential pressure. Consequently the model is able to produce reasonable agreement with ERT and WMS on the void fraction measurement. The effect of the friction loss on the mathematical models is also investigated and discussed. It is concluded the friction loss cannot be neglected, particularly when gas void fraction is less than 0.2
LLMRec: Large Language Models with Graph Augmentation for Recommendation
The problem of data sparsity has long been a challenge in recommendation
systems, and previous studies have attempted to address this issue by
incorporating side information. However, this approach often introduces side
effects such as noise, availability issues, and low data quality, which in turn
hinder the accurate modeling of user preferences and adversely impact
recommendation performance. In light of the recent advancements in large
language models (LLMs), which possess extensive knowledge bases and strong
reasoning capabilities, we propose a novel framework called LLMRec that
enhances recommender systems by employing three simple yet effective LLM-based
graph augmentation strategies. Our approach leverages the rich content
available within online platforms (e.g., Netflix, MovieLens) to augment the
interaction graph in three ways: (i) reinforcing user-item interaction egde,
(ii) enhancing the understanding of item node attributes, and (iii) conducting
user node profiling, intuitively from the natural language perspective. By
employing these strategies, we address the challenges posed by sparse implicit
feedback and low-quality side information in recommenders. Besides, to ensure
the quality of the augmentation, we develop a denoised data robustification
mechanism that includes techniques of noisy implicit feedback pruning and
MAE-based feature enhancement that help refine the augmented data and improve
its reliability. Furthermore, we provide theoretical analysis to support the
effectiveness of LLMRec and clarify the benefits of our method in facilitating
model optimization. Experimental results on benchmark datasets demonstrate the
superiority of our LLM-based augmentation approach over state-of-the-art
techniques. To ensure reproducibility, we have made our code and augmented data
publicly available at: https://github.com/HKUDS/LLMRec.gitComment: WSDM 2024 Oral Presentatio
Reducing the gap between streaming and non-streaming Transducer-based ASR by adaptive two-stage knowledge distillation
Transducer is one of the mainstream frameworks for streaming speech
recognition. There is a performance gap between the streaming and non-streaming
transducer models due to limited context. To reduce this gap, an effective way
is to ensure that their hidden and output distributions are consistent, which
can be achieved by hierarchical knowledge distillation. However, it is
difficult to ensure the distribution consistency simultaneously because the
learning of the output distribution depends on the hidden one. In this paper,
we propose an adaptive two-stage knowledge distillation method consisting of
hidden layer learning and output layer learning. In the former stage, we learn
hidden representation with full context by applying mean square error loss
function. In the latter stage, we design a power transformation based adaptive
smoothness method to learn stable output distribution. It achieved 19\%
relative reduction in word error rate, and a faster response for the first
token compared with the original streaming model in LibriSpeech corpus
Recommended from our members
Identifying degradation patterns of lithium ion batteries from impedance spectroscopy using machine learning
Abstract: Forecasting the state of health and remaining useful life of Li-ion batteries is an unsolved challenge that limits technologies such as consumer electronics and electric vehicles. Here, we build an accurate battery forecasting system by combining electrochemical impedance spectroscopy (EIS)—a real-time, non-invasive and information-rich measurement that is hitherto underused in battery diagnosis—with Gaussian process machine learning. Over 20,000 EIS spectra of commercial Li-ion batteries are collected at different states of health, states of charge and temperatures—the largest dataset to our knowledge of its kind. Our Gaussian process model takes the entire spectrum as input, without further feature engineering, and automatically determines which spectral features predict degradation. Our model accurately predicts the remaining useful life, even without complete knowledge of past operating conditions of the battery. Our results demonstrate the value of EIS signals in battery management systems
Tactile angle discriminability improvement: contributions of working memory training and continuous attended sensory input
Perceptual learning is commonly assumed to enhance perception through continuous attended sensory input. However, learning is generalizable to performance in untrained stimuli and tasks. Although previous studies have observed a possible generalization effect across tasks as a result of working memory (WM) training, comparisons of the contributions of WM training and continuous attended sensory input to perceptual learning generalization are still rare. Therefore, we compared which factors contributed most to perceptual generalization and investigated which skills acquired during WM training led to tactile generalization across tasks. Here, a Braille-like dot pattern matching n-back WM task was used as the WM training task, with four workload levels (0, 1, 2, and 3-back levels). A tactile angle discrimination (TAD) task was used as a pre- and posttest to assess improvements in tactile perception. Between tests, four subject groups were randomly assigned to four different workload n-back tasks to consecutively complete three sessions of training. The results showed that tactile n-back WM training could enhance TAD performance, with the 3-back training group having the highest TAD threshold improvement rate. Furthermore, the rate of WM capacity improvement on the 3-back level across training sessions was correlated with the rate of TAD threshold improvement. These findings suggest that continuous attended sensory input and enhanced WM capacity can lead to improvements in TAD ability, and that greater improvements in WM capacity can predict greater improvements in TAD performance.
NEW & NOTEWORTHY Perceptual learning is not always specific to the trained task and stimuli. We demonstrate that both continuous attended sensory input and improved WM capacity can be used to enhance tactile angle discrimination (TAD) ability. Moreover, WM capacity improvement is important in generalizing the training effect to the TAD ability. These findings contribute to understanding the mechanism of perceptual learning generalization across tasks
Identifying degradation patterns of lithium ion batteries from impedance spectroscopy using machine learning
Abstract: Forecasting the state of health and remaining useful life of Li-ion batteries is an unsolved challenge that limits technologies such as consumer electronics and electric vehicles. Here, we build an accurate battery forecasting system by combining electrochemical impedance spectroscopy (EIS)—a real-time, non-invasive and information-rich measurement that is hitherto underused in battery diagnosis—with Gaussian process machine learning. Over 20,000 EIS spectra of commercial Li-ion batteries are collected at different states of health, states of charge and temperatures—the largest dataset to our knowledge of its kind. Our Gaussian process model takes the entire spectrum as input, without further feature engineering, and automatically determines which spectral features predict degradation. Our model accurately predicts the remaining useful life, even without complete knowledge of past operating conditions of the battery. Our results demonstrate the value of EIS signals in battery management systems
Association of Intraoperative Hypotension with Acute Kidney Injury after Noncardiac Surgery in Patients Younger than 60 Years Old
Background/Aims: Intraoperative hypotension (IOH) may be associated with surgery-related acute kidney injury (AKI). However, the duration of hypotension that triggers AKI is poorly understood. The incidence of AKI with various durations of IOH and mean arterial pressures (MAPs) was investigated. Materials: A retrospective cohort study of 4,952 patients undergoing noncardiac surgery (2011 to 2016) with MAP monitoring and a length of stay of one or more days was performed. The exclusion criteria were a preoperative estimated glomerular filtration (eGFR) ≤60 mL min–1 1.73 m2–1, a preoperative MAP less than 65 mm Hg, dialysis dependence, urologic surgery, age older than 60 years, and a surgical duration of less than 60 min. The primary exposure was IOH, and the primary outcome was AKI (50% or 0.3 mg dL–1 increase in creatinine) during the first 7 postoperative days. Multivariable logistic regression was used to model the exposure-outcome relationship. Results: AKI occurred in 186 (3.76%) noncardiac surgery patients. The adjusted odds ratio for surgery-related AKI for a MAP of less than 55 mm Hg was 14.11 (95% confidence interval: 5.02–39.69) for an exposure of more than 20 min. Age was not an interaction factor between AKI and IOH. Conclusion: There was a considerably increased risk of postoperative AKI when intraoperative MAP was less than 55 mm Hg for more than 10 min. Strict blood pressure management is recommended even for patients younger than 60 years old
- …