46 research outputs found
TopoSZ: Preserving Topology in Error-Bounded Lossy Compression
Existing error-bounded lossy compression techniques control the pointwise
error during compression to guarantee the integrity of the decompressed data.
However, they typically do not explicitly preserve the topological features in
data. When performing post hoc analysis with decompressed data using
topological methods, preserving topology in the compression process to obtain
topologically consistent and correct scientific insights is desirable. In this
paper, we introduce TopoSZ, an error-bounded lossy compression method that
preserves the topological features in 2D and 3D scalar fields. Specifically, we
aim to preserve the types and locations of local extrema as well as the level
set relations among critical points captured by contour trees in the
decompressed data. The main idea is to derive topological constraints from
contour-tree-induced segmentation from the data domain, and incorporate such
constraints with a customized error-controlled quantization strategy from the
classic SZ compressor.Our method allows users to control the pointwise error
and the loss of topological features during the compression process with a
global error bound and a persistence threshold
Explainable Recommender with Geometric Information Bottleneck
Explainable recommender systems can explain their recommendation decisions, enhancing user trust in the systems. Most explainable recommender systems either rely on human-annotated rationales to train models for explanation generation or leverage the attention mechanism to extract important text spans from reviews as explanations. The extracted rationales are often confined to an individual review and may fail to identify the implicit features beyond the review text. To avoid the expensive human annotation process and to generate explanations beyond individual reviews, we propose to incorporate a geometric prior learnt from user-item interactions into a variational network which infers latent factors from user-item reviews. The latent factors from an individual user-item pair can be used for both recommendation and explanation generation, which naturally inherit the global characteristics encoded in the prior knowledge. Experimental results on three e-commerce datasets show that our model significantly improves the interpretability of a variational recommender using the Wasserstein distance while achieving performance comparable to existing content-based recommender systems in terms of recommendation behaviours
TROPHY: A Topologically Robust Physics-Informed Tracking Framework for Tropical Cyclones
Tropical cyclones (TCs) are among the most destructive weather systems.
Realistically and efficiently detecting and tracking TCs are critical for
assessing their impacts and risks. Recently, a multilevel robustness framework
has been introduced to study the critical points of time-varying vector fields.
The framework quantifies the robustness of critical points across varying
neighborhoods. By relating the multilevel robustness with critical point
tracking, the framework has demonstrated its potential in cyclone tracking. An
advantage is that it identifies cyclonic features using only 2D wind vector
fields, which is encouraging as most tracking algorithms require multiple
dynamic and thermodynamic variables at different altitudes. A disadvantage is
that the framework does not scale well computationally for datasets containing
a large number of cyclones. This paper introduces a topologically robust
physics-informed tracking framework (TROPHY) for TC tracking. The main idea is
to integrate physical knowledge of TC to drastically improve the computational
efficiency of multilevel robustness framework for large-scale climate datasets.
First, during preprocessing, we propose a physics-informed feature selection
strategy to filter 90% of critical points that are short-lived and have low
stability, thus preserving good candidates for TC tracking. Second, during
in-processing, we impose constraints during the multilevel robustness
computation to focus only on physics-informed neighborhoods of TCs. We apply
TROPHY to 30 years of 2D wind fields from reanalysis data in ERA5 and generate
a number of TC tracks. In comparison with the observed tracks, we demonstrate
that TROPHY can capture TC characteristics that are comparable to and sometimes
even better than a well-validated TC tracking algorithm that requires multiple
dynamic and thermodynamic scalar fields
Explainable Recommender with Geometric Information Bottleneck
Explainable recommender systems can explain their recommendation decisions,
enhancing user trust in the systems. Most explainable recommender systems
either rely on human-annotated rationales to train models for explanation
generation or leverage the attention mechanism to extract important text spans
from reviews as explanations. The extracted rationales are often confined to an
individual review and may fail to identify the implicit features beyond the
review text. To avoid the expensive human annotation process and to generate
explanations beyond individual reviews, we propose to incorporate a geometric
prior learnt from user-item interactions into a variational network which
infers latent factors from user-item reviews. The latent factors from an
individual user-item pair can be used for both recommendation and explanation
generation, which naturally inherit the global characteristics encoded in the
prior knowledge. Experimental results on three e-commerce datasets show that
our model significantly improves the interpretability of a variational
recommender using the Wasserstein distance while achieving performance
comparable to existing content-based recommender systems in terms of
recommendation behaviours.Comment: Accepted by TKD
The Mystery of In-Context Learning: A Comprehensive Survey on Interpretation and Analysis
Understanding in-context learning (ICL) capability that enables large
language models (LLMs) to excel in proficiency through demonstration examples
is of utmost importance. This importance stems not only from the better
utilization of this capability across various tasks, but also from the
proactive identification and mitigation of potential risks, including concerns
regarding truthfulness, bias, and toxicity, that may arise alongside the
capability. In this paper, we present a thorough survey on the interpretation
and analysis of in-context learning. First, we provide a concise introduction
to the background and definition of in-context learning. Then, we give an
overview of advancements from two perspectives: 1) a theoretical perspective,
emphasizing studies on mechanistic interpretability and delving into the
mathematical foundations behind ICL; and 2) an empirical perspective,
concerning studies that empirically analyze factors associated with ICL. We
conclude by highlighting the challenges encountered and suggesting potential
avenues for future research. We believe that our work establishes the basis for
further exploration into the interpretation of in-context learning.
Additionally, we have created a repository containing the resources referenced
in our survey
Counterfactual Generation with Identifiability Guarantees
Counterfactual generation lies at the core of various machine learning tasks,
including image translation and controllable text generation. This generation
process usually requires the identification of the disentangled latent
representations, such as content and style, that underlie the observed data.
However, it becomes more challenging when faced with a scarcity of paired data
and labeling information. Existing disentangled methods crucially rely on
oversimplified assumptions, such as assuming independent content and style
variables, to identify the latent variables, even though such assumptions may
not hold for complex data distributions. For instance, food reviews tend to
involve words like tasty, whereas movie reviews commonly contain words such as
thrilling for the same positive sentiment. This problem is exacerbated when
data are sampled from multiple domains since the dependence between content and
style may vary significantly over domains. In this work, we tackle the
domain-varying dependence between the content and the style variables inherent
in the counterfactual generation task. We provide identification guarantees for
such latent-variable models by leveraging the relative sparsity of the
influences from different latent variables. Our theoretical insights enable the
development of a doMain AdapTive counTerfactual gEneration model, called
(MATTE). Our theoretically grounded framework achieves state-of-the-art
performance in unsupervised style transfer tasks, where neither paired data nor
style labels are utilized, across four large-scale datasets. Code is available
at https://github.com/hanqi-qi/Matte.gitComment: Neurips23. Controllable generation in causal perspective with a case
study of ChatGPT, sheds light on theory-guaranteed alignment in language
model
Position bias mitigation : a knowledge-aware graph model for emotion cause extraction
The Emotion Cause Extraction (ECE)} task aims to identify clauses which
contain emotion-evoking information for a particular emotion expressed in text.
We observe that a widely-used ECE dataset exhibits a bias that the majority of
annotated cause clauses are either directly before their associated emotion
clauses or are the emotion clauses themselves. Existing models for ECE tend to
explore such relative position information and suffer from the dataset bias. To
investigate the degree of reliance of existing ECE models on clause relative
positions, we propose a novel strategy to generate adversarial examples in
which the relative position information is no longer the indicative feature of
cause clauses. We test the performance of existing models on such adversarial
examples and observe a significant performance drop. To address the dataset
bias, we propose a novel graph-based method to explicitly model the emotion
triggering paths by leveraging the commonsense knowledge to enhance the
semantic dependencies between a candidate clause and an emotion clause.
Experimental results show that our proposed approach performs on par with the
existing state-of-the-art methods on the original ECE dataset, and is more
robust against adversarial attacks compared to existing models.Comment: ACL2021 Main Conference Long pape
Addressing token uniformity in transformers via singular value transformation
Token uniformity is commonly observed in transformer-based models, in which different tokens share a large proportion of similar information after going through stacked multiple self-attention layers in a transformer. In this paper, we propose to use the distribution of singular values of outputs of each transformer layer to characterise the phenomenon of token uniformity and empirically illustrate that a less skewed singular value distribution can alleviate the token uniformity problem. Base on our observations, we define several desirable properties of singular value distributions and propose a novel transformation function for updating the singular values. We show that apart from alleviating token uniformity, the transformation function should preserve the local neighbourhood structure in the original embedding space. Our proposed singular value transformation function is applied to a range of transformer-based language models such as BERT, ALBERT, RoBERTa and DistilBERT, and improved performance is observed in semantic textual similarity evaluation and a range of GLUE tasks
Explainable recommender with geometric information bottleneck
Explainable recommender systems can explain their recommendation decisions, enhancing user trust in the systems. Most explainable recommender systems either rely on human-annotated rationales to train models for explanation generation or leverage the attention mechanism to extract important text spans from reviews as explanations. The extracted rationales are often confined to an individual review and may fail to identify the implicit features beyond the review text. To avoid the expensive human annotation process and to generate explanations beyond individual reviews, we propose to incorporate a geometric prior learnt from user-item interactions into a variational network which infers latent factors from user-item reviews. The latent factors from an individual user-item pair can be used for both recommendation and explanation generation, which naturally inherit the global characteristics encoded in the prior knowledge. Experimental results on three e-commerce datasets show that our model significantly improves the interpretability of a variational recommender using the Wasserstein distance while achieving performance comparable to existing content-based recommender systems in terms of recommendation behaviours