13 research outputs found
Crowdsourced PAC Learning under Classification Noise
In this paper, we analyze PAC learnability from labels produced by
crowdsourcing. In our setting, unlabeled examples are drawn from a distribution
and labels are crowdsourced from workers who operate under classification
noise, each with their own noise parameter. We develop an end-to-end
crowdsourced PAC learning algorithm that takes unlabeled data points as input
and outputs a trained classifier. Our three-step algorithm incorporates
majority voting, pure-exploration bandits, and noisy-PAC learning. We prove
several guarantees on the number of tasks labeled by workers for PAC learning
in this setting and show that our algorithm improves upon the baseline by
reducing the total number of tasks given to workers. We demonstrate the
robustness of our algorithm by exploring its application to additional
realistic crowdsourcing settings.Comment: 14 page
Deconfounded Causal Collaborative Filtering
Recommender systems may be confounded by various types of confounding factors
(also called confounders) that may lead to inaccurate recommendations and
sacrificed recommendation performance. Current approaches to solving the
problem usually design each specific model for each specific confounder.
However, real-world systems may include a huge number of confounders and thus
designing each specific model for each specific confounder is unrealistic. More
importantly, except for those "explicit confounders" that researchers can
manually identify and process such as item's position in the ranking list,
there are also many "latent confounders" that are beyond the imagination of
researchers. For example, users' rating on a song may depend on their current
mood or the current weather, and users' preference on ice creams may depend on
the air temperature. Such latent confounders may be unobservable in the
recorded training data. To solve the problem, we propose a deconfounded causal
collaborative filtering model. We first frame user behaviors with unobserved
confounders into a causal graph, and then we design a front-door adjustment
model carefully fused with machine learning to deconfound the influence of
unobserved confounders. The proposed model is able to handle both global
confounders and personalized confounders. Experiments on real-world e-commerce
datasets show that our method is able to deconfound unobserved confounders to
achieve better recommendation performance.Comment: 9 pages, 5 figures; comments and suggestions are highly appreciate
Towards More Robust and Accurate Sequential Recommendation with Cascade-guided Adversarial Training
Sequential recommendation models, models that learn from chronological
user-item interactions, outperform traditional recommendation models in many
settings. Despite the success of sequential recommendation models, their
robustness has recently come into question. Two properties unique to the nature
of sequential recommendation models may impair their robustness - the cascade
effects induced during training and the model's tendency to rely too heavily on
temporal information. To address these vulnerabilities, we propose
Cascade-guided Adversarial training, a new adversarial training procedure that
is specifically designed for sequential recommendation models. Our approach
harnesses the intrinsic cascade effects present in sequential modeling to
produce strategic adversarial perturbations to item embeddings during training.
Experiments on training state-of-the-art sequential models on four public
datasets from different domains show that our training approach produces
superior model ranking accuracy and superior model robustness to real item
replacement perturbations when compared to both standard model training and
generic adversarial training.Comment: Accepted to present at SIAM International Conference on Data Mining
(SDM24
On the Unlikelihood of D-Separation
Causal discovery aims to recover a causal graph from data generated by it;
constraint based methods do so by searching for a d-separating conditioning set
of nodes in the graph via an oracle. In this paper, we provide analytic
evidence that on large graphs, d-separation is a rare phenomenon, even when
guaranteed to exist, unless the graph is extremely sparse. We then provide an
analytic average case analysis of the PC Algorithm for causal discovery, as
well as a variant of the SGS Algorithm we call UniformSGS. We consider a set
of nodes, and generate a random DAG where
with i.i.d. probability if .
We provide upper bounds on the probability that a subset of
d-separates and , conditional on and being d-separable; our
upper bounds decay exponentially fast to as . For
the PC Algorithm, while it is known that its worst-case guarantees fail on
non-sparse graphs, we show that the same is true for the average case, and that
the sparsity requirement is quite demanding: for good performance, the density
must go to as even in the average case. For
UniformSGS, while it is known that the running time is exponential for existing
edges, we show that in the average case, that is the expected running time for
most non-existing edges as well
Causal Layering via Conditional Entropy
Causal discovery aims to recover information about an unobserved causal graph
from the observable data it generates. Layerings are orderings of the variables
which place causes before effects. In this paper, we provide ways to recover
layerings of a graph by accessing the data via a conditional entropy oracle,
when distributions are discrete. Our algorithms work by repeatedly removing
sources or sinks from the graph. Under appropriate assumptions and
conditioning, we can separate the sources or sinks from the remainder of the
nodes by comparing their conditional entropy to the unconditional entropy of
their noise. Our algorithms are provably correct and run in worst-case
quadratic time. The main assumptions are faithfulness and injective noise, and
either known noise entropies or weakly monotonically increasing noise entropies
along directed paths. In addition, we require one of either a very mild
extension of faithfulness, or strictly monotonically increasing noise
entropies, or expanding noise injectivity to include an additional single
argument in the structural functions
Enhancing Performance on Seen and Unseen Dialogue Scenarios using Retrieval-Augmented End-to-End Task-Oriented System
End-to-end task-oriented dialogue (TOD) systems have achieved promising
performance by leveraging sophisticated natural language understanding and
natural language generation capabilities of pre-trained models. This work
enables the TOD systems with more flexibility through a simple cache. The cache
provides the flexibility to dynamically update the TOD systems and handle both
existing and unseen dialogue scenarios. Towards this end, we first fine-tune a
retrieval module to effectively retrieve the most relevant information entries
from the cache. We then train end-to-end TOD models that can refer to and
ground on both dialogue history and retrieved information during TOD
generation. The cache is straightforward to construct, and the backbone models
of TOD systems are compatible with existing pre-trained generative models.
Extensive experiments demonstrate the superior performance of our framework,
with a notable improvement in non-empty joint goal accuracy by 6.7% compared to
strong baselines.Comment: Accepted by SIGDIAL 2023 as a long pape
DialogStudio: Towards Richest and Most Diverse Unified Dataset Collection for Conversational AI
Despite advancements in conversational AI, language models encounter
challenges to handle diverse conversational tasks, and existing dialogue
dataset collections often lack diversity and comprehensiveness. To tackle these
issues, we introduce DialogStudio: the largest and most diverse collection of
dialogue datasets, unified under a consistent format while preserving their
original information. Our collection encompasses data from open-domain
dialogues, task-oriented dialogues, natural language understanding,
conversational recommendation, dialogue summarization, and knowledge-grounded
dialogues, making it an incredibly rich and diverse resource for dialogue
research and model training. To further enhance the utility of DialogStudio, we
identify the licenses for each dataset and design domain-aware prompts for
selected dialogues to facilitate instruction-aware fine-tuning. Furthermore, we
develop conversational AI models using the dataset collection, and our
experiments in both zero-shot and few-shot learning scenarios demonstrate the
superiority of DialogStudio. To improve transparency and support dataset and
task-based research, as well as language model pre-training, all datasets,
licenses, codes, and models associated with DialogStudio are made publicly
accessible at https://github.com/salesforce/DialogStudi
Salesforce CausalAI Library: A Fast and Scalable Framework for Causal Analysis of Time Series and Tabular Data
We introduce the Salesforce CausalAI Library, an open-source library for
causal analysis using observational data. It supports causal discovery and
causal inference for tabular and time series data, of both discrete and
continuous types. This library includes algorithms that handle linear and
non-linear causal relationships between variables, and uses multi-processing
for speed-up. We also include a data generator capable of generating synthetic
data with specified structural equation model for both the aforementioned data
formats and types, that helps users control the ground-truth causal process
while investigating various algorithms. Finally, we provide a user interface
(UI) that allows users to perform causal analysis on data without coding. The
goal of this library is to provide a fast and flexible solution for a variety
of problems in the domain of causality. This technical report describes the
Salesforce CausalAI API along with its capabilities, the implementations of the
supported algorithms, and experiments demonstrating their performance and
speed. Our library is available at
\url{https://github.com/salesforce/causalai}
REX: Rapid Exploration and eXploitation for AI Agents
In this paper, we propose an enhanced approach for Rapid Exploration and
eXploitation for AI Agents called REX. Existing AutoGPT-style techniques have
inherent limitations, such as a heavy reliance on precise descriptions for
decision-making, and the lack of a systematic approach to leverage try-and-fail
procedures akin to traditional Reinforcement Learning (RL). REX introduces an
additional layer of rewards and integrates concepts similar to Upper Confidence
Bound (UCB) scores, leading to more robust and efficient AI agent performance.
This approach has the advantage of enabling the utilization of offline
behaviors from logs and allowing seamless integration with existing foundation
models while it does not require any model fine-tuning. Through comparative
analysis with existing methods such as Chain-of-Thoughts(CoT) and Reasoning viA
Planning(RAP), REX-based methods demonstrate comparable performance and, in
certain cases, even surpass the results achieved by these existing techniques.
Notably, REX-based methods exhibit remarkable reductions in execution time,
enhancing their practical applicability across a diverse set of scenarios