17 research outputs found
HYPRO: A Hybridly Normalized Probabilistic Model for Long-Horizon Prediction of Event Sequences
In this paper, we tackle the important yet under-investigated problem of
making long-horizon prediction of event sequences. Existing state-of-the-art
models do not perform well at this task due to their autoregressive structure.
We propose HYPRO, a hybridly normalized probabilistic model that naturally fits
this task: its first part is an autoregressive base model that learns to
propose predictions; its second part is an energy function that learns to
reweight the proposals such that more realistic predictions end up with higher
probabilities. We also propose efficient training and inference algorithms for
this model. Experiments on multiple real-world datasets demonstrate that our
proposed HYPRO model can significantly outperform previous models at making
long-horizon predictions of future events. We also conduct a range of ablation
studies to investigate the effectiveness of each component of our proposed
methods.Comment: NeurIPS 2022 camera-read
A Graph Regularized Point Process Model For Event Propagation Sequence
Point process is the dominant paradigm for modeling event sequences occurring
at irregular intervals. In this paper we aim at modeling latent dynamics of
event propagation in graph, where the event sequence propagates in a directed
weighted graph whose nodes represent event marks (e.g., event types). Most
existing works have only considered encoding sequential event history into
event representation and ignored the information from the latent graph
structure. Besides they also suffer from poor model explainability, i.e.,
failing to uncover causal influence across a wide variety of nodes. To address
these problems, we propose a Graph Regularized Point Process (GRPP) that can be
decomposed into: 1) a graph propagation model that characterizes the event
interactions across nodes with neighbors and inductively learns node
representations; 2) a temporal attentive intensity model, whose excitation and
time decay factors of past events on the current event are constructed via the
contextualization of the node embedding. Moreover, by applying a graph
regularization method, GRPP provides model interpretability by uncovering
influence strengths between nodes. Numerical experiments on various datasets
show that GRPP outperforms existing models on both the propagation time and
node prediction by notable margins.Comment: IJCNN 202
Language Models Can Improve Event Prediction by Few-Shot Abductive Reasoning
Large language models have shown astonishing performance on a wide range of
reasoning tasks. In this paper, we investigate whether they could reason about
real-world events and help improve the prediction performance of event sequence
models. We design LAMP, a framework that integrates a large language model in
event prediction. Particularly, the language model performs abductive reasoning
to assist an event sequence model: the event model proposes predictions on
future events given the past; instructed by a few expert-annotated
demonstrations, the language model learns to suggest possible causes for each
proposal; a search module finds out the previous events that match the causes;
a scoring function learns to examine whether the retrieved events could
actually cause the proposal. Through extensive experiments on several
challenging real-world datasets, we demonstrate that our framework -- thanks to
the reasoning capabilities of large language models -- could significantly
outperform the state-of-the-art event sequence models.Comment: NeurIPS 2023 camera-read
Automatic Deduction Path Learning via Reinforcement Learning with Environmental Correction
Automatic bill payment is an important part of business operations in fintech
companies. The practice of deduction was mainly based on the total amount or
heuristic search by dividing the bill into smaller parts to deduct as much as
possible. This article proposes an end-to-end approach of automatically
learning the optimal deduction paths (deduction amount in order), which reduces
the cost of manual path design and maximizes the amount of successful
deduction. Specifically, in view of the large search space of the paths and the
extreme sparsity of historical successful deduction records, we propose a deep
hierarchical reinforcement learning approach which abstracts the action into a
two-level hierarchical space: an upper agent that determines the number of
steps of deductions each day and a lower agent that decides the amount of
deduction at each step. In such a way, the action space is structured via prior
knowledge and the exploration space is reduced. Moreover, the inherited
information incompleteness of the business makes the environment just partially
observable. To be precise, the deducted amounts indicate merely the lower
bounds of the available account balance. To this end, we formulate the problem
as a partially observable Markov decision problem (POMDP) and employ an
environment correction algorithm based on the characteristics of the business.
In the world's largest electronic payment business, we have verified the
effectiveness of this scheme offline and deployed it online to serve millions
of users
Continual Learning in Predictive Autoscaling
Predictive Autoscaling is used to forecast the workloads of servers and
prepare the resources in advance to ensure service level objectives (SLOs) in
dynamic cloud environments. However, in practice, its prediction task often
suffers from performance degradation under abnormal traffics caused by external
events (such as sales promotional activities and applications
re-configurations), for which a common solution is to re-train the model with
data of a long historical period, but at the expense of high computational and
storage costs. To better address this problem, we propose a replay-based
continual learning method, i.e., Density-based Memory Selection and Hint-based
Network Learning Model (DMSHM), using only a small part of the historical log
to achieve accurate predictions. First, we discover the phenomenon of sample
overlap when applying replay-based continual learning in prediction tasks. In
order to surmount this challenge and effectively integrate new sample
distribution, we propose a density-based sample selection strategy that
utilizes kernel density estimation to calculate sample density as a reference
to compute sample weight, and employs weight sampling to construct a new memory
set. Then we implement hint-based network learning based on hint representation
to optimize the parameters. Finally, we conduct experiments on public and
industrial datasets to demonstrate that our proposed method outperforms
state-of-the-art continual learning methods in terms of memory capacity and
prediction accuracy. Furthermore, we demonstrate remarkable practicability of
DMSHM in real industrial applications
Prompt-augmented Temporal Point Process for Streaming Event Sequence
Neural Temporal Point Processes (TPPs) are the prevalent paradigm for
modeling continuous-time event sequences, such as user activities on the web
and financial transactions. In real-world applications, event data is typically
received in a \emph{streaming} manner, where the distribution of patterns may
shift over time. Additionally, \emph{privacy and memory constraints} are
commonly observed in practical scenarios, further compounding the challenges.
Therefore, the continuous monitoring of a TPP to learn the streaming event
sequence is an important yet under-explored problem. Our work paper addresses
this challenge by adopting Continual Learning (CL), which makes the model
capable of continuously learning a sequence of tasks without catastrophic
forgetting under realistic constraints. Correspondingly, we propose a simple
yet effective framework, PromptTPP\footnote{Our code is available at {\small
\url{ https://github.com/yanyanSann/PromptTPP}}}, by integrating the base TPP
with a continuous-time retrieval prompt pool. The prompts, small learnable
parameters, are stored in a memory space and jointly optimized with the base
TPP, ensuring that the model learns event streams sequentially without
buffering past examples or task-specific attributes. We present a novel and
realistic experimental setup for modeling event streams, where PromptTPP
consistently achieves state-of-the-art performance across three real user
behavior datasets.Comment: NeurIPS 2023 camera ready versio
WeaverBird: Empowering Financial Decision-Making with Large Language Model, Knowledge Base, and Search Engine
We present WeaverBird, an intelligent dialogue system designed specifically
for the finance domain. Our system harnesses a large language model of GPT
architecture that has been tuned using extensive corpora of finance-related
text. As a result, our system possesses the capability to understand complex
financial queries, such as "How should I manage my investments during
inflation?", and provide informed responses. Furthermore, our system
incorporates a local knowledge base and a search engine to retrieve relevant
information. The final responses are conditioned on the search results and
include proper citations to the sources, thus enjoying an enhanced credibility.
Through a range of finance-related questions, we have demonstrated the superior
performance of our system compared to other models. To experience our system
firsthand, users can interact with our live demo at
https://weaverbird.ttic.edu, as well as watch our 2-min video illustration at
https://www.youtube.com/watch?v=fyV2qQkX6Tc
EasyTPP: Towards Open Benchmarking Temporal Point Processes
Continuous-time event sequences play a vital role in real-world domains such
as healthcare, finance, online shopping, social networks, and so on. To model
such data, temporal point processes (TPPs) have emerged as the most natural and
competitive models, making a significant impact in both academic and
application communities. Despite the emergence of many powerful models in
recent years, there hasn't been a central benchmark for these models and future
research endeavors. This lack of standardization impedes researchers and
practitioners from comparing methods and reproducing results, potentially
slowing down progress in this field. In this paper, we present EasyTPP, the
first central repository of research assets (e.g., data, models, evaluation
programs, documentations) in the area of event sequence modeling. Our EasyTPP
makes several unique contributions to this area: a unified interface of using
existing datasets and adding new datasets; a wide range of evaluation programs
that are easy to use and extend as well as facilitate reproducible research;
implementations of popular neural TPPs, together with a rich library of modules
by composing which one could quickly build complex models. All the data and
implementation can be found at
https://github.com/ant-research/EasyTemporalPointProcess. We will actively
maintain this benchmark and welcome contributions from other researchers and
practitioners. Our benchmark will help promote reproducible research in this
field, thus accelerating research progress as well as making more significant
real-world impacts.Comment: ICLR 2024 camera read
Leveraging Large Language Models for Pre-trained Recommender Systems
Recent advancements in recommendation systems have shifted towards more
comprehensive and personalized recommendations by utilizing large language
models (LLM). However, effectively integrating LLM's commonsense knowledge and
reasoning abilities into recommendation systems remains a challenging problem.
In this paper, we propose RecSysLLM, a novel pre-trained recommendation model
based on LLMs. RecSysLLM retains LLM reasoning and knowledge while integrating
recommendation domain knowledge through unique designs of data, training, and
inference. This allows RecSysLLM to leverage LLMs' capabilities for
recommendation tasks in an efficient, unified framework. We demonstrate the
effectiveness of RecSysLLM on benchmarks and real-world scenarios. RecSysLLM
provides a promising approach to developing unified recommendation systems by
fully exploiting the power of pre-trained language models.Comment: 13 pages, 4 figure
Enhancing Recommender Systems with Large Language Model Reasoning Graphs
Recommendation systems aim to provide users with relevant suggestions, but
often lack interpretability and fail to capture higher-level semantic
relationships between user behaviors and profiles. In this paper, we propose a
novel approach that leverages large language models (LLMs) to construct
personalized reasoning graphs. These graphs link a user's profile and
behavioral sequences through causal and logical inferences, representing the
user's interests in an interpretable way. Our approach, LLM reasoning graphs
(LLMRG), has four components: chained graph reasoning, divergent extension,
self-verification and scoring, and knowledge base self-improvement. The
resulting reasoning graph is encoded using graph neural networks, which serves
as additional input to improve conventional recommender systems, without
requiring extra user or item information. Our approach demonstrates how LLMs
can enable more logical and interpretable recommender systems through
personalized reasoning graphs. LLMRG allows recommendations to benefit from
both engineered recommendation systems and LLM-derived reasoning graphs. We
demonstrate the effectiveness of LLMRG on benchmarks and real-world scenarios
in enhancing base recommendation models.Comment: 12 pages, 6 figure