3 research outputs found
Spending Money Wisely: Online Electronic Coupon Allocation based on Real-Time User Intent Detection
Online electronic coupon (e-coupon) is becoming a primary tool for e-commerce
platforms to attract users to place orders. E-coupons are the digital
equivalent of traditional paper coupons which provide customers with discounts
or gifts. One of the fundamental problems related is how to deliver e-coupons
with minimal cost while users' willingness to place an order is maximized. We
call this problem the coupon allocation problem. This is a non-trivial problem
since the number of regular users on a mature e-platform often reaches hundreds
of millions and the types of e-coupons to be allocated are often multiple. The
policy space is extremely large and the online allocation has to satisfy a
budget constraint. Besides, one can never observe the responses of one user
under different policies which increases the uncertainty of the policy making
process. Previous work fails to deal with these challenges. In this paper, we
decompose the coupon allocation task into two subtasks: the user intent
detection task and the allocation task. Accordingly, we propose a two-stage
solution: at the first stage (detection stage), we put forward a novel
Instantaneous Intent Detection Network (IIDN) which takes the user-coupon
features as input and predicts user real-time intents; at the second stage
(allocation stage), we model the allocation problem as a Multiple-Choice
Knapsack Problem (MCKP) and provide a computational efficient allocation method
using the intents predicted at the detection stage. We conduct extensive online
and offline experiments and the results show the superiority of our proposed
framework, which has brought great profits to the platform and continues to
function online
Learning to Collaborate in Multi-Module Recommendation via Multi-Agent Reinforcement Learning without Communication
With the rise of online e-commerce platforms, more and more customers prefer
to shop online. To sell more products, online platforms introduce various
modules to recommend items with different properties such as huge discounts. A
web page often consists of different independent modules. The ranking policies
of these modules are decided by different teams and optimized individually
without cooperation, which might result in competition between modules. Thus,
the global policy of the whole page could be sub-optimal. In this paper, we
propose a novel multi-agent cooperative reinforcement learning approach with
the restriction that different modules cannot communicate. Our contributions
are three-fold. Firstly, inspired by a solution concept in game theory named
correlated equilibrium, we design a signal network to promote cooperation of
all modules by generating signals (vectors) for different modules. Secondly, an
entropy-regularized version of the signal network is proposed to coordinate
agents' exploration of the optimal global policy. Furthermore, experiments
based on real-world e-commerce data demonstrate that our algorithm obtains
superior performance over baselines
Recommended from our members
Enabling Resilience in Cyber-Physical-Human Water Infrastructures
Rapid urbanization and growth in urban populations have forced community-scale infrastructures (e.g., water, power and natural gas distribution systems, and transportation networks) to operate at their limits. Aging (and failing) infrastructures around the world are becoming increasingly vulnerable to operational degradation, extreme weather, natural disasters and cyber attacks/failures. These trends have wide-ranging socioeconomic consequences and raise public safety concerns. In this thesis, we introduce the notion of cyber-physical-human infrastructures (CPHIs) - smart community-scale infrastructures that bridge technologies with physical infrastructures and people. CPHIs are highly dynamic stochastic systems characterized by complex physical models that exhibit regionwide variability and uncertainty under disruptions. Failures in these distributed settings tend to be difficult to predict and estimate, and expensive to repair. Real-time fault identification is crucial to ensure continuity of lifeline services to customers at adequate levels of quality. Emerging smart community technologies have the potential to transform our failing infrastructures into robust and resilient future CPHIs.In this thesis, we explore one such CPHI - community water infrastructures. Current urban water infrastructures, that are decades (sometimes over a 100 years) old, encompass diverse geophysical regimes. Water stress concerns include the scarcity of supply and an increase in demand due to urbanization. Deterioration and damage to the infrastructure can disrupt water service; contamination events can result in economic and public health consequences. Unfortunately, little investment has gone into modernizing this key lifeline.To enhance the resilience of water systems, we propose an integrated middleware framework for quick and accurate identification of failures in complex water networks that exhibit uncertain behavior. Our proposed approach integrates IoT-based sensing, domain-specific models and simulations with machine learning methods to identify failures (pipe breaks, contamination events). The composition of techniques results in cost-accuracy-latency tradeoffs in fault identification, inherent in CPHIs due to the constraints imposed by cyber components, physical mechanics and human operators. Three key resilience problems are addressed in this thesis; isolation of multiple faults under a small number of failures, state estimation of the water systems under extreme events such as earthquakes, and contaminant source identification in water networks using human-in-the-loop based sensing. By working with real world water agencies (WSSC, DC and LADWP, LA), we first develop an understanding of operations of water CPHI systems. We design and implement a sensor-simulation-data integration framework AquaSCALE, and apply it to localize multiple concurrent pipe failures. We use a mixture of infrastructure measurements (i.e., historical and live water pressure/flow), environmental data (i.e., weather) and human inputs (i.e., twitter feeds), combined and enhanced with the domain model and supervised learning techniques to locate multiple failures at fine levels of granularity (individual pipeline level) with detection time reduced by orders of magnitude (from hours/days to minutes). We next consider the resilience of water infrastructures under extreme events (i.e., earthquakes) - the challenge here is the lack of apriori knowledge and the increased number and severity of damages to infrastructures. We present a graphical model based approach for efficient online state estimation, where the offline graph factorization partitions a given network into disjoint subgraphs, and the belief propagation based inference is executed on-the-fly in a distributed manner on those subgraphs. Our proposed approach can isolate 80% broken pipes and 99% loss-of-service to end-users during an earthquake.Finally, we address issues of water quality - today this is a human-in-the-loop process where operators need to gather water samples for lab tests. We incorporate the necessary abstractions with event processing methods into a workflow, which iteratively selects and refines the set of potential failure points via human-driven grab sampling. Our approach utilizes Hidden Markov Model based representations for event inference, along with reinforcement learning methods for further refining event locations and reducing the cost of human efforts.The proposed techniques are integrated into a middleware architecture, which enables components to communicate/collaborate with one another. We validate our approaches through a prototype implementation with multiple real-world water networks, supply-demand patterns from water utilities and policies set by the U.S. EPA. While our focus here is on water infrastructures in a community, the developed end-to-end solution is applicable to other infrastructures and community services which operate in disruptive and resource-constrained environments