80 research outputs found
Truthful Facility Assignment with Resource Augmentation: An Exact Analysis of Serial Dictatorship
We study the truthful facility assignment problem, where a set of agents with
private most-preferred points on a metric space are assigned to facilities that
lie on the metric space, under capacity constraints on the facilities. The goal
is to produce such an assignment that minimizes the social cost, i.e., the
total distance between the most-preferred points of the agents and their
corresponding facilities in the assignment, under the constraint of
truthfulness, which ensures that agents do not misreport their most-preferred
points.
We propose a resource augmentation framework, where a truthful mechanism is
evaluated by its worst-case performance on an instance with enhanced facility
capacities against the optimal mechanism on the same instance with the original
capacities. We study a very well-known mechanism, Serial Dictatorship, and
provide an exact analysis of its performance. Although Serial Dictatorship is a
purely combinatorial mechanism, our analysis uses linear programming; a linear
program expresses its greedy nature as well as the structure of the input, and
finds the input instance that enforces the mechanism have its worst-case
performance. Bounding the objective of the linear program using duality
arguments allows us to compute tight bounds on the approximation ratio. Among
other results, we prove that Serial Dictatorship has approximation ratio
when the capacities are multiplied by any integer . Our
results suggest that even a limited augmentation of the resources can have
wondrous effects on the performance of the mechanism and in particular, the
approximation ratio goes to 1 as the augmentation factor becomes large. We
complement our results with bounds on the approximation ratio of Random Serial
Dictatorship, the randomized version of Serial Dictatorship, when there is no
resource augmentation
Preference-aware task assignment in on-demand taxi dispatching: An online stable matching approach
A central issue in on-demand taxi dispatching platforms is task assignment, which designs matching policies among dynamically arrived drivers (workers) and passengers (tasks). Previous matching policies maximize the profit of the platform without considering the preferences of workers and tasks (e.g., workers may prefer high-rewarding tasks while tasks may prefer nearby workers). Such ignorance of preferences impairs user experience and will decrease the profit of the platform in the long run. To address this problem, we propose preference-aware task assignment using online stable matching. Specifically, we define a new model, Online Stable Matching under Known Identical Independent Distributions (OSM-KIID). It not only maximizes the expected total profits (OBJ-1), but also tries to satisfy the preferences among workers and tasks by minimizing the expected total number of blocking pairs (OBJ-2). The model also features a practical arrival assumption validated on real-world dataset. Furthermore, we present a linear program based online algorithm LP-ALG, which achieves an online ratio of at least 1−1/e on OBJ-1 and has at most 0.6·|E| blocking pairs expectedly, where |E| is the total number of edges in the compatible graph. We also show that a natural Greedy can have an arbitrarily bad performance on OBJ-1 while maintaining around 0.5·|E| blocking pairs. Evaluations on both synthetic and real datasets confirm our theoretical analysis and demonstrate that LP-ALG strictly dominates all the baselines on both objectives when tasks notably outnumber workers
Autonomous Decision-Making Schemes for Real-World Applications in Supply Chains and Online Systems
Designing hand-engineered solutions for decision-making in complex environments is a challenging task. This dissertation investigates the possibility of having autonomous decision-makers in several real-world problems, e.g., in dynamic matching, marketing, and transportation. Achieving high-quality performance in these systems is strongly tied to the actions that a controller performs in different situations. This problem is further complicated by the fact that every single action might have long-term consequences, so ignoring them might cause unpredicted outcomes. My primary focus is to approach these problems with long-term objectives in mind, instead of only focusing on myopic ones. By borrowing techniques from optimal control and reinforcement learning, I design modeling infrastructures for each specific problem. Currently, the mainstream of reinforcement learning research uses games and robotics simulators for verification of the performance of an algorithm. In contrast, my main endeavor in this dissertation is to bridge the gap between the developed methods and their real-world applications, which are studied less often. For instance, for dynamic matching, I propose a simple matching rule with optimality guarantees; for customer journey, I use reinforcement learning to design an online algorithm based on temporal difference learning; and, for transportation, I showed that it is possible to train a solver with the capability of solving a wide variety of vehicle routing problems using reinforcement learning. Finally, I conclude this dissertation by introducing a new paradigm, which I call corrective reinforcement learning. This paradigm addressed one major challenge in applying policies found by RL, that is, they might significantly differ from real systems. I propose a mechanism that resolves this issue by finding improved controllers which are close to the status quo. I believe that the models proposed in this dissertation will contribute to the discovery of methods that can outperform current systems, which are primarily controlled by humans
- …