10 research outputs found
Capacitated Dynamic Programming: Faster Knapsack and Graph Algorithms
One of the most fundamental problems in Computer Science is the Knapsack
problem. Given a set of n items with different weights and values, it asks to
pick the most valuable subset whose total weight is below a capacity threshold
T. Despite its wide applicability in various areas in Computer Science,
Operations Research, and Finance, the best known running time for the problem
is O(Tn). The main result of our work is an improved algorithm running in time
O(TD), where D is the number of distinct weights. Previously, faster runtimes
for Knapsack were only possible when both weights and values are bounded by M
and V respectively, running in time O(nMV) [Pisinger'99]. In comparison, our
algorithm implies a bound of O(nM^2) without any dependence on V, or O(nV^2)
without any dependence on M. Additionally, for the unbounded Knapsack problem,
we provide an algorithm running in time O(M^2) or O(V^2). Both our algorithms
match recent conditional lower bounds shown for the Knapsack problem [Cygan et
al'17, K\"unnemann et al'17].
We also initiate a systematic study of general capacitated dynamic
programming, of which Knapsack is a core problem. This problem asks to compute
the maximum weight path of length k in an edge- or node-weighted directed
acyclic graph. In a graph with m edges, these problems are solvable by dynamic
programming in time O(km), and we explore under which conditions the dependence
on k can be eliminated. We identify large classes of graphs where this is
possible and apply our results to obtain linear time algorithms for the problem
of k-sparse Delta-separated sequences. The main technical innovation behind our
results is identifying and exploiting concavity that appears in relaxations and
subproblems of the tasks we consider
Generalizations of Length Limited Huffman Coding for Hierarchical Memory Settings
In this paper, we study the problem of designing prefix-free encoding schemes having minimum average code length that can be decoded efficiently under a decode cost model that captures memory hierarchy induced cost functions. We also study a special case of this problem that is closely related to the length limited Huffman coding (LLHC) problem; we call this the soft-length limited Huffman coding problem. In this version, there is a penalty associated with each of the n characters of the alphabet whose encodings exceed a specified bound D(? n) where the penalty increases linearly with the length of the encoding beyond D. The goal of the problem is to find a prefix-free encoding having minimum average code length and total penalty within a pre-specified bound P. This generalizes the LLHC problem. We present an algorithm to solve this problem that runs in time O(nD). We study a further generalization in which the penalty function and the objective function can both be arbitrary monotonically non-decreasing functions of the codeword length. We provide dynamic programming based exact and PTAS algorithms for this setting
Heuristic Solution Approaches to the Solid Assignment Problem
The 3-dimensional assignment problem, also known as the Solid Assignment Problem (SAP), is a challenging problem in combinatorial optimisation. While the ordinary or 2-dimensional assignment problem is in the P-class, SAP which is an extension of it, is NP-hard. SAP is the problem of allocating n jobs to n machines in n factories such that exactly one job is allocated to one machine in one factory. The objective is to minimise the total cost of getting these n jobs done. The problem is commonly solved using exact methods of integer programming such as Branch-and-Bound B&B. As it is intractable, only approximate solutions are found in reasonable time for large instances. Here, we suggest a number of approximate solution approaches, one of them the Diagonals Method (DM), relies on the Kuhn-Tucker Munkres algorithm, also known as the Hungarian Assignment Method. The approach was discussed, hybridised, presented and compared with other heuristic approaches such as the Average Method, the Addition Method, the Multiplication Method and the Genetic Algorithm. Moreover, a special case of SAP which involves Monge-type matrices is also considered. We have shown that in this case DM finds the exact solution efficiently.
We sought to provide illustrations of the models and approaches presented whenever appropriate. Extensive experimental results are included and discussed. The thesis ends with a conclusions and some suggestions for further work on the same and related topics
Optimal Sparse Regression Trees
Regression trees are one of the oldest forms of AI models, and their
predictions can be made without a calculator, which makes them broadly useful,
particularly for high-stakes applications. Within the large literature on
regression trees, there has been little effort towards full provable
optimization, mainly due to the computational hardness of the problem. This
work proposes a dynamic-programming-with-bounds approach to the construction of
provably-optimal sparse regression trees. We leverage a novel lower bound based
on an optimal solution to the k-Means clustering algorithm in 1-dimension over
the set of labels. We are often able to find optimal sparse trees in seconds,
even for challenging datasets that involve large numbers of samples and
highly-correlated features.Comment: AAAI 2023, final archival versio
Recommended from our members
The Fine-Grained Complexity of Problems Expressible by First-Order Logic and Its Extensions
This dissertation studies the fine-grained complexity of model checking problems for fixed logical formulas on sparse input structures. The Orthogonal Vectors problem is an important and well-studied problem in fine-grained complexity: its hardness is implied by the Strong Exponential Time Hypothesis, and its hardness implies the hardness of many other interesting problems. We show that the Orthogonal Vectors problem is complete in the class of first-order model checking on sparse structures, under fine-grained reductions. In other words, the hardness of Orthogonal Vectors and the hardness of first-order model checking imply each other. This also gives us an improved algorithm for first-order model checking problems. Among all first-order logic formulas in prenex normal form, we have reasons to believe that quantifier structures and may be the hardest in computational complexity: If the Nondeterministic version of the Strong Exponential Time Hypothesis is true, formulas of these forms are the only hard ones under the Strong Exponential Time Hypothesis. We can add extensions to first-order logic to strengthen its expressive power. This work also studies the fine-grained complexity of first-order formulas with comparison on structures with total order, first-order formulas with transitive closure operations, first-order formulas of fixed quantifier rank, and first-order formulas of fixed variable complexity. We also introduce a technique that can be used to reduce from sequential problems on graphs to parallel problems on sets, which can be applied to extending the Least Weight Subsequence problems from linear structures to some special classes of graphs
When Stuck, Flip a Coin:New Algorithms for Large-Scale Tasks
Many modern services need to routinely perform tasks on a large scale. This prompts us to consider the following question:
How can we design efficient algorithms for large-scale computation?
In this thesis, we focus on devising a general strategy to address the above question. Our approaches use tools from graph theory and convex optimization, and prove to be very effective on a number of problems that exhibit locality. A recurring theme in our work is to use randomization to obtain simple and practical algorithms.
The techniques we developed enabled us to make progress on the following questions:
- Parallel Computation of Approximately Maximum Matchings. We put forth a new approach to computing -approximate maximum matchings in the Massively Parallel Computation (MPC) model. In the regime in which the memory per machine is , i.e., linear in the size of the vertex-set, our algorithm requires only rounds of computations. This is an almost exponential improvement over the barrier of rounds that all the previous results required in this regime.
- Parallel Computation of Maximal Independent Sets. We propose a simple randomized algorithm that constructs maximal independent sets in the MPC model. If the memory per machine is our algorithm runs in MPC-rounds. In the same regime, all the previously known algorithms required rounds of computation.
- Network Routing under Link Failures. We design a new protocol for stateless message-routing in -connected graphs. Our routing scheme has two important features: (1) each router performs the routing decisions based only on the local information available to it; and, (2) a message is delivered successfully even if arbitrary links have failed. This significantly improves upon the previous work of which the routing schemes tolerate only up to failed links in -connected graphs.
- Streaming Submodular Maximization under Element Removals. We study the problem of maximizing submodular functions subject to cardinality constraint , in the context of streaming algorithms. In a regime in which up to elements can be removed from the stream, we design an algorithm that provides a constant-factor approximation for this problem. At the same time, the algorithm stores only elements. Our algorithm improves quadratically upon the prior work, that requires storing many elements to solve the same problem.
- Fast Recovery for the Separated Sparsity Model. In the context of compressed sensing, we put forth two recovery algorithms of nearly-linear time for the separated sparsity signals (that naturally model neural spikes). This improves upon the previous algorithm that had a quadratic running time. We also derive a refined version of the natural dynamic programming (DP) approach to the recovery of the separated sparsity signals. This DP approach leads to a recovery algorithm that runs in linear time for an important class of separated sparsity signals. Finally, we consider a generalization of these signals into two dimensions, and we show that computing an exact projection for the two-dimensional model is NP-hard
Solving Assignment and Routing Problems in Mixed Traffic Systems
This doctoral thesis presents not only a new traffic assignment model for mixed traffic systems but also new heuristics for multi-paths routing problems, a case study in Hanoi Vietnam, and a new software, named TranOpt Plus, supporting three major features: map editing, dynamic routing, and traffic assignment modeling.
We investigate three routing problems: k shortest loop-less paths (KSLP), dissimilar shortest loop-less paths (DSLP), and multi-objective shortest paths (MOSP). By developing loop filters and a similarity filter, we create two new heuristics based on Eppstein's algorithm: one using loop filters for the KSLP problem (HELF), the other using loop-and-similarity filters for the DSLP problem (HELSF). The computational results on real street maps indicate that the new heuristics dominate the other algorithms considered in terms of either running time or the average length of the found paths.
In traffic assignment modeling, we propose a new User Equilibrium (UE) model, named GUEM, for mixed traffic systems where 2- and 4-wheel vehicles travel together without any separate lanes for each kind of vehicle. At the optimal solution to the model, a user equilibrium for each kind of vehicle is obtained. The model is applied to the traffic system in Hanoi, Vietnam, where the traffic system is mixed traffic dominated by motorcycles. The predicted assignment by the GUEM model using real collected data in Hanoi is in high agreement with the real traffic situation in Hanoi.
Finally, we present the TranOpt Plus software, containing the implementation of all the routing algorithms mentioned in the thesis, as well as the GUEM model and a number of popular traffic assignment models for both standard traffic systems and mixed traffic systems. With its intuitive graphical user interface (GUI) and its strong visualization tools, TranOpt Plus also enables users without any mathematical or computer science background to use conveniently. Nevertheless, TranOpt Plus can be easily extended by further map-related problems, e.g., transportation network design, facility location, and the traveling salesman problem.
Keywords: mixed traffic assignment modeling, routing algorithms, shortest paths, dissimilar paths, Hanoi, TranOpt Plus, map visualizatio