208 research outputs found
Provably Good Solutions to the Knapsack Problem via Neural Networks of Bounded Size
The development of a satisfying and rigorous mathematical understanding of
the performance of neural networks is a major challenge in artificial
intelligence. Against this background, we study the expressive power of neural
networks through the example of the classical NP-hard Knapsack Problem. Our
main contribution is a class of recurrent neural networks (RNNs) with rectified
linear units that are iteratively applied to each item of a Knapsack instance
and thereby compute optimal or provably good solution values. We show that an
RNN of depth four and width depending quadratically on the profit of an optimum
Knapsack solution is sufficient to find optimum Knapsack solutions. We also
prove the following tradeoff between the size of an RNN and the quality of the
computed Knapsack solution: for Knapsack instances consisting of items, an
RNN of depth five and width computes a solution of value at least
times the optimum solution value. Our results
build upon a classical dynamic programming formulation of the Knapsack Problem
as well as a careful rounding of profit values that are also at the core of the
well-known fully polynomial-time approximation scheme for the Knapsack Problem.
A carefully conducted computational study qualitatively supports our
theoretical size bounds. Finally, we point out that our results can be
generalized to many other combinatorial optimization problems that admit
dynamic programming solution methods, such as various Shortest Path Problems,
the Longest Common Subsequence Problem, and the Traveling Salesperson Problem.Comment: A short version of this paper appears in the proceedings of AAAI 202
Neur2RO: Neural Two-Stage Robust Optimization
Robust optimization provides a mathematical framework for modeling and
solving decision-making problems under worst-case uncertainty. This work
addresses two-stage robust optimization (2RO) problems (also called adjustable
robust optimization), wherein first-stage and second-stage decisions are made
before and after uncertainty is realized, respectively. This results in a
nested min-max-min optimization problem which is extremely challenging
computationally, especially when the decisions are discrete. We propose
Neur2RO, an efficient machine learning-driven instantiation of
column-and-constraint generation (CCG), a classical iterative algorithm for
2RO. Specifically, we learn to estimate the value function of the second-stage
problem via a novel neural network architecture that is easy to optimize over
by design. Embedding our neural network into CCG yields high-quality solutions
quickly as evidenced by experiments on two 2RO benchmarks, knapsack and capital
budgeting. For knapsack, Neur2RO finds solutions that are within roughly
of the best-known values in a few seconds compared to the three hours of the
state-of-the-art exact branch-and-price algorithm; for larger and more complex
instances, Neur2RO finds even better solutions. For capital budgeting, Neur2RO
outperforms three variants of the -adaptability algorithm, particularly on
the largest instances, with a 5 to 10-fold reduction in solution time. Our code
and data are available at https://github.com/khalil-research/Neur2RO
Optimistic No-regret Algorithms for Discrete Caching
We take a systematic look at the problem of storing whole files in a cache
with limited capacity in the context of optimistic learning, where the caching
policy has access to a prediction oracle (provided by, e.g., a Neural Network).
The successive file requests are assumed to be generated by an adversary, and
no assumption is made on the accuracy of the oracle. In this setting, we
provide a universal lower bound for prediction-assisted online caching and
proceed to design a suite of policies with a range of performance-complexity
trade-offs. All proposed policies offer sublinear regret bounds commensurate
with the accuracy of the oracle. Our results substantially improve upon all
recently-proposed online caching policies, which, being unable to exploit the
oracle predictions, offer only regret. In this pursuit, we
design, to the best of our knowledge, the first comprehensive optimistic
Follow-the-Perturbed leader policy, which generalizes beyond the caching
problem. We also study the problem of caching files with different sizes and
the bipartite network caching problem. Finally, we evaluate the efficacy of the
proposed policies through extensive numerical experiments using real-world
traces.Comment: Accepted to ACM SIGMETRICS 202
Fast Adaptive Non-Monotone Submodular Maximization Subject to a Knapsack Constraint
Constrained submodular maximization problems encompass a wide variety of applications, including personalized recommendation, team formation, and revenue maximization via viral marketing. The massive instances occurring in modern-day applications can render existing algorithms prohibitively slow. Moreover, frequently those instances are also inherently stochastic. Focusing on these challenges, we revisit the classic problem of maximizing a (possibly non-monotone) submodular function subject to a knapsack constraint. We present a simple randomized greedy algorithm that achieves a 5.83 approximation and runs in O(n log n) time, i.e., at least a factor
n
faster than other state-of-the-art algorithms. The robustness of our approach allows us to further transfer it to a stochastic version of the problem. There, we obtain a 9-approximation to the best adaptive policy, which is the first constant approximation for non-monotone objectives. Experimental evaluation of our algorithms showcases their improved performance on real and synthetic data
AdvCat: Domain-Agnostic Robustness Assessment for Cybersecurity-Critical Applications with Categorical Inputs
Machine Learning-as-a-Service systems (MLaaS) have been largely developed for
cybersecurity-critical applications, such as detecting network intrusions and
fake news campaigns. Despite effectiveness, their robustness against
adversarial attacks is one of the key trust concerns for MLaaS deployment. We
are thus motivated to assess the adversarial robustness of the Machine Learning
models residing at the core of these security-critical applications with
categorical inputs. Previous research efforts on accessing model robustness
against manipulation of categorical inputs are specific to use cases and
heavily depend on domain knowledge, or require white-box access to the target
ML model. Such limitations prevent the robustness assessment from being as a
domain-agnostic service provided to various real-world applications. We propose
a provably optimal yet computationally highly efficient adversarial robustness
assessment protocol for a wide band of ML-driven cybersecurity-critical
applications. We demonstrate the use of the domain-agnostic robustness
assessment method with substantial experimental study on fake news detection
and intrusion detection problems.Comment: IEEE BigData 202
The multilevel critical node problem : theoretical intractability and a curriculum learning approach
Évaluer la vulnérabilité des réseaux est un enjeu de plus en plus critique. Dans ce mémoire, nous nous penchons sur une approche étudiant la défense d’infrastructures stratégiques contre des attaques malveillantes au travers de problèmes d'optimisations multiniveaux. Plus particulièrement, nous analysons un jeu séquentiel en trois étapes appelé le « Multilevel Critical Node problem » (MCN). Ce jeu voit deux joueurs s'opposer sur un graphe: un attaquant et un défenseur. Le défenseur commence par empêcher préventivement que certains nœuds soient attaqués durant une phase de vaccination. Ensuite, l’attaquant infecte un sous ensemble des nœuds non vaccinés. Finalement, le défenseur réagit avec une stratégie de protection. Dans ce mémoire, nous fournissons les premiers résultats de complexité pour MCN ainsi que ceux de ses sous-jeux. De plus, en considérant les différents cas de graphes unitaires, pondérés ou orientés, nous clarifions la manière dont la complexité de ces problèmes varie. Nos résultats contribuent à élargir les familles de problèmes connus pour être complets pour les classes NP, et .
Motivés par l’insolubilité intrinsèque de MCN, nous concevons ensuite une heuristique efficace pour le jeu. Nous nous appuyons sur les approches récentes cherchant à apprendre des heuristiques pour des problèmes d’optimisation combinatoire en utilisant l’apprentissage par renforcement et les réseaux de neurones graphiques. Contrairement aux précédents travaux, nous nous intéressons aux situations dans lesquelles de multiples joueurs prennent des décisions de manière séquentielle. En les inscrivant au sein du formalisme d’apprentissage multiagent, nous concevons un algorithme apprenant à résoudre des problèmes d’optimisation combinatoire multiniveaux budgétés opposant deux joueurs dans un jeu à somme nulle sur un graphe. Notre méthode est basée sur un simple curriculum : si un agent sait estimer la valeur d’une instance du problème ayant un budget au plus B, alors résoudre une instance avec budget B+1 peut être fait en temps polynomial quelque soit la direction d’optimisation en regardant la valeur de tous les prochains états possibles. Ainsi, dans une approche ascendante, nous entraînons notre agent sur des jeux de données d’instances résolues heuristiquement avec des budgets de plus en plus grands. Nous rapportons des résultats quasi optimaux sur des graphes de tailles au plus 100 et un temps de résolution divisé par 185 en moyenne comparé au meilleur solutionneur exact pour le MCN.Evaluating the vulnerability of networks is a problem which has gain momentum in recent decades. In this work, we focus on a Multilevel Programming approach to study the defense of critical infrastructures against malicious attacks. We analyze a three-stage sequential game played in a graph called the Multilevel Critical Node problem (MCN). This game sees two players competing with each other: a defender and an attacker. The defender starts by preventively interdicting nodes from being attacked during what is called a vaccination phase. Then, the attacker infects a subset of non-vaccinated nodes and, finally, the defender reacts with a protection strategy. We provide the first computational complexity results associated with MCN and its subgames. Moreover, by considering unitary, weighted, undirected and directed graphs, we clarify how the theoretical tractability or intractability of those problems vary. Our findings contribute with new NP-complete, -complete and -complete problems.
Motivated by the intrinsic intractability of the MCN, we then design efficient heuristics for the game by building upon the recent approaches seeking to learn heuristics for combinatorial optimization problems through graph neural networks and reinforcement learning. But contrary to previous work, we tackle situations with multiple players taking decisions sequentially. By framing them in a multi-agent reinforcement learning setting, we devise a value-based method to learn to solve multilevel budgeted combinatorial problems involving two players in a zero-sum game over a graph. Our framework is based on a simple curriculum: if an agent knows how to estimate the value of instances with budgets up to B, then solving instances with budget B+1 can be done in polynomial time regardless of the direction of the optimization by checking the value of every possible afterstate. Thus, in a bottom-up approach, we generate datasets of heuristically solved instances with increasingly larger budgets to train our agent. We report results close to optimality on graphs up to 100 nodes and a 185 x speedup on average compared to the quickest exact solver known for the MCN
Scalable Influence Maximization for Multiple Products in Continuous-Time Diffusion Networks
A typical viral marketing model identifies influential users in a social network to maximize a single product adoption assuming unlimited user attention, campaign budgets, and time. In reality, multiple products need campaigns, users have limited attention, convincing users incurs costs, and advertisers have limited budgets and expect the adoptions to be maximized soon. Facing these user, monetary, and timing constraints, we formulate the problem as a submodular maximization task in a continuous-time diffusion model under the intersection of a matroid and multiple knapsack constraints. We propose a randomized algorithm estimating the user influence in a network ( nodes, edges) to an accuracy of with randomizations and computations. By exploiting the influence estimation algorithm as a subroutine, we develop an adaptive threshold greedy algorithm achieving an approximation factor of the optimal when out of the knapsack constraints are active. Extensive experiments on networks of millions of nodes demonstrate that the proposed algorithms achieve the state-of-the-art in terms of effectiveness and scalability
The Convex Relaxation Barrier, Revisited: Tightened Single-Neuron Relaxations for Neural Network Verification
We improve the effectiveness of propagation- and linear-optimization-based
neural network verification algorithms with a new tightened convex relaxation
for ReLU neurons. Unlike previous single-neuron relaxations which focus only on
the univariate input space of the ReLU, our method considers the multivariate
input space of the affine pre-activation function preceding the ReLU. Using
results from submodularity and convex geometry, we derive an explicit
description of the tightest possible convex relaxation when this multivariate
input is over a box domain. We show that our convex relaxation is significantly
stronger than the commonly used univariate-input relaxation which has been
proposed as a natural convex relaxation barrier for verification. While our
description of the relaxation may require an exponential number of
inequalities, we show that they can be separated in linear time and hence can
be efficiently incorporated into optimization algorithms on an as-needed basis.
Based on this novel relaxation, we design two polynomial-time algorithms for
neural network verification: a linear-programming-based algorithm that
leverages the full power of our relaxation, and a fast propagation algorithm
that generalizes existing approaches. In both cases, we show that for a modest
increase in computational effort, our strengthened relaxation enables us to
verify a significantly larger number of instances compared to similar
algorithms
- …