1,571 research outputs found
Optimal quantum control of Bose Einstein condensates in magnetic microtraps
Transport of Bose-Einstein condensates in magnetic microtraps, controllable
by external parameters such as wire currents or radio-frequency fields, is
studied within the framework of optimal control theory (OCT). We derive from
the Gross-Pitaevskii equation the optimality system for the OCT fields that
allow to efficiently channel the condensate between given initial and desired
states. For a variety of magnetic confinement potentials we study transport and
wavefunction splitting of the condensate, and demonstrate that OCT allows to
drastically outperfrom more simple schemes for the time variation of the
microtrap control parameters.Comment: 11 pages, 7 figure
Learning-aided Stochastic Network Optimization with Imperfect State Prediction
We investigate the problem of stochastic network optimization in the presence
of imperfect state prediction and non-stationarity. Based on a novel
distribution-accuracy curve prediction model, we develop the predictive
learning-aided control (PLC) algorithm, which jointly utilizes historic and
predicted network state information for decision making. PLC is an online
algorithm that requires zero a-prior system statistical information, and
consists of three key components, namely sequential distribution estimation and
change detection, dual learning, and online queue-based control.
Specifically, we show that PLC simultaneously achieves good long-term
performance, short-term queue size reduction, accurate change detection, and
fast algorithm convergence. In particular, for stationary networks, PLC
achieves a near-optimal , utility-delay
tradeoff. For non-stationary networks, \plc{} obtains an
utility-backlog tradeoff for distributions that last
time, where
is the prediction accuracy and is a constant (the
Backpressue algorithm \cite{neelynowbook} requires an length
for the same utility performance with a larger backlog). Moreover, PLC detects
distribution change slots faster with high probability ( is the
prediction size) and achieves an convergence time. Our results demonstrate
that state prediction (even imperfect) can help (i) achieve faster detection
and convergence, and (ii) obtain better utility-delay tradeoffs
Ancilla-assisted sequential approximation of nonlocal unitary operations
We consider the recently proposed "no-go" theorem of Lamata et al [Phys. Rev.
Lett. 101, 180506 (2008)] on the impossibility of sequential implementation of
global unitary operations with the aid of an itinerant ancillary system and
view the claim within the language of Kraus representation. By virtue of an
extremely useful tool for analyzing entanglement properties of quantum
operations, namely, operator-Schmidt decomposition, we provide alternative
proof to the "no-go" theorem and also study the role of initial correlations
between the qubits and ancilla in sequential preparation of unitary entanglers.
Despite the negative response from the "no-go" theorem, we demonstrate
explicitly how the matrix-product operator(MPO) formalism provides a flexible
structure to develop protocols for sequential implementation of such entanglers
with an optimal fidelity. The proposed numerical technique, that we call
variational matrix-product operator (VMPO), offers a computationally efficient
tool for characterizing the "globalness" and entangling capabilities of
nonlocal unitary operations.Comment: Slightly improved version as published in Phys. Rev.
Approximate policy iteration: A survey and some new methods
We consider the classical policy iteration method of dynamic programming (DP), where approximations and simulation are used to deal with the curse of dimensionality. We survey a number of issues: convergence and rate of convergence of approximate policy evaluation methods, singularity and susceptibility to simulation noise of policy evaluation, exploration issues, constrained and enhanced policy iteration, policy oscillation and chattering, and optimistic and distributed policy iteration. Our discussion of policy evaluation is couched in general terms and aims to unify the available methods in the light of recent research developments and to compare the two main policy evaluation approaches: projected equations and temporal differences (TD), and aggregation. In the context of these approaches, we survey two different types of simulation-based algorithms: matrix inversion methods, such as least-squares temporal difference (LSTD), and iterative methods, such as least-squares policy evaluation (LSPE) and TD (λ), and their scaled variants. We discuss a recent method, based on regression and regularization, which rectifies the unreliability of LSTD for nearly singular projected Bellman equations. An iterative version of this method belongs to the LSPE class of methods and provides the connecting link between LSTD and LSPE. Our discussion of policy improvement focuses on the role of policy oscillation and its effect on performance guarantees. We illustrate that policy evaluation when done by the projected equation/TD approach may lead to policy oscillation, but when done by aggregation it does not. This implies better error bounds and more regular performance for aggregation, at the expense of some loss of generality in cost function representation capability. Hard aggregation provides the connecting link between projected equation/TD-based and aggregation-based policy evaluation, and is characterized by favorable error bounds.National Science Foundation (U.S.) (No.ECCS-0801549)Los Alamos National Laboratory. Information Science and Technology InstituteUnited States. Air Force (No.FA9550-10-1-0412
Generalizing movements with information-theoretic stochastic optimal control
Stochastic optimal control is typically used to plan a movement for a specific situation. Although most stochastic optimal control methods fail to generalize this movement plan to a new situation without replanning, a stochastic optimal control method is presented that allows reuse of the obtained policy in a new situation, as the policy is more robust to slight deviations from the initial movement plan. To improve the robustness of the policy, we employ information-theoretic policy updates that explicitly operate on trajectory distributions instead of single trajectories. To ensure a stable and smooth policy update, the âdistanceâ is limited between the trajectory distributions of the old and the new control policies. The introduced bound offers a closed-form solution for the resulting policy and extends results from recent developments in stochastic optimal control. In contrast to many standard stochastic optimal control algorithms, the current approach can directly infer the system dynamics from data points, and hence can also be used for model-based reinforcement learning. This paper represents an extension of the paper by Lioutikov et al. (âSample-Based Information-Theoretic Stochastic Optimal Control,â Proceedings of 2014 IEEE International Conference on Robotics and Automation (ICRA), IEEE, Piscataway, NJ, 2014, pp. 3896â3902). In addition to revisiting the content, an extensive theoretical comparison is presented of the approach with related work, additional aspects of the implementation are discussed, and further evaluations are introduced
Towards a Universal Theory of Artificial Intelligence based on Algorithmic Probability and Sequential Decision Theory
Decision theory formally solves the problem of rational agents in uncertain
worlds if the true environmental probability distribution is known.
Solomonoff's theory of universal induction formally solves the problem of
sequence prediction for unknown distribution. We unify both theories and give
strong arguments that the resulting universal AIXI model behaves optimal in any
computable environment. The major drawback of the AIXI model is that it is
uncomputable. To overcome this problem, we construct a modified algorithm
AIXI^tl, which is still superior to any other time t and space l bounded agent.
The computation time of AIXI^tl is of the order t x 2^l.Comment: 8 two-column pages, latex2e, 1 figure, submitted to ijca
Reducing Electricity Demand Charge for Data Centers with Partial Execution
Data centers consume a large amount of energy and incur substantial
electricity cost. In this paper, we study the familiar problem of reducing data
center energy cost with two new perspectives. First, we find, through an
empirical study of contracts from electric utilities powering Google data
centers, that demand charge per kW for the maximum power used is a major
component of the total cost. Second, many services such as Web search tolerate
partial execution of the requests because the response quality is a concave
function of processing time. Data from Microsoft Bing search engine confirms
this observation.
We propose a simple idea of using partial execution to reduce the peak power
demand and energy cost of data centers. We systematically study the problem of
scheduling partial execution with stringent SLAs on response quality. For a
single data center, we derive an optimal algorithm to solve the workload
scheduling problem. In the case of multiple geo-distributed data centers, the
demand of each data center is controlled by the request routing algorithm,
which makes the problem much more involved. We decouple the two aspects, and
develop a distributed optimization algorithm to solve the large-scale request
routing problem. Trace-driven simulations show that partial execution reduces
cost by for one data center, and by for geo-distributed
data centers together with request routing.Comment: 12 page
Fault Tolerant Filtering and Fault Detection for Quantum Systems Driven By Fields in Single Photon States
The purpose of this paper is to solve a fault tolerant filtering and fault
detection problem for a class of open quantum systems driven by a
continuous-mode bosonic input field in single photon states when the systems
are subject to stochastic faults. Optimal estimates of both the system
observables and the fault process are simultaneously calculated and
characterized by a set of coupled recursive quantum stochastic differential
equations.Comment: arXiv admin note: text overlap with arXiv:1504.0678
Linearly Parameterized Bandits
We consider bandit problems involving a large (possibly infinite) collection
of arms, in which the expected reward of each arm is a linear function of an
-dimensional random vector , where .
The objective is to minimize the cumulative regret and Bayes risk. When the set
of arms corresponds to the unit sphere, we prove that the regret and Bayes risk
is of order , by establishing a lower bound for an
arbitrary policy, and showing that a matching upper bound is obtained through a
policy that alternates between exploration and exploitation phases. The
phase-based policy is also shown to be effective if the set of arms satisfies a
strong convexity condition. For the case of a general set of arms, we describe
a near-optimal policy whose regret and Bayes risk admit upper bounds of the
form .Comment: 40 pages; updated results and reference
- âŠ