205 research outputs found

    AdCraft: An Advanced Reinforcement Learning Benchmark Environment for Search Engine Marketing Optimization

    Full text link
    We introduce AdCraft, a novel benchmark environment for the Reinforcement Learning (RL) community distinguished by its stochastic and non-stationary properties. The environment simulates bidding and budgeting dynamics within Search Engine Marketing (SEM), a digital marketing technique utilizing paid advertising to enhance the visibility of websites on search engine results pages (SERPs). The performance of SEM advertisement campaigns depends on several factors, including keyword selection, ad design, bid management, budget adjustments, and performance monitoring. Deep RL recently emerged as a potential strategy to optimize campaign profitability within the complex and dynamic landscape of SEM but it requires substantial data, which may be costly or infeasible to acquire in practice. Our customizable environment enables practitioners to assess and enhance the robustness of RL algorithms pertinent to SEM bid and budget management without such costs. Through a series of experiments within the environment, we demonstrate the challenges imposed by sparsity and non-stationarity on agent convergence and performance. We hope these challenges further encourage discourse and development around effective strategies for managing real-world uncertainties

    Finite-Time Thermodynamics

    Get PDF
    The theory around the concept of finite time describes how processes of any nature can be optimized in situations when their rate is required to be non-negligible, i.e., they must come to completion in a finite time. What the theory makes explicit is “the cost of haste”. Intuitively, it is quite obvious that you drive your car differently if you want to reach your destination as quickly as possible as opposed to the case when you are running out of gas. Finite-time thermodynamics quantifies such opposing requirements and may provide the optimal control to achieve the best compromise. The theory was initially developed for heat engines (steam, Otto, Stirling, a.o.) and for refrigerators, but it has by now evolved into essentially all areas of dynamic systems from the most abstract ones to the most practical ones. The present collection shows some fascinating current examples

    Thermodynamics of quantum systems under dynamical control

    Full text link
    In this review the debated rapport between thermodynamics and quantum mechanics is addressed in the framework of the theory of periodically-driven/controlled quantum-thermodynamic machines. The basic model studied here is that of a two-level system (TLS), whose energy is periodically modulated while the system is coupled to thermal baths. When the modulation interval is short compared to the bath memory time, the system-bath correlations are affected, thereby causing cooling or heating of the TLS, depending on the interval. In steady state, a periodically-modulated TLS coupled to two distinct baths constitutes the simplest quantum heat machine (QHM) that may operate as either an engine or a refrigerator, depending on the modulation rate. We find their efficiency and power-output bounds and the conditions for attaining these bounds. An extension of this model to multilevel systems shows that the QHM power output can be boosted by the multilevel degeneracy. These results are used to scrutinize basic thermodynamic principles: (i) Externally-driven/modulated QHMs may attain the Carnot efficiency bound, but when the driving is done by a quantum device ("piston"), the efficiency strongly depends on its initial quantum state. Such dependence has been unknown thus far. (ii) The refrigeration rate effected by QHMs does not vanish as the temperature approaches absolute zero for certain quantized baths, e.g., magnons, thous challenging Nernst's unattainability principle. (iii) System-bath correlations allow more work extraction under periodic control than that expected from the Szilard-Landauer principle, provided the period is in the non-Markovian domain. Thus, dynamically-controlled QHMs may benefit from hitherto unexploited thermodynamic resources

    Radio observations of active galactic nuclei with mm-VLBI

    Full text link
    Over the past few decades, our knowledge of jets produced by active galactic nuclei (AGN) has greatly progressed thanks to the development of very-long-baseline interferometry (VLBI). Nevertheless, the crucial mechanisms involved in the formation of the plasma flow, as well as those driving its exceptional radiative output up to TeV energies, remain to be clarified. Most likely, these physical processes take place at short separations from the supermassive black hole, on scales which are inaccessible to VLBI observations at centimeter wavelengths. Due to their high synchrotron opacity, the dense and highly magnetized regions in the vicinity of the central engine can only be penetrated when observing at shorter wavelengths, in the millimeter and sub-millimeter regimes. While this was recognized already in the early days of VLBI, it was not until the very recent years that sensitive VLBI imaging at high frequencies has become possible. Ongoing technical development and wide band observing now provide adequate imaging fidelity to carry out more detailed analyses. In this article we overview some open questions concerning the physics of AGN jets, and we discuss the impact of mm-VLBI studies. Among the rich set of results produced so far in this frequency regime, we particularly focus on studies performed at 43 GHz (7 mm) and at 86 GHz (3 mm). Some of the first findings at 230 GHz (1 mm) obtained with the Event Horizon Telescope are also presented.Comment: Published in The Astronomy & Astrophysics Review. Open access: https://link.springer.com/article/10.1007/s00159-017-0105-

    Lipschitzness Is All You Need To Tame Off-policy Generative Adversarial Imitation Learning

    Full text link
    Despite the recent success of reinforcement learning in various domains, these approaches remain, for the most part, deterringly sensitive to hyper-parameters and are often riddled with essential engineering feats allowing their success. We consider the case of off-policy generative adversarial imitation learning, and perform an in-depth review, qualitative and quantitative, of the method. We show that forcing the learned reward function to be local Lipschitz-continuous is a sine qua non condition for the method to perform well. We then study the effects of this necessary condition and provide several theoretical results involving the local Lipschitzness of the state-value function. We complement these guarantees with empirical evidence attesting to the strong positive effect that the consistent satisfaction of the Lipschitzness constraint on the reward has on imitation performance. Finally, we tackle a generic pessimistic reward preconditioning add-on spawning a large class of reward shaping methods, which makes the base method it is plugged into provably more robust, as shown in several additional theoretical guarantees. We then discuss these through a fine-grained lens and share our insights. Crucially, the guarantees derived and reported in this work are valid for any reward satisfying the Lipschitzness condition, nothing is specific to imitation. As such, these may be of independent interest
    corecore