1,414 research outputs found

    Coherent spin-networks

    Full text link
    In this paper we discuss a proposal of coherent states for Loop Quantum Gravity. These states are labeled by a point in the phase space of General Relativity as captured by a spin-network graph. They are defined as the gauge invariant projection of a product over links of Hall's heat-kernels for the cotangent bundle of SU(2). The labels of the state are written in terms of two unit-vectors, a spin and an angle for each link of the graph. The heat-kernel time is chosen to be a function of the spin. These labels are the ones used in the Spin Foam setting and admit a clear geometric interpretation. Moreover, the set of labels per link can be written as an element of SL(2,C). Therefore, these states coincide with Thiemann's coherent states with the area operator as complexifier. We study the properties of semiclassicality of these states and show that, for large spins, they reproduce a superposition over spins of spin-networks with nodes labeled by Livine-Speziale coherent intertwiners. Moreover, the weight associated to spins on links turns out to be given by a Gaussian times a phase as originally proposed by Rovelli.Comment: 15 page

    Learning Contextual Bandits in a Non-stationary Environment

    Full text link
    Multi-armed bandit algorithms have become a reference solution for handling the explore/exploit dilemma in recommender systems, and many other important real-world problems, such as display advertisement. However, such algorithms usually assume a stationary reward distribution, which hardly holds in practice as users' preferences are dynamic. This inevitably costs a recommender system consistent suboptimal performance. In this paper, we consider the situation where the underlying distribution of reward remains unchanged over (possibly short) epochs and shifts at unknown time instants. In accordance, we propose a contextual bandit algorithm that detects possible changes of environment based on its reward estimation confidence and updates its arm selection strategy respectively. Rigorous upper regret bound analysis of the proposed algorithm demonstrates its learning effectiveness in such a non-trivial environment. Extensive empirical evaluations on both synthetic and real-world datasets for recommendation confirm its practical utility in a changing environment.Comment: 10 pages, 13 figures, To appear on ACM Special Interest Group on Information Retrieval (SIGIR) 201

    Logical analysis of data as a tool for the analysis of probabilistic discrete choice behavior

    Get PDF
    Probabilistic Discrete Choice Models (PDCM) have been extensively used to interpret the behavior of heterogeneous decision makers that face discrete alternatives. The classification approach of Logical Analysis of Data (LAD) uses discrete optimization to generate patterns, which are logic formulas characterizing the different classes. Patterns can be seen as rules explaining the phenomenon under analysis. In this work we discuss how LAD can be used as the first phase of the specification of PDCM. Since in this task the number of patterns generated may be extremely large, and many of them may be nearly equivalent, additional processing is necessary to obtain practically meaningful information. Hence, we propose computationally viable techniques to obtain small sets of patterns that constitute meaningful representations of the phenomenon and allow to discover significant associations between subsets of explanatory variables and the output. We consider the complex socio-economic problem of the analysis of the utilization of the Internet in Italy, using real data gathered by the Italian National Institute of Statistics

    Delay and Cooperation in Nonstochastic Bandits

    Get PDF
    We study networks of communicating learning agents that cooperate to solve a common nonstochastic bandit problem. Agents use an underlying communication network to get messages about actions selected by other agents, and drop messages that took more than dd hops to arrive, where dd is a delay parameter. We introduce \textsc{Exp3-Coop}, a cooperative version of the {\sc Exp3} algorithm and prove that with KK actions and NN agents the average per-agent regret after TT rounds is at most of order (d+1+KNα≀d)(Tln⁥K)\sqrt{\bigl(d+1 + \tfrac{K}{N}\alpha_{\le d}\bigr)(T\ln K)}, where α≀d\alpha_{\le d} is the independence number of the dd-th power of the connected communication graph GG. We then show that for any connected graph, for d=Kd=\sqrt{K} the regret bound is K1/4TK^{1/4}\sqrt{T}, strictly better than the minimax regret KT\sqrt{KT} for noncooperating agents. More informed choices of dd lead to bounds which are arbitrarily close to the full information minimax regret Tln⁥K\sqrt{T\ln K} when GG is dense. When GG has sparse components, we show that a variant of \textsc{Exp3-Coop}, allowing agents to choose their parameters according to their centrality in GG, strictly improves the regret. Finally, as a by-product of our analysis, we provide the first characterization of the minimax regret for bandit learning with delay.Comment: 30 page

    From Bandits to Experts: A Tale of Domination and Independence

    Full text link
    We consider the partial observability model for multi-armed bandits, introduced by Mannor and Shamir. Our main result is a characterization of regret in the directed observability model in terms of the dominating and independence numbers of the observability graph. We also show that in the undirected case, the learner can achieve optimal regret without even accessing the observability graph before selecting an action. Both results are shown using variants of the Exp3 algorithm operating on the observability graph in a time-efficient manner

    Boltzmann Exploration Done Right

    Get PDF
    Boltzmann exploration is a classic strategy for sequential decision-making under uncertainty, and is one of the most standard tools in Reinforcement Learning (RL). Despite its widespread use, there is virtually no theoretical understanding about the limitations or the actual benefits of this exploration scheme. Does it drive exploration in a meaningful way? Is it prone to misidentifying the optimal actions or spending too much time exploring the suboptimal ones? What is the right tuning for the learning rate? In this paper, we address several of these questions in the classic setup of stochastic multi-armed bandits. One of our main results is showing that the Boltzmann exploration strategy with any monotone learning-rate sequence will induce suboptimal behavior. As a remedy, we offer a simple non-monotone schedule that guarantees near-optimal performance, albeit only when given prior access to key problem parameters that are typically not available in practical situations (like the time horizon TT and the suboptimality gap Δ\Delta). More importantly, we propose a novel variant that uses different learning rates for different arms, and achieves a distribution-dependent regret bound of order Klog⁡2TΔ\frac{K\log^2 T}{\Delta} and a distribution-independent bound of order KTlog⁡K\sqrt{KT}\log K without requiring such prior knowledge. To demonstrate the flexibility of our technique, we also propose a variant that guarantees the same performance bounds even if the rewards are heavy-tailed

    On the Troll-Trust Model for Edge Sign Prediction in Social Networks

    Get PDF
    In the problem of edge sign prediction, we are given a directed graph (representing a social network), and our task is to predict the binary labels of the edges (i.e., the positive or negative nature of the social relationships). Many successful heuristics for this problem are based on the troll-trust features, estimating at each node the fraction of outgoing and incoming positive/negative edges. We show that these heuristics can be understood, and rigorously analyzed, as approximators to the Bayes optimal classifier for a simple probabilistic model of the edge labels. We then show that the maximum likelihood estimator for this model approximately corresponds to the predictions of a Label Propagation algorithm run on a transformed version of the original social graph. Extensive experiments on a number of real-world datasets show that this algorithm is competitive against state-of-the-art classifiers in terms of both accuracy and scalability. Finally, we show that troll-trust features can also be used to derive online learning algorithms which have theoretical guarantees even when edges are adversarially labeled.Comment: v5: accepted to AISTATS 201

    An optical reaction micro-turbine

    Get PDF
    To any energy flow there is an associated flow of momentum, so that recoil forces arise every time an object absorbs or deflects incoming energy. This same principle governs the operation of macroscopic turbines as well as that of microscopic turbines that use light as the working fluid. However, a controlled and precise redistribution of optical energy is not easy to achieve at the micron scale resulting in a low efficiency of power to torque conversion. Here we use direct laser writing to fabricate 3D light guiding structures, shaped as a garden sprinkler, that can precisely reroute input optical power into multiple output channels. The shape parameters are derived from a detailed theoretical analysis of losses in curved microfibers. These optical reaction micro-turbines can maximally exploit light’s momentum to generate a strong, uniform and controllable torque
