12 research outputs found

    The on-line shortest path problem under partial monitoring

    Get PDF
    The on-line shortest path problem is considered under various models of partial monitoring. Given a weighted directed acyclic graph whose edge weights can change in an arbitrary (adversarial) way, a decision maker has to choose in each round of a game a path between two distinguished vertices such that the loss of the chosen path (defined as the sum of the weights of its composing edges) be as small as possible. In a setting generalizing the multi-armed bandit problem, after choosing a path, the decision maker learns only the weights of those edges that belong to the chosen path. For this problem, an algorithm is given whose average cumulative loss in n rounds exceeds that of the best path, matched off-line to the entire sequence of the edge weights, by a quantity that is proportional to 1/\sqrt{n} and depends only polynomially on the number of edges of the graph. The algorithm can be implemented with linear complexity in the number of rounds n and in the number of edges. An extension to the so-called label efficient setting is also given, in which the decision maker is informed about the weights of the edges corresponding to the chosen path at a total of m << n time instances. Another extension is shown where the decision maker competes against a time-varying path, a generalization of the problem of tracking the best expert. A version of the multi-armed bandit setting for shortest path is also discussed where the decision maker learns only the total weight of the chosen path but not the weights of the individual edges on the path. Applications to routing in packet switched networks along with simulation results are also presented.Comment: 35 page

    Új módszerek az adattömörítésben = New methods in data compression

    Get PDF
    Univerzális, kis késleltetésű kódokat terveztünk individuális sorozatok veszteséges tömörítésére, melyek ugyanolyan jó teljesítményt nyújtanak, mint a sorozathoz illesztett legjobb időben változó kód egy referenciaosztályból, mely az alkalmazott kódolási eljárást időről időre változtathatja. Hatékony, kis komplexitású implementációt készítettünk arra az esetre, amikor az alap-referenciaosztály a hagyományos vagy bizonyos hálózati skalárkvantálók osztálya. Új útvonalválasztási módszereket dolgoztunk ki kommunikációs hálózatokra, melyek aszimptotikusan ugyanolyan jó QoS (csomagvesztési arány, késleltetés) eredményt adnak, mint a változó hálózati környezethez (utólag) illesztett legjobb út. Kiemelendő, hogy a módszer teljesítménye és komplexitása időben optimális konvergenciasebesség mellett a hálózat méretével (és nem az utak számával) skálázik. Kísérletek szerint az elterjedt standard bájt-alapú tömörítő algoritmusok rosszul teljesítenek, ha a forrás nem bájt-alapú, ugyanakkor a bit-alapú módszerek jól működnek bájt-alapú forrásokra is (továbbá komplexitásuk - az alkalmazott kisebb ábécé miatt - gyakran lényegesen kisebb). Ezt a megfigyelést elméletileg is igazoltuk, megvizsgálva, hogy hogyan közelíthetőek blokk-Markov-források magasabb rendű szimbólum-alapú Markov-modellek segítségével. Megoldottuk a ládapakolási probléma egy szekvenciális, on-line változatát, mely alkalmazható bizonyos, kevés erőforrással rendelkező szenzorok hatékony adásütemezésére. | We designed limited-delay data compression methods that perform asymptotically as well as the best time-varying code from a reference family (matched to the source sequence in hindsight) that can change the employed base code several times. We provided efficient, low-complexity solutions for the cases when the base reference class is the set of traditional or certain network scalar quantizers. We developed routing algorithms for communication networks that can provide asymptotically as good QoS parameters (such as packet loss ratio or delay) as the best fixed path in the network matched to the varying conditions in hindsight. The performance and complexity of the developed methods scale with the size of the network (instead of with the number of paths) even when the rate of convergence (in time) is optimal. Experiments indicate that data for which bytes are not the natural choice of symbols compress poorly using standard byte-based implementations of lossless data compression algorithms, while algorithms working on a bit level perform reasonably on byte-based data (in addition to having computational advantages resulting from operating on a small alphabet). We explained this phenomenon by analyzing how block Markov sources can be approximated with symbol-based higher order Markov sources. We provided a solution to a sequential on-line version of the bin packing problem, which can be applied to schedule transmissions for certain sensors with limited resources

    Mirror Descent Meets Fixed Share (and feels no regret)

    Get PDF
    Mirror descent with an entropic regularizer is known to achieve shifting regret bounds that are logarithmic in the dimension. This is done using either a carefully designed projection or by a weight sharing technique. Via a novel unified analysis, we show that these two approaches deliver essentially equivalent bounds on a notion of regret generalizing shifting, adaptive, discounted, and other related regrets. Our analysis also captures and extends the generalized weight sharing technique of Bousquet and Warmuth, and can be refined in several ways, including improvements for small losses and adaptive tuning of parameters

    Improved Regret Bounds for Tracking Experts with Memory

    Get PDF
    We address the problem of sequential prediction with expert advice in a non-stationary environment with long-term memory guarantees in the sense of Bousquet and Warmuth [4]. We give a linear-time algorithm that improves on the best known regret bounds [26]. This algorithm incorporates a relative entropy projection step. This projection is advantageous over previous weight-sharing approaches in that weight updates may come with implicit costs as in for example portfolio optimization. We give an algorithm to compute this projection step in linear time, which may be of independent interest

    Online Multitask Learning with Long-Term Memory

    Get PDF
    We introduce a novel online multitask setting. In this setting each task is partitioned into a sequence of segments that is unknown to the learner. Associated with each segment is a hypothesis from some hypothesis class. We give algorithms that are designed to exploit the scenario where there are many such segments but significantly fewer associated hypotheses. We prove regret bounds that hold for any segmentation of the tasks and any association of hypotheses to the segments. In the single-task setting this is equivalent to switching with long-term memory in the sense of [Bousquet and Warmuth; 2003]. We provide an algorithm that predicts on each trial in time linear in the number of hypotheses when the hypothesis class is finite. We also consider infinite hypothesis classes from reproducing kernel Hilbert spaces for which we give an algorithm whose per trial time complexity is cubic in the number of cumulative trials. In the single-task special case this is the first example of an efficient regret-bounded switching algorithm with long-term memory for a non-parametric hypothesis class

    Adaptive Routing Using Expert Advice

    Full text link

    Online Matrix Completion with Side Information

    Get PDF
    This thesis considers the problem of binary matrix completion with side information in the online setting and the applications thereof. The side information provides additional information on the rows and columns and can yield improved results compared to when such information is not available. We present efficient and general algorithms in transductive and inductive models. The performance guarantees that we prove are with respect to the matrix complexity measures of the max-norm and the margin complexity. We apply our bounds to the hypothesis class of biclustered matrices. Such matrices can be permuted through the rows and columns into homogeneous latent blocks. This class is a natural choice for our problem since the margin complexity and max-norm of these matrices have an upper bound that is easy to interpret in terms of the latent dimensions. We also apply our algorithms to a novel online multitask setting with RKHS hypothesis classes. In this setting, each task is partitioned in a sequence of segments, where a hypothesis is associated with each segment. Our algorithms are designed to exploit the scenario where the number of associated hypotheses is much smaller than the number of segments. We prove performance guarantees that hold for any segmentation of the tasks and any association of hypotheses to the segments. In the single-task setting, this is analogous to switching with long-term memory in the sense of [Bousquet and Warmuth; 2003]
    corecore