12 research outputs found
The on-line shortest path problem under partial monitoring
The on-line shortest path problem is considered under various models of
partial monitoring. Given a weighted directed acyclic graph whose edge weights
can change in an arbitrary (adversarial) way, a decision maker has to choose in
each round of a game a path between two distinguished vertices such that the
loss of the chosen path (defined as the sum of the weights of its composing
edges) be as small as possible. In a setting generalizing the multi-armed
bandit problem, after choosing a path, the decision maker learns only the
weights of those edges that belong to the chosen path. For this problem, an
algorithm is given whose average cumulative loss in n rounds exceeds that of
the best path, matched off-line to the entire sequence of the edge weights, by
a quantity that is proportional to 1/\sqrt{n} and depends only polynomially on
the number of edges of the graph. The algorithm can be implemented with linear
complexity in the number of rounds n and in the number of edges. An extension
to the so-called label efficient setting is also given, in which the decision
maker is informed about the weights of the edges corresponding to the chosen
path at a total of m << n time instances. Another extension is shown where the
decision maker competes against a time-varying path, a generalization of the
problem of tracking the best expert. A version of the multi-armed bandit
setting for shortest path is also discussed where the decision maker learns
only the total weight of the chosen path but not the weights of the individual
edges on the path. Applications to routing in packet switched networks along
with simulation results are also presented.Comment: 35 page
Ăšj mĂłdszerek az adattömörĂtĂ©sben = New methods in data compression
Univerzális, kis kĂ©sleltetĂ©sű kĂłdokat terveztĂĽnk individuális sorozatok vesztesĂ©ges tömörĂtĂ©sĂ©re, melyek ugyanolyan jĂł teljesĂtmĂ©nyt nyĂşjtanak, mint a sorozathoz illesztett legjobb idĹ‘ben változĂł kĂłd egy referenciaosztálybĂłl, mely az alkalmazott kĂłdolási eljárást idĹ‘rĹ‘l idĹ‘re változtathatja. HatĂ©kony, kis komplexitásĂş implementáciĂłt kĂ©szĂtettĂĽnk arra az esetre, amikor az alap-referenciaosztály a hagyományos vagy bizonyos hálĂłzati skalárkvantálĂłk osztálya. Ăšj Ăştvonalválasztási mĂłdszereket dolgoztunk ki kommunikáciĂłs hálĂłzatokra, melyek aszimptotikusan ugyanolyan jĂł QoS (csomagvesztĂ©si arány, kĂ©sleltetĂ©s) eredmĂ©nyt adnak, mint a változĂł hálĂłzati környezethez (utĂłlag) illesztett legjobb Ăşt. KiemelendĹ‘, hogy a mĂłdszer teljesĂtmĂ©nye Ă©s komplexitása idĹ‘ben optimális konvergenciasebessĂ©g mellett a hálĂłzat mĂ©retĂ©vel (Ă©s nem az utak számával) skálázik. KĂsĂ©rletek szerint az elterjedt standard bájt-alapĂş tömörĂtĹ‘ algoritmusok rosszul teljesĂtenek, ha a forrás nem bájt-alapĂş, ugyanakkor a bit-alapĂş mĂłdszerek jĂłl működnek bájt-alapĂş forrásokra is (továbbá komplexitásuk - az alkalmazott kisebb ábĂ©cĂ© miatt - gyakran lĂ©nyegesen kisebb). Ezt a megfigyelĂ©st elmĂ©letileg is igazoltuk, megvizsgálva, hogy hogyan közelĂthetĹ‘ek blokk-Markov-források magasabb rendű szimbĂłlum-alapĂş Markov-modellek segĂtsĂ©gĂ©vel. Megoldottuk a ládapakolási problĂ©ma egy szekvenciális, on-line változatát, mely alkalmazhatĂł bizonyos, kevĂ©s erĹ‘forrással rendelkezĹ‘ szenzorok hatĂ©kony adásĂĽtemezĂ©sĂ©re. | We designed limited-delay data compression methods that perform asymptotically as well as the best time-varying code from a reference family (matched to the source sequence in hindsight) that can change the employed base code several times. We provided efficient, low-complexity solutions for the cases when the base reference class is the set of traditional or certain network scalar quantizers. We developed routing algorithms for communication networks that can provide asymptotically as good QoS parameters (such as packet loss ratio or delay) as the best fixed path in the network matched to the varying conditions in hindsight. The performance and complexity of the developed methods scale with the size of the network (instead of with the number of paths) even when the rate of convergence (in time) is optimal. Experiments indicate that data for which bytes are not the natural choice of symbols compress poorly using standard byte-based implementations of lossless data compression algorithms, while algorithms working on a bit level perform reasonably on byte-based data (in addition to having computational advantages resulting from operating on a small alphabet). We explained this phenomenon by analyzing how block Markov sources can be approximated with symbol-based higher order Markov sources. We provided a solution to a sequential on-line version of the bin packing problem, which can be applied to schedule transmissions for certain sensors with limited resources
Mirror Descent Meets Fixed Share (and feels no regret)
Mirror descent with an entropic regularizer is known to achieve shifting
regret bounds that are logarithmic in the dimension. This is done using either
a carefully designed projection or by a weight sharing technique. Via a novel
unified analysis, we show that these two approaches deliver essentially
equivalent bounds on a notion of regret generalizing shifting, adaptive,
discounted, and other related regrets. Our analysis also captures and extends
the generalized weight sharing technique of Bousquet and Warmuth, and can be
refined in several ways, including improvements for small losses and adaptive
tuning of parameters
Improved Regret Bounds for Tracking Experts with Memory
We address the problem of sequential prediction with expert advice in a
non-stationary environment with long-term memory guarantees in the sense of
Bousquet and Warmuth [4]. We give a linear-time algorithm that improves on the
best known regret bounds [26]. This algorithm incorporates a relative entropy
projection step. This projection is advantageous over previous weight-sharing
approaches in that weight updates may come with implicit costs as in for
example portfolio optimization. We give an algorithm to compute this projection
step in linear time, which may be of independent interest
Online Multitask Learning with Long-Term Memory
We introduce a novel online multitask setting. In this setting each task is
partitioned into a sequence of segments that is unknown to the learner.
Associated with each segment is a hypothesis from some hypothesis class. We
give algorithms that are designed to exploit the scenario where there are many
such segments but significantly fewer associated hypotheses. We prove regret
bounds that hold for any segmentation of the tasks and any association of
hypotheses to the segments. In the single-task setting this is equivalent to
switching with long-term memory in the sense of [Bousquet and Warmuth; 2003].
We provide an algorithm that predicts on each trial in time linear in the
number of hypotheses when the hypothesis class is finite. We also consider
infinite hypothesis classes from reproducing kernel Hilbert spaces for which we
give an algorithm whose per trial time complexity is cubic in the number of
cumulative trials. In the single-task special case this is the first example of
an efficient regret-bounded switching algorithm with long-term memory for a
non-parametric hypothesis class
Online Matrix Completion with Side Information
This thesis considers the problem of binary matrix completion with side information in the online setting and the applications thereof. The side information provides additional information on the rows and columns and can yield improved results compared to when such information is not available. We present efficient and general algorithms in transductive and inductive models. The performance guarantees that we prove are with respect to the matrix complexity measures of the max-norm and the margin complexity. We apply our bounds to the hypothesis class of biclustered matrices. Such matrices can be permuted through the rows and columns into homogeneous latent blocks. This class is a natural choice for our problem since the margin complexity and max-norm of these matrices have an upper bound that is easy to interpret in terms of the latent dimensions. We also apply our algorithms to a novel online multitask setting with RKHS hypothesis classes. In this setting, each task is partitioned in a sequence of segments, where a hypothesis is associated with each segment. Our algorithms are designed to exploit the scenario where the number of associated hypotheses is much smaller than the number of segments. We prove performance guarantees that hold for any segmentation of the tasks and any association of hypotheses to the segments. In the single-task setting, this is analogous to switching with long-term memory in the sense of [Bousquet and Warmuth; 2003]