502 research outputs found

    Robustness - a challenge also for the 21st century: A review of robustness phenomena in technical, biological and social systems as well as robust approaches in engineering, computer science, operations research and decision aiding

    Get PDF
    Notions on robustness exist in many facets. They come from different disciplines and reflect different worldviews. Consequently, they contradict each other very often, which makes the term less applicable in a general context. Robustness approaches are often limited to specific problems for which they have been developed. This means, notions and definitions might reveal to be wrong if put into another domain of validity, i.e. context. A definition might be correct in a specific context but need not hold in another. Therefore, in order to be able to speak of robustness we need to specify the domain of validity, i.e. system, property and uncertainty of interest. As proofed by Ho et al. in an optimization context with finite and discrete domains, without prior knowledge about the problem there exists no solution what so ever which is more robust than any other. Similar to the results of the No Free Lunch Theorems of Optimization (NLFTs) we have to exploit the problem structure in order to make a solution more robust. This optimization problem is directly linked to a robustness/fragility tradeoff which has been observed in many contexts, e.g. 'robust, yet fragile' property of HOT (Highly Optimized Tolerance) systems. Another issue is that robustness is tightly bounded to other phenomena like complexity for which themselves exist no clear definition or theoretical framework. Consequently, this review rather tries to find common aspects within many different approaches and phenomena than to build a general theorem for robustness, which anyhow might not exist because complex phenomena often need to be described from a pluralistic view to address as many aspects of a phenomenon as possible. First, many different robustness problems have been reviewed from many different disciplines. Second, different common aspects will be discussed, in particular the relationship of functional and structural properties. This paper argues that robustness phenomena are also a challenge for the 21st century. It is a useful quality of a model or system in terms of the 'maintenance of some desired system characteristics despite fluctuations in the behaviour of its component parts or its environment' (s. [Carlson and Doyle, 2002], p. 2). We define robustness phenomena as solution with balanced tradeoffs and robust design principles and robustness measures as means to balance tradeoffs. --

    Learning Adversarial Low-rank Markov Decision Processes with Unknown Transition and Full-information Feedback

    Full text link
    In this work, we study the low-rank MDPs with adversarially changed losses in the full-information feedback setting. In particular, the unknown transition probability kernel admits a low-rank matrix decomposition \citep{REPUCB22}, and the loss functions may change adversarially but are revealed to the learner at the end of each episode. We propose a policy optimization-based algorithm POLO, and we prove that it attains the O~(K56A12dln⁥(1+M)/(1−γ)2)\widetilde{O}(K^{\frac{5}{6}}A^{\frac{1}{2}}d\ln(1+M)/(1-\gamma)^2) regret guarantee, where dd is rank of the transition kernel (and hence the dimension of the unknown representations), AA is the cardinality of the action space, MM is the cardinality of the model class, and Îł\gamma is the discounted factor. Notably, our algorithm is oracle-efficient and has a regret guarantee with no dependence on the size of potentially arbitrarily large state space. Furthermore, we also prove an Ω(Îł21−γdAK)\Omega(\frac{\gamma^2}{1-\gamma} \sqrt{d A K}) regret lower bound for this problem, showing that low-rank MDPs are statistically more difficult to learn than linear MDPs in the regret minimization setting. To the best of our knowledge, we present the first algorithm that interleaves representation learning, exploration, and exploitation to achieve the sublinear regret guarantee for RL with nonlinear function approximation and adversarial losses

    Universal Algorithmic Intelligence: A mathematical top->down approach

    Full text link
    Sequential decision theory formally solves the problem of rational agents in uncertain worlds if the true environmental prior probability distribution is known. Solomonoff's theory of universal induction formally solves the problem of sequence prediction for unknown prior distribution. We combine both ideas and get a parameter-free theory of universal Artificial Intelligence. We give strong arguments that the resulting AIXI model is the most intelligent unbiased agent possible. We outline how the AIXI model can formally solve a number of problem classes, including sequence prediction, strategic games, function minimization, reinforcement and supervised learning. The major drawback of the AIXI model is that it is uncomputable. To overcome this problem, we construct a modified algorithm AIXItl that is still effectively more intelligent than any other time t and length l bounded agent. The computation time of AIXItl is of the order t x 2^l. The discussion includes formal definitions of intelligence order relations, the horizon problem and relations of the AIXI theory to other AI approaches.Comment: 70 page

    A survey on online active learning

    Full text link
    Online active learning is a paradigm in machine learning that aims to select the most informative data points to label from a data stream. The problem of minimizing the cost associated with collecting labeled observations has gained a lot of attention in recent years, particularly in real-world applications where data is only available in an unlabeled form. Annotating each observation can be time-consuming and costly, making it difficult to obtain large amounts of labeled data. To overcome this issue, many active learning strategies have been proposed in the last decades, aiming to select the most informative observations for labeling in order to improve the performance of machine learning models. These approaches can be broadly divided into two categories: static pool-based and stream-based active learning. Pool-based active learning involves selecting a subset of observations from a closed pool of unlabeled data, and it has been the focus of many surveys and literature reviews. However, the growing availability of data streams has led to an increase in the number of approaches that focus on online active learning, which involves continuously selecting and labeling observations as they arrive in a stream. This work aims to provide an overview of the most recently proposed approaches for selecting the most informative observations from data streams in the context of online active learning. We review the various techniques that have been proposed and discuss their strengths and limitations, as well as the challenges and opportunities that exist in this area of research. Our review aims to provide a comprehensive and up-to-date overview of the field and to highlight directions for future work

    Gamification of Pure Exploration for Linear Bandits

    Get PDF
    We investigate an active pure-exploration setting, that includes best-arm identification, in the context of linear stochastic bandits. While asymptotically optimal algorithms exist for standard multi-arm bandits, the existence of such algorithms for the best-arm identification in linear bandits has been elusive despite several attempts to address it. First, we provide a thorough comparison and new insight over different notions of optimality in the linear case, including G-optimality, transductive optimality from optimal experimental design and asymptotic optimality. Second, we design the first asymptotically optimal algorithm for fixed-confidence pure exploration in linear bandits. As a consequence, our algorithm naturally bypasses the pitfall caused by a simple but difficult instance, that most prior algorithms had to be engineered to deal with explicitly. Finally, we avoid the need to fully solve an optimal design problem by providing an approach that entails an efficient implementation.Comment: 11+25 pages. To be published in the proceedings of ICML 202

    What Happens in the Field Stays in the Field: Exploring Whether Professionals Play Minimax in Laboratory Experiments

    Get PDF
    The minimax argument represents game theory in its most elegant form: simple but with stark predictions. Although some of these predictions have been met with reasonable success in the field, experimental data have generally not provided results close to the theoretical predictions. In a striking study, Palacios-Huerta and Volij (2007) present evidence that potentially resolves this puzzle: both amateur and professional soccer players play nearly exact minimax strategies in laboratory experiments. In this paper, we establish important bounds on these results by examining the behavior of four distinct subject pools: college students, bridge professionals, world-class poker players, who have vast experience with high-stakes randomization in card games, and American professional soccer players. In contrast to Palacios-Huerta and Volij’s results, we find little evidence that real-world experience transfers to the lab in these games—indeed, similar to previous experimental results, all four subject pools provide choices that are generally not close to minimax predictions. We use two additional pieces of evidence to explore why professionals do not perform well in the lab: (1) complementary experimental treatments that pit professionals against preprogrammed computers, and (2) post-experiment questionnaires. The most likely explanation is that these professionals are unable to transfer their skills at randomization from the familiar context of the field to the unfamiliar context of the lab.
    • 

    corecore