306 research outputs found

    Uncertainty in Artificial Intelligence: Proceedings of the Thirty-Fourth Conference

    Get PDF

    Contributions to Monte Carlo Search

    Full text link
    This research is motivated by improving decision making under uncertainty and in particular for games and symbolic regression. The present dissertation gathers research contributions in the field of Monte Carlo Search. These contributions are focused around the selection, the simulation and the recommendation policies. Moreover, we develop a methodology to automatically generate an MCS algorithm for a given problem. For the selection policy, in most of the bandit literature, it is assumed that there is no structure or similarities between arms. Thus each arm is independent from one another. In several instances however, arms can be closely related. We show both theoretically and empirically, that a significant improvement over the state-of-the-art selection policies is possible. For the contribution on simulation policy, we focus on the symbolic regression problem and ponder on how to consistently generate different expressions by changing the probability to draw each symbol. We formalize the situation into an optimization problem and try different approaches. We show a clear improvement in the sampling process for any length. We further test the best approach by embedding it into a MCS algorithm and it still shows an improvement. For the contribution on recommendation policy, we study the most common in combination with selection policies. A good recommendation policy is a policy that works well with a given selection policy. We show that there is a trend that seems to favor a robust recommendation policy over a riskier one. We also present a contribution where we automatically generate several MCS algorithms from a list of core components upon which most MCS algorithms are built upon and compare them to generic algorithms. The results show that it often enables discovering new variants of MCS that significantly outperform generic MCS algorithms

    Ensembles for sequence learning

    Get PDF
    This thesis explores the application of ensemble methods to sequential learning tasks. The focus is on the development and the critical examination of new methods or novel applications of existing methods, with emphasis on supervised and reinforcement learning problems. In both types of problems, even after having observed a certain amount of data, we are often faced with uncertainty as to which hypothesis is correct among all the possible ones. However, in many methods for both supervised and for reinforcement learning problems this uncertainty is ignored, in the sense that there is a single solution selected out of the whole of the hypothesis space. Apart from the classical solution of analytical Bayesian formulations, ensemble methods offer an alternative approach to representing this uncertainty. This is done simply through maintaining a set of alternative hypotheses. The sequential supervised problem considered is that of automatic speech recognition using hidden Markov models. The application of ensemble methods to the problem represents a challenge in itself, since most such methods can not be readily adapted to sequential learning tasks. This thesis proposes a number of different approaches for applying ensemble methods to speech recognition and develops methods for effective training of phonetic mixtures with or without access to phonetic alignment data. Furthermore, the notion of expected loss is introduced for integrating probabilistic models with the boosting approach. In some cases substantial improvements over the baseline system are obtained. In reinforcement learning problems the goal is to act in such a way as to maximise future reward in a given environment. In such problems uncertainty becomes important since neither the environment nor the distribution of rewards that result from each action are known. This thesis presents novel algorithms for acting nearly optimally under uncertainty based on theoretical considerations. Some ensemble-based representations of uncertainty (including a fully Bayesian model) are developed and tested on a few simple tasks resulting in performance comparable with the state of the art. The thesis also draws some parallels between a proposed representation of uncertainty based on gradient-estimates and on"prioritised sweeping" and between the application of reinforcement learning to controlling an ensemble of classifiers and classical supervised ensemble learning methods

    Meta-Stability of Interacting Adaptive Agents

    Get PDF
    The adaptive process can be considered as being driven by two fundamental forces: exploitation and exploration. While the explorative process may be deterministic, the resultant effect may be stochastic. Stochastic effects may also exist in the expoitative process. This thesis considers the effects of stochastic fluctuations inherent in the adaptive process on the behavioural dynamics of a population of interacting agents. It is hypothesied that in such systems, one or more attractors in the population space exist; and that transitions between these attractors can occur; either as a result of internal shocks (sampling fluctuations) or external shocks (environmental changes). It is further postulated that such transitions in the (microscopic) population space may be observable as phase transitions in the behaviour of macroscopic observables. A simple model of a stock market, driven by asexual reproduction (selection plus mutation) is put forward as a testbed. A statistical dynamics analysis of the behaviour of this market is then developed. Fixed points in the space of agent behaviours are located, and market dynamics are compared to the analytic predictions. Additionally, an analysis of the relative importance of internal shocks(sampling fluctuations) and external shocks( the stock dividend sequence) across varying population size is presented

    On Experimentation in Software-Intensive Systems

    Get PDF
    Context: Delivering software that has value to customers is a primary concern of every software company. Prevalent in web-facing companies, controlled experiments are used to validate and deliver value in incremental deployments. At the same that web-facing companies are aiming to automate and reduce the cost of each experiment iteration, embedded systems companies are starting to adopt experimentation practices and leverage their activities on the automation developments made in the online domain. Objective: This thesis has two main objectives. The first objective is to analyze how software companies can run and optimize their systems through automated experiments. This objective is investigated from the perspectives of the software architecture, the algorithms for the experiment execution and the experimentation process. The second objective is to analyze how non web-facing companies can adopt experimentation as part of their development process to validate and deliver value to their customers continuously. This objective is investigated from the perspectives of the software development process and focuses on the experimentation aspects that are distinct from web-facing companies. Method: To achieve these objectives, we conducted research in close collaboration with industry and used a combination of different empirical research methods: case studies, literature reviews, simulations, and empirical evaluations. Results: This thesis provides six main results. First, it proposes an architecture framework for automated experimentation that can be used with different types of experimental designs in both embedded systems and web-facing systems. Second, it proposes a new experimentation process to capture the details of a trustworthy experimentation process that can be used as the basis for an automated experimentation process. Third, it identifies the restrictions and pitfalls of different multi-armed bandit algorithms for automating experiments in industry. This thesis also proposes a set of guidelines to help practitioners select a technique that minimizes the occurrence of these pitfalls. Fourth, it proposes statistical models to analyze optimization algorithms that can be used in automated experimentation. Fifth, it identifies the key challenges faced by embedded systems companies when adopting controlled experimentation, and we propose a set of strategies to address these challenges. Sixth, it identifies experimentation techniques and proposes a new continuous experimentation model for mission-critical and business-to-business. Conclusion: The results presented in this thesis indicate that the trustworthiness in the experimentation process and the selection of algorithms still need to be addressed before automated experimentation can be used at scale in industry. The embedded systems industry faces challenges in adopting experimentation as part of its development process. In part, this is due to the low number of users and devices that can be used in experiments and the diversity of the required experimental designs for each new situation. This limitation increases both the complexity of the experimentation process and the number of techniques used to address this constraint

    Learning domain abstractions for long lived robots

    Get PDF
    Recent trends in robotics have seen more general purpose robots being deployed in unstructured environments for prolonged periods of time. Such robots are expected to adapt to different environmental conditions, and ultimately take on a broader range of responsibilities, the specifications of which may change online after the robot has been deployed. We propose that in order for a robot to be generally capable in an online sense when it encounters a range of unknown tasks, it must have the ability to continually learn from a lifetime of experience. Key to this is the ability to generalise from experiences and form representations which facilitate faster learning of new tasks, as well as the transfer of knowledge between different situations. However, experience cannot be managed na¨ıvely: one does not want constantly expanding tables of data, but instead continually refined abstractions of the data – much like humans seem to abstract and organise knowledge. If this agent is active in the same, or similar, classes of environments for a prolonged period of time, it is provided with the opportunity to build abstract representations in order to simplify the learning of future tasks. The domain is a common structure underlying large families of tasks, and exploiting this affords the agent the potential to not only minimise relearning from scratch, but over time to build better models of the environment. We propose to learn such regularities from the environment, and extract the commonalities between tasks. This thesis aims to address the major question: what are the domain invariances which should be learnt by a long lived agent which encounters a range of different tasks? This question can be decomposed into three dimensions for learning invariances, based on perception, action and interaction. We present novel algorithms for dealing with each of these three factors. Firstly, how does the agent learn to represent the structure of the world? We focus here on learning inter-object relationships from depth information as a concise representation of the structure of the domain. To this end we introduce contact point networks as a topological abstraction of a scene, and present an algorithm based on support vector machine decision boundaries for extracting these from three dimensional point clouds obtained from the agent’s experience of a domain. By reducing the specific geometry of an environment into general skeletons based on contact between different objects, we can autonomously learn predicates describing spatial relationships. Secondly, how does the agent learn to acquire general domain knowledge? While the agent attempts new tasks, it requires a mechanism to control exploration, particularly when it has many courses of action available to it. To this end we draw on the fact that many local behaviours are common to different tasks. Identifying these amounts to learning “common sense” behavioural invariances across multiple tasks. This principle leads to our concept of action priors, which are defined as Dirichlet distributions over the action set of the agent. These are learnt from previous behaviours, and expressed as the prior probability of selecting each action in a state, and are used to guide the learning of novel tasks as an exploration policy within a reinforcement learning framework. Finally, how can the agent react online with sparse information? There are times when an agent is required to respond fast to some interactive setting, when it may have encountered similar tasks previously. To address this problem, we introduce the notion of types, being a latent class variable describing related problem instances. The agent is required to learn, identify and respond to these different types in online interactive scenarios. We then introduce Bayesian policy reuse as an algorithm that involves maintaining beliefs over the current task instance, updating these from sparse signals, and selecting and instantiating an optimal response from a behaviour library. This thesis therefore makes the following contributions. We provide the first algorithm for autonomously learning spatial relationships between objects from point cloud data. We then provide an algorithm for extracting action priors from a set of policies, and show that considerable gains in speed can be achieved in learning subsequent tasks over learning from scratch, particularly in reducing the initial losses associated with unguided exploration. Additionally, we demonstrate how these action priors allow for safe exploration, feature selection, and a method for analysing and advising other agents’ movement through a domain. Finally, we introduce Bayesian policy reuse which allows an agent to quickly draw on a library of policies and instantiate the correct one, enabling rapid online responses to adversarial conditions

    Brasil encima de todo, Dios encima de todos : una etnografía del Sentido Colonial Metafórico Bolsonarista en las elecciones brasileñas de 2018

    Get PDF
    Tesis inédita de la Universidad Complutense de Madrid, Facultad de Ciencias Políticas y Sociología, leída el 26-04-2022This thesis seeks to explain the success of Brazil’s far-right-wing Bolsonarists in the 2018 election campaign. It analyses how the different communicative forms present in the campaign, were structured around foundational metaphors that generated the entire cognitive universe of Bolsonarismo, allowing it to link into the type of society Brazil actually was: a post-colonial society, historically racist, hierarchical and founded on a regime of the normalisation of violence, which at the same time built its management of these dilemmas on the cordiality and flexibility of its relationships through festive expressions, such as carnivals and football. Influenced by Caio Prado Jr’s (2011/1942) concept of colonial sense, I have called the relationship between Bolsonarist metaphorical thinking and Brazil’s colonial origin a metaphorical colonial sense. This concept of metaphorical colonial sense has been crucial to the testing of the hypotheses that were the starting points for this thesis. These were that the Bolsonarist phenomenon could be explained as a product of the existence of a global systemic crisis, as a product of the existence of specific elements in Brazilian culture closely linked to the colonial past that included an acceptance of hierarchies and a specific historical social order that was perceived to be at risk, and through the category of “far-right” rather than fascism, as, although Bolsonarismo shared certain features of fascism, it was a complex, particular and peculiarly Brazilian phenomenon better explained by the concept of “far-right.”..Esta tesis busca explicar el éxito de la extrema derecha brasileña Bolsonarista en la campaña electoral de 2018. Se analiza cómo las diferentes formas comunicativas presentes en la campaña, se estructuraron en torno a metáforas fundacionales que generaron todo el universo cognitivo del Bolsonarismo, permitiéndole vincularse al tipo de sociedad que Brasil realmente era: una sociedad postcolonial, históricamente racista, jerárquica y fundada en un régimen de normalización de la violencia, que al mismo tiempo construía su gestión de estos dilemas en la cordialidad y flexibilidad de sus relaciones a través de expresiones festivas, como los carnavales y el fútbol. Influenciado por el concepto de sentido colonial de Caio Prado Jr (2011/1942), he llamado sentido colonial metafórico a la relación entre el pensamiento metafórico Bolsonarista y el origen colonial de Brasil. El concepto del sentido colonial metafórico ha sido central para la comprobación de las hipótesis de partida de esta tesis. Estas fueron que el fenómeno Bolsonarista podía explicarse como producto de la existencia de una crisis sistémica global, como producto de la existencia de elementos específicos de la cultura brasileña estrechamente vinculados al pasado colonial que incluían una aceptación de las jerarquías y un orden social histórico específico que se percibía en riesgo, y a través de la categoría de “extrema derecha” y no del fascismo, ya que, aunque el Bolsonarismo compartía ciertos rasgos del fascismo, era un fenómeno complejo, particular y peculiarmente brasileño que se explicaba mejor con el concepto de “extrema derecha.”..Fac. de Ciencias Políticas y SociologíaTRUEunpu

    The weight of experience: an investigation of probability weighting under decisions from experience

    Get PDF
    In decisions from experience tasks objective information regarding payoffs and probabilities must be inferred from samples of possible outcomes. A series of recent experiments has revealed that people show deviating choice behaviour in such tasks, indicating underweighting of small probabilities instead of overweighting of small probabilities as in decisions from description. In a range of experiments, the research presented in this thesis provides a new direction by showing that such reversals from overweighting to underweighting in decisions from experience are very robust and can be replicated even if all the existing explanations - sampling error, recency weighting and judgement error - are experimentally controlled for. Furthermore, reversals were replicated within common decision making biases like the common ratio effect. An important, but unexpected, new finding has been the observation of a reversed reflection effect under decisions from experience. This suggests that the difference between choice behaviour may not be restricted to underlying transformations of probabilities, as suggested in the literature. Drawing from an extensive range of model tests and parameter estimations, it is also demonstrated that the differences are reflected in the best fitting parameter values for prospect theory under decisions from experience. However, it is also shown that simple reinforcement models, which provide a more intuitive rationale for experiential choice behaviour, can account for the data just as well, without any assumptions regarding the weighting of probabilities
    corecore