4,824 research outputs found

    Algebraization Levels in the Study of Probability

    Get PDF
    This research was funded by Project PID2019-105601GB-I00/AEI/10.13039/501100011033 and Research Group FQM-126 (Junta de Andalucia).The paper aims to analyze how the different degrees of mathematical formalization can be worked in the study of probability at non-university educational levels. The model of algebraization levels for mathematical practices based on the onto-semiotic approach is applied to identify the different objects and processes involved in the resolution of a selection of probabilistic problems. As a result, we describe the possible progression from arithmetic and proto-algebraic levels of mathematical activity to higher levels of algebraization and formalization in the study of probability. The method of analysis developed can help to establish connections between intuitive/informal and progressively more formal approaches in the study of mathematics.Junta de Andalucia FQM-126PID2019-105601GB-I00/AEI/10.13039/50110001103

    Causal Reinforcement Learning: A Survey

    Full text link
    Reinforcement learning is an essential paradigm for solving sequential decision problems under uncertainty. Despite many remarkable achievements in recent decades, applying reinforcement learning methods in the real world remains challenging. One of the main obstacles is that reinforcement learning agents lack a fundamental understanding of the world and must therefore learn from scratch through numerous trial-and-error interactions. They may also face challenges in providing explanations for their decisions and generalizing the acquired knowledge. Causality, however, offers a notable advantage as it can formalize knowledge in a systematic manner and leverage invariance for effective knowledge transfer. This has led to the emergence of causal reinforcement learning, a subfield of reinforcement learning that seeks to enhance existing algorithms by incorporating causal relationships into the learning process. In this survey, we comprehensively review the literature on causal reinforcement learning. We first introduce the basic concepts of causality and reinforcement learning, and then explain how causality can address core challenges in non-causal reinforcement learning. We categorize and systematically review existing causal reinforcement learning approaches based on their target problems and methodologies. Finally, we outline open issues and future directions in this emerging field.Comment: 48 pages, 10 figure

    SoK: Memorization in General-Purpose Large Language Models

    Full text link
    Large Language Models (LLMs) are advancing at a remarkable pace, with myriad applications under development. Unlike most earlier machine learning models, they are no longer built for one specific application but are designed to excel in a wide range of tasks. A major part of this success is due to their huge training datasets and the unprecedented number of model parameters, which allow them to memorize large amounts of information contained in the training data. This memorization goes beyond mere language, and encompasses information only present in a few documents. This is often desirable since it is necessary for performing tasks such as question answering, and therefore an important part of learning, but also brings a whole array of issues, from privacy and security to copyright and beyond. LLMs can memorize short secrets in the training data, but can also memorize concepts like facts or writing styles that can be expressed in text in many different ways. We propose a taxonomy for memorization in LLMs that covers verbatim text, facts, ideas and algorithms, writing styles, distributional properties, and alignment goals. We describe the implications of each type of memorization - both positive and negative - for model performance, privacy, security and confidentiality, copyright, and auditing, and ways to detect and prevent memorization. We further highlight the challenges that arise from the predominant way of defining memorization with respect to model behavior instead of model weights, due to LLM-specific phenomena such as reasoning capabilities or differences between decoding algorithms. Throughout the paper, we describe potential risks and opportunities arising from memorization in LLMs that we hope will motivate new research directions
    • …
    corecore