174 research outputs found

    Learning to Search in Reinforcement Learning

    Get PDF
    In this thesis, we investigate the use of search based algorithms with deep neural networks to tackle a wide range of problems ranging from board games to video games and beyond. Drawing inspiration from AlphaGo, the first computer program to achieve superhuman performance in the game of Go, we developed a new algorithm AlphaZero. AlphaZero is a general reinforcement learning algorithm that combines deep neural networks with a Monte Carlo Tree search for planning and learning. Starting completely from scratch, without any prior human knowledge beyond the basic rules of the game, AlphaZero managed to achieve superhuman performance in Go, chess and shogi. Subsequently, building upon the success of AlphaZero, we investigated ways to extend our methods to problems in which the rules are not known or cannot be hand-coded. This line of work led to the development of MuZero, a model-based reinforcement learning agent that builds a deterministic internal model of the world and uses it to construct plans in its imagination. We applied our method to Go, chess, shogi and the classic Atari suite of video-games, achieving superhuman performance. MuZero is the first RL algorithm to master a variety of both canonical challenges for high performance planning and visually complex problems using the same principles. Finally, we describe Stochastic MuZero, a general agent that extends the applicability of MuZero to highly stochastic environments. We show that our method achieves superhuman performance in stochastic domains such as backgammon and the classic game of 2048 while matching the performance of MuZero in deterministic ones like Go

    A Survey of Monte Carlo Tree Search Methods

    Get PDF
    Monte Carlo tree search (MCTS) is a recently proposed search method that combines the precision of tree search with the generality of random sampling. It has received considerable interest due to its spectacular success in the difficult problem of computer Go, but has also proved beneficial in a range of other domains. This paper is a survey of the literature to date, intended to provide a snapshot of the state of the art after the first five years of MCTS research. We outline the core algorithm's derivation, impart some structure on the many variations and enhancements that have been proposed, and summarize the results from the key game and nongame domains to which MCTS methods have been applied. A number of open research questions indicate that the field is ripe for future work

    Blended Learning Researches in Iran: Several Fundamental Criticisms

    Get PDF
    The present study seeks to critically review the state of the blended learning researches in the Iranian context. For this critique, 47 papers about blended learning were found in a number of indexing databases and their contents were analyzed. The contents mainly revolved around use of relevant terminology, features of blended learning, methodology, levels of blended learning, variables of the study, and the analyzed educational programs. Some major criticisms that can be leveled at these studies include limited range of terminology, inappropriate use of key concepts, overemphasis on quantitative methods, overuse of pseudo-empirical method, lack of case studies, mistaking blended learning for application of computers in education, excessive concentration on the level of educational programs, superficial treatment of the distinction between learning and retaining, lack of attention to some of the variables of blended learning, and use of blended learning for primary and secondary education

    Fair Use and Machine Learning

    Get PDF
    There would be a beaten path to the maker of software that could reliably state whether a use of a copyrighted work was protected as fair use. But applying machine learning to fair use faces considerable hurdles. Fair use has generated hundreds of reported cases, but machine learning works best with examples in greater numbers. More examples may be available, from mining the decision making of web sites, from having humans judge fair use examples just as they label images to teach self-driving cars, and using machine learning itself to generate examples. Beyond the number of examples, the form of the data is more abstract than the concrete examples on which machine learning has succeeded, such as computer vision, viewing recommendations, and even in comparison to machine translation, where the operative unit was the sentence, not a concept that could be distributed across a document. But techniques presently in use do find patterns in data to build more abstract features, and then use the same process to build more abstract features. It may be that such automated processes can provide the conceptual blocks necessary. In addition, tools drawn from knowledge engineering (ironically, the branch of artificial intelligence that of late has been eclipsed by machine learning) may extract concepts from such data as judicial opinions. Such tools would include new methods of knowledge representation and automated tagging. If the data questions are overcome, machine learning provides intriguing possibilities, but also faces challenges from the nature of fair use law. Artificial neural networks have shown formidable performance in classification. Classifying fair use examples raises a number of questions. Fair use law is often considered contradictory, vague, and unpredictable. In computer science terminology, the data is “noisy.” That inconsistency could flummox artificial neural networks, or the networks could disclose consistencies that have eluded commentators. Other algorithms such as nearest neighbor and support vectors could likewise both use and test legal reasoning by analogy. Another approach to machine learning, decision trees, may be simpler than other approaches in some respects, but could work on smaller data sets (addressing one of the data issues above) and provide something that machine learning often lacks: transparency. Decision trees disclose their decision-making process, whereas neural networks, especially deep learning, are opaque black boxes. Finally, unsupervised machine learning could be used to explore fair use case law for patterns, whether they be consistent structures in its jurisprudence, or biases that have played an undisclosed role. Any possible patterns found, however, should be treated as possibilities, pending testing by other means

    Automatic Emotion Recognition from Mandarin Speech

    Get PDF

    Procedural content generation: better benchmarks for transfer reinforcement learning

    Get PDF
    The idea of transfer in reinforcement learning (TRL) is intriguing: being able to transfer knowledge from one problem to another problem without learning everything from scratch. This promises quicker learning and learning more complex methods. To gain an insight into the field and to detect emerging trends, we performed a database search. We note a surprisingly late adoption of deep learning that starts in 2018. The introduction of deep learning has not yet solved the greatest challenge of TRL: generalization. Transfer between different domains works well when domains have strong similarities (e.g. MountainCar to Cartpole), and most TRL publications focus on different tasks within the same domain that have few differences. Most TRL applications we encountered compare their improvements against self-defined baselines, and the field is still missing unified benchmarks. We consider this to be a disappointing situation. For the future, we note that: (1) A clear measure of task similarity is needed. (2) Generalization needs to improve. Promising approaches merge deep learning with planning via MCTS or introduce memory through LSTMs. (3) The lack of benchmarking tools will be remedied to enable meaningful comparison and measure progress. Already Alchemy and Meta-World are emerging as interesting benchmark suites. We note that another development, the increase in procedural content generation (PCG), can improve both benchmarking and generalization in TRL.LIACS-Managemen

    Deep Lake: a Lakehouse for Deep Learning

    Full text link
    Traditional data lakes provide critical data infrastructure for analytical workloads by enabling time travel, running SQL queries, ingesting data with ACID transactions, and visualizing petabyte-scale datasets on cloud storage. They allow organizations to break down data silos, unlock data-driven decision-making, improve operational efficiency, and reduce costs. However, as deep learning takes over common analytical workflows, traditional data lakes become less useful for applications such as natural language processing (NLP), audio processing, computer vision, and applications involving non-tabular datasets. This paper presents Deep Lake, an open-source lakehouse for deep learning applications developed at Activeloop. Deep Lake maintains the benefits of a vanilla data lake with one key difference: it stores complex data, such as images, videos, annotations, as well as tabular data, in the form of tensors and rapidly streams the data over the network to (a) Tensor Query Language, (b) in-browser visualization engine, or (c) deep learning frameworks without sacrificing GPU utilization. Datasets stored in Deep Lake can be accessed from PyTorch, TensorFlow, JAX, and integrate with numerous MLOps tools
    corecore