174 research outputs found
Learning to Search in Reinforcement Learning
In this thesis, we investigate the use of search based algorithms with deep neural
networks to tackle a wide range of problems ranging from board games to video
games and beyond. Drawing inspiration from AlphaGo, the first computer program
to achieve superhuman performance in the game of Go, we developed a new algorithm AlphaZero. AlphaZero is a general reinforcement learning algorithm that
combines deep neural networks with a Monte Carlo Tree search for planning and
learning. Starting completely from scratch, without any prior human knowledge
beyond the basic rules of the game, AlphaZero managed to achieve superhuman
performance in Go, chess and shogi. Subsequently, building upon the success of AlphaZero, we investigated ways to extend our methods to problems in which the rules
are not known or cannot be hand-coded. This line of work led to the development
of MuZero, a model-based reinforcement learning agent that builds a deterministic
internal model of the world and uses it to construct plans in its imagination. We
applied our method to Go, chess, shogi and the classic Atari suite of video-games,
achieving superhuman performance. MuZero is the first RL algorithm to master
a variety of both canonical challenges for high performance planning and visually complex problems using the same principles. Finally, we describe Stochastic
MuZero, a general agent that extends the applicability of MuZero to highly stochastic environments. We show that our method achieves superhuman performance in
stochastic domains such as backgammon and the classic game of 2048 while matching the performance of MuZero in deterministic ones like Go
A Survey of Monte Carlo Tree Search Methods
Monte Carlo tree search (MCTS) is a recently proposed search method that combines the precision of tree search with the generality of random sampling. It has received considerable interest due to its spectacular success in the difficult problem of computer Go, but has also proved beneficial in a range of other domains. This paper is a survey of the literature to date, intended to provide a snapshot of the state of the art after the first five years of MCTS research. We outline the core algorithm's derivation, impart some structure on the many variations and enhancements that have been proposed, and summarize the results from the key game and nongame domains to which MCTS methods have been applied. A number of open research questions indicate that the field is ripe for future work
Blended Learning Researches in Iran: Several Fundamental Criticisms
The present study seeks to critically review the state of the blended learning researches in the Iranian context. For this critique, 47 papers about blended learning were found in a number of indexing databases and their contents were analyzed. The contents mainly revolved around use of relevant terminology, features of blended learning, methodology, levels of blended learning, variables of the study, and the analyzed educational programs. Some major criticisms that can be leveled at these studies include limited range of terminology, inappropriate use of key concepts, overemphasis on quantitative methods, overuse of pseudo-empirical method, lack of case studies, mistaking blended learning for application of computers in education, excessive concentration on the level of educational programs, superficial treatment of the distinction between learning and retaining, lack of attention to some of the variables of blended learning, and use of blended learning for primary and secondary education
Fair Use and Machine Learning
There would be a beaten path to the maker of software that could reliably state whether a use of a copyrighted work was protected as fair use. But applying machine learning to fair use faces considerable hurdles. Fair use has generated hundreds of reported cases, but machine learning works best with examples in greater numbers. More examples may be available, from mining the decision making of web sites, from having humans judge fair use examples just as they label images to teach self-driving cars, and using machine learning itself to generate examples. Beyond the number of examples, the form of the data is more abstract than the concrete examples on which machine learning has succeeded, such as computer vision, viewing recommendations, and even in comparison to machine translation, where the operative unit was the sentence, not a concept that could be distributed across a document. But techniques presently in use do find patterns in data to build more abstract features, and then use the same process to build more abstract features. It may be that such automated processes can provide the conceptual blocks necessary. In addition, tools drawn from knowledge engineering (ironically, the branch of artificial intelligence that of late has been eclipsed by machine learning) may extract concepts from such data as judicial opinions. Such tools would include new methods of knowledge representation and automated tagging. If the data questions are overcome, machine learning provides intriguing possibilities, but also faces challenges from the nature of fair use law. Artificial neural networks have shown formidable performance in classification. Classifying fair use examples raises a number of questions. Fair use law is often considered contradictory, vague, and unpredictable. In computer science terminology, the data is “noisy.” That inconsistency could flummox artificial neural networks, or the networks could disclose consistencies that have eluded commentators. Other algorithms such as nearest neighbor and support vectors could likewise both use and test legal reasoning by analogy. Another approach to machine learning, decision trees, may be simpler than other approaches in some respects, but could work on smaller data sets (addressing one of the data issues above) and provide something that machine learning often lacks: transparency. Decision trees disclose their decision-making process, whereas neural networks, especially deep learning, are opaque black boxes. Finally, unsupervised machine learning could be used to explore fair use case law for patterns, whether they be consistent structures in its jurisprudence, or biases that have played an undisclosed role. Any possible patterns found, however, should be treated as possibilities, pending testing by other means
Procedural content generation: better benchmarks for transfer reinforcement learning
The idea of transfer in reinforcement learning (TRL) is intriguing: being able to transfer knowledge from one problem to another problem without learning everything from scratch. This promises quicker learning and learning more complex methods. To gain an insight into the field and to detect emerging trends, we performed a database search. We note a surprisingly late adoption of deep learning that starts in 2018. The introduction of deep learning has not yet solved the greatest challenge of TRL: generalization. Transfer between different domains works well when domains have strong similarities (e.g. MountainCar to Cartpole), and most TRL publications focus on different tasks within the same domain that have few differences. Most TRL applications we encountered compare their improvements against self-defined baselines, and the field is still missing unified benchmarks. We consider this to be a disappointing situation. For the future, we note that: (1) A clear measure of task similarity is needed. (2) Generalization needs to improve. Promising approaches merge deep learning with planning via MCTS or introduce memory through LSTMs. (3) The lack of benchmarking tools will be remedied to enable meaningful comparison and measure progress. Already Alchemy and Meta-World are emerging as interesting benchmark suites. We note that another development, the increase in procedural content generation (PCG), can improve both benchmarking and generalization in TRL.LIACS-Managemen
Deep Lake: a Lakehouse for Deep Learning
Traditional data lakes provide critical data infrastructure for analytical
workloads by enabling time travel, running SQL queries, ingesting data with
ACID transactions, and visualizing petabyte-scale datasets on cloud storage.
They allow organizations to break down data silos, unlock data-driven
decision-making, improve operational efficiency, and reduce costs. However, as
deep learning takes over common analytical workflows, traditional data lakes
become less useful for applications such as natural language processing (NLP),
audio processing, computer vision, and applications involving non-tabular
datasets. This paper presents Deep Lake, an open-source lakehouse for deep
learning applications developed at Activeloop. Deep Lake maintains the benefits
of a vanilla data lake with one key difference: it stores complex data, such as
images, videos, annotations, as well as tabular data, in the form of tensors
and rapidly streams the data over the network to (a) Tensor Query Language, (b)
in-browser visualization engine, or (c) deep learning frameworks without
sacrificing GPU utilization. Datasets stored in Deep Lake can be accessed from
PyTorch, TensorFlow, JAX, and integrate with numerous MLOps tools
- …