Search CORE

509,625 research outputs found

Online Meta-learning by Parallel Algorithm Competition

Author: Baker James E.
Bertsekas D. P.
Downey Carlton
Gabillon V.
Goodfellow Ian
Mnih Volodymyr
Snoek Jasper
Snoek Jasper
Springenberg Jost T.
Sutton S.
Sutton S.
Szita I.
Unemi T.
Wu Jian
Publication venue
Publication date: 24/02/2017
Field of study

The efficiency of reinforcement learning algorithms depends critically on a few meta-parameters that modulates the learning updates and the trade-off between exploration and exploitation. The adaptation of the meta-parameters is an open question in reinforcement learning, which arguably has become more of an issue recently with the success of deep reinforcement learning in high-dimensional state spaces. The long learning times in domains such as Atari 2600 video games makes it not feasible to perform comprehensive searches of appropriate meta-parameter values. We propose the Online Meta-learning by Parallel Algorithm Competition (OMPAC) method. In the OMPAC method, several instances of a reinforcement learning algorithm are run in parallel with small differences in the initial values of the meta-parameters. After a fixed number of episodes, the instances are selected based on their performance in the task at hand. Before continuing the learning, Gaussian noise is added to the meta-parameters with a predefined probability. We validate the OMPAC method by improving the state-of-the-art results in stochastic SZ-Tetris and in standard Tetris with a smaller, 10

\times

10, board, by 31% and 84%, respectively, and by improving the results for deep Sarsa(

\lambda

) agents in three Atari 2600 games by 62% or more. The experiments also show the ability of the OMPAC method to adapt the meta-parameters according to the learning progress in different tasks.Comment: 15 pages, 10 figures. arXiv admin note: text overlap with arXiv:1702.0311

arXiv.org e-Print Archive

Crossref

Slow Learners are Fast

Author: Langford John
Smola Alexander
Zinkevich Martin
Publication venue
Publication date: 01/01/2009
Field of study

Online learning algorithms have impressive convergence properties when it comes to risk minimization and convex games on very large problems. However, they are inherently sequential in their design which prevents them from taking advantage of modern multi-core architectures. In this paper we prove that online learning with delayed updates converges well, thereby facilitating parallel online learning.Comment: Extended version of conference paper - NIPS 200

arXiv.org e-Print Archive

CiteSeerX

Pedagogy First, Technology Second: teaching & learning information literacy online

Author: Bradbury Stephanie J.
Fell Peter
Peacock Judith A.
Vollmerhause Kurt
Publication venue
Publication date: 01/01/2003
Field of study

This paper explores the pedagogical and technical issues, challenges and outcomes of creating an online information literacy course. Currently under development, this course will be offered as a parallel study option to Advanced Information Retrieval Skills (AIRS:IFN001 ) for QUT postgraduate students, a compulsory face-to-face course for all QUT research students. The aim of this project is to optimise students’ access to AIRS:IFN001 and meet the University’s objectives regarding flexible delivery and online teaching. Still in its developmental stages, AIRS::Online extends beyond the current notion of static online information literacy tutorials by providing a facilitated, student focussed learning environment comprising content and learning experiences enhanced by appropriate multimedia technology and resources which engage students in planned facilitated and/or self-paced learning events. Course assessment is formative and summative, and is comprised of a research log and reflective journal to provide a means for reviewing the content and key process of advanced information searching and retrieval

Queensland University of Technology ePrints Archive

Dynamic Metric Learning from Pairwise Comparisons

Author: Greenewald Kristjan
Hero III Alfred
Kelley Stephen
Publication venue
Publication date: 10/10/2016
Field of study

Recent work in distance metric learning has focused on learning transformations of data that best align with specified pairwise similarity and dissimilarity constraints, often supplied by a human observer. The learned transformations lead to improved retrieval, classification, and clustering algorithms due to the better adapted distance or similarity measures. Here, we address the problem of learning these transformations when the underlying constraint generation process is nonstationary. This nonstationarity can be due to changes in either the ground-truth clustering used to generate constraints or changes in the feature subspaces in which the class structure is apparent. We propose Online Convex Ensemble StrongLy Adaptive Dynamic Learning (OCELAD), a general adaptive, online approach for learning and tracking optimal metrics as they change over time that is highly robust to a variety of nonstationary behaviors in the changing metric. We apply the OCELAD framework to an ensemble of online learners. Specifically, we create a retro-initialized composite objective mirror descent (COMID) ensemble (RICE) consisting of a set of parallel COMID learners with different learning rates, demonstrate RICE-OCELAD on both real and synthetic data sets and show significant performance improvements relative to previously proposed batch and online distance metric learning algorithms.Comment: to appear Allerton 2016. arXiv admin note: substantial text overlap with arXiv:1603.0367

arXiv.org e-Print Archive

Crossref

The Practice of Telecommuting: A Fresh Perspective

Author: Franco GANDOLFI
Gary OSTER
Publication venue
Publication date
Field of study

Telecommuting has been a popular practice for an increasing number of firms and governmental bodies over the past decade or more. This research paper reviews antecedents, implementation considerations, known consequences, barriers, and recommendations that need to be determined prior to the adoption of telecommuting practices. The paper demonstrates that the phenomenon of telecommuting is the result of historical, sociological, and technological shifts and advancements. While firms have successfully implemented various elements of telecommuting practices, challenges along the way have yielded insights and lessons that merit further examination and discussion. This paper asserts that with selected individuals, proper structure, and sufficient feedback mechanisms in place, the adoption of telecommuting has the capacity to strengthen a firm’s bottom line and provide tangible benefit for its employees. As a case in point, online learning, developed in parallel with the growth of telecommuting, yields substantial benefits for employees and the companies in which they serve. For employees, online learning is convenient, accommodates multiple learning styles, and is an engaging learning mechanism. For corporations, online learning encourages cost-effectiveness, uniformity in quality and flexibility, and enhanced cross-cultural and cross-disciplinary communications, all necessary to meet the challenges of the ever-changing global marketplace.telecommuting; technology; online learning; social media; innovation; institutional learning; cross-cultural communications.

Research Papers in Economics