Search CORE

13 research outputs found

Online Meta-learning by Parallel Algorithm Competition

Author: Baker James E.
Bertsekas D. P.
Downey Carlton
Gabillon V.
Goodfellow Ian
Mnih Volodymyr
Snoek Jasper
Snoek Jasper
Springenberg Jost T.
Sutton S.
Sutton S.
Szita I.
Unemi T.
Wu Jian
Publication venue
Publication date: 24/02/2017
Field of study

The efficiency of reinforcement learning algorithms depends critically on a few meta-parameters that modulates the learning updates and the trade-off between exploration and exploitation. The adaptation of the meta-parameters is an open question in reinforcement learning, which arguably has become more of an issue recently with the success of deep reinforcement learning in high-dimensional state spaces. The long learning times in domains such as Atari 2600 video games makes it not feasible to perform comprehensive searches of appropriate meta-parameter values. We propose the Online Meta-learning by Parallel Algorithm Competition (OMPAC) method. In the OMPAC method, several instances of a reinforcement learning algorithm are run in parallel with small differences in the initial values of the meta-parameters. After a fixed number of episodes, the instances are selected based on their performance in the task at hand. Before continuing the learning, Gaussian noise is added to the meta-parameters with a predefined probability. We validate the OMPAC method by improving the state-of-the-art results in stochastic SZ-Tetris and in standard Tetris with a smaller, 10

\times

10, board, by 31% and 84%, respectively, and by improving the results for deep Sarsa(

\lambda

) agents in three Atari 2600 games by 62% or more. The experiments also show the ability of the OMPAC method to adapt the meta-parameters according to the learning progress in different tasks.Comment: 15 pages, 10 figures. arXiv admin note: text overlap with arXiv:1702.0311

arXiv.org e-Print Archive

Crossref

Phase II study of S-1, a novel oral fluorouracil, in advanced non-small-cell lung cancer

Author: Ansfield FJ
Blum JL
Cutsem EV
Giller SA
H Niitani
Hara Y
Harris BE
Heggie GD
Hino M
Hirata K
Ichinose Y
Ikenaka k
K Furuse
K Hasegawa
K Matsui
K Yoshimori
Kaplan EL
Keicho N
Lokich JJ
M Kawahara
Moynihan T
Nakai Y
Nakai Y
Niitani H
Ota K
Richards F
S Kudoh
Seifert P
Shirasaka T
Shirasaka T
Shirasaka T
Taguchi T
Tatsumi K
Unemi N
Vogelzang NJ
Weiden PL
Y Segawa
Yoshimori K
Publication venue: Nature Publishing Group
Publication date
Field of study

The purpose of this study was to evaluate the efficacy and safety of a novel oral anticancer fluoropyrimidine derivative, S-1, in patients receiving initial chemotherapy for unresectable, advanced non-small-cell lung cancer (NSCLC). Between June 1996 and July 1998, 62 patients with NSCLC who had not received previous chemotherapy for advanced disease were enrolled in this study. 59 patients (22 stage IIIB and 37 stage IV) were eligible for the evaluation of efficacy and safety. S-1 was administered orally, twice daily, after meals. 3 dosages of S-1 were prescribed according to body surface area (BSA) so that they would be approximately equivalent to 80 mg m−2day−1: BSA < 1.25 m2, 40 mg b.i.d.; BSA≥1.25 but <1.5 m2; 50 mg b.i.d., and BSA≥1.5 m2: 60 mg b.i.d. One cycle consisted of consecutive administration of S-1 for 28 days followed by a 2-week rest period, and cycles were repeated up to 4 times. The partial response (PR) rate of the eligible patients was 22.0% (13/59); (95% confidence interval: 12.3–34.7%). A PR was observed in 22.7% (5/22) of the stage IIIB patients and 21.6% (8/37) of the stage IV patients. The median response duration was 3.4 months (1.1–13.7 months or longer). Grade 4 neutropenia was observed in one of the 59 patients (1.7%). The grade 3 or 4 toxicities consisted of decreased haemoglobin level in 1.7% of patients (1/59), neutropenia in 6.8% (4/59), thrombocytopenia in 1.7% (1/59), anorexia in 10.2% (6/59), diarrhoea in 8.5% (5/59), stomatitis in 1.7% (1/59), and malaise in 6.8% (4/59), and their incidences were relatively low. There were no irreversible, severe or unexpected toxicities. The median survival time (MST) of all patients was 10.2 months (95% confidence interval: 7.7–14.5 months), and the one-year survival rate was 41.1%. The MST of the stage IIIB patients was 7.9 months, and that of the stage IV patients was 11.1 months. The one-year survival rates of the stage IIIB and IV patients were 30.7% and 47.4%, respectively. S-1 was considered to be an active single agent against NSCLC. Further study of S-1 with other active agents is warranted. © 2001 Cancer Research Campaignhttp://www.bjcancer.co

Crossref

PubMed Central

SBART 2.4: An IEC Tool for Creating Two-Dimensional Images, Movies and Collages

Author: Tatsuo Unemi
Unemi T.
Publication venue: 'MIT Press - Journals'
Publication date
Field of study

Crossref

Replicator dynamics in public goods games with reward funds

Author: Sasaki T.
Unemi T.
Publication venue: 'Elsevier BV'
Publication date: 01/10/2011
Field of study

Which punishment or rewards are most effective at maintaining cooperation in public goods interaction and deterring defectors who are willing to freeload on others' contribution? The sanction system is itself a public good and can cause problematic "second-order free riders" who do not contribute to the provisions of the sanctions and thus may subvert the cooperation supported by sanctioning. Recent studies have shown that public goods games with punishment can lead to a coercion-based regime if participation in the game is optional. Here, we reveal that even with compulsory participation, rewards can maintain cooperation within an infinitely large population. We consider three strategies for players in a standard public goods game to be a cooperator or a defector in a standard public goods game, or to be a rewarder who contributes to the public good and to a fund that rewards players who contribute during the game. Cooperators do not contribute to the reward fund and are therefore classified as second-order free riders. The replicator dynamics for the three strategies exhibit a rock-scissors-paper cycle, and can be analyzed fully, despite the fact that the expected payoffs are nonlinear. The model does not require repeated interaction, spatial structure, group selection, or reputation. We also discuss a simple method for second-order sanctions, which can lead to a globally stable state where 100% of the population are rewarders

International Institute for Applied Systems Analysis (IIASA)

Cooperation of categorical and behavioral learning in a practical solution to the abstraction problem

Author: A. G. Barto
A. W. Moore
Atsushi Ueno
G. Tesauro
Hideaki Takeda
J. Simons
L. Lin
R. S. Sutton
T. Mohri
T. Unemi
T. Yairi
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Picbreeder: A Case Study in Collaborative Evolutionary Exploration of Design Space

Author: Adam Campbell
Adelein Rodriguez
Bongard J. C.
Caldwell C.
David B. D'Ambrosio
Draves S.
Hoover A.
Jeremiah T. Folsom-Kovarik
Jimmy Secretan
Kenneth O. Stanley
Machado P.
Nicholas Beato
Teller A.
Unemi T.
Publication venue: 'MIT Press - Journals'
Publication date: 01/01/2011
Field of study

For domains in which fitness is subjective or difficult to express formally, Interactive Evolutionary Computation (IEC) is a natural choice. It is possible that a collaborative process combining feedback from multiple users can improve the quality and quantity of generated artifacts. Picbreeder, a large-scale online experiment in collaborative interactive evolution (CIE), explores this potential. Picbreeder is an online community in which users can evolve and share images, and most importantly, continue evolving others ’ images. Through this process of branching from other images, and through continually increasing image complexity made possible by the underlying NeuroEvolution of Augmenting Topologies (NEAT) algorithm, evolved images proliferate unlike in any other current IEC system. This paper discusses not only the strengths of the Picbreeder approach, but its challenges and shortcomings as well, in the hope that lessons learned will inform the design of future CIE systems

CiteSeerX

Crossref

University of Central Florida (UCF): STARS (Showcase of Text, Archives, Research & Scholarship)

5-Fluorouracil Prodrug, Ftorafur, Modulated by Uracil (UFT): Preclinical and Clinical Prospective

Author: Bedikian A.Y.
Cao S.
Cao S.
Cao S.
Danhauser L.L.
Fujii S.
Fujii S.
Fukui Y.
Hines J.D.
Ho D.H.
Jaiyesimi I.
Jaiyesimi I.
Kimura K.
Kohne-Wompner C.H.
Meropol N.J.
Porter D.J.T.
Rustum Y.M.
Shirasaka T.
Taguchi T.
Taguchi T.
Toide H.
Unemi N.
Zhang Z.G.
Publication venue: 'SAGE Publications'
Publication date
Field of study

Crossref

Comparing cell kinetic studies of the effect of ftorafur and 5-fluorouracil on the L 1210 ascites tumor

Author: AM Cohen
AT Wu
BT Garibjanian
CH Diggs
GN Hortobagyi
H Fujita
JA Benvenuto
KF Klippel
KU Hartmann
L Sachs
LM Putten van
M Valdivieso
M Valdivieso
MG Pallavicini
N Karev
N Unemi
R Freudenthal
RJ Belt
RK Johnson
RS Camplejohn
S Fujimoto
SA Hiller
SW Hall
T Buroker
T Buroker
T Buroker
W Sadee
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Task allocation for robots using inspiration from hormones

Author: Arkin R.C.
Chaimowicz L.
Goldberg D.E.
Hamann H.
Holland J.
Joanne H Walker
Liu H.
Luke S.
Maes P.
Mendao M.
Myra S Wilson
Neal M.
Nehmzow U.
Nolfi S.
Nolfi S.
Rechenberg I.
Unemi T.
Walker J.
Walker J.
Walker J.
Walker J.
Publication venue: 'SAGE Publications'
Publication date
Field of study

Crossref