We propose a novel response-adaptive randomisation procedure for multi-armed trials with normally
distributed outcomes which is non-myopic, thus is near-optimal in terms of patient bene t, yet maintains computa-
tional feasibility. We derive our response-adaptive algorithm based on the Gittins index for the multi-armed bandit
problem, as an extension of the method rst introduced in Villar et al (2015). We illustrate the proposed procedure
by simulations in the context of Phase II cancer trials. Our results show that there are e ciency and patient bene t
gains of using a response-adaptive allocation procedure with a continuous endpoint instead of a binary one. These
gains persist even if an anticipated low rate of missing data due to deaths, drop-outs or complete responses is imputed
online through a procedure introduced in this paper. Additionally, we discuss how there are response-adaptive designs
that outperform the traditional equal randomised design both in terms of e ciency and patient bene t measures in
the multi-armed trial context