Biped walking remains a difficult problem and robot models can
greatly {facilitate} our understanding of the underlying
biomechanical principles as well as their neuronal control. The
goal of this study is to specifically demonstrate that stable
biped walking can be achieved by combining the physical properties
of the walking robot with a small, reflex-based neuronal network,
which is governed mainly by local sensor signals. This study shows
that human-like gaits emerge without {specific} position or
trajectory control and that the walker is able to compensate small
disturbances through its own dynamical properties. The reflexive
controller used here has the following characteristics, which are
different from earlier approaches: (1) Control is mainly local.
Hence, it uses only two signals (AEA=Anterior Extreme Angle and
GC=Ground Contact) which operate at the inter-joint level. All
other signals operate only at single joints. (2) Neither position
control nor trajectory tracking control is used. Instead, the
approximate nature of the local reflexes on each joint allows the
robot mechanics itself (e.g., its passive dynamics) to contribute
substantially to the overall gait trajectory computation. (3) The
motor control scheme used in the local reflexes of our robot is
more straightforward and has more biological plausibility than
that of other robots, because the outputs of the motorneurons in
our reflexive controller are directly driving the motors of the
joints, rather than working as references for position or velocity
control. As a consequence, the neural controller and the robot
mechanics are closely coupled as a neuro-mechanical system and
this study emphasises that dynamically stable biped walking gaits
emerge from the coupling between neural computation and physical
computation. This is demonstrated by different walking
experiments using two real robot as well as by a Poincar\'{e} map
analysis applied on a model of the robot in order to assess its
stability. In addition, this neuronal control structure allows the
use of a policy gradient reinforcement learning algorithm to tune
the parameters of the neurons in real-time, during walking. This
way the robot can reach a record-breaking walking speed of 3.5
leg-lengths per second after only a few minutes of online
learning, which is even comparable to the fastest relative speed
of human walking