Foundation models exhibit significant capabilities in decision-making and
logical deductions. Nonetheless, a continuing discourse persists regarding
their genuine understanding of the world as opposed to mere stochastic mimicry.
This paper meticulously examines a simple transformer trained for Othello,
extending prior research to enhance comprehension of the emergent world model
of Othello-GPT. The investigation reveals that Othello-GPT encapsulates a
linear representation of opposing pieces, a factor that causally steers its
decision-making process. This paper further elucidates the interplay between
the linear world representation and causal decision-making, and their
dependence on layer depth and model complexity. We have made the code public