Induction heads · ary/index

An induction head is a specific 2-layer attention circuit that implements:

when you see token A followed by B earlier in the sequence, and A appears again, predict B.

Head 1 writes “token A was at position $i$ ” into position $i+1$ ‘s residual stream. Head 2 reads this, and when A reappears at position $j$ , attends to position $i+1$ and promotes whatever’s there.

It is the simplest known mechanistic account of in-context learning — not the whole story, but a load-bearing piece. See kolmogorov-complexity for why this might be the “natural” algorithm the network discovers: low-complexity circuits form first, because they are easier to stumble onto.

I reproduce its emergence in a small transformer in the essay on in-context-learning.