I removed a learned behavior from a tiny AI model. Adding it to another model was harder.
A small mechanistic-interpretability result: subtracting one internal direction mostly removed a learned behavior, but adding matching directions to a separately trained model did not install it.