Instruct Vectors - Base models can be instruct with activation vectors — LessWrong

By training per-layer steering vectors via descent on a frozen base model, I found that it is possible to induce consistent assistant behavior, including the proper use of EOS tokens at the end of assistant turns and consistent reference to the self as an AI assistant. Using the steering vectors, Qwen3-4B-Base was able to imitate the behavior of an instruction/chat tuned model.