Published onMarch 25, 2026|Views: 62|7 min readStop using torch.cat for your KV cache implementationsllmkv-cachepytorchinferenceoptimizationtransformerstl;dr: `torch.cat` is not in-place, instead use pre-allocated buffers