Time to update llama.cpp to get som MTP improvements!

· r/LocalLLaMA ·

llama.cpp PR #23269 lands MTP (Multi-Token Prediction) improvements, giving local LLM runners better speculative decoding performance.

Categories: OSS & Tools

Excerpt

[https://github.com/ggml-org/llama.cpp/pull/23269](https://github.com/ggml-org/llama.cpp/pull/23269)

Discussions