MTP support merged into llama.cpp

· r/LocalLLaMA ·

Multi-Token Prediction support merged into llama.cpp master branch, adding a new inference technique to the popular open-source LLM inference framework.

Categories: OSS & Tools

Excerpt

PR [22673](https://github.com/ggml-org/llama.cpp/pull/22673) has been merged into master! 🎉

Discussions