Time to update llama.cpp to get som MTP improvements!
llama.cpp PR #23269 lands MTP (Multi-Token Prediction) improvements, giving local LLM runners better speculative decoding performance.
Excerpt
[https://github.com/ggml-org/llama.cpp/pull/23269](https://github.com/ggml-org/llama.cpp/pull/23269)
Read at source: https://www.reddit.com/r/LocalLLaMA/comments/1thlmsx/time_to_update_llamacpp_to_get_som_mtp/
Discussions
- reddit · 100 points · 72 comments
- reddit · 101 points · 73 comments
- reddit · 101 points · 75 comments
- reddit · 100 points · 75 comments
- reddit · 106 points · 75 comments
- reddit · 106 points · 75 comments
- reddit · 105 points · 76 comments
- reddit · 107 points · 76 comments
- reddit · 108 points · 76 comments
- reddit · 108 points · 76 comments
- reddit · 110 points · 76 comments
- reddit · 112 points · 76 comments
- reddit · 115 points · 76 comments
- reddit · 118 points · 76 comments
- reddit · 119 points · 77 comments
- reddit · 121 points · 78 comments
- reddit · 121 points · 79 comments
- reddit · 121 points · 80 comments
- reddit · 123 points · 80 comments
- reddit · 120 points · 80 comments
- reddit · 123 points · 80 comments
- reddit · 123 points · 80 comments