The exact KV cache usage of DeepSeek V4

By Ok_Warning2146

· r/LocalLLaMA · Apr 26, 2026

Technical analysis of DeepSeek V4's KV cache efficiency: V4 uses ~10GB at 1M context versus V3's ~84GB, representing an ~8x improvement per calculations from the V4 paper and vLLM implementation.

Categories: Model Releases

Excerpt

Figure 1 of DSV4 paper seems to imply that DSV3.2 uses \~50GB at 1m context and DSV4 uses \~5GB: [https://huggingface.co/deepseek-ai/DeepSeek-V4-Pro/blob/main/DeepSeek\_V4.pdf](https://huggingface.co/deepseek-ai/DeepSeek-V4-Pro/blob/main/DeepSeek_V4.pdf) \*\*\*Numbers updated with the KV cache breakdown from vllm\*\*\* [https://vllm.ai/blog/deepseek-v4](https://vllm.ai/blog/deepseek-v4) From my own calculations, the correct FP16 KV cache at 1m context should be: |Model|Params|128k|160k|1m|KV%| |:-|:-|:-|:-|:-|:-| |V3/3.1|671B|8.58GiB|10.72GiB|68.63GiB|5.11%| |V3.2|671B|10.48GiB|13.11GiB|83.88GiB|6.25%| |V4 Flash|284B|0.84GiB|1.05GiB|6.72GiB|1.18%| |V4 Pro|1600B|1.20GiB|1.50GiB|9.62GiB|0.3%| So while KV cache saving is not 9.5x but 7.879x. It is still very impressive. If you look at the KV% metric, then we are seeing close to 20x gain. This basically obliterates all current transformer-SSM hybrid models' KV cache usage. But the transformer-SSM crowd can just use DSV4's CSA and HCA on their transformer layers to catch up. At this KV cache usage, that also means when DSV4 is supported at llama.cpp, we can easily run 1m context for DSV4 Flash on 256GB RAM and 3090 or for DSV4 Pro on 1.5TB RAM and RTX 6000 Blackwell. I suppose the various speed gain mentioned in the paper can make this viable. While DSV4 Pro doesn't do well at artificial analysis. We can expect Kimi and Zhipu will make derivatives off it such that we have a beast that uses very little KV cache. All in al

Read at source: https://www.reddit.com/r/LocalLLaMA/comments/1svzlog/the_exact_kv_cache_usage_of_deepseek_v4/

Discussions

reddit · 100 points · 45 comments
reddit · 101 points · 45 comments
reddit · 106 points · 45 comments
reddit · 106 points · 45 comments
reddit · 102 points · 46 comments
reddit · 110 points · 47 comments
reddit · 111 points · 47 comments
reddit · 111 points · 47 comments
reddit · 117 points · 47 comments
reddit · 115 points · 47 comments
reddit · 119 points · 47 comments
reddit · 120 points · 47 comments
reddit · 119 points · 47 comments
reddit · 121 points · 47 comments
reddit · 122 points · 47 comments
reddit · 124 points · 49 comments
reddit · 119 points · 50 comments
reddit · 121 points · 51 comments
reddit · 126 points · 51 comments
reddit · 126 points · 51 comments
reddit · 121 points · 51 comments
reddit · 122 points · 51 comments