AMD Hipfire - a new inference engine optimized for AMD GPU's

· r/LocalLLaMA ·

Hipfire is a new open-source inference engine targeting AMD GPUs with a custom mq4 quantization method, releasing models on HuggingFace and claiming significant speedups per community benchmarks.

Categories: OSS & Tools

Excerpt

Came across hipfire the other day. It's a brand new inference engine focused on all AMD GPU's (not just the latest). [Github.](https://github.com/Kaden-Schutt/hipfire) It uses a special mq4 quantization method. The hipfire creator is pumping out [models on huggingface.](https://huggingface.co/schuttdev) I don't know enough about quantization to know how good these quants are in terms of quality, but as an RDNA3 aficionado I'm happy AMD is getting some attention. [Localmaxxing](https://www.localmaxxing.com/) is a new LLM benchmarking site, and shows some pretty dramatic speedups for hipfire inference. Edit: I should have just said hipfire - I don't think this is connected to AMD officially.

Discussions