Hypura – A storage-tier-aware LLM inference scheduler for Apple Silicon

· HN · LLMs ·

Hypura is a storage-tier-aware LLM inference scheduler for Apple Silicon that optimizes memory management across unified memory tiers.

Categories: OSS & Tools

Excerpt

HN · 221 points · 86 comments

Discussions