Skymizer Taiwan Inc. Unveils Breakthrough Architecture Enabling Ultra-Large LLM Inference on a Single Card
Skymizer's HTX301 PCIe card with 6 chips and 384GB memory enables 700B-parameter LLM inference at ~240W by offloading the decode phase from GPUs, splitting prefill/decode across hardware.
Excerpt
[Source](https://en.prnasia.com/releases/apac/skymizer-taiwan-inc-unveils-breakthrough-architecture-enabling-ultra-large-llm-inference-on-a-single-card-530405.shtml)
Article excerpt:
>With a single PCIe card — powered by six HTX301 chips and 384 GB of memory — enterprises can now run 700B-parameter model inference locally at just \~240W per card.
The memory-bandwidth-intensive token generation that dominates real-world inference latency. Existing GPUs handle compute-dense prefill; HTX301 cards handle decode. Each silicon matched to its phase.
This is a really interesting approach.
It only lets the GPU handle the prefill stage, while everything else, including the model weights and decoding, runs entirely on this card. That way, you can run huge billion parameter models without needing to chase after graphics cards with massive VRAM.
As for how the actual product will perform in real life, we'll have to wait until early June at Computex to find out.
Read at source: https://www.reddit.com/r/LocalLLaMA/comments/1sx2vxp/skymizer_taiwan_inc_unveils_breakthrough/
Discussions
- reddit · 100 points · 30 comments
- reddit · 102 points · 31 comments
- reddit · 103 points · 31 comments
- reddit · 107 points · 31 comments
- reddit · 104 points · 33 comments
- reddit · 107 points · 33 comments
- reddit · 110 points · 33 comments
- reddit · 108 points · 33 comments
- reddit · 112 points · 34 comments
- reddit · 120 points · 35 comments
- reddit · 120 points · 35 comments
- reddit · 121 points · 35 comments
- reddit · 122 points · 35 comments
- reddit · 122 points · 35 comments
- reddit · 122 points · 35 comments
- reddit · 122 points · 35 comments
- reddit · 122 points · 36 comments
- reddit · 125 points · 36 comments
- reddit · 122 points · 36 comments
- reddit · 127 points · 36 comments
- reddit · 126 points · 36 comments
- reddit · 126 points · 36 comments
- reddit · 127 points · 36 comments
- reddit · 125 points · 37 comments
- reddit · 127 points · 38 comments