Luce DFlash: Qwen3.6-27B at up to 2x throughput on a single RTX 3090
Luce DFlash is an inference acceleration tool that achieves up to 2x throughput for running Qwen3.6-27B on a single RTX 3090, targeting local LLM enthusiasts on consumer hardware.
Excerpt
r/LocalLLaMA · 107 points · 23 comments · i.redd.it
Read at source: https://i.redd.it/ppdt7ixx9rxg1.png
Discussions
- reddit · 107 points · 23 comments
- reddit · 166 points · 42 comments
- reddit · 213 points · 57 comments
- reddit · 262 points · 78 comments
- reddit · 306 points · 89 comments
- reddit · 333 points · 99 comments
- reddit · 361 points · 105 comments
- reddit · 379 points · 108 comments
- reddit · 391 points · 118 comments
- reddit · 425 points · 125 comments
- reddit · 441 points · 129 comments
- reddit · 449 points · 130 comments
- reddit · 463 points · 130 comments
- reddit · 513 points · 140 comments
- reddit · 527 points · 143 comments
- reddit · 532 points · 143 comments
- reddit · 540 points · 148 comments
- reddit · 544 points · 150 comments
- reddit · 552 points · 152 comments
- reddit · 558 points · 156 comments
- reddit · 569 points · 158 comments
- reddit · 566 points · 158 comments
- reddit · 578 points · 158 comments
- reddit · 587 points · 158 comments
- reddit · 588 points · 158 comments
- reddit · 596 points · 160 comments
- reddit · 596 points · 160 comments
- reddit · 607 points · 166 comments
- reddit · 603 points · 168 comments
- reddit · 615 points · 168 comments
- reddit · 615 points · 170 comments
- reddit · 615 points · 171 comments
- reddit · 617 points · 172 comments
- reddit · 624 points · 173 comments
- reddit · 629 points · 173 comments