Xiaomi claims MiMo-V2.5-Pro-UltraSpeed tops 1K tokens/second, a first at the 1T-parameter scale, using a standard 8-GPU commodity node; API trial starts June 9 (Jose Antonio Lanz/Decrypt)
Xiaomi opened API trials for a 1T-parameter MiMo model claiming unusually high inference speed on commodity 8-GPU hardware.
Excerpt
<a href="https://decrypt.co/370449/xiaomi-mimo-ultraspeed-ai-model-faster-chatgpt-claude"><img align="RIGHT" border="0" hspace="4" src="http://www.techmeme.com/260608/i63.jpg" vspace="4" /></a>
<p><a href="https://www.techmeme.com/260608/p63#a260608p63" title="Techmeme permalink"><img height="12" src="http://www.techmeme.com/img/pml.png" style="border: none; padding: 0; margin: 0;" width="11" /></a> Jose Antonio Lanz / <a href="https://decrypt.co/">Decrypt</a>:<br />
<span style="font-size: 1.3em;"><b><a href="https://decrypt.co/370449/xiaomi-mimo-ultraspeed-ai-model-faster-chatgpt-claude">Xiaomi claims MiMo-V2.5-Pro-UltraSpeed tops 1K tokens/second, a first at the 1T-parameter scale, using a standard 8-GPU commodity node; API trial starts June 9</a></b></span> — Most people know Xiaomi as the Chinese phone brand. The one that makes cheap electric scooters and air purifiers.</p>
Read at source: https://www.techmeme.com/260608/p63#a260608p63