Distributed Algorithm Examples

Google's DiffusionGemma generates 256 tokens in parallel and self-corrects as it goes

Google's open-source diffusion language model generates 256 tokens in parallel and self-corrects, hitting 4x speed on one GPU ...

Latency, bandwidth, and fidelity all matter when you’re chasing milliseconds.

21h

Google says that DiffusionGemma can generate more than 1,000 tokens per second when running on a single H100, a server-grade ...

Some results have been hidden because they may be inaccessible to you