Benchmark
Compare models on real hardware
Run one prompt across multiple open-source models and hardware profiles to compare quality, latency, cost, and memory side by side.
Compare before you commit
Run the same prompt across several open-source models and hardware profiles to compare quality, latency, cost, and tokens/sec side by side — then deploy the winning setup as an API.
How it works →Prompt
Models
Hardware profiles
Leave all unselected to benchmark each model on its default profile.
Latest run
Run a benchmark to compare results side by side.
No benchmark runs yet
Pick a prompt, choose a few models and hardware profiles, then run a benchmark to see a full comparison.
History
Your previous benchmark runs.