Model Comparison
One request. Multiple models. A clear decision -- in seconds instead of weeks.
Four comparison dimensions
Answer quality
See side by side which model delivers the best tone, depth and fewest errors.
Latency
Time-to-first-token and total response time -- per model, per request.
Cost
Per-request cost in Euro -- transparent and auditable for your budget.
Token usage
Input and output tokens per model so you find the right tradeoff.
How model comparison works
Five steps from activation to default adoption -- all models in parallel, transparently documented.
- 1
Activate 'Comparison' mode in the chat
Switch to 'Compare' in the chat window. Up to four model slots open, each pre-set with a different model.
- 2
Pick models or accept defaults
Suggested defaults are typically Claude Opus, GPT-5.x, Gemini 2.5 and Mistral Large -- overridable with any of the 100+ EU models.
- 3
Enter the request once
One prompt -- all models receive the same input. Attachments (image, PDF) are sent to each model that supports the modality.
- 4
Four answers streaming in parallel
Answers appear token by token side by side. Latency, cost and token usage are measured live and shown at the end.
- 5
Adopt the winner as team default
One click sets the winning model as the team default. Other users automatically benefit.
When is a model comparison worthwhile?
Four concrete use cases where side-by-side comparison saves hours of research.
Picking a model for a team
Before committing 'we use model X for marketing copy' actually test four candidates instead of guessing.
Quality assessment of new models
When a new model launches (e.g. Claude Opus 5), pit it directly against the predecessor and rivals.
Cost optimisation
Check if a cheaper model (e.g. Claude Haiku) suffices for your tasks -- often 50 percent cost savings at the same quality.
Troubleshooting bad answers
Is the default model having a bad day? Comparison instantly shows whether the issue is the model or the prompt.
Benefits
Frequently asked questions on model comparison
Answers on slots, cost, models and export options.
How many AI models can I compare in parallel?+
Up to four models simultaneously. In comparison mode answers are streamed in parallel -- each answer with its own metrics (tokens, latency, cost).
Which models are available for comparison?+
All 100+ EU-hosted models of the platform: Claude (Sonnet, Opus, Haiku), GPT family, Google Gemini, Mistral, Meta Llama, Cohere, DeepSeek, xAI Grok. Freely selectable per slot.
Does a comparison cost more than a normal request?+
Yes -- each model bills its own tokens. With 4 models a comparison costs roughly 4x as much as a single request, depending on the model. Costs are shown transparently per model.
When is a model comparison worthwhile?+
For important decisions: 'Which model do we set as default?', 'Is the more expensive model worth it?', for quality audits or when a new model launches.
How long does a comparison of four models take?+
About as long as the slowest model. Since all run in parallel there is no sequential bottleneck -- typical response time 2 to 15 seconds depending on task complexity.
Can I export comparison results?+
Yes. Every comparison lands in the audit trail with prompt, all four answers, metrics and models. CSV or PDF export available.
Are the requests truly sent identically to all models?+
Yes -- same prompt, same system message, same attachments, same temperature (if set). Model-specific defaults are mapped to a common baseline for fairness.
Can I adopt the winning model as a team default?+
Yes, with one click. The company admin can set the default model per use case for each team -- directly from the comparison result.
Start Model Comparison
100+ EU models, all directly comparable -- no extra account needed.
Start Free Trial