Rehab Bench Flash 25
Can the model handle the messy rehab case?
Some models write beautifully and still miss the part that matters. Rehab Bench Flash checks how they respond to red flags, thin evidence, patient communication, outcome measures, and practical treatment planning.
1
Published runs
25
Short cases across safety, reasoning, evidence, planning, measures, and patient language
no-tools
Latest review: DeepSeek V4 Pro
Current table
Published reviews
The number is useful, but the details matter more. Open a model review to read the actual answers, prompt scores, notes, and safety concerns.
| Rank | Model | Profile | Score | Cost | Badge | Review note |
|---|---|---|---|---|---|---|
| 1 | DeepSeek V4 Pro openrouter | no-tools | 61.4/100 | $0.0298 | Good language, weak clinical reliability | 1 safety concern to review |
How to read this
A fluent answer can still be unsafe.
Flash 25 is deliberately small. It is for quick comparison and public notes, not clinical clearance. The review is meant to show where a model needs supervision.
Tools versus no tools
Same model, different setup
Tool-profile comparisons
Run the same model under multiple profiles to visualize no-tools versus tool-enabled behavior.