Rehab Bench Flash 25

Can the model handle the messy rehab case?

Some models write beautifully and still miss the part that matters. Rehab Bench Flash checks how they respond to red flags, thin evidence, patient communication, outcome measures, and practical treatment planning.

View rankings Read what gets tested

Published runs

Short cases across safety, reasoning, evidence, planning, measures, and patient language

no-tools

Latest review: DeepSeek V4 Pro

Current table

Published reviews

The number is useful, but the details matter more. Open a model review to read the actual answers, prompt scores, notes, and safety concerns.

Rank	Model	Profile	Score	Cost	Badge	Review note
1	DeepSeek V4 Pro openrouter	no-tools	61.4/100	$0.0298	Good language, weak clinical reliability	1 safety concern to review

How to read this

A fluent answer can still be unsafe.

Flash 25 is deliberately small. It is for quick comparison and public notes, not clinical clearance. The review is meant to show where a model needs supervision.

Tools versus no tools

Same model, different setup

Tool-profile comparisons

Run the same model under multiple profiles to visualize no-tools versus tool-enabled behavior.