A deterministic recovery flow for Gateway tool invoke failed: 500 and other model-switch failures.
If model switching throws a 500 error, this is usually runtime/state drift, not a prompt problem. The goal is to isolate whether the issue is model-specific, session-specific, or runtime-wide.
Users keep retrying random model names in the same stale session. That usually burns time and tokens without proving anything.
Paste this after switching models:
Before answering, print:
1) active model id/name
2) fallback models list (or "none")
3) current session surface (dashboard / DM / server)
Then write exactly 3 bullets explaining one risk of changing models mid-project.
If the model switch says “success” but this output still shows the old model, treat it as a state-application failure.
Model IDs must match provider format exactly. One wrong prefix can trigger a 500-looking failure path.
Your primary model may fail, but fallback still answers. That can hide the real error. Test with one explicit model and no fallback first.
If config changed but runtime wasn’t restarted, model-switch behavior can be inconsistent across sessions.
Dashboard and Discord can run different session contexts. Always verify model in the same surface where failure happens.
If a session has a pinned model override, config edits may not apply as expected. Reset the override to default and rerun the canary test in a fresh session.
Model switching can fail when runtime health is already unstable. If you also see slow tool calls, blank responses, or runtime health errors, treat this as runtime instability first and model-switch issue second.
Users sometimes switch model settings in one account/workspace and test in another. Confirm you are editing and testing in the same workspace before escalating.
If two clean attempts fail with the same 500 error, stop retrying and escalate with evidence. More retries add noise.
Don’t paste API keys or full secrets in support tickets while debugging model issues.