fix(model-metadata): guard against models.dev underreports at step 5 The resolution order in get_model_context_length() trusts the models.dev lookup at step 5 over the curated DEFAULT_CONTEXT_LENGTHS table at step 8. When models.dev reports a stale/incorrectly-low value, the correct curated default is never reached. Concrete case: models.dev reports `minimax-m3-free` on the opencode provider as 200K context, but `DEFAULT_CONTEXT_LENGTHS['minimax-m3']` is 1M. Without this guard, the agent's effective context window is 5x too small and Hermes auto-compresses at 72% of 200K (~144K) when the model actually accepts ~720K of input. Complements upstream's PR #36726 which drops stale <=204,800 cache entries for the M3 family at step 1 (catches the symptom — stale cache from pre-catalog builds). This patch catches the root cause — any future fresh-lookup underreport, not just M3. Mirrors the existing Kimi guard in the OpenRouter path (step 6 below): same pattern, generalised to any curated-vs-live drift. Adds the `_curated_context_length` helper that mirrors the longest-key-first substring match used by step 8 so the guard can compare apples to apples. Tests: - 3 helper tests (M3 family -> 1M, M2.5 -> 204,800, unknown -> None) - 3 step-5 guard tests (underreport rejected, larger value accepted, equal value accepted) - 1 end-to-end live-resolution test for the original bug Did NOT add a generic step-1 cache invalidation: cached values are persistent user data, and a `curated > cached` heuristic cannot reliably distinguish a known underreport from a legitimate provider-specific cap (Codex gpt-5.5 is 272K vs curated 1.05M; Nous qwen3.6-plus is cached at 1M vs curated 1,048,576). The fresh-lookup guard at step 5 covers the in-the-wild underreport case without false-positive risk.