[v1.x] fix(auth): coerce empty-string optional URL fields to None in OAuthClientMetadata#2405
[v1.x] fix(auth): coerce empty-string optional URL fields to None in OAuthClientMetadata#2405felixweinberger wants to merge 1 commit intov1.xfrom
Conversation
There was a problem hiding this comment.
LGTM — clean, focused fix with comprehensive test coverage.
Extended reasoning...
Overview
This v1.x backport adds a field_validator to OAuthClientMetadata that coerces empty-string values to None for five optional URL fields (client_uri, logo_uri, tos_uri, policy_uri, jwks_uri). The diff touches exactly two files: the model definition and its test suite.
Security risks
None. The change only widens accepted input at the deserialization boundary — empty strings that previously raised ValidationError now parse as None. Valid URLs, explicit None, and omitted keys are entirely unaffected. No auth logic, token handling, or permission checks are modified.
Level of scrutiny
Low. This is a mechanical, non-breaking fix that follows an established pattern already present in the same file (normalize_token_type on OAuthToken). The validator is a simple equality check with a two-branch return. The fix is scoped entirely to DCR response parsing via OAuthClientMetadata and its subclass OAuthClientInformationFull.
Other factors
Test coverage is thorough: parametrized per-field coercion, all-fields-together, valid-URL passthrough, inheritance through OAuthClientInformationFull, and a negative case confirming non-empty invalid URLs still raise. The pre-existing gap in OAuthMetadata and ProtectedResourceMetadata flagged in the inline comment is explicitly not introduced by this PR and is worth a follow-up, but does not block merging this correct fix.
| @field_validator( | ||
| "client_uri", | ||
| "logo_uri", | ||
| "tos_uri", | ||
| "policy_uri", | ||
| "jwks_uri", | ||
| mode="before", | ||
| ) | ||
| @classmethod | ||
| def _empty_string_optional_url_to_none(cls, v: object) -> object: | ||
| # RFC 7591 §2 marks these URL fields OPTIONAL. Some authorization servers | ||
| # echo omitted metadata back as "" instead of dropping the keys, which | ||
| # AnyHttpUrl would otherwise reject — throwing away an otherwise valid | ||
| # registration response. Treat "" as absent. | ||
| if v == "": | ||
| return None | ||
| return v | ||
|
|
||
| def validate_scope(self, requested_scope: str | None) -> list[str] | None: | ||
| if requested_scope is None: | ||
| return None |
There was a problem hiding this comment.
🟣 This PR correctly adds empty-string coercion for OAuthClientMetadata, but OAuthMetadata and ProtectedResourceMetadata have the same defect for their optional AnyHttpUrl fields and are left unprotected. This is a pre-existing gap — the PR does not introduce or modify these classes — but the same Postel's law rationale applies equally to them.
Extended reasoning...
What the bug is
The PR adds a field_validator to OAuthClientMetadata that coerces empty strings to None for five optional AnyHttpUrl fields. The same class of defect exists in two sibling models that were not touched by this PR:
OAuthMetadata(RFC 8414 Authorization Server Metadata):registration_endpoint,service_documentation,op_policy_uri,op_tos_uri,revocation_endpoint,introspection_endpoint— all declaredAnyHttpUrl | Noneat lines 135, 142–149, with no empty-string validator.ProtectedResourceMetadata(RFC 9728 Protected Resource Metadata):jwks_uri,resource_documentation,resource_policy_uri,resource_tos_uri— all declaredAnyHttpUrl | Noneat lines 164, 169–171, with no empty-string validator.
How it manifests
If a non-compliant authorization server or resource server echoes any of these optional URL fields as "" (rather than omitting the key), Pydantic's AnyHttpUrl validator rejects the empty string and raises ValidationError. The entire discovery response is discarded — the same failure mode this PR was authored to fix.
Step-by-step proof
- A non-compliant AS returns its metadata with
"op_tos_uri": "". - The client calls
OAuthMetadata.model_validate(data). - Pydantic attempts to coerce
""throughAnyHttpUrl— nofield_validatorintercepts it. AnyHttpUrlraisesValidationErrorbecause an empty string is not a valid URL.- The caller discards the response and OAuth discovery fails entirely.
The same sequence applies to any of the unprotected fields in ProtectedResourceMetadata.
Why existing code does not prevent it
The new validator is defined only on OAuthClientMetadata and is not inherited by OAuthMetadata or ProtectedResourceMetadata (they are not subclasses). Neither of those models defines any field_validator for their URL fields.
Impact
Any MCP client connecting to a non-compliant server that echoes empty-string URL fields in its authorization server metadata or protected resource metadata response will fail OAuth discovery entirely. The behavior is identical to what the PR describes for DCR responses — a valid discovery document is thrown away because of a cosmetic serialization quirk in a single optional field.
How to fix
Add the same _empty_string_optional_url_to_none field_validator (or a shared mixin) to OAuthMetadata and ProtectedResourceMetadata, covering all their optional AnyHttpUrl fields.
Pre-existing status
This is a pre-existing issue that predates this PR. The PR neither introduces these models nor adds new call sites for them — it is scoped entirely to DCR response parsing via OAuthClientMetadata. The gap in coverage is worth addressing as a follow-up given the PR's stated motivation.
Coerce
""toNonefor the five OPTIONAL URL fields onOAuthClientMetadataso that DCR responses from servers that echo omitted metadata as empty strings parse successfully.v1.x backport of the same change targeting
main(#2404).Motivation and Context
RFC 7591 §2 marks
client_uri,logo_uri,tos_uri,policy_uri, andjwks_urias OPTIONAL. Some authorization servers in the wild (observed: Udemy, deepsense.ai, Apify, windsor.ai, firecrawl, and others) echo the client's omitted metadata back as""instead of dropping the keys:{"client_id": "abc123", "client_uri": "", "logo_uri": "", "tos_uri": "", "policy_uri": "", ...}AnyHttpUrlrejects"", sohandle_registration_responseraisesValidationErrorand callers discard an otherwise valid registration — the server returned 201 with a realclient_id.The servers are technically non-compliant (an empty string is not a valid URL), but the only reasonable interpretation of
""for an OPTIONAL field is "absent". This applies Postel's law at the deserialization boundary, matching the existingnormalize_token_typevalidator onOAuthTokenin the same file. BecauseOAuthClientInformationFullinherits fromOAuthClientMetadata, the coercion applies to parsed DCR responses as well.How Has This Been Tested?
New
TestOAuthClientMetadataEmptyUrlCoercioncovers per-field coercion (parametrized over all five fields), all-fields-together, valid-URL passthrough, inheritance throughOAuthClientInformationFull(the DCR-response model), and a negative case confirming non-empty invalid URLs still raise.Breaking Changes
None. This only widens accepted input — previously-rejected
""values now succeed asNone. Valid URLs,None, and omitted keys are unchanged.Types of changes
Checklist
Additional context
RFC 7591 §2: https://datatracker.ietf.org/doc/html/rfc7591#section-2