Skip to content

federation: split controlplane hook into pre- and post-deploy#3847

Open
vakwetu wants to merge 1 commit intoopenstack-k8s-operators:mainfrom
vakwetu:fix_fed_hook
Open

federation: split controlplane hook into pre- and post-deploy#3847
vakwetu wants to merge 1 commit intoopenstack-k8s-operators:mainfrom
vakwetu:fix_fed_hook

Conversation

@vakwetu
Copy link
Copy Markdown
Contributor

@vakwetu vakwetu commented Apr 13, 2026

Recent changes were made to hook_controlplane_config.yml to accomodate adding federation to the SKMO deployment. This deployment differs though, in that the control plane already exists in the SKMO case. The changes broke the existing component job.

The cases are sufficiently different that its just cleaner to just use two different playbooks. That way, we do not have to worry about getting different conditionals right. This PR just does this - reverting the existing playbook to its original state (for the component job) and adding a new one for the SKMO case.

split into two task files:

  • hook_controlplane_config.yml (pre-deploy / component pipeline, unchanged)
  • hook_controlplane_config_postdeploy.yml (post-deploy / SKMO pipeline)

Add a new playbook federation-controlplane-config-postdeploy.yml that wraps the new post-deploy task file.

The architecture/automation/vars/multi-namespace-skmo.yaml is updated separately to call the new post-deploy playbook.

Fix variable names in both task files to use the required cifmw_ prefix so they pass the var-naming[pattern] rule enforced by ansible-lint.

Related-To: OSPCIX-1321

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci bot commented Apr 13, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign evallesp for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

vakwetu added a commit to vakwetu/architecture that referenced this pull request Apr 13, 2026
Switch the Stage 5 post_stage_run from federation-controlplane-config.yml
to federation-controlplane-config-postdeploy.yml.

The post-deploy hook is designed for pipelines where the
OpenStackControlPlane CR already exists.  It reads the live OSCP to
determine the CA bundle secret name, preserves any existing secret data,
and merges in the Keycloak CA cert before patching caBundleSecretName
when not already set.

The original federation-controlplane-config.yml (pre-deploy hook) is
intended for the component pipeline where the OSCP does not yet exist
at hook execution time.

Relates-to: OSPCIX-1321
Depends-On: openstack-k8s-operators/ci-framework#3847
Signed-off-by: Ade Lee <alee@redhat.com>
Co-authored-by: Claude Sonnet 4.6 <no-reply@anthropic.com>
@vakwetu vakwetu requested review from abays, evallesp and xek April 14, 2026 01:50
Copy link
Copy Markdown
Contributor

@xek xek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice split — separating the pre-deploy and post-deploy cases makes this much cleaner.

One concern: hook_controlplane_config_postdeploy.yml Step 5 still uses kubernetes.core.k8s state: patched for the OpenStackControlPlane CR. This is the same pattern that was hitting the "Failed to find exact match for core.openstack.org/v1beta1.OpenStackControlPlane" bug in the component pipeline. The SKMO/architecture pipeline may run a different kubernetes Python client version so it might work there, but it's the same underlying resource-discovery issue (kubernetes-client/python#1536, kubernetes.core#553 — both wontfix upstream).

If the SKMO pipeline is hitting the same error, the fix would be to replace Step 5 with oc patch using the plural resource name:

- name: Patch OpenStackControlPlane to set caBundleSecretName (when unset)
  when: not cifmw_federation_oscp_has_ca_bundle | bool
  ansible.builtin.command:
    cmd: >-
      oc patch openstackcontrolplanes.core.openstack.org controlplane
      -n {{ cifmw_federation_run_osp_cmd_namespace }}
      --type merge
      -p '{"spec":{"tls":{"caBundleSecretName":"{{ cifmw_federation_ca_bundle_secret_name }}"}}}'
  environment:
    KUBECONFIG: "{{ cifmw_openshift_kubeconfig }}"
    PATH: "{{ cifmw_path }}"
  changed_when: true

Key detail: oc patch needs the plural name openstackcontrolplanes (not singular openstackcontrolplane). The CRD defines plural: openstackcontrolplanes, singular: openstackcontrolplane, shortNames: [osctlplane, oscp].

Also — the postdeploy file is missing a Step 7 that patches the OSCP with the Keystone federation config (httpdCustomization, customServiceConfig). The original had this as Step 8. Without it, the SKMO pipeline won't actually configure Keystone for OIDC.

For the pre-deploy file: this looks correct — the component pipeline doesn't need to patch the OSCP directly since kustomize_deploy handles it.

The pre-deploy fix should unblock OSPCIX-1321 for the component pipeline. Our PR #3845 has the same fix for the pre-deploy path (plus the IPA kube-rbac-proxy image fix for the LDAP job) — happy to close ours in favor of this one if you want, or we can keep the IPA fix separate.

Copy link
Copy Markdown
Contributor

@xek xek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correction on my earlier comment — you're right that the root cause is the execution flow, not the kubernetes.core.k8s resource discovery or plural/singular naming.

In the component pipeline the hook runs at pre_deploy stage. Before 398d55b, this only wrote kustomization files and secrets — kustomize_deploy would apply them later when creating the controlplane. After 398d55b, Steps 5 and 8 tried to directly patch the OSCP, but the CRD isn't even registered with the API server at pre_deploy time. The k8s_info returns ok with an empty resource list, _federation_oscp_has_ca_bundle resolves to False, and the patch task fires against a non-existent resource type.

The split into pre-deploy (kustomization-only) and post-deploy (direct patching for SKMO) is the right fix. state: patched in the postdeploy file should work fine since in the SKMO pipeline the OSCP will actually be running by then.

LGTM for the pre-deploy revert. Our PR #3845 can be narrowed down to just the IPA kube-rbac-proxy image fix (separate issue from the federation breakage).

@vakwetu vakwetu force-pushed the fix_fed_hook branch 2 times, most recently from c163ce6 to d2313eb Compare April 15, 2026 00:22
@vakwetu
Copy link
Copy Markdown
Contributor Author

vakwetu commented Apr 15, 2026

tested - this woks both for SKMO and the keystone component jobs.

Comment thread roles/federation/tasks/hook_controlplane_config.yml
Copy link
Copy Markdown
Contributor

@evallesp evallesp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have some problems reviewing this.

I see that we have mismatch cert name inside the secret.
ALso the way the secret name in the post is calculated, is make me hard to find if it boths run in the same pipeline the post adapts to the name created in the pre (by reading OSCP).
But if both are running in different pipelines, then it has different naming but we don't mind.

I think the certificate name inside the secret should be the same to avoid duplication, but I might be wrong.

type: Opaque
metadata:
name: keystone-httpd-override
name: keycloakca
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(non-blocking) question: Should this match keycloak-ca.crt ?
Here it creates: data: {KeyCloakCA: }
And in the post it merges at:
data: >-
{{
cifmw_federation_ca_bundle_existing_data |
combine({'keycloak-ca.crt': cifmw_federation_sso_ca.content})
}}

so: data: {KeyCloakCA: , keycloak-ca.crt: } ?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it doesn't have to match. The way this works is you define a secret that contains all your custom ca certs. In this case, that secret would be keycloakca.

libcommon will read that secret and iterate over all the fields and extract the contents and add those to a chain of ca certs. The field names do not matter at all.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@evallesp - the two playbooks run in different pipelines. One runs in the keystone federation component job and the other runs in the SKMO uni-job. There would be no reason to run this twice.

The pre-config job just reverts to what was there before so that we can be sure that the pre-existing component job just runs as before.

The post-config job is the one with newer, more robust code - as it needs to be because the control plane already exists for this case.

@softwarefactory-project-zuul
Copy link
Copy Markdown

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/4f67b41cf27e4774a52daef7851faa70

✔️ openstack-k8s-operators-content-provider SUCCESS in 2h 14m 12s
✔️ podified-multinode-edpm-deployment-crc SUCCESS in 1h 28m 15s
✔️ cifmw-crc-podified-edpm-baremetal SUCCESS in 1h 38m 57s
✔️ cifmw-crc-podified-edpm-baremetal-minor-update SUCCESS in 2h 01m 50s
cifmw-pod-zuul-files TIMED_OUT in 30m 59s
✔️ noop SUCCESS in 0s
✔️ cifmw-pod-ansible-test SUCCESS in 8m 43s
✔️ cifmw-pod-pre-commit SUCCESS in 8m 24s
✔️ cifmw-molecule-federation SUCCESS in 1m 34s

The existing hook_controlplane_config.yml assumed the
OpenStackControlPlane (OSCP) was not yet present, which is correct for
the component / CRC pipeline.  The SKMO integration pipeline runs after
kustomize_deploy has already brought up the OSCP, so it needs a
different hook that reads the live OSCP, preserves existing CA bundle
data, and patches the resource in place.

Split into two task files:
  - hook_controlplane_config.yml  (pre-deploy / component pipeline, unchanged)
  - hook_controlplane_config_postdeploy.yml  (post-deploy / SKMO pipeline)

Add a new playbook federation-controlplane-config-postdeploy.yml that
wraps the new post-deploy task file.

The architecture/automation/vars/multi-namespace-skmo.yaml is updated
separately to call the new post-deploy playbook.

Fix variable names in both task files to use the required cifmw_ prefix
so they pass the var-naming[pattern] rule enforced by ansible-lint.

Relates-To: OSPCIX-1321
Signed-off-by: Ade Lee <alee@redhat.com>
Co-authored-by: Claude Sonnet 4.6 <no-reply@anthropic.com>
Made-with: Cursor
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants