server: optimize account creation by pre-loading the role permissions#11137
server: optimize account creation by pre-loading the role permissions#11137weizhouapache wants to merge 4 commits intoapache:4.20from
Conversation
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## 4.20 #11137 +/- ##
============================================
+ Coverage 16.15% 16.25% +0.09%
- Complexity 13276 13425 +149
============================================
Files 5657 5663 +6
Lines 497969 500212 +2243
Branches 60389 60744 +355
============================================
+ Hits 80463 81311 +848
- Misses 408542 409816 +1274
- Partials 8964 9085 +121
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
@blueorangutan package |
|
@sureshanaparti a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress. |
3d62538 to
e5848ac
Compare
|
Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ debian ✔️ suse15. SL-JID 14033 |
|
@blueorangutan test |
|
@weizhouapache a [SL] Trillian-Jenkins test job (ol8 mgmt + kvm-ol8) has been kicked to run smoke tests |
|
[SF] Trillian test result (tid-13682)
|
|
@weizhouapache , what do we need still to get this merged? |
|
@weizhouapache is this relevant for 4.20.3, and what tests to be covered for this? |
|
@DaanHoogland @sureshanaparti if you want to test, maybe check if there is a regression (for example, #5781 should not happen) |
RosiKyu
left a comment
There was a problem hiding this comment.
Test Results
| Test Case | Result |
|---|---|
| TC1: Role Escalation Prevention | FAILED |
| TC2: Performance Optimization | PASSED |
Issue Found
Role escalation prevention (#5879) is broken. A user with a restricted Domain Admin role (with createServiceOffering denied) was able to successfully create an account with the unrestricted default Domain Admin role.
Root Cause: The checkRoleEscalation method appears to be using the wrong caller account. Server logs show:
12:57:40 CIDRs from which account 'Account [{"accountName":"restrictedadmin"...
12:57:46 Checking if user of account admin [75685252-fa97-11f0-877d-1e009d0003da] with role-id [1] can create an account...
@weizhouapache The request was authenticated as restrictedadmin (role-id 11), but the escalation check used admin (role-id 1 - Root Admin), which has all permissions and thus the check passes incorrectly.
TC1: Role Escalation Prevention (Regression Test for #5781)
Objective
Verify that a user with a restricted Domain Admin role (with createServiceOffering denied) cannot create an account with the unrestricted default Domain Admin role.
Test Steps
- Created role "Domain Admin Restricted" by cloning default Domain Admin role (ID:
81466765-10a5-4aba-bf55-ed25f6c994c5) - Updated
createServiceOfferingpermission fromallowtodenyfor the restricted role - Created domain "TestDomain" (ID:
2bfa0be4-7d8e-4859-b40c-ad205f64b89b) - Created account "restrictedadmin" with the restricted role (ID:
cece4a85-fba7-4780-8269-69cb1771bc1e) - Registered API keys for restrictedadmin user
- Switched to restrictedadmin user profile (verified via
list accountsshowing only own account with 408 APIs available) - Attempted to create account "escalateduser" with unrestricted Domain Admin role (ID:
51d853ed-fa97-11f0-877d-1e009d0003da)
Expected Result:
The createAccount command should fail with a permission denied error, preventing role escalation.
Actual Result:
The account "escalateduser" was successfully created with the unrestricted Domain Admin role. Role escalation prevention is NOT working.
Root Cause Analysis:
Server log at 12:57:46 shows:
Checking if user of account admin [75685252-fa97-11f0-877d-1e009d0003da] with role-id [1] can create an account of type escalateduser
The escalation check is incorrectly using the admin account (role-id 1) instead of the actual caller restrictedadmin (role-id 11).
Test Evidence:
- Restricted role created and permission denied:
{
"role": {
"description": "Restricted Domain Admin for testing",
"id": "81466765-10a5-4aba-bf55-ed25f6c994c5",
"name": "Domain Admin Restricted",
"type": "DomainAdmin"
}
}- createServiceOffering permission set to deny:
{
"id": "69374f83-ddfc-4a7a-8658-054e5b3d389f",
"permission": "deny",
"rule": "createServiceOffering"
}- Verified restrictedadmin context (408 APIs, only own account visible):
{
"account": [
{
"name": "restrictedadmin",
"roleid": "81466765-10a5-4aba-bf55-ed25f6c994c5",
"rolename": "Domain Admin Restricted"
}
],
"count": 1
}- Escalation succeeded (BUG):
{
"account": {
"name": "escalateduser",
"roleid": "51d853ed-fa97-11f0-877d-1e009d0003da",
"rolename": "Domain Admin",
"created": "2026-01-26T12:57:46+0000"
}
}- Server log showing wrong caller used in escalation check:
2026-01-26 12:57:40,709 DEBUG CIDRs from which account 'Account [{"accountName":"restrictedadmin"...
2026-01-26 12:57:46,013 DEBUG Checking if user of account admin [75685252-fa97-11f0-877d-1e009d0003da] with role-id [1] can create an account of type escalateduser
Status: FAILED - Role escalation prevention regression
TC2: Performance Optimization for createAccount API
Objective
Verify that the createAccount API completes in less than 1 second after the optimization (previously took ~6.5 seconds).
Test Steps
- Switched to admin profile
- Executed
createAccountcommand to create user "perftest" with User role - Checked management server logs for API timing (START to END timestamps)
Expected Result:
The createAccount API should complete in less than 1 second.
Actual Result:
The API completed in approximately 659 milliseconds.
Test Evidence:
- Account creation command output:
{
"account": {
"name": "perftest",
"roleid": "51d8639c-fa97-11f0-877d-1e009d0003da",
"rolename": "User",
"created": "2026-01-26T12:59:37+0000"
}
}- Server log timing:
2026-01-26 12:59:37,015 DEBUG ===START=== 10.0.35.71 -- POST
2026-01-26 12:59:37,674 DEBUG ===END=== 10.0.35.71 -- POST
Duration: 659ms (12:59:37,674 - 12:59:37,015)
Status: PASSED - Performance optimization working as expected
|
thanks @RosiKyu for the testing |
|
@blueorangutan package |
|
@weizhouapache a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress. |
|
Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ el10 ✔️ debian ✔️ suse15. SL-JID 16893 |
|
@blueorangutan test |
|
@weizhouapache a [SL] Trillian-Jenkins test job (ol8 mgmt + kvm-ol8) has been kicked to run smoke tests |
|
[SF] Trillian test result (tid-15505)
|
|
@RosiKyu |
There was a problem hiding this comment.
Pull request overview
This PR optimizes the role-escalation prevention path during account creation (introduced in/related to #5879) by pre-loading role permissions once per role, avoiding repeated permission lookups while iterating over API names.
Changes:
- Refactors
AccountManagerImpl.checkRoleEscalationto prefetch role/permission data per checker and reuse it while evaluating API access. - Extends the
APICheckerinterface with optional hooks to fetch role permissions and to check access using preloaded permissions. - Updates
DynamicRoleBasedAPIAccessCheckerto expose role permissions and to support the new “preloaded permissions” access-check overload.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
server/src/main/java/com/cloud/user/AccountManagerImpl.java |
Uses per-role preloaded permissions in role escalation checks, calling into new APIChecker hooks. |
plugins/acl/dynamic-role-based/src/main/java/org/apache/cloudstack/acl/DynamicRoleBasedAPIAccessChecker.java |
Implements the new hooks to return role+permissions and perform access checks without re-querying permissions. |
api/src/main/java/org/apache/cloudstack/acl/APIChecker.java |
Adds default methods for role-permission preloading and an overload for permission checks using preloaded data. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| boolean isEnabled(); | ||
|
|
||
| default Pair<Role, List<RolePermission>> getRolePermissions(long roleId) { return null; } | ||
| default boolean checkAccess(Account account, String commandName, Role accountRole, List<RolePermission> allPermissions) { return false; } |
There was a problem hiding this comment.
APIChecker.checkAccess(Account, String, Role, List<RolePermission>) has a default implementation that returns false. In AccountManagerImpl.checkApiAccess(...) the return value is ignored and only exceptions deny access, so a false default will effectively behave like “allowed” if any implementation ever returns a non-null value from getRolePermissions(...) but forgets to override this new checkAccess(...) overload. Consider changing the default implementation to delegate to the existing checkAccess(Account, String) (or throw an UnsupportedOperationException) so the safe/legacy behavior is preserved by default.
| default boolean checkAccess(Account account, String commandName, Role accountRole, List<RolePermission> allPermissions) { return false; } | |
| default boolean checkAccess(Account account, String commandName, Role accountRole, List<RolePermission> allPermissions) { | |
| return checkAccess(account, commandName); | |
| } |
| default Pair<Role, List<RolePermission>> getRolePermissions(long roleId) { return null; } | ||
| default boolean checkAccess(Account account, String commandName, Role accountRole, List<RolePermission> allPermissions) { return false; } |
There was a problem hiding this comment.
The new checkAccess(Account, String, Role, List<RolePermission>) overload and getRolePermissions(long) are role/ACL-specific, but they were added to the base APIChecker interface (which is also used by non-ACL checkers like rate limiting). To avoid coupling unrelated checkers to role permission concepts, consider moving these new methods to APIAclChecker (or another ACL-specific interface) instead of APIChecker.
| default Pair<Role, List<RolePermission>> getRolePermissions(long roleId) { return null; } | |
| default boolean checkAccess(Account account, String commandName, Role accountRole, List<RolePermission> allPermissions) { return false; } | |
| /** | |
| * ACL-specific extension for API checkers that work with roles and role permissions. | |
| */ | |
| interface APIAclChecker extends APIChecker { | |
| default Pair<Role, List<RolePermission>> getRolePermissions(long roleId) { return null; } | |
| default boolean checkAccess(Account account, String commandName, Role accountRole, List<RolePermission> allPermissions) { return false; } | |
| } |
| @Override | ||
| public boolean checkAccess(Account account, String commandName, Role accountRole, List<RolePermission> allPermissions) { | ||
| if (accountRole == null) { | ||
| throw new PermissionDeniedException(String.format("The account [%s] has role null or unknown.", account)); | ||
| } | ||
|
|
||
| if (accountRole.getRoleType() == RoleType.Admin && accountRole.getId() == RoleType.Admin.getId()) { | ||
| if (logger.isTraceEnabled()) { | ||
| logger.trace(String.format("Account [%s] is Root Admin, all APIs are allowed.", account)); | ||
| } | ||
| return true; | ||
| } | ||
|
|
||
| if (checkApiPermissionByRole(accountRole, commandName, allPermissions)) { | ||
| return true; | ||
| } | ||
| throw new UnavailableCommandException(String.format("The API [%s] does not exist or is not available for the account %s.", commandName, account)); | ||
| } |
There was a problem hiding this comment.
DynamicRoleBasedAPIAccessChecker adds a new checkAccess(Account, String, Role, List<RolePermission>) path, but the existing unit tests in this module only cover checkAccess(User, String) / checkAccess(Account, String). Adding tests for the new overload (allow/deny and root-admin cases) would help prevent regressions in the new preloaded-permissions flow used during account creation.
| List<APIChecker> apiCheckers = getEnabledApiCheckers(); | ||
| for (APIChecker apiChecker : apiCheckers) { | ||
| checkApiAccess(apiChecker, caller, requested); | ||
| } |
There was a problem hiding this comment.
checkRoleEscalation iterates over every enabled APIChecker. This includes non-ACL checkers like ApiRateLimitServiceImpl which increments per-account counters in checkAccess(Account, String) (and may throw RequestLimitException). That means creating an account can unexpectedly consume/trigger API rate limits and also interfere with the “requested can access” probing (exceptions are treated as “irrelevant”). Consider filtering the checkers used here to APIAclChecker only (or otherwise excluding rate limiting and other non-ACL checkers) so role escalation validation only evaluates permissions.
|
Hi @weizhouapache @RosiKyu please check #12973 for similar optimizations |
Description
This PR optimizes the process of #5879
prior to this PR
with this PR, it is reduced to less than 1 second
Types of changes
Feature/Enhancement Scale or Bug Severity
Feature/Enhancement Scale
Bug Severity
Screenshots (if appropriate):
How Has This Been Tested?
How did you try to break this feature and the system with this change?