SREP-3738: fix misleading error messages in dynatrace gather-logs#877
SREP-3738: fix misleading error messages in dynatrace gather-logs#877rolandmkunkel wants to merge 2 commits intoopenshift:masterfrom
Conversation
|
@rolandmkunkel: This pull request references SREP-3738 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the bug to target the "4.22.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
WalkthroughSingle file modified with error message clarity improvements and logging format consistency fixes. Error messages for token provider failures now include Dynatrace/vault CLI installation context, and request token failure logs updated to use formatted Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~8 minutes Important Pre-merge checks failedPlease resolve all errors before merging. Addressing warnings is optional. ❌ Failed checks (1 error)
✅ Passed checks (9 passed)
✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: rolandmkunkel The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
@rolandmkunkel: This pull request references SREP-3738 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the bug to target the "4.22.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
cmd/dynatrace/hcpGatherLogsCmd.go (1)
286-292:⚠️ Potential issue | 🟠 MajorReturn early when restarted-pod request token retrieval fails.
After Line 288, execution continues into
getLogseven though request token creation failed. This hides the real failure and can produce misleading follow-up errors.🔧 Proposed fix
podLogsRequestToken, err := getDTQueryExecution(DTURL, accessToken, restartedPodLogsQuery.finalQuery) if err != nil { log.Printf("failed to get request token: %v", err) - + _ = f.Close() + return fmt.Errorf("failed to get request token for restarted pod logs: %w", err) } err = getLogs(DTURL, accessToken, podLogsRequestToken, f) f.Close()🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cmd/dynatrace/hcpGatherLogsCmd.go` around lines 286 - 292, The code calls getLogs even when getDTQueryExecution failed: check the error returned from getDTQueryExecution(DTURL, accessToken, restartedPodLogsQuery.finalQuery) and return early (after closing f) if err != nil so you don't proceed with an invalid podLogsRequestToken; specifically, inside the error branch for getDTQueryExecution log the error, ensure f.Close() is invoked, and return from the enclosing function instead of falling through to getLogs(DTURL, accessToken, podLogsRequestToken, f).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@cmd/dynatrace/hcpGatherLogsCmd.go`:
- Around line 180-184: The loop opens a file handle 'f' then calls
getDTQueryExecution (assigning eventsRequestToken) and on error uses continue
without closing 'f', leaking file descriptors; fix by ensuring 'f.Close()' is
called before each continue on the error paths around the getDTQueryExecution
call (and the similar path near lines 240-244), or even better, arrange to close
the file immediately after opening (e.g., defer f.Close() inside a scoped
function or ensure explicit f.Close() before every continue) so
getDTQueryExecution and the eventsRequestToken error path do not leak the file
handle.
---
Outside diff comments:
In `@cmd/dynatrace/hcpGatherLogsCmd.go`:
- Around line 286-292: The code calls getLogs even when getDTQueryExecution
failed: check the error returned from getDTQueryExecution(DTURL, accessToken,
restartedPodLogsQuery.finalQuery) and return early (after closing f) if err !=
nil so you don't proceed with an invalid podLogsRequestToken; specifically,
inside the error branch for getDTQueryExecution log the error, ensure f.Close()
is invoked, and return from the enclosing function instead of falling through to
getLogs(DTURL, accessToken, podLogsRequestToken, f).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Pro Plus
Run ID: f77a2560-b9a7-4743-b184-78353fa3e848
📒 Files selected for processing (1)
cmd/dynatrace/hcpGatherLogsCmd.go
| eventsRequestToken, err := getDTQueryExecution(DTURL, accessToken, eventQuery.finalQuery) | ||
| if err != nil { | ||
| log.Print("failed to get request token", err) | ||
| log.Printf("failed to get request token: %v", err) | ||
| continue | ||
| } |
There was a problem hiding this comment.
Close f before continue when request token creation fails.
On Line 183 and Line 243 paths, the loop continues without closing the opened file handle. Repeated failures can exhaust file descriptors.
🔧 Proposed fix
@@
eventsRequestToken, err := getDTQueryExecution(DTURL, accessToken, eventQuery.finalQuery)
if err != nil {
+ _ = f.Close()
log.Printf("failed to get request token: %v", err)
continue
}
err = getEvents(DTURL, accessToken, eventsRequestToken, f)
_ = f.Close()
@@
podLogsRequestToken, err := getDTQueryExecution(DTURL, accessToken, podLogsQuery.finalQuery)
if err != nil {
+ _ = f.Close()
log.Printf("failed to get request token: %v", err)
continue
}
err = getLogs(DTURL, accessToken, podLogsRequestToken, f)
_ = f.Close()Also applies to: 240-244
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@cmd/dynatrace/hcpGatherLogsCmd.go` around lines 180 - 184, The loop opens a
file handle 'f' then calls getDTQueryExecution (assigning eventsRequestToken)
and on error uses continue without closing 'f', leaking file descriptors; fix by
ensuring 'f.Close()' is called before each continue on the error paths around
the getDTQueryExecution call (and the similar path near lines 240-244), or even
better, arrange to close the file immediately after opening (e.g., defer
f.Close() inside a scoped function or ensure explicit f.Close() before every
continue) so getDTQueryExecution and the eventsRequestToken error path do not
leak the file handle.
|
@rolandmkunkel: This pull request references SREP-3738 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the bug to target the "5.0.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/retest |
|
@rolandmkunkel: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
|
||
| accessToken, err := tokenProvider.Token() | ||
| if err != nil { | ||
| f.Close() |
There was a problem hiding this comment.
It seems this is gathering a bunch of data after opening the file without really needing the file - might be easier to just gather all data we need first, once we know we have everything open the file and write.
Might also make the codeflow easier to just push all the file-writing into it's own function so we can just use 'defer f.Close()' instead of sprinkling f.Close() everywhere.
Summary
log.Print("failed to get request token", err)calls that concatenatethe message and error without a separator, producing garbled output like
"failed to get request tokenrequest failed: 401 Unauthorized". Changed tolog.Printfwith proper formatting.vault CLI
Ref: SREP-3738
Test plan
osdctl dt gather-logswithoutvaultinstalled and verify theerror message mentions the vault CLI
osdctl hcp must-gatherand confirm DT query errors are loggedwith proper formatting (space/colon between message and error)
Summary by CodeRabbit