backport: Merge bitcoin# 28035, 26957#7258
backport: Merge bitcoin# 28035, 26957#7258vijaydasmp wants to merge 2 commits intodashpay:developfrom
Conversation
✅ No Merge Conflicts DetectedThis PR currently has no conflicts with other open PRs. |
8c47d59 doc: improve -debuglogfile help to be a bit clearer (jonatack) 20d89d6 bench: document expected results in logging benchmarks (jonatack) d8deba8 bench: add LogPrintfCategory and LogPrintLevel benchmarks (Jon Atack) 102b203 bench: order the logging benchmark code by output (Jon Atack) 4b3fdbf bench: update logging benchmark naming for clarity (Jon Atack) 4684aa8 bench: allow logging benchmarks to be order-independent (Larry Ruane) Pull request description: Update our logging benchmarks for evaluating ongoing work like bitcoin#25203 and refactoring proposals like bitcoin#26619 and bitcoin#26697. - make the logging benchmarks order-independent (Larry Ruane) - add missing benchmarks for the `LogPrintLevel` and `LogPrintfCategory` macros that our logging is migrating to; at some later point it should be feasible to drop some of the previous logging benchmarks - update the logging benchmark naming to be clear which benchmark corresponds to which log macro, and update the ordering to be the same as the output - add clarifying documentation to the logging benchmarks - improve the `-debuglogfile` config option help to be clearer; can be tested by running `./src/bitcoind -help | grep -A4 '\-debuglogfile'` Reviewers can run the logging benchmarks with: ```bash ./src/bench/bench_bitcoin -filter='LogP*.*' ``` ACKs for top commit: LarryRuane: ACK 8c47d59 martinus: code review & tested ACK 8c47d59, here are my benchmark results: achow101: ACK 8c47d59 Tree-SHA512: 705f8720c9ceaf14a1945039c7578a0c17a12215cbc44908099af4ac444561c3f95d833c5a91b325cdd4470737d8a01e2da64db2d542dd7c9a3747fbfdbf213e
00e0f20 to
8a0b97b
Compare
fa3d729 lint: Ignore check_fileopens failure on **kwargs (MarcoFalke) fa6bb85 test: Ignore UTF-8 errors in assert_debug_log (MarcoFalke) fa63326 test: Fix debug_log_size helper (MarcoFalke) Pull request description: Fix two bugs, see commit messages. ACKs for top commit: theStack: utACK fa3d729 Tree-SHA512: 4a29bdf954bf62bb7676c2a41b03ad017bc86d535b2bd912c96bd41d1621beb06d840b53c211480ad51974e8b293bbae620060d2528d269159f32c0b44e47712
|
✅ Review complete (commit 0f8aca9) |
WalkthroughThis PR updates logging benchmarks in Estimated code review effort🎯 3 (Moderate) | ⏱️ ~18 minutes 🚥 Pre-merge checks | ✅ 1 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@test/functional/feature_governance_cl.py`:
- Around line 153-156: The code opens the debug log in text mode but computes a
byte offset via node.debug_log_size(mode="rb"), causing invalid seeks; change to
open the file in binary mode (use 'rb') when getting dl and seeking by the byte
offset, read the bytes into a variable (e.g., debug_log_bytes) and then decode
to UTF-8 (handling errors='replace' or similar) into debug_log_part; keep the
same seek logic (seek_pos if > 0 else 0) and use node.debug_log_path and
node.debug_log_size as before.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 61e7a32f-85bd-4d94-8ba6-2a2f942a20a9
📒 Files selected for processing (5)
src/bench/logging.cppsrc/init/common.cpptest/functional/feature_governance_cl.pytest/functional/test_framework/test_node.pytest/lint/lint-python-utf8-encoding.py
| with open(node.debug_log_path, encoding='utf-8') as dl: | ||
| seek_pos = node.debug_log_bytes() - 100 * 1024 # read the last 100 KiB only | ||
| seek_pos = node.debug_log_size(mode="rb") - 100 * 1024 # read the last 100 KiB only | ||
| dl.seek(seek_pos if seek_pos > 0 else 0) | ||
| debug_log_part = dl.read() |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Verify the byte-offset/text-seek mismatch in the changed segment.
rg -n -C2 'open\(node\.debug_log_path, encoding=.utf-8.|debug_log_size\(mode="rb"\)|dl\.seek\(' test/functional/feature_governance_cl.pyRepository: dashpay/dash
Length of output: 1772
🌐 Web query:
Dash dash feature_governance_cl.py debug_log_path UTF-8 text mode seek
💡 Result:
In Dash Core’s functional test framework, debug_log_path is the per-node logfile at:
chain_path / "debug.log"(i.e.,<datadir>/<chain>/debug.log). In code it’s exposed as thedebug_log_pathproperty onTestNode. [1]
The problem with UTF-8 + text mode + seek is in how the test framework tracks “where it last read up to” in debug.log:
debug_log_bytes()opens the log in text mode (open(..., encoding="utf-8")), doesseek(0, 2)and returnstell(). [1]assert_debug_log()later re-opens the file in text mode and doesdl.seek(prev_size)whereprev_sizecame from that earliertell(). [1]
On some platforms (notably Windows), text-mode seeks are not byte-offset seeks (they use an opaque cookie tied to newline/decoder state), so using these offsets as if they were stable byte positions can lead to “text mode seek” errors or incorrect positioning.
Fix (robust approach): do offset tracking in binary mode, decode after reading, e.g.:
- compute size with
Path.stat().st_sizeoropen(..., "rb")/tell() - read new data with
open(..., "rb"),seek(prev_size),read(), then.decode("utf-8", errors="replace")
This avoids text-mode seek() semantics entirely while still letting you search log text for expected messages. [1]
Source(s)
[1] Dash Core test_node.py implementation of debug_log_path, debug_log_bytes(), and assert_debug_log(). (github.com)
Citations:
Avoid mixing binary byte offsets with text-mode seek().
Line 154 computes a byte offset using debug_log_size(mode="rb"), but lines 153–155 open and seek the file in text mode with encoding='utf-8'. This mismatch causes invalid seek positions for UTF-8 and can produce flaky reads, especially on Windows where text-mode seeks use opaque decoder state.
Open the file in binary mode and decode explicitly after reading:
Proposed fix
- def governance_tip_updated(node):
- with open(node.debug_log_path, encoding='utf-8') as dl:
- seek_pos = node.debug_log_size(mode="rb") - 100 * 1024 # read the last 100 KiB only
- dl.seek(seek_pos if seek_pos > 0 else 0)
- debug_log_part = dl.read()
- return expected_msg in debug_log_part
+ def governance_tip_updated(node):
+ with open(node.debug_log_path, "rb") as dl:
+ seek_pos = node.debug_log_size(mode="rb") - 100 * 1024 # read the last 100 KiB only
+ dl.seek(seek_pos if seek_pos > 0 else 0)
+ debug_log_part = dl.read().decode("utf-8", errors="replace")
+ return expected_msg in debug_log_part📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| with open(node.debug_log_path, encoding='utf-8') as dl: | |
| seek_pos = node.debug_log_bytes() - 100 * 1024 # read the last 100 KiB only | |
| seek_pos = node.debug_log_size(mode="rb") - 100 * 1024 # read the last 100 KiB only | |
| dl.seek(seek_pos if seek_pos > 0 else 0) | |
| debug_log_part = dl.read() | |
| with open(node.debug_log_path, "rb") as dl: | |
| seek_pos = node.debug_log_size(mode="rb") - 100 * 1024 # read the last 100 KiB only | |
| dl.seek(seek_pos if seek_pos > 0 else 0) | |
| debug_log_part = dl.read().decode("utf-8", errors="replace") |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@test/functional/feature_governance_cl.py` around lines 153 - 156, The code
opens the debug log in text mode but computes a byte offset via
node.debug_log_size(mode="rb"), causing invalid seeks; change to open the file
in binary mode (use 'rb') when getting dl and seeking by the byte offset, read
the bytes into a variable (e.g., debug_log_bytes) and then decode to UTF-8
(handling errors='replace' or similar) into debug_log_part; keep the same seek
logic (seek_pos if > 0 else 0) and use node.debug_log_path and
node.debug_log_size as before.
thepastaclaw
left a comment
There was a problem hiding this comment.
Code Review
I validated the successful agent findings against the checked-out PR head and the patch scope. One blocking regression remains in the Dash governance functional test: it computes a byte offset and then seeks on a UTF-8 text stream, which is not a safe offset domain and can fail on multibyte log contents. The backport-prerequisite review section produced no actionable finding.
Reviewed commit: 0f8aca9
🔴 1 blocking
🤖 Prompt for all review comments with AI agents
These findings are from an automated code review. Verify each finding against the current code and only fix it if needed.
In `test/functional/feature_governance_cl.py`:
- [BLOCKING] lines 153-156: Governance log polling seeks to a byte offset on a UTF-8 text stream
`governance_tip_updated()` opens `debug.log` in text mode with UTF-8 decoding, but computes `seek_pos` with `node.debug_log_size(mode="rb")`, which returns a byte offset. In Python text I/O, arbitrary byte offsets are not valid seek positions unless they come from `tell()` on the same text stream. This can misalign the read or raise `UnicodeDecodeError` when the offset lands inside a multibyte UTF-8 sequence. The surrounding test framework already documents this constraint in `TestNode.assert_debug_log()` and `TestNode.wait_for_debug_log()` by using matching open modes for `debug_log_size()` and the later `open(...)/seek()`.
| with open(node.debug_log_path, encoding='utf-8') as dl: | ||
| seek_pos = node.debug_log_bytes() - 100 * 1024 # read the last 100 KiB only | ||
| seek_pos = node.debug_log_size(mode="rb") - 100 * 1024 # read the last 100 KiB only | ||
| dl.seek(seek_pos if seek_pos > 0 else 0) | ||
| debug_log_part = dl.read() |
There was a problem hiding this comment.
🔴 Blocking: Governance log polling seeks to a byte offset on a UTF-8 text stream
governance_tip_updated() opens debug.log in text mode with UTF-8 decoding, but computes seek_pos with node.debug_log_size(mode="rb"), which returns a byte offset. In Python text I/O, arbitrary byte offsets are not valid seek positions unless they come from tell() on the same text stream. This can misalign the read or raise UnicodeDecodeError when the offset lands inside a multibyte UTF-8 sequence. The surrounding test framework already documents this constraint in TestNode.assert_debug_log() and TestNode.wait_for_debug_log() by using matching open modes for debug_log_size() and the later open(...)/seek().
💡 Suggested change
| with open(node.debug_log_path, encoding='utf-8') as dl: | |
| seek_pos = node.debug_log_bytes() - 100 * 1024 # read the last 100 KiB only | |
| seek_pos = node.debug_log_size(mode="rb") - 100 * 1024 # read the last 100 KiB only | |
| dl.seek(seek_pos if seek_pos > 0 else 0) | |
| debug_log_part = dl.read() | |
| with open(node.debug_log_path, "rb") as dl: | |
| seek_pos = node.debug_log_size(mode="rb") - 100 * 1024 | |
| dl.seek(seek_pos if seek_pos > 0 else 0) | |
| debug_log_part = dl.read().decode("utf-8", errors="replace") |
source: ['codex-general']
🤖 Fix this with AI agents
These findings are from an automated code review. Verify each finding against the current code and only fix it if needed.
In `test/functional/feature_governance_cl.py`:
- [BLOCKING] lines 153-156: Governance log polling seeks to a byte offset on a UTF-8 text stream
`governance_tip_updated()` opens `debug.log` in text mode with UTF-8 decoding, but computes `seek_pos` with `node.debug_log_size(mode="rb")`, which returns a byte offset. In Python text I/O, arbitrary byte offsets are not valid seek positions unless they come from `tell()` on the same text stream. This can misalign the read or raise `UnicodeDecodeError` when the offset lands inside a multibyte UTF-8 sequence. The surrounding test framework already documents this constraint in `TestNode.assert_debug_log()` and `TestNode.wait_for_debug_log()` by using matching open modes for `debug_log_size()` and the later `open(...)/seek()`.
bitcoin backports