Skip to content

fix nil pointer panic in check_consensus_sync_status#157

Merged
qu0b merged 2 commits intoethpandaops:masterfrom
yperbasis:fix/nil-sync-status
Apr 12, 2026
Merged

fix nil pointer panic in check_consensus_sync_status#157
qu0b merged 2 commits intoethpandaops:masterfrom
yperbasis:fix/nil-sync-status

Conversation

@yperbasis
Copy link
Copy Markdown
Contributor

Summary

  • Fix nil pointer dereference in check_consensus_sync_status when GetNodeSyncStatus returns an error
  • When the RPC call fails (e.g. during network bootstrap before genesis), syncStatus is nil, but getClientInfo dereferences it unconditionally — causing a panic
  • The analogous check_execution_sync_status task already has this nil guard; this applies the same pattern

Panic trace from CI

panic: runtime error: invalid memory address or nil pointer dereference

goroutine 172 [running]:
github.com/ethpandaops/assertoor/pkg/tasks/check_consensus_sync_status.(*Task).getClientInfo(...)
    pkg/tasks/check_consensus_sync_status/task.go:219
github.com/ethpandaops/assertoor/pkg/tasks/check_consensus_sync_status.(*Task).processCheck(...)
    pkg/tasks/check_consensus_sync_status/task.go:142

Observed in Erigon CI glamsterdam Kurtosis tests: https://github.com/erigontech/erigon/actions/runs/24228480466/job/70734695754

Test plan

  • Verify check_consensus_sync_status no longer panics when a CL client is unreachable during startup
  • Verify ClientInfo for failed clients has Name populated and sync fields zeroed

When GetNodeSyncStatus returns an error, syncStatus is nil, but
getClientInfo dereferences it unconditionally. This causes a panic
when a consensus client is temporarily unreachable (e.g. during
initial network bootstrap before genesis).

The execution sync status task already handles this correctly with
a nil guard; apply the same pattern here.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@qu0b qu0b force-pushed the master branch 2 times, most recently from 1d5b002 to 4321d7d Compare April 11, 2026 11:01
Address review: use an early-return guard with Synchronizing: true
(assume unhealthy) instead of relying on Go zero-value defaults.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@yperbasis
Copy link
Copy Markdown
Contributor Author

Root cause analysis

The panic was hit in Erigon's glamsterdam Kurtosis CI, which spins up a 3-node devnet (Erigon EL + Lighthouse CL) with a short genesis_delay: 20 and seconds_per_slot: 6.

Timeline

Time Event
05:56:44 Erigon nodes start, process genesis block
05:56:47 Lighthouse nodes start, waiting for genesis
05:57:02 Lighthouse peers connect, sync state → "Synced"
05:57:10 Assertoor starts stability-check
05:57:10 check_clients_are_healthy marks 1-erigon-lighthouse as failed (EL/CL polling hadn't received blocks yet for this client, unlike client 3)
05:57:10 check_consensus_sync_status calls GetNodeSyncStatus for client 1 → returns error "get node syncing not supported"
05:57:10 PANIC in getClientInfo: nil syncStatus dereference
05:58:08 Glamsterdam fork would have activated (never reached)

Key observations

  1. The HTTP layer was fine. The snooper-beacon logs for all 3 clients show identical request counts (120 each) and 8 successful 200 OK responses to /eth/v1/node/syncing. The Lighthouse response was valid:

    {"data": {"el_offline": false, "head_slot": "0", "is_optimistic": false, "is_syncing": false, "sync_distance": "0"}}
  2. The "get node syncing not supported" error is internal to assertoor's client abstraction, not from the HTTP response. Client 1's consensus client object likely wasn't fully initialised during the client pool bootstrap (client 1 never received its initial EL/CL block poll, unlike client 3 — a race condition).

  3. The execution sync status task already handles this. check_execution_sync_status/task.go:217 has if syncStatus != nil before dereferencing. The consensus counterpart was just missing the same guard.

  4. The test never got to exercise any fork features — it failed during the initial stability check, ~58 seconds before the Glamsterdam fork activation.

@qu0b qu0b merged commit ea117e9 into ethpandaops:master Apr 12, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants