Fix thread safety issues in Communicator, Session, and AsyncPromiseFulfillerDecorator by neinhart · Pull Request #433 · ChargeTimeEU/Java-OCA-OCPP

neinhart · 2026-04-05T06:32:42Z

Problem

Several thread safety issues exist in ocpp-common that can cause data corruption
or resource exhaustion under concurrent load:

Communicator: transactionQueue uses non-thread-safe ArrayDeque, but is
accessed concurrently by sendCall() and the RetryRunner thread without
synchronization. failedFlag is a plain boolean with no cross-thread
visibility guarantee.
AsyncPromiseFulfillerDecorator: Default executor is Executors.newCachedThreadPool()
which creates threads without bound. When many chargers send requests simultaneously,
this can exhaust memory with OutOfMemoryError: unable to create new native thread.
Session.onCall(): The synchronized keyword serializes all incoming CALL messages
per session, but all internal state is already thread-safe (ConcurrentHashMap,
stateless Gson, immutable FeatureRepository). Additionally, pendingPromises
entries are never removed after completion, causing unbounded map growth.

Changes

Replace ArrayDeque with ConcurrentLinkedDeque and boolean with AtomicBoolean
in Communicator
Replace unbounded CachedThreadPool with bounded ThreadPoolExecutor in
AsyncPromiseFulfillerDecorator (core=CPUs, max=CPUs×2, queue=1000,
CallerRunsPolicy for natural backpressure). Existing setExecutor() API
is preserved for custom configuration.
Remove unnecessary synchronized from Session.onCall()
Add whenComplete callback to auto-remove pendingPromises entries

Tests

Added 3 new test classes (10 tests total):

AsyncPromiseFulfillerDecoratorTest — async delegation, non-blocking, executor resilience
CommunicatorConcurrencyTest — concurrent sendCall, concurrent sendCall + reconnect
SessionConcurrencyTest — concurrent onCall, pendingPromises cleanup, concurrent completion

…lfillerDecorator - Replace non-thread-safe ArrayDeque with ConcurrentLinkedDeque in Communicator transaction queue to fix race condition where RetryRunner thread and sendCall() concurrently access the queue without synchronization - Replace boolean failedFlag with AtomicBoolean for cross-thread visibility between RetryRunner and EventHandler - Replace unbounded CachedThreadPool with bounded ThreadPoolExecutor in AsyncPromiseFulfillerDecorator (core=CPUs, max=CPUs*2, queue=1000, CallerRunsPolicy) to prevent thread explosion when many chargers send requests simultaneously - Remove unnecessary synchronized from Session.onCall() — all internal state is already thread-safe (ConcurrentHashMap, stateless Gson, immutable FeatureRepository). The lock serialized all incoming CALL messages per session despite OCPP supporting concurrent requests via unique message IDs - Add automatic cleanup of pendingPromises via whenComplete callback to prevent unbounded map growth

robert-s-ubi · 2026-04-14T14:29:38Z

Hi @neinhart, I haven't fully looked into all issues, but this one:

| Additionally, pendingPromises entries are never removed after completion, causing unbounded map growth.

does not seem correct to me. See Session#completePendingPromise() which removes pendingPromises when completing.

neinhart · 2026-04-15T02:25:27Z

Hi @robert-s-ubi, thanks for the careful review and the correction — you're right that my wording in the PR description was inaccurate.

Let me trace through the code more precisely:

Normal success path — you're correct:

SimplePromiseFulfiller.fulfill() calls eventHandler.handleRequest(request) and gets a Confirmation
It then calls eventHandler.asyncCompleteRequest(messageId, conf) (SimplePromiseFulfiller.java:47)
asyncCompleteRequest delegates to session.completePendingPromise() (Server.java:143)
completePendingPromise removes the entry from the map. ✅

So in the successful path, entries are removed correctly as you pointed out.

Exception path — where I think there is still a leak:

In SimplePromiseFulfiller.fulfill() (lines 49-54), if eventHandler.handleRequest(request) throws an exception, the catch block calls promise.completeExceptionally(ex) directly on the promise, bypassing asyncCompleteRequest/completePendingPromise.

} catch (Exception ex) {
    logger.warn(\"fulfillPromise() failed\", ex);
    if (promise != null) {
        promise.completeExceptionally(ex);  // <-- bypasses completePendingPromise
    }
}

In this case, whenComplete(ConfirmationHandler) fires and sends back the error response, but the entry in pendingPromises is never cleaned up. Each exception from a handler leaks one entry.

The promise.whenComplete((result, error) -> pendingPromises.remove(id)) callback I added is a no-op in the success path (the entry is already gone) but handles the exception path as well.

That said, I agree the PR description overstates the issue — it's an edge case in the exception path, not a general leak. Would you prefer:

I keep the whenComplete cleanup as defensive and update the PR description to accurately describe the exception-path scenario, or
I remove this specific change from the PR (since it's a minor edge case) and the PR focuses only on the clear thread-safety fixes (ArrayDeque, AtomicBoolean, ThreadPoolExecutor, synchronized removal)?

Happy to go either way. Thanks again for taking the time to review!

robert-s-ubi · 2026-04-15T08:36:15Z

That said, I agree the PR description overstates the issue — it's an edge case in the exception path, not a general leak. Would you prefer:

I keep the whenComplete cleanup as defensive and update the PR description to accurately describe the exception-path scenario, or

I remove this specific change from the PR (since it's a minor edge case) and the PR focuses only on the clear thread-safety fixes (ArrayDeque, AtomicBoolean, ThreadPoolExecutor, synchronized removal)?

@neinhart I'd prefer option 2, and put the promise leak in a separate PR. Something similar was already fixed with the promises for outgoing requests in this commit: 32903d4 - and that change replaced the existing promise removals with a single whenComplete() removal. I'd like to solve this issue the same way, if possible.

robert-s-ubi

I haven't looked at the tests yet. I suppose not all will apply when the completion leak fix is removed?

robert-s-ubi · 2026-04-15T19:35:02Z

+        "An error occurred. Sending this information: uniqueId {}: action: {}, errorCode: {},"
+            + " errorDescription: {}",


Please revert this unnecessary formatting change.

robert-s-ubi · 2026-04-15T19:35:29Z

+        "Payload for Action is syntactically correct but at least one of the fields violates"
+            + " occurrence constraints";


Please revert this unnecessary formatting change.

robert-s-ubi · 2026-04-15T19:35:35Z

+        "An internal error occurred and the receiver was not able to process the requested Action"
+            + " successfully";


Please revert this unnecessary formatting change.

robert-s-ubi · 2026-04-15T19:36:13Z

          if (request.validate()) {
            CompletableFuture<Confirmation> promise = new CompletableFuture<>();
            promise.whenComplete(new ConfirmationHandler(id, action, communicator));
+            promise.whenComplete((result, error) -> pendingPromises.remove(id));


Please revert and make this change in a different PR.

robert-s-ubi · 2026-04-15T19:45:54Z

+    int coreSize = Runtime.getRuntime().availableProcessors();
+    int maxSize = coreSize * 2;


I'm not sure about the effects of this. Note that this also affects the request handling on the charging station side, and I've seen CSMS "bomb" charging stations with parallel ChangeConfigurationRequests for all read-write configuration settings. Would this still work the same as before?

Also, I gather that once the threads are exhausted (which may be only 2 on a single-core charging station), the executor switches to executing the request handler within the onCall() caller's thread. But if then an exception is thrown in the request handler, it would not result in the same response as when it was thrown in a separate thread, would it?

robert-s-ubi requested changes Apr 15, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix thread safety issues in Communicator, Session, and AsyncPromiseFulfillerDecorator#433

Fix thread safety issues in Communicator, Session, and AsyncPromiseFulfillerDecorator#433
neinhart wants to merge 1 commit intoChargeTimeEU:masterfrom
neinhart:fix/thread-safety

neinhart commented Apr 5, 2026

Uh oh!

robert-s-ubi commented Apr 14, 2026

Uh oh!

neinhart commented Apr 15, 2026

Uh oh!

robert-s-ubi commented Apr 15, 2026

Uh oh!

robert-s-ubi left a comment

Uh oh!

robert-s-ubi Apr 15, 2026

Uh oh!

robert-s-ubi Apr 15, 2026

Uh oh!

robert-s-ubi Apr 15, 2026

Uh oh!

robert-s-ubi Apr 15, 2026

Uh oh!

robert-s-ubi Apr 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		"An error occurred. Sending this information: uniqueId {}: action: {}, errorCode: {},"
		+ " errorDescription: {}",

		"Payload for Action is syntactically correct but at least one of the fields violates"
		+ " occurrence constraints";

		"An internal error occurred and the receiver was not able to process the requested Action"
		+ " successfully";

		int coreSize = Runtime.getRuntime().availableProcessors();
		int maxSize = coreSize * 2;

Conversation

neinhart commented Apr 5, 2026

Problem

Changes

Tests

Uh oh!

robert-s-ubi commented Apr 14, 2026

Uh oh!

neinhart commented Apr 15, 2026

Uh oh!

robert-s-ubi commented Apr 15, 2026

Uh oh!

robert-s-ubi left a comment

Choose a reason for hiding this comment

Uh oh!

robert-s-ubi Apr 15, 2026

Choose a reason for hiding this comment

Uh oh!

robert-s-ubi Apr 15, 2026

Choose a reason for hiding this comment

Uh oh!

robert-s-ubi Apr 15, 2026

Choose a reason for hiding this comment

Uh oh!

robert-s-ubi Apr 15, 2026

Choose a reason for hiding this comment

Uh oh!

robert-s-ubi Apr 15, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants