Skip to content

Use hardware channel activity detection for checking interference#1727

Open
weebl2000 wants to merge 4 commits intomeshcore-dev:devfrom
weebl2000:use-hardware-channel-activity-detection
Open

Use hardware channel activity detection for checking interference#1727
weebl2000 wants to merge 4 commits intomeshcore-dev:devfrom
weebl2000:use-hardware-channel-activity-detection

Conversation

@weebl2000
Copy link
Copy Markdown
Contributor

@weebl2000 weebl2000 commented Feb 18, 2026

Using RSSI isn't very reliable (it's disabled in the code for a reason). We might try hw channel activity detection. Since we are gonna TX anyway it shouldn't be too bad to listen for a few milliseconds first. Worst case we see there is a transmission and we wait a little while. Alternative is that we just TX anyway and intefere with others causing transmission collisions.

Would be great if people could test this on their repeaters & companions and see if TX becomes more reliable.

You can build firmware for you device using https://mcimages.weebl.me?commitId=use-hardware-channel-activity-detection

First impressions after flashing this on one of my repeaters that is high up & exposed to a lot of other repeaters is that it responds a little more consistently. Sometimes it's snappier, but often times it's slightly slower but it fails way less. I.e. less failed status requests & non-answered commands.

@towerviewcams
Copy link
Copy Markdown

Ok now this is very interesting. My house has LOS to 4 high level repeaters. I'll test this when I get back from work and see what happens. reminds me of busy channel lockout in shared 2 way radio repeaters back in the day before LTR / trunking. Could be onto something here for sure.

@syssi
Copy link
Copy Markdown
Contributor

syssi commented Mar 17, 2026

I like your implementation. :-) Using hardware CAD instead of the broken RSSI approach is the right direction.

One concern: repurposing interference_threshold as the CAD enable flag is a code smell. The field originally carried a numeric RSSI threshold, now it's overloaded as a boolean for a different feature. A threshold of 50 and 1 mean the same thing, which is surprising.

Given that RSSI-based detection is effectively dead code at this point, why not just remove interference_threshold entirely and enable CAD unconditionally? That would simplify the config, eliminate the semantic confusion, and make the intent clear.

@AnonymousWho
Copy link
Copy Markdown

Isn't this very like CSMA/CD, like on classic Ethernet? Or CSMA/CA, if a pre-calculated timer is used per node?

@weebl2000 weebl2000 force-pushed the use-hardware-channel-activity-detection branch from 40ab8ea to acbc02f Compare March 20, 2026 08:19
@JvM-nl
Copy link
Copy Markdown

JvM-nl commented Mar 20, 2026

Its running here. First feelings in a busy mesh is that access my repeater is smooth, also send a message in public is running smooth.

I will try it longer and see if it stay the same

Day 2 and still happy with this patch add to 1.14.1

@weebl2000 weebl2000 force-pushed the use-hardware-channel-activity-detection branch from acbc02f to d7087bc Compare March 23, 2026 13:28
@jirogit
Copy link
Copy Markdown

jirogit commented Mar 23, 2026

Thank you very much for implementing CAD!

I tested simultaneous 2-node DM — sharing results & observations

Test environment:
Devices: WisMesh Tag + Wio Tracker L1 Pro
Distance: ~4 feet (indoor, direct LOS)
Firmware: commit d7087bc (display as d7087b at chat client setting)
Region: Japan LoRa settings 920.800MHz,BW125.0kHz, SF12, CR8, 13dBm

---Baseline (CAD disabled, default):

Test
Result

1-char DM, one-way
~1s ✅

150-char DM, one-way
~8-9s ✅

1-char channel msg, one-way
~1.5s ✅

1-char channel msg, simultaneous
Never delivered ❌

1-char DM, simultaneous
~25-29s, eventually delivered ⚠️

150-char DM, simultaneous
~6min sometimes delivered, ~10min failed ❌

---With CAD enabled (this PR):

Test
Result

1-char DM, simultaneous (run 1)
5s + 52s, both delivered ✅

1-char DM, simultaneous (run 2)
5s + 27s, both delivered ✅

150-char DM, simultaneous (run 1)
7min 7sec, both failed ❌

150-char DM, simultaneous (run 2)
7min 2s + 7min 5s, both failed ❌

Observations:

CAD clearly helps for short messages. 1-char simultaneous DM improved dramatically — one side gets through quickly (~5s) while the other waits and eventually delivers (~27-52s).

150-char simultaneous DM fails consistently. The retry pattern is: 4 retries at ~1min 5sec intervals via direct path, then 5th retry switches to flood path.

The flood path switch appears to make things worse, not better — both nodes fail at ~7min.
No random backoff. Both nodes appear to retry at nearly identical intervals, causing synchronized collisions. Adding random backoff between retries would likely resolve the 150-char case.

Flood path collision. When direct path fails and falls back to flood, both nodes flood simultaneously — same collision pattern observed in channel message simultaneous test (never delivered).

Suggestion:

Adding a small random backoff (e.g. 0-500ms) before each retry attempt would break the synchronized collision loop. This is especially important for multi-packet messages where collision opportunities multiply with each packet fragment.

Thank you for reading!

@jirogit
Copy link
Copy Markdown

jirogit commented Mar 24, 2026

Suggestion:

Adding a small random backoff (e.g. 0-500ms) before each retry attempt would break the synchronized collision loop. This is especially important for multi-packet messages where collision opportunities multiply with each packet fragment.

Following up on the suggestion above about random backoff — I implemented it and tested with Japan LoRa settings (SF12/BW125/CR4/8).
These settings result in very long air time (~8-10s per 150-char message), so the backoff values need to be much larger than the suggested 0-500ms.
Changes I made on top of this PR:

Long random backoff (8000-22000ms) when channel is busy
Small jitter (0-500ms) when channel is free, to prevent simultaneous TX from two nodes that both detect a free channel at the same time
Used vTaskDelay(1) instead of delay() or yield() to keep BLE responsive during backoff

These values are likely too large for US/EU regions with faster LoRa settings. But for Japan SF12, backoff must exceed single message air time to break the collision loop.
Test result: 150-char simultaneous DM, 10/11 success (91%). Before this change: consistent failure for Japanese settings.

jirogit@2ac7f4f

@terminalvelocity23
Copy link
Copy Markdown

terminalvelocity23 commented Mar 31, 2026

Yeah, it seems to perform worse on long messages with this PR to me. We have a dense mesh at SF7, 62,5 kHz here, with lots of in-band noise from telemetry and whatnot.

@YSOFF
Copy link
Copy Markdown

YSOFF commented Apr 2, 2026

Every 6 to 12 hours over a three-day period, I divided the received packets on corrupted. The correlation is linear and evens out over time. Standard "int.thresh 6" based on RSSI, showed a received/damaged ratio of 5.8–6. With new "hardware" measurement system, the ratio was 3.7–4. I performed measurements on the same node and antenna. I don’t know if damaged packet ratio is a good indicator of how well the int.thresh function works, but my observations show that the RSSI measurement works better.

@jirogit
Copy link
Copy Markdown

jirogit commented Apr 6, 2026

Updating my earlier comments in light of subsequent work.

After testing CAD + random backoff (reported above), I went further and
implemented a full RSSI-based LBT approach in PR #2218 (Japan regulatory
compliance for ARIB STD-T108).

Key findings comparing the two approaches:

CAD (this PR)

  • Detects LoRa chirp signals (preamble + payload on SX126x/LR11xx;
    preamble only on SX127x) — non-LoRa interference (WiFi, Zigbee, etc.)
    is invisible to it
  • Very fast detection (sub-ms)
  • Helps significantly for short messages
  • Not sufficient alone for Japanese regulatory requirements — ARIB
    STD-T108 certification testing starts with an unmodulated (CW) carrier
    sense test, which CAD cannot pass (it only responds to LoRa chirp
    patterns); RSSI-based detection passes this test directly

RSSI-based LBT (PR #2218 approach)

  • 5ms energy measurement — detects all RF energy, not just LoRa
  • -80 dBm absolute threshold (ARIB STD-T108 compliant)
  • Exponential backoff: 2000–16000ms base, +0–500ms jitter on free channel
  • ARIB 4-second airtime limit enforced via dynamic MAX_TEXT_LEN per coding rate:
    CR4/5 → 48 bytes (~16 JP chars), CR4/8 → 16 bytes (~5 JP chars)
  • Simultaneous DM test results at SF12/BW125 (Japan settings):
    • 16-byte / CR4/8: 3/3 success ✅
    • 48-byte / CR4/5: 3/3 success ✅
  • ARIB-compliant without CAD

Note: the 150-char simultaneous DM result (91% success) reported in my
earlier comment was from a separate experiment — CAD (this PR) with
manually added RSSI+backoff on top, before the 4-second airtime limit
was introduced. Under Japan regulatory mode (PR #2218), 150-char messages
are not transmittable by design due to the airtime constraint.

@YSOFF's observation (RSSI ratio 5.8–6 vs CAD 3.7–4 favoring RSSI) is
interesting, though the picture is complex. RSSI-based detection sees all
RF energy regardless of modulation, while CAD only responds to LoRa chirp
patterns — so in environments with significant non-LoRa interference, RSSI
would have an advantage. However, CAD sensitivity can potentially reach
below typical ambient noise floors, meaning CAD may detect weak LoRa
signals that RSSI cannot see at all. This cuts both ways: CAD may catch
weak interference that RSSI misses, but may also flag channels as busy
when RSSI would consider them free. Which effect dominates likely depends
on the specific RF environment.

For Japan regulatory mode, I ended up removing CAD entirely — RSSI alone
with proper backoff is both simpler and legally correct. For general
high-traffic use (non-regulatory), CAD still has value as a lightweight
first-pass check, especially for short messages.

Jitter (0–500ms on free channel detection) prevents two nodes that
simultaneously detect a clear channel from transmitting at the same
instant. Exponential backoff is primarily designed to relieve heavily
congested channels — by increasing wait time with each failed attempt,
it spreads out retries across many nodes competing for the same channel,
reducing the overall collision rate under high traffic load. These
mechanisms are most clearly beneficial for group/channel message traffic
(fire-and-forget).

For DMs, the picture is more nuanced. Direct DMs with a known path make
4 attempts via that path before falling back to flood on the 5th. The
ACK timeout duration is dynamically calculated based on estimated airtime
and path length (number of hops), which varies significantly with SF and
BW, and to a meaningful extent with CR as well, making the retry cycle
length environment- and topology-dependent. For flood DMs (path unknown
from the start), both backoff and flood delivery probability interact,
making behavior harder to predict.

Finally, the ACK timeout retry interval itself is not randomized — nodes
retry at fixed intervals determined by ACK timeout. This means two nodes
that start transmitting simultaneously tend to collide again on each
retry, compounding the problem. Synchronized retry was clearly observed
under intentionally simultaneous test conditions; in real-world usage,
transmission timing is typically less correlated, so the effect may be
less pronounced — though in dense meshes like those reported in Germany,
near-simultaneous transmissions from multiple nodes are plausible enough
that retry synchronization remains a practical concern. In regulated
regions, duty cycle constraints may add another layer of complexity.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants