Skip to content

Bounds-check pzTail in prepare_v2/v3 span overloads (fix AOOR variant of #430)#663

Open
jamescater2 wants to merge 1 commit intoericsink:mainfrom
jamescater2:fix/prepare-v2-pztail-bounds-check
Open

Bounds-check pzTail in prepare_v2/v3 span overloads (fix AOOR variant of #430)#663
jamescater2 wants to merge 1 commit intoericsink:mainfrom
jamescater2:fix/prepare-v2-pztail-bounds-check

Conversation

@jamescater2
Copy link
Copy Markdown

Summary

Adds a bounds check on p_tail before the tail-span arithmetic in the ReadOnlySpan<byte> overloads of sqlite3_prepare_v2 and sqlite3_prepare_v3. Without the check, native error paths that leave *pzTail unwritten cause the wrapper to throw ArgumentOutOfRangeException from sql.Slice instead of letting the error rc propagate to the caller.

Scope is deliberately narrow — this is the smallest defensive fix for the AOOR symptom. I haven't touched the AV side of the same bug family (#108, #321, #430, #479, #588) because that needs a larger, more invasive change; the AOOR variant closes cleanly here.

Root cause

In src/providers/provider.tt (and the four regenerated provider_e_sqlite3_*.cs files):

unsafe int ISQLite3Provider.sqlite3_prepare_v2(sqlite3 db, ReadOnlySpan<byte> sql, out IntPtr stm, out ReadOnlySpan<byte> tail)
{
    fixed (byte* p_sql = sql)
    {
        var rc = NativeMethods.sqlite3_prepare_v2(db, p_sql, sql.Length, out stm, out var p_tail);
        var len_consumed = (int) (p_tail - p_sql);     // no validation of p_tail
        int len_remain = sql.Length - len_consumed;
        if (len_remain > 0)
        {
            tail = sql.Slice(len_consumed, len_remain);   // throws AOOR on bad p_tail
        }
        else
        {
            tail = ReadOnlySpan<byte>.Empty;
        }
        return rc;
    }
}

Some native paths return without writing *pzTail — notably sqlite3LockAndPrepare returning SQLITE_MISUSE_BKPT when sqlite3SafetyCheckOk(db) fails or zSql is null. On those paths the caller's out byte* p_tail stays at its .locals init default of 0. Then:

  • (p_tail - p_sql) is a 64-bit signed difference. Cast to int, its sign depends on bit 31 of the low 32 bits of p_sql.
  • For managed-heap addresses on x64 where that bit is clear (common, and easy to reproduce by growing the GC heap past 2 GB), the cast produces a large negative len_consumed.
  • sql.Length - negative overflows positive, so len_remain > 0 is true and sql.Slice(negative, positive) throws ArgumentOutOfRangeException through the stack instead of the caller seeing rc == SQLITE_MISUSE.

Relationship to existing issues

src/SQLitePCLRaw.core/raw.cs:815 already carries the comment // #430 happens here — the call site is known. #108 / #321 / #430 / #479 / #588 are all the same root cause but surface as AccessViolationException when the stale/bad pointer lands in unmapped memory rather than valid memory at a bad offset. The ArgumentOutOfRangeException variant of the same bug seems to be under-reported; every public report I could find is the AV flavour. This PR closes the AOOR variant specifically.

Fix

Six added lines per overload (for prepare_v2 and prepare_v3):

if (p_tail != null && p_tail >= p_sql && p_tail <= p_sql + sql.Length)
{
    var len_consumed = (int) (p_tail - p_sql);
    int len_remain = sql.Length - len_consumed;
    tail = len_remain > 0 ? sql.Slice(len_consumed, len_remain) : ReadOnlySpan<byte>.Empty;
}
else
{
    tail = ReadOnlySpan<byte>.Empty;
}

On a valid p_tail within the SQL buffer, behaviour is identical. On a p_tail outside [p_sql, p_sql + sql.Length], the wrapper returns an empty tail and lets rc propagate — the caller now sees the real SQLITE_MISUSE (or whatever native returned) instead of an AOOR from inside the wrapper.

Applied to:

  • src/providers/provider.tt (generator source of truth)
  • All four regenerated variants under src/SQLitePCLRaw.provider.e_sqlite3/Generated/ (funcptrs × win/notwin, prenet5 × win/notwin)

Other providers (sqlite3, sqlcipher, winsqlite3, dynamic_cdecl, dynamic_stdcall) share the same template — regenerating them will pick up the same fix on the next gen_providers run.

Regression tests

Added in src/common/tests_xunit.cs:

  • test_prepare_v2_span_tolerates_uninitialised_pzTail
  • test_prepare_v3_span_tolerates_uninitialised_pzTail

Each:

  1. Opens an in-memory sqlite3, then calls manual_close_v2() to zero the handle.
  2. Loops 512 iterations, each calling prepare_v2 / prepare_v3 with non-empty SQL.
  3. Uses a rolling allocator (random 1 KB–128 KB fillers) to force managed-heap addresses into the bit-31-clear range where the bug fires deterministically.
  4. Asserts rc == SQLITE_MISUSE and tail.IsEmpty on every iteration.

Verified locally against both the patched and unpatched provider packages via a scratch project:

Build Iterations AOOR count
Unpatched 2000 2000
Patched 2000 0

Same test, same address range, opposite outcomes.

Test project modernisation

While I was here, src/tests/tests.csproj on current main didn't build on modern .NET SDKs — it targeted netcoreapp3.1, project-referenced a deleted SQLitePCLRaw.nativelibrary, and called a 3-arg NativeLibrary.Load overload that no longer exists. I bumped it to the minimum needed to run:

  • TargetFramework$(tfm_net) (net8.0)
  • Dropped the deleted SQLitePCLRaw.nativelibrary and unused SQLitePCLRaw.provider.dynamic_cdecl project references
  • Added ProjectReference to SQLitePCLRaw.provider.e_sqlite3 and PackageReference to SourceGear.sqlite3 for the native library
  • Rewrote src/tests/my_batteries_v2.cs to init against SQLite3Provider_e_sqlite3 directly instead of the deleted dynamic-load path
  • Dropped the stale DotNetCliToolReference for dotnet-reportgenerator-cli (deprecated syntax)

Full suite now runs green: dotnet test src/tests/tests.csproj -c Release — 124 / 124 pass in ~300 ms on my box. Happy to split this into a separate prep-PR if you'd prefer it that way.

Verification on a downstream consumer

A WPF / .NET 10 Windows application that was reliably tripping this AOOR at ~40 % per full-suite run (6 / 15 runs of a 32-way concurrent upsert test, hitting ExecuteScalar("PRAGMA journal_mode=WAL;") through Microsoft.Data.Sqlite 10.0.7) ran cleanly for 30 / 30 full-suite runs against a locally-built SQLitePCLRaw.bundle_e_sqlite3 with this patch applied — zero AOOR, zero AV. All environmental mitigations were peeled off for the A/B (test-collection serialization removed, contention tune-down reverted to 32 × 100). A fresh stack trace from one of the failing unpatched runs is available if useful.


On native error paths where sqlite3_prepare_v2/v3 returns without
writing *pzTail (e.g. sqlite3LockAndPrepare returning
SQLITE_MISUSE_BKPT when the db handle fails sqlite3SafetyCheckOk or
zSql is null), the wrapper's `out byte* p_tail` stays at zero.
The subsequent `(int)(p_tail - p_sql)` can truncate to a negative
int depending on the managed-heap address, which makes
sql.Slice(len_consumed, len_remain) throw
ArgumentOutOfRangeException instead of letting the error rc
propagate.

Guard the tail-span construction behind an explicit bounds check:
only compute the slice when p_tail falls within [p_sql, p_sql +
sql.Length]; otherwise return an empty tail. Normal happy paths
are unchanged.

This is the ArgumentOutOfRangeException variant of the bug family
documented in ericsink#108, ericsink#321, ericsink#430, ericsink#479, ericsink#588; those reports surface
as AccessViolationException when the stale pointer lands in
unmapped memory. Both variants share the same call site
(raw.cs:815, already annotated "// ericsink#430 happens here").

Also adds regression tests in src/common/tests_xunit.cs that
deterministically fire the AOOR on unpatched builds by closing the
db with manual_close_v2 and growing the heap into the bit-31-clear
address range.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@ericsink
Copy link
Copy Markdown
Owner

Interesting PR. Thanks. I'll take a closer look ASAP.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants