Add file-based transfer APIs to build cache provider interface and all cache plugins#5746
Add file-based transfer APIs to build cache provider interface and all cache plugins#5746
Conversation
|
@copilot - add APIs to |
Added |
|
@copilot - are there async variants of those |
No, |
|
@copilot - take a look at the other |
Added |
|
@copilot - can we add stream support to the other cache plugins? |
Added streaming support to both remaining cache plugins: Amazon S3 plugin (
Azure Storage plugin (
All three cache plugins (HTTP, S3, Azure) now implement both streaming methods. All existing tests pass (45 S3, 10 Azure). |
|
@copilot - Can you put usage of this functionality behind a Rush experiment? |
Added a |
...rushstack/node-core-library/copilot-stream-cache-entry-for-http-plugin_2026-04-05-04-27.json
Outdated
Show resolved
Hide resolved
common/changes/@microsoft/rush/copilot-stream-cache-entry-for-http-plugin_2026-04-05-03-56.json
Outdated
Show resolved
Hide resolved
libraries/rush-lib/src/logic/buildCache/FileSystemBuildCacheProvider.ts
Outdated
Show resolved
Hide resolved
rush-plugins/rush-amazon-s3-build-cache-plugin/src/AmazonS3Client.ts
Outdated
Show resolved
Hide resolved
rush-plugins/rush-amazon-s3-build-cache-plugin/src/AmazonS3Client.ts
Outdated
Show resolved
Hide resolved
rush-plugins/rush-amazon-s3-build-cache-plugin/src/AmazonS3Client.ts
Outdated
Show resolved
Hide resolved
…ix WebClient private member type mismatch
Agent-Logs-Url: https://github.com/microsoft/rushstack/sessions/13978d72-ea9a-4463-a6a2-51ecfc72f3ff Co-authored-by: iclanton <5010588+iclanton@users.noreply.github.com>
Agent-Logs-Url: https://github.com/microsoft/rushstack/sessions/13978d72-ea9a-4463-a6a2-51ecfc72f3ff Co-authored-by: iclanton <5010588+iclanton@users.noreply.github.com>
…ype compatibility Agent-Logs-Url: https://github.com/microsoft/rushstack/sessions/d8f5e137-3cb0-4ed9-8391-50d7a33d804e Co-authored-by: iclanton <5010588+iclanton@users.noreply.github.com>
…error logging, stream cleanup, and nits Agent-Logs-Url: https://github.com/microsoft/rushstack/sessions/d0190bf7-a346-4a71-93bf-d5375d98b552 Co-authored-by: iclanton <5010588+iclanton@users.noreply.github.com>
Agent-Logs-Url: https://github.com/microsoft/rushstack/sessions/86050f65-dd6c-45f4-ac41-95fdb860c053 Co-authored-by: iclanton <5010588+iclanton@users.noreply.github.com>
…alue The streaming path was fixed in 8becd08 but the buffer path still printed the entire encodings array instead of the individual value. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Deduplicate the S3 prefix logic (repeated 4 times) into a helper, and extract the write-permission guard (repeated in buffer and stream set methods) to match the HTTP provider's pattern. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Both the buffer and streaming response paths duplicated the same Content-Encoding header parsing logic. Extract into a shared helper that returns a parsed string array or undefined. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace the outer-scoped cacheEntryBuffer (which was only used as a boolean flag at the cache-miss check) with an explicit cloudCacheHit boolean. This scopes the buffer into the else branch and makes the streaming path set the flag consistently too. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
A new WebClient was being constructed on every call to _makeHttpCoreRequestAsync. Since the provider never configures any WebClient instance state, a single class member suffices. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace verbose `terminal: terminal`, `headers: headers`, `body: body`, and `credential: credential` with ES6 shorthand property syntax. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Fix credential fallback test: mock CredentialCache so cached credentials are available, making the test actually validate the stream-body guard (previously, the test passed trivially because _tryGetCredentialsAsync would throw before making a second request) - Fix 504 statusText from 'BadGateway' to 'Gateway Timeout' - Replace fragile S3 upload snapshot that captured Readable internals (breaks on Node.js upgrades) with targeted assertions on URL, verb, headers, and body identity - Replace fail() + try/catch with expect().rejects.toThrow() - Move Readable import to top of S3 test file instead of per-test dynamic import Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
ea777a0 to
f9ca5b7
Compare
Replace stream-based ICloudBuildCacheProvider methods with file-path-based alternatives that give providers full control over I/O: - tryGetCacheEntryStreamByIdAsync → tryGetCacheEntryToFileAsync - trySetCacheEntryStreamAsync → trySetCacheEntryFromFileAsync Key improvements: - S3 uploads now hash the tarball on disk before streaming, restoring AWS Signature V4 payload signing (removes UNSIGNED_PAYLOAD) - Azure provider uses SDK-native uploadFile/downloadToFile - HTTP provider uses FileSystem.createReadStream/createWriteStreamAsync - Providers that don't need pre-upload computation (HTTP, Azure) don't pay the cost of hashing - Experiment renamed to useDirectFileTransfersForBuildCache Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- tryGetCacheEntryToFileAsync → tryDownloadCacheEntryToFileAsync - trySetCacheEntryFromFileAsync → tryUploadCacheEntryFromFileAsync Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Mock FileSystem.createReadStream, createWriteStreamAsync, and ensureFolderAsync via jest.spyOn instead of module-level fs mocks. This is more targeted, less brittle, and consistent with the production code's use of the FileSystem abstraction. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
If tryDownloadCacheEntryToFileAsync throws mid-download, a corrupt partial file could be left at the target path. On the next build, tryGetCacheEntryPathByIdAsync would find it and try to untar it. Delete the file in the catch block to prevent this. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The HTTP and S3 providers both ensure the parent directory exists before writing the cache entry file. The Azure provider was missing this, which would cause failures on a fresh machine where the build cache folder hasn't been created yet. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Rush change file: replace "streaming APIs" with file-based transfer API names and mention the experiment flag - node-core-library change file: fix broken backtick formatting - Azure provider: update JSDoc from "stream" to "file-based" - HTTP test names: replace "stream consumed" / "stream bodies" with file-based language Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The delay variable is in milliseconds but the log message said "s". This produced misleading output like "Will retry request in 4000s...". Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… signing Remove unused onBlobAlreadyExists parameter from _trySetBlobDataAsync This callback was previously used to drain incoming streams when the blob already existed. With the switch to file-based APIs, no callers pass this parameter anymore.
…ore-library and update consumers to use FileSystem instead of fs directly Wrap FileSystem stream methods in _wrapException for consistent error handling
- HTTP: add 404 cache miss test for tryDownloadCacheEntryToFileAsync - HTTP: add pipeline assertion in download success test - S3: add retry test for downloadObjectToFileAsync on transient 5xx - S3: add pipeline assertions in download success/miss tests Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
For stream-body requests, the log read "unknown bytes" which is awkward. Change to "unknown length" for clarity. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
| type FileSystemCopyFilesAsyncFilter, | ||
| type FileSystemCopyFilesFilter, | ||
| type FileSystemReadStream, | ||
| type FileSystemWriteStream, |
There was a problem hiding this comment.
Adding this should be it's own PR.
| "Authorization": "AWS4-HMAC-SHA256 Credential=accessKeyId/20200418/us-east-1/s3/aws4_request,SignedHeaders=host;x-amz-content-sha256;x-amz-date,Signature=194608e9e7ba6d8aa4a019b3b6fd237e6b09ef1f45ff7fa60cbb81c1875538be", | ||
| "x-amz-content-sha256": "e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855", | ||
| "x-amz-date": "20200418T123242Z", |
There was a problem hiding this comment.
This looks rather sketchy to be here, even if I realize it only talks to localhost
| }); | ||
| }); | ||
|
|
||
| describe('File-based requests', () => { |
There was a problem hiding this comment.
An approach I've found helpful for this sort of code is to use a localhost server instead of mocking everything.
| case WebClientProxy.Fiddler: | ||
| // For debugging, disable cert validation | ||
| // eslint-disable-next-line | ||
| process.env['NODE_TLS_REJECT_UNAUTHORIZED'] = '0'; |
There was a problem hiding this comment.
You should just be able to inject the CA cert? There are a few different ways to have NodeJS pick up the system CAs; environment variable, CLI parameter, or you can manually update the set of accepted CAs when initializing the connection.
| body: entryStream, | ||
| warningText: 'Could not write cache entry', | ||
| // Streaming uploads cannot be retried because the stream is consumed | ||
| maxAttempts: 1 |
There was a problem hiding this comment.
Then why is this even an option to the function?
Summary
Adds optional file-based transfer APIs to the build cache provider interface (
ICloudBuildCacheProvider) and implements them across all cache plugins (HTTP, Amazon S3, Azure Blob Storage). When enabled via theuseDirectFileTransfersForBuildCacheexperiment flag, cache entries are transferred directly between local files and cloud storage without buffering entire contents in memory, preventing out-of-memory errors for large build outputs.Details
Core changes:
ICloudBuildCacheProvidergains two optional methods:tryDownloadCacheEntryToFileAsyncandtryUploadCacheEntryFromFileAsync. Providers that don't implement them gracefully fall back to the existing buffer-based APIs.OperationBuildCacheconditionally uses the file-based path whenuseDirectFileTransfersForBuildCacheis enabled and the provider supports it. Includes cleanup of partial files on failed downloads.WebClientis refactored to extract a shared_makeRawRequestAsynccore used by both buffer and streaming request paths, with a newfetchStreamAsyncmethod and Content-Encoding decompression support for streaming responses.FileSystemin node-core-library gainscreateReadStream,createWriteStream, andcreateWriteStreamAsyncmethods (wrapped in_wrapExceptionfor consistent error handling).FileSystemBuildCacheProvideris simplified — the stream method is removed since cloud providers now handle file I/O directly.Plugin implementations:
fetchStreamAsync→pipeline()to file. Uploads viacreateReadStream→fetchStreamAsync. UsesmaxAttempts: 1for uploads (stream consumed after first attempt), with credential fallback skipped for stream bodies._hashFileAsync, then stream with the SHA-256 hash included in the AWS Signature V4 request — restoring full payload signing (no moreUNSIGNED_PAYLOAD). No retry on uploads.blobClient.downloadToFile(). Uploads viablockBlobClient.uploadFile(). Parent directory creation ensured before download.Gating:
useDirectFileTransfersForBuildCacheinexperiments.json. Defaults to off. Falls back to buffer-based APIs if the cloud provider plugin doesn't implement the file-based methods.How it was tested
HttpBuildCacheProvider(14 tests): buffer and file-based GET/SET, 404 cache miss, credential fallback skip for file uploads, write-not-allowed checks, retry behavior, pipeline assertionsAmazonS3Client(38 tests): buffer and file-based GET/SET, signed payload hash verification (not UNSIGNED-PAYLOAD), download retry on transient 5xx, no-retry on upload, credential validation, pipeline assertionsImpacted documentation
experiments.schema.jsonupdated withuseDirectFileTransfersForBuildCachedescriptioncommon/reviews/api/rush-lib.api.mdupdated with new API surfacecommon/reviews/api/node-core-library.api.mdupdated with newFileSystemstream methods