Skip to content

Fix prefetch-dataset hang on GPU#207

Draft
ethanglaser wants to merge 1 commit intoIntelPython:mainfrom
ethanglaser:dev/eglaser-sklbench-gpu-fix
Draft

Fix prefetch-dataset hang on GPU#207
ethanglaser wants to merge 1 commit intoIntelPython:mainfrom
ethanglaser:dev/eglaser-sklbench-gpu-fix

Conversation

@ethanglaser
Copy link
Copy Markdown
Contributor

@ethanglaser ethanglaser commented Apr 7, 2026

Description

Dataset prefetching with --prefetch-datasets was hanging all subsequent GPU benchmarks because multiprocessing.Pool defaults to fork, which corrupts the SYCL/Level Zero GPU runtime when child processes exit. Switched to spawn start method to avoid this.

Still more work needed on this

Via claude


Checklist:

Completeness and readability

  • I have commented my code, particularly in hard-to-understand areas.
  • I have updated the documentation to reflect the changes or created a separate PR with updates and provided its number in the description, if necessary.
  • Git commit message contains an appropriate signed-off-by string (see CONTRIBUTING.md for details).
  • I have resolved any merge conflicts that might occur with the base branch.

Testing

  • I have run it locally and tested the changes extensively.
  • All CI jobs are green or I have provided justification why they aren't.
  • I have extended testing suite if new functionality was introduced in this PR.

@ethanglaser
Copy link
Copy Markdown
Contributor Author

ethanglaser commented Apr 7, 2026

GPU run with some progress but still issues: http://intel-ci.intel.com/f132cfc2-18cc-f185-949b-a4bf010d0e2d
Verifying no issues on CPU side: http://intel-ci.intel.com/f132cfa6-f997-f1ab-a141-a4bf010d0e2d

@david-cortes-intel
Copy link
Copy Markdown
Contributor

@ethanglaser Are you sure this is about sycl runtimes and not openmp? If it was about openmp, it could be fixed by switching whatever backend the python process is picking towards intel's or LLVM's, which can work in forked processed. That should be relatively easy if using conda.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants