fix: restore original recursion limit after dendrogram computation#1306
fix: restore original recursion limit after dendrogram computation#1306kunal-10-cloud wants to merge 3 commits intomalariagen:masterfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This PR prevents plot_haplotype_clustering() and plot_diplotype_clustering() from permanently modifying the process-wide Python recursion limit by restoring the prior limit after dendrogram computation (even on exceptions), addressing issue #1305.
Changes:
- Save the current recursion limit before temporarily increasing it for dendrogram computation.
- Wrap clustering/plotting logic in
try/finallyto guarantee restoring the original recursion limit.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
malariagen_data/anoph/hapclust.py |
Wrapes recursion-limit elevation in try/finally so the original limit is restored after haplotype dendrogram plotting. |
malariagen_data/anoph/dipclust.py |
Applies the same try/finally recursion-limit restoration pattern for diplotype dendrogram plotting. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| # This is needed to avoid RecursionError on some clustering analyses | ||
| # with larger numbers of nodes. | ||
| sys.setrecursionlimit(10_000) | ||
|
|
||
| # Load sample metadata. | ||
| df_samples = self.sample_metadata( | ||
| sample_sets=sample_sets, | ||
| sample_query=sample_query, | ||
| sample_query_options=sample_query_options, | ||
| ) | ||
| # with larger numbers of nodes. Save and restore the original limit to | ||
| # avoid permanently modifying global interpreter state. | ||
| _original_limit = sys.getrecursionlimit() | ||
| try: | ||
| sys.setrecursionlimit(10_000) |
There was a problem hiding this comment.
Consider adding a regression test that asserts sys.getrecursionlimit() is unchanged after calling plot_diplotype_clustering() (and ideally also when an exception is raised during clustering/plotting). This ensures the try/finally restoration behavior remains covered and prevents future regressions.
| # This is needed to avoid RecursionError on some haplotype clustering analyses | ||
| # with larger numbers of haplotypes. | ||
| sys.setrecursionlimit(10_000) | ||
| # with larger numbers of haplotypes. Save and restore the original limit to | ||
| # avoid permanently modifying global interpreter state. | ||
| _original_limit = sys.getrecursionlimit() | ||
| try: | ||
| sys.setrecursionlimit(10_000) |
There was a problem hiding this comment.
Consider adding a regression test that asserts sys.getrecursionlimit() is unchanged after calling plot_haplotype_clustering() (and ideally also when an exception is raised inside the dendrogram computation). This is the core behavior change, and without a test it’s easy to regress back to permanently altering the global recursion limit.
Both hapclust.py and dipclust.py called sys.setrecursionlimit(10_000) without restoring the original limit, permanently modifying global Python state for the entire process. This wraps the elevated limit in a try/finally block to ensure the original limit is always restored, preventing side-effect pollution in long-running sessions. Fixes malariagen#1305
048af0c to
74c5e31
Compare
|
Hi @jonbrenas Just a friendly nudge on this PR whenever you get a chance! Quick summary: plot_haplotype_clustering() and plot_diplotype_clustering() currently call sys.setrecursionlimit(10_000) but never restore the original limit. This means a single plotting call permanently elevates the recursion limit from 1,000 to 10,000 for the entire Python session — silently masking RecursionError bugs in user code and increasing memory risk in multi-threaded environments (e.g., dask workers). The fix simply wraps the existing code in try/finally to restore the original limit after computation. No new dependencies, no API changes, no behaviour change for the dendrogram itself. It's a small, self-contained fix — just two files changed with the same pattern applied to each. Full test suite passes (1041 passed). Should be a quick review! Happy to address any feedback. Thanks! |
Summary
sys.setrecursionlimit(10_000)calls inhapclust.pyanddipclust.pywithtry/finallyblocks to restore the original recursion limit after dendrogram computationplot_haplotype_clustering()orplot_diplotype_clustering()permanently elevated the global Python recursion limit from 1,000 to 10,000 for the entire processfinallyblock, ensuring cleanup even if an exception occursProblem
Both
hapclust.py(line 89) anddipclust.py(line 92) calledsys.setrecursionlimit(10_000)as a side effect of plotting dendrograms but never restored the original limit. This:RecursionErrorbugs in user code or other libraries are silently suppressed because the limit is 10x higherChanges
malariagen_data/anoph/hapclust.pysys.getrecursionlimit()before elevationtry/finallyto guarantee restorationmalariagen_data/anoph/dipclust.pyTest plan
test_hapclust.py,test_dipclust.py)ruff checkpasses with no issuesruff formatapplied — no formatting violationsFixes #1305