Fix atom-map crash for constitutional-isomer cuts (H-abs with identical donors)#877
Fix atom-map crash for constitutional-isomer cuts (H-abs with identical donors)#877
Conversation
There was a problem hiding this comment.
Pull request overview
Fixes an atom-mapping crash for H-abstraction reactions where scissored fragments can be constitutional isomers with identical formulas, and adds a reaction-family-based fallback for reactive bond detection when mapping fails.
Changes:
- Add a strict-first, two-pass pairing strategy for scissored reactant/product fragments to avoid formula-only “isomorphism” matches.
- Add defensive handling for per-pair mapping failures (
None) to avoid crashes and optionally retry alternate product dictionaries. - Allow reactive-bond queries without an atom map by deriving formed/broken/changed bonds from the RMG family recipe +
r_label_map.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| arc/mapping/engine.py | Adds strict isomorphism mode and strict-first fragment pairing; adds None guard in glue_maps. |
| arc/mapping/driver.py | Detects failed fragment maps and retries next product dict instead of crashing. |
| arc/mapping/engine_test.py | Adds tests covering strict mode and strict-first pairing behavior for formula isomers. |
| arc/reaction/reaction.py | Moves atom-map requirement behind r_bonds_only short-circuit; adds family-recipe fallback for reactive bonds. |
| arc/reaction/reaction_test.py | Adds regression test ensuring reactive-bond APIs work when atom_map is unavailable. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| # 1) Build base map | ||
| am_dict: Dict[int,int] = {} | ||
| for map_list, (r_cut, p_cut) in zip(maps, pairs): | ||
| if map_list is None: | ||
| logger.warning(f'glue_maps: received a None per-pair map for ' | ||
| f'{r_cut.mol.smiles} -> {p_cut.mol.smiles}; cannot build atom map.') | ||
| return None |
There was a problem hiding this comment.
glue_maps() now returns None when a per-pair map is None, but its signature/docstring still indicate it always returns List[int]. Please update the return annotation (and docs) to Optional[List[int]] so callers and type checkers match the actual behavior.
| @@ -969,12 +969,80 @@ def get_bonds(self, | |||
| mapped_p_bonds.append(tuple([self.atom_map.index(p_bond[0]), self.atom_map.index(p_bond[1])])) | |||
| return r_bonds, p_bonds | |||
There was a problem hiding this comment.
mapped_p_bonds is computed but never used (and the remapping logic appears inconsistent with how p_bonds is already mapped). This dead code is confusing and should be removed, or the function should return mapped_p_bonds if that’s the intended output.
r_cut_p_cut_isomorphic matched purely on molecular-formula fingerprint OR graph isomorphism, so constitutional isomers sharing a formula (e.g. the alpha vs beta pentyl-ether radicals in H-abstraction between identical donors) got paired as "isomorphic" and handed to map_two_species, which then failed to superimpose them and returned None. Add a strict flag that requires full graph isomorphism, and run pairing_reactants_and_products_for_mapping in two passes: strict first, then the loose fingerprint-or-isomorphic fallback for any unmatched r-cuts so rearrangements whose cuts aren't strictly isomorphic (e.g. Intra_Disproportionation, ring-openings) still work. Also guard glue_maps against a None per-pair map so the caller sees a clean None instead of a TypeError when a future mapping regression slips through.
map_rxn blindly handed fragment_maps to glue_maps. Any None entry (from map_two_species giving up on a pair) would crash inside glue_maps trying to subscript the None. Detect None entries up front, log which pairs failed, and either advance to the next product_dict index (same recovery path used by the existing template-order error branch) or return None so upstream consumers can fall back cleanly.
NMD (check_normal_mode_displacement) only needs the reactive bond set in reactant index space - which is canonically given by the RMG family recipe (BREAK_BOND / FORM_BOND / CHANGE_BOND actions) plus r_label_map. Previously NMD routed through reaction.get_bonds, which hard-required self.atom_map and raised otherwise, so any atom-mapping failure killed the whole TS validation even though the reactive bonds were directly computable. Three changes: - get_bonds: move the atom_map check below the r_bonds_only=True short- circuit. Reactant bonds never touch atom_map, so r_bonds_only should not require one. - _get_reactive_bonds_from_family: new helper that reads the family's actions (via ReactionFamily(...).actions) and resolves labels through r_label_map into reactant-indexed bond tuples. - get_formed_and_broken_bonds / get_changed_bonds: when self.atom_map is None but family + product_dicts are present, use the helper. Otherwise unchanged. Emits a warning so the fallback path is diagnosable.
5f3b022 to
1324fda
Compare
Summary
glue_mapsfor H-abstraction reactions where the two radicals on the radical side are positional isomers of the same skeleton (e.g.CC[CH]OCC + CCCOCC <=> C[CH]OCCC + CCCOCC), becauser_cut_p_cut_isomorphictreated molecular-formula equality (fingerprint) as sufficient evidence of isomorphism and paired genuinely-different cuts.pairing_reactants_and_products_for_mappingto a two-pass strategy: strict graph isomorphism first, then the original loose fingerprint-or-isomorphic fallback for any unmatched r-cuts (preservesIntra_Disproportionation, ring-opening C10H10 etc.).glue_mapsandmap_rxn, plus a family-recipe-based fallback inreaction.get_formed_and_broken_bonds/get_changed_bondsso NMD can still validate a TS even when atom-mapping truly fails.Commits
glue_mapsguard.atom_mapcheck inget_bondsbelow ther_bonds_only=Trueshort-circuit so reactant-only queries no longer require a map.Test plan
arc.mapping.engine_test(67 tests incl. newtest_r_cut_p_cut_isomorphic_strictandtest_pairing_prefers_strict_match_for_formula_isomers)arc.mapping.driver_testarc.reaction.reaction_test(incl. newtest_get_bonds_and_reactive_bonds_without_atom_map)arc.checks.nmd_test