Skip to content

Fix atom-map crash for constitutional-isomer cuts (H-abs with identical donors)#877

Open
calvinp0 wants to merge 3 commits intomainfrom
fix-atom-map-pairing-formula-isomers
Open

Fix atom-map crash for constitutional-isomer cuts (H-abs with identical donors)#877
calvinp0 wants to merge 3 commits intomainfrom
fix-atom-map-pairing-formula-isomers

Conversation

@calvinp0
Copy link
Copy Markdown
Member

Summary

  • Atom-mapping crashed in glue_maps for H-abstraction reactions where the two radicals on the radical side are positional isomers of the same skeleton (e.g. CC[CH]OCC + CCCOCC <=> C[CH]OCCC + CCCOCC), because r_cut_p_cut_isomorphic treated molecular-formula equality (fingerprint) as sufficient evidence of isomorphism and paired genuinely-different cuts.
  • Switch pairing_reactants_and_products_for_mapping to a two-pass strategy: strict graph isomorphism first, then the original loose fingerprint-or-isomorphic fallback for any unmatched r-cuts (preserves Intra_Disproportionation, ring-opening C10H10 etc.).
  • Add defensive None-handling in glue_maps and map_rxn, plus a family-recipe-based fallback in reaction.get_formed_and_broken_bonds / get_changed_bonds so NMD can still validate a TS even when atom-mapping truly fails.

Commits

  1. mapping/engine: two-pass pairing to reject formula-only matches — root cause fix + glue_maps guard.
  2. mapping/driver: surface failed pair mappings and retry next product dict — clean error path replacing the subscript-None crash.
  3. reaction: derive reactive bonds from RMG family when atom_map is absent — NMD safety net; moves atom_map check in get_bonds below the r_bonds_only=True short-circuit so reactant-only queries no longer require a map.

Test plan

  • arc.mapping.engine_test (67 tests incl. new test_r_cut_p_cut_isomorphic_strict and test_pairing_prefers_strict_match_for_formula_isomers)
  • arc.mapping.driver_test
  • arc.reaction.reaction_test (incl. new test_get_bonds_and_reactive_bonds_without_atom_map)
  • arc.checks.nmd_test
  • 122/122 passing locally

Copilot AI review requested due to automatic review settings April 20, 2026 14:34
Comment thread arc/reaction/reaction_test.py Fixed
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes an atom-mapping crash for H-abstraction reactions where scissored fragments can be constitutional isomers with identical formulas, and adds a reaction-family-based fallback for reactive bond detection when mapping fails.

Changes:

  • Add a strict-first, two-pass pairing strategy for scissored reactant/product fragments to avoid formula-only “isomorphism” matches.
  • Add defensive handling for per-pair mapping failures (None) to avoid crashes and optionally retry alternate product dictionaries.
  • Allow reactive-bond queries without an atom map by deriving formed/broken/changed bonds from the RMG family recipe + r_label_map.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
arc/mapping/engine.py Adds strict isomorphism mode and strict-first fragment pairing; adds None guard in glue_maps.
arc/mapping/driver.py Detects failed fragment maps and retries next product dict instead of crashing.
arc/mapping/engine_test.py Adds tests covering strict mode and strict-first pairing behavior for formula isomers.
arc/reaction/reaction.py Moves atom-map requirement behind r_bonds_only short-circuit; adds family-recipe fallback for reactive bonds.
arc/reaction/reaction_test.py Adds regression test ensuring reactive-bond APIs work when atom_map is unavailable.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread arc/mapping/engine.py
Comment on lines 1283 to +1289
# 1) Build base map
am_dict: Dict[int,int] = {}
for map_list, (r_cut, p_cut) in zip(maps, pairs):
if map_list is None:
logger.warning(f'glue_maps: received a None per-pair map for '
f'{r_cut.mol.smiles} -> {p_cut.mol.smiles}; cannot build atom map.')
return None
Copy link

Copilot AI Apr 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

glue_maps() now returns None when a per-pair map is None, but its signature/docstring still indicate it always returns List[int]. Please update the return annotation (and docs) to Optional[List[int]] so callers and type checkers match the actual behavior.

Copilot uses AI. Check for mistakes.
Comment thread arc/reaction/reaction.py Outdated
Comment on lines 967 to 970
@@ -969,12 +969,80 @@ def get_bonds(self,
mapped_p_bonds.append(tuple([self.atom_map.index(p_bond[0]), self.atom_map.index(p_bond[1])]))
return r_bonds, p_bonds
Copy link

Copilot AI Apr 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mapped_p_bonds is computed but never used (and the remapping logic appears inconsistent with how p_bonds is already mapped). This dead code is confusing and should be removed, or the function should return mapped_p_bonds if that’s the intended output.

Copilot uses AI. Check for mistakes.
r_cut_p_cut_isomorphic matched purely on molecular-formula fingerprint OR
graph isomorphism, so constitutional isomers sharing a formula (e.g. the
alpha vs beta pentyl-ether radicals in H-abstraction between identical
donors) got paired as "isomorphic" and handed to map_two_species, which
then failed to superimpose them and returned None.

Add a strict flag that requires full graph isomorphism, and run
pairing_reactants_and_products_for_mapping in two passes: strict first,
then the loose fingerprint-or-isomorphic fallback for any unmatched
r-cuts so rearrangements whose cuts aren't strictly isomorphic (e.g.
Intra_Disproportionation, ring-openings) still work.

Also guard glue_maps against a None per-pair map so the caller sees a
clean None instead of a TypeError when a future mapping regression slips
through.
map_rxn blindly handed fragment_maps to glue_maps. Any None entry (from
map_two_species giving up on a pair) would crash inside glue_maps trying
to subscript the None. Detect None entries up front, log which pairs
failed, and either advance to the next product_dict index (same recovery
path used by the existing template-order error branch) or return None so
upstream consumers can fall back cleanly.
NMD (check_normal_mode_displacement) only needs the reactive bond set in
reactant index space - which is canonically given by the RMG family
recipe (BREAK_BOND / FORM_BOND / CHANGE_BOND actions) plus r_label_map.
Previously NMD routed through reaction.get_bonds, which hard-required
self.atom_map and raised otherwise, so any atom-mapping failure killed
the whole TS validation even though the reactive bonds were directly
computable.

Three changes:

- get_bonds: move the atom_map check below the r_bonds_only=True short-
  circuit. Reactant bonds never touch atom_map, so r_bonds_only should
  not require one.

- _get_reactive_bonds_from_family: new helper that reads the family's
  actions (via ReactionFamily(...).actions) and resolves labels through
  r_label_map into reactant-indexed bond tuples.

- get_formed_and_broken_bonds / get_changed_bonds: when self.atom_map is
  None but family + product_dicts are present, use the helper. Otherwise
  unchanged. Emits a warning so the fallback path is diagnosable.
@calvinp0 calvinp0 force-pushed the fix-atom-map-pairing-formula-isomers branch from 5f3b022 to 1324fda Compare April 20, 2026 15:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants