Integrate Atomic Simulation Environment (ASE) as a Job Adapter#836
Integrate Atomic Simulation Environment (ASE) as a Job Adapter#836kfir4444 wants to merge 12 commits into
Conversation
There was a problem hiding this comment.
Pull request overview
Adds support for running calculations via ASE (Atomic Simulation Environment) as a new ARC job adapter, along with YAML parsing adjustments to consume ASE-produced YAML outputs.
Changes:
- Register a new
asejob adapter and supporting standalone execution script (ase_script.py) producing YAML outputs. - Extend settings to recognize ASE as supported ESS and define ASE-specific filenames/options/environment mappings.
- Update YAML parsing and XYZ-from-file fallback to better handle
.yml/.yamlinputs.
Reviewed changes
Copilot reviewed 8 out of 9 changed files in this pull request and generated 8 comments.
Show a summary per file
| File | Description |
|---|---|
| arc/settings/settings.py | Adds ASE to supported ESS + filenames/options; updates cluster delete command and ASE env mapping. |
| arc/parser/parser.py | Adds .yml/.yaml path in parse_xyz_from_file() to parse geometry via YAML adapter. |
| arc/parser/adapters/yaml.py | Expands supported YAML keys; modifies energy parsing logic. |
| arc/level.py | Changes deduce_software() control flow to return early when software is already set. |
| arc/job/adapters/scripts/ase_script.py | New standalone runner for ASE jobs (SP/opt/freq) producing output.yml. |
| arc/job/adapters/ase.py | New ASEAdapter integrating ASE script into ARC job execution (incore/queue). |
| arc/job/adapters/init.py | Imports/registers the new ASE adapter module. |
| arc/job/adapter.py | Adds ase to JobEnum. |
| .gitignore | Ignores spec.md. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| ASE_CALCULATORS_ENV = {'torchani': 'TANI_PYTHON', | ||
| 'xtb': 'SELLA_PYTHON', |
There was a problem hiding this comment.
Not true. xtb-python is installed with the sella env (see devtools/sella_environment.yml)
| if job_type in ['sp', 'opt', 'conf_opt', 'freq', 'optfreq', 'directed_scan']: | ||
| output['sp'] = to_hartree(atoms.get_potential_energy()) | ||
|
|
||
| if job_type in ['opt', 'conf_opt', 'optfreq', 'directed_scan']: | ||
| fmax = float(settings.get('fmax', 0.001)) | ||
| steps = int(settings.get('steps', 1000)) | ||
| engine_name = settings.get('optimizer', 'BFGS').lower() | ||
|
|
||
| engine_dict = { | ||
| 'bfgs': BFGS, 'lbfgs': LBFGS, 'gpmin': GPMin, | ||
| 'scipyfminbfgs': SciPyFminBFGS, 'scipyfmincg': SciPyFminCG, | ||
| 'sella': None, | ||
| } | ||
| if engine_name == 'sella': | ||
| from sella import Sella | ||
| opt_class = Sella | ||
| else: | ||
| opt_class = engine_dict.get(engine_name, BFGS) | ||
| opt = opt_class(atoms, logfile=os.path.join(os.path.dirname(input_path), 'opt.log')) | ||
|
|
||
| try: | ||
| opt.run(fmax=fmax, steps=steps) | ||
| save_current_geometry(output, atoms, xyz) | ||
| output['sp'] = to_hartree(atoms.get_potential_energy()) | ||
| except Exception as exc: |
| job_type = input_dict.get('job_type') | ||
| xyz = input_dict.get('xyz') | ||
| settings = input_dict.get('settings', {}) | ||
|
|
||
| atoms = Atoms(symbols=xyz['symbols'], positions=xyz['coords']) | ||
| calc = get_calculator(settings) | ||
| atoms.calc = calc | ||
|
|
||
| apply_constraints(atoms, input_dict.get('constraints')) |
| # Default mapping if not yet fully defined in settings.py | ||
| DEFAULT_ASE_ENV = { | ||
| 'torchani': 'TANI_PYTHON', | ||
| 'xtb': 'XTB_PYTHON', |
| class ASEAdapter(JobAdapter): | ||
| """ | ||
| A generic adapter for ASE (Atomic Simulation Environment) jobs. | ||
| Supports multiple calculators and environments. | ||
| """ | ||
| def __init__(self, |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #836 +/- ##
==========================================
- Coverage 60.68% 60.67% -0.02%
==========================================
Files 103 104 +1
Lines 31186 31285 +99
Branches 8128 8140 +12
==========================================
+ Hits 18926 18982 +56
- Misses 9910 9944 +34
- Partials 2350 2359 +9
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
Build UMA (Meta FAIR fairchem-core, uma-s-1p1, task omol) on top of the generic ASE adapter (PR #836) instead of a standalone adapter: - ase_script.py: add a uma/fairchem branch to get_calculator; set total charge and spin (=multiplicity) on atoms.info (omol conditioning); use Sella order=1 for TS saddle-point searches when is_ts; add an irc job type via Sella IRC. - ase.py: derive the calculator from the level method (so method='uma' works with no args), resolve UMA defaults (latest model, omol, cpu) via determine_settings, pass is_ts and irc_direction to the script, and warn on a UMA single point for an isolated atom or triplet O2 (unreliable absolute energy). - settings.py: UMA_PYTHON=find_executable('uma_env'), ASE_CALCULATORS_ENV['uma'], and UMA_LATEST_MODEL. - level.py: route method 'uma'/'uma-s-1'/'uma-s-1p1' to the 'ase' software. - yaml.py: implement parse_irc_traj and parse_1d_scan_coords so UMA IRC/scan outputs round-trip. Rotor scans run through ARC's directed_scan (constrained opt), already supported by the ASE adapter. fairchem/Sella-IRC API points only confirmable inside uma_env are marked with # VERIFY. Adds env-independent unit tests (routing, calculator/settings resolution, input writing, sp warning, output round-trip) plus skip-guarded model tests.
| # Threshold for considering a mode as a translation/rotation (cm^-1) | ||
| rot_trans_threshold = 10.0 | ||
|
|
||
| num_to_filter = 6 if len(masses) > 2 else 5 if len(masses) == 2 else 0 |
There was a problem hiding this comment.
This filters 6 modes for any molecule with >2 atoms. A linear polyatomic (like CO2, HCN, ...)
has 3N−5 vibrations, so it should filter only 5.
I think ARC has an is_linear(), try implementing?
| process = subprocess.run(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE, text=True) | ||
| if process.returncode != 0: | ||
| logger.error(f"ASE job failed incore:\n{process.stderr}") | ||
| self.parse_results() |
There was a problem hiding this comment.
On failure it logs but still calls parse_results(). no job_status is set to "error".
| logger = get_logger() | ||
|
|
||
| # Default mapping if not yet fully defined in settings.py | ||
| DEFAULT_ASE_ENV = { |
There was a problem hiding this comment.
but settings defines ASE_CALCULATORS_ENV. Either drop DEFAULT_ASE_ENV or align it with the settings dict?
| from ase.optimize.sciopt import SciPyFminBFGS, SciPyFminCG | ||
| from ase.vibrations import Vibrations | ||
|
|
||
| # Constants matched to ASE internal units (3.23.0+) for exact numerical matching |
There was a problem hiding this comment.
constants should go here: https://github.com/ReactionMechanismGenerator/ARC/blob/main/arc/constants.py
please consolidate
| if job_type in ['sp', 'opt', 'conf_opt', 'freq', 'optfreq', 'directed_scan']: | ||
| output['sp'] = to_hartree(atoms.get_potential_energy()) | ||
|
|
||
| if job_type in ['opt', 'conf_opt', 'optfreq', 'directed_scan']: | ||
| fmax = float(settings.get('fmax', 0.001)) | ||
| steps = int(settings.get('steps', 1000)) | ||
| engine_name = settings.get('optimizer', 'BFGS').lower() | ||
|
|
||
| engine_dict = { | ||
| 'bfgs': BFGS, 'lbfgs': LBFGS, 'gpmin': GPMin, | ||
| 'scipyfminbfgs': SciPyFminBFGS, 'scipyfmincg': SciPyFminCG, | ||
| 'sella': None, | ||
| } | ||
| if engine_name == 'sella': | ||
| from sella import Sella | ||
| opt_class = Sella | ||
| else: | ||
| opt_class = engine_dict.get(engine_name, BFGS) | ||
| opt = opt_class(atoms, logfile=os.path.join(os.path.dirname(input_path), 'opt.log')) | ||
|
|
||
| try: | ||
| opt.run(fmax=fmax, steps=steps) | ||
| save_current_geometry(output, atoms, xyz) | ||
| output['sp'] = to_hartree(atoms.get_potential_energy()) | ||
| except Exception as exc: |
| try: | ||
| opt.run(fmax=fmax, steps=steps) | ||
| save_current_geometry(output, atoms, xyz) | ||
| output['sp'] = to_kJmol(atoms.get_potential_energy()) |
There was a problem hiding this comment.
we should not store sp when the opt fails
| 'isotopes': input_xyz.get('isotopes') or tuple([None] * len(input_xyz['symbols'])) | ||
| } | ||
|
|
||
| if job_type in ['sp', 'opt', 'conf_opt', 'freq', 'optfreq', 'directed_scan']: |
There was a problem hiding this comment.
for an opt job, this triggers an SCF/energy calc. then the opt runs and overwrites output['sp'].
This is minor, but we might multiply it by a huge number of calls for advanced work.
So for any opt/optfreq/conf_opt/directed_scan job, that first energy evaluation is thrown away. i think that the first if should really be "sp" only
|
|
||
| if name == 'torchani': | ||
| import torch | ||
| import torchani |
There was a problem hiding this comment.
unless there's a reason, please relocate all imports to the top of the file as a convention in ARC
| name = calc_config.get('calculator', '').lower() | ||
| kwargs = calc_config.get('calculator_kwargs', {}) | ||
|
|
||
| if name == 'torchani': |
There was a problem hiding this comment.
The function accepts charge and multiplicity. xTB uses them, tut TorchANI MOPAC receive those same parameters and silently ignore them.
for TorchANI So ignoring is correct, just undocumented (should be logged)
But MOPAC should support charge/multiplicity ((via keywords like CHARGE= / DOUBLET etc.)
| self.assertAlmostEqual(to_kJmol(1.0), 96.48533, places=5) | ||
| self.assertAlmostEqual(to_kJmol(27.21138), 2625.499015202655, places=5) | ||
|
|
||
| def test_numpy_vibrational_analysis(self): |
There was a problem hiding this comment.
also test a >2 atom linear molecule
Summary
This PR introduces a new job adapter that allows ARC to utilize the Atomic Simulation Environment (ASE) as a backend engine. By integrating ASE, ARC can now run single-point energy calculations, geometry optimizations, and vibrational (frequency) analyses using any ASE-compatible calculator.
Key Changes
environment.
generic ASE calculators.
coordinates).
masses, and force constants directly from the Hessian, matching physical constants and standards used by TorchANI/ASE.
Impact
This integration significantly broadens ARC's computational capabilities by tapping directly into the ASE ecosystem, making it trivial to incorporate Machine Learning Potentials (like TorchANI) and any method that interacts with ASE very easily.