Skip to content

Add a summary SpectrumConfig object to generator outputs#1845

Open
michaelmackenzie wants to merge 11 commits into
Mu2e:mainfrom
michaelmackenzie:SpectrumConfig
Open

Add a summary SpectrumConfig object to generator outputs#1845
michaelmackenzie wants to merge 11 commits into
Mu2e:mainfrom
michaelmackenzie:SpectrumConfig

Conversation

@michaelmackenzie
Copy link
Copy Markdown
Contributor

The goal of this data product is to simplify downstream workflows that need to know information about a dataset's primary generator configuration. This includes samples with restricted energy and cos(theta_z) regions. I've added default output of this product to all generators, even where the information is trivial, so downstream workflows will always find it and it can be updated to add more information if restrictions change.

@FNALbuild
Copy link
Copy Markdown
Collaborator

Hi @michaelmackenzie,
You have proposed changes to files in these packages:

  • MCDataProducts
  • EventGenerator

which require these tests: build.

@Mu2e/write, @Mu2e/fnalbuild-users have access to CI actions on main.

⌛ The following tests have been triggered for 4a17888: build (Build queue - API unavailable)

About FNALbuild. Code review on Mu2e/Offline.

@FNALbuild
Copy link
Copy Markdown
Collaborator

☔ The build tests failed for 4a17888.

Test Result Details
test with Command did not list any other PRs to include
merge Merged 4a17888 at 3d6f2fe
build (prof) Log file. Build time: 08 min 33 sec
ceSimReco Log file.
g4test_03MT Log file. Return Code 1.
transportOnly Log file. Return Code 1.
POT Log file.
g4study Log file. Return Code 1.
cosmicSimReco Log file. Return Code 1.
cosmicOffSpill Log file.
ceSteps Log file.
ceDigi Log file.
muDauSteps Log file. Return Code 9.
ceMix Log file. Return Code 1.
rootOverlaps Log file.
g4surfaceCheck Log file.
trigger Log file.
check_cmake Log file.
FIXME, TODO ➡️ TODO (0) FIXME (1) in 48 files
clang-tidy ➡️ 92 errors 709 warnings
whitespace check no whitespace errors found

N.B. These results were obtained from a build of this Pull Request at 4a17888 after being merged into the base branch at 3d6f2fe.

For more information, please check the job page here.
Build artifacts are deleted after 5 days. If this is not desired, select Keep this build forever on the job page.

@michaelmackenzie
Copy link
Copy Markdown
Contributor Author

@FNALbuild run build test

@FNALbuild
Copy link
Copy Markdown
Collaborator

⌛ The following tests have been triggered for 8e2fd2d: build (Build queue - API unavailable)

@FNALbuild
Copy link
Copy Markdown
Collaborator

☔ The build tests failed for 8e2fd2d.

Test Result Details
test with Command did not list any other PRs to include
merge Merged 8e2fd2d at 3d6f2fe
build (prof) Log file. Build time: 04 min 39 sec
ceSimReco Log file.
g4test_03MT Log file. Return Code 1.
transportOnly Log file. Return Code 1.
POT Log file.
g4study Log file. Return Code 1.
cosmicSimReco Log file. Return Code 1.
cosmicOffSpill Log file.
ceSteps Log file.
ceDigi Log file.
muDauSteps Log file. Return Code 9.
ceMix Log file. Return Code 1.
rootOverlaps Log file.
g4surfaceCheck Log file.
trigger Log file.
check_cmake Log file.
FIXME, TODO ➡️ TODO (0) FIXME (1) in 48 files
clang-tidy ➡️ 92 errors 709 warnings
whitespace check no whitespace errors found

N.B. These results were obtained from a build of this Pull Request at 8e2fd2d after being merged into the base branch at 3d6f2fe.

For more information, please check the job page here.
Build artifacts are deleted after 5 days. If this is not desired, select Keep this build forever on the job page.

@michaelmackenzie
Copy link
Copy Markdown
Contributor Author

@FNALbuild run build test

@FNALbuild
Copy link
Copy Markdown
Collaborator

⌛ The following tests have been triggered for 1ba6baf: build (Build queue - API unavailable)

@FNALbuild
Copy link
Copy Markdown
Collaborator

☔ The build tests failed for 1ba6baf.

Test Result Details
test with Command did not list any other PRs to include
merge Merged 1ba6baf at 83b5e2f
build (prof) Log file. Build time: 08 min 33 sec
ceSimReco Log file.
g4test_03MT Log file.
transportOnly Log file.
POT Log file.
g4study Log file.
cosmicSimReco Log file.
cosmicOffSpill Log file.
ceSteps Log file.
ceDigi Log file.
muDauSteps Log file. Return Code 9.
ceMix Log file. Return Code 1.
rootOverlaps Log file.
g4surfaceCheck Log file.
trigger Log file.
check_cmake Log file.
FIXME, TODO ➡️ TODO (0) FIXME (1) in 48 files
clang-tidy ➡️ 92 errors 709 warnings
whitespace check no whitespace errors found

N.B. These results were obtained from a build of this Pull Request at 1ba6baf after being merged into the base branch at 83b5e2f.

For more information, please check the job page here.
Build artifacts are deleted after 5 days. If this is not desired, select Keep this build forever on the job page.

@michaelmackenzie
Copy link
Copy Markdown
Contributor Author

@FNALbuild run build test

@FNALbuild
Copy link
Copy Markdown
Collaborator

⌛ The following tests have been triggered for 00a3573: build (Build queue - API unavailable)

@FNALbuild
Copy link
Copy Markdown
Collaborator

☀️ The build tests passed at 00a3573.

Test Result Details
test with Command did not list any other PRs to include
merge Merged 00a3573 at 83b5e2f
build (prof) Log file. Build time: 04 min 26 sec
ceSimReco Log file.
g4test_03MT Log file.
transportOnly Log file.
POT Log file.
g4study Log file.
cosmicSimReco Log file.
cosmicOffSpill Log file.
ceSteps Log file.
ceDigi Log file.
muDauSteps Log file.
ceMix Log file.
rootOverlaps Log file.
g4surfaceCheck Log file.
trigger Log file.
check_cmake Log file.
FIXME, TODO ➡️ TODO (0) FIXME (1) in 48 files
clang-tidy ➡️ 92 errors 709 warnings
whitespace check no whitespace errors found

N.B. These results were obtained from a build of this Pull Request at 00a3573 after being merged into the base branch at 83b5e2f.

For more information, please check the job page here.
Build artifacts are deleted after 5 days. If this is not desired, select Keep this build forever on the job page.

Copy link
Copy Markdown
Collaborator

@brownd1978 brownd1978 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for introducing this Michael, it's an important addition to our simulation bookkeeping.

As defined, SpectrumConfig is a passive record of what the generators actually do. A stronger design is to also use the object to make selections in the generator, tying together implementation and recording. I believe this could be done reasonably efficiently using a map to associate values with variables, perhaps even inside ParticleGeneratorTool, which would simplify and standardize the individual generator implementation.

If the PDF associated with each restricted variable is known you could actually compute the fraction in SpectrumConfig. Please consider adding a constructor taking the PDF as a function as input for that case, to again standardize and simplify downstream code, either here or as a future upgrade.

The current design depends on the the input variables being independent, that should be clearly stated. I can think of counter examples. It might be possible to generalize SpectrumConfig to handle that, but it's not needed for this first implementation.

'type' makes more sense as an attribute of the variable, not the spectrum config. Then you can get rid of the 'other' type value, which is currently used to describe mixed flat and non-flat variable. If you provide the PDF function you could also have a value to describe that.

Comment thread MCDataProducts/inc/SpectrumConfig.hh Outdated
xmax_(xmax) {}

std::string name_ ;
double fraction_ ;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest adding comments to describe these variables

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, I added comments

if(verbose) printf(" Total fraction: %.4e\n", fraction);
return fraction;
}

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest adding a function to test a collection of values against the limits and return true/false if the values are in range, to be used downstream in a filter or other mechanism. To be proof against ordering errors I suggest to provide these as a map, with values keyed to the variable name, with the function requiring the names all match.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added this function to test a map of <var name, value> against the restricted variables to decide if it's accepted

class SpectrumConfig {
public:
// Indicate the broad class of primary simulations this falls under
enum Type {kPhysical = 0, kFlat, kOther};
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this enum associated with the composite object and not the individual variable? I could imagine a config restricting the range of flat and non-flat variables (say phi and theta) in a single object.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Originally I was just using this to indicate general spectrum configurations, e.g. flat electrons vs. DIO spectrum. But following your suggestion I've moved this to the variable struct and added a 'is_physical()' check to the total object that returns true of all variables are configured as physical (where by "flat" here I mean non-physical and flat, and "other" is any other non-physical choice).

Comment thread MCDataProducts/inc/SpectrumConfig.hh Outdated

public: // allow direct access/manipulation of the fields
std::vector<RestrictedVar> vars_;
int type_;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why isn't this data member strongly typed?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I fixed this

@michaelmackenzie
Copy link
Copy Markdown
Contributor Author

Hi Dave,

Thank you for the helpful review! I think it's worth considering expanding this object to actively configure tools instead of passively storing configuration information, but I would see this as a future update. Similarly with accepting PDFs as input rather than just the summary variable information. I also added to the comments that this assumes variable fractions are independent (I could add an overall fraction_ variable to SpectrumConfig to override the variable fractions in non-independent cases if this makes sense?).

I think it would be useful to get a first version of this added, and then attempt to propagate some of its data to dataset metadata, in case we want to tweak the implementation before an expansion. But this is just my impulse, I'm also happy to work on further developments if you would like!

@oksuzian
Copy link
Copy Markdown
Collaborator

oksuzian commented Jun 3, 2026

PR Review: #1845 — Add a summary SpectrumConfig object to generator outputs

Summary

This PR introduces a new mu2e::SpectrumConfig MC data product that records how each event generator was configured (restricted variable ranges, sampled fractions, and whether a spectrum is physical/flat). It wires the object into ~45 generator modules/tools so that each one emits a per-SubRun summary of its spectrum configuration.

  • Scope: 48 files changed (+~700, −0 net new logic), almost entirely additive.
  • Core new file: MCDataProducts/inc/SpectrumConfig.hh (108 lines) + ROOT dictionary registration (classes.h, classes_def.xml).
  • Reviews: 1 reviewer (brownd1978, MEMBER) left a design review + 4 inline comments; author has responded/addressed most.
  • Risk: Low–Medium. Mechanically repetitive and additive, but there is at least one compile-correctness concern and several persistence/consistency concerns worth verifying before merge.

Core change: SpectrumConfig.hh

The class holds a std::map<std::string, RestrictedVar> of named variables, each with a fraction_, [xmin_, xmax_] range, and a per-variable Type (kPhysical/kFlat/kOther). It provides add_var, accepted(map), is_physical(), total_fraction(), and var_fraction(). The author already incorporated reviewer feedback by moving Type from the whole config onto each variable and adding the accepted() selection method.


Issues found

1. Missing #include <map> (likely real bug) — MCDataProducts/inc/SpectrumConfig.hh
The header uses std::map extensively (the vars_ member, get_variables(), accepted()), but only includes <limits>, <vector>, and <string>:

#include <limits>
#include <vector>
#include <string>

<vector> is no longer needed (no std::vector remains after the refactor to a map), and <map> should be added. It currently only compiles by accident via transitive includes — fragile and should be fixed explicitly.

2. accepted() is not const

bool accepted(const std::map<std::string, double>& vars) {  // non-const
  ...
  if(!vars_[name].accepted(value)) return false;  // operator[] forces non-const

It uses vars_[name] (operator[]), which mutates the map (inserts a default RestrictedVar if a stale/typo key sneaks past the contains check) and prevents calling it on a const SpectrumConfig. Since this object will typically be read back as a const product from the SubRun, this should be const and use vars_.at(name) after the contains() guard.

3. printf-based error reporting in a data product
accepted() and var_fraction() use printf(...) to report undefined variables. For an MCDataProduct that may be consumed in filters, this is inconsistent with the rest of Offline's messaging (mf::LogWarning/cet::exception). Returning false/0. silently-with-printf can mask configuration errors downstream.

4. ROOT persistence of RestrictedVar default — verify schema
classes_def.xml registers SpectrumConfig, RestrictedVar, and the map. Worth confirming the RestrictedVar members all persist (they're public PODs, so OK) and that storing std::numeric_limits<double>::lowest()/max() as range sentinels (e.g. in GammaConvFlat) round-trips cleanly through ROOT I/O without precision/inf surprises.

5. Semantic inconsistency in fraction reporting across generators
The fraction_ field means different things in different modules:

  • DIOGenerator/Mu2eXGenerator compute a real _energy_fraction (restricted/full integral).
  • Most tools (MuCap*, MuplusMichel, RMC, RPC, etc.) hardcode fraction = 1. even when the spectrum is range-restricted (xmin/xmax from getXMin()/getXMax()).

So total_fraction() will report 1.0 for generators that are in fact sampling a restricted spectrum, which undercuts the stated purpose ("normalization correction factor"). This is the same concern brownd1978 raised about computing the fraction from the PDF. At minimum this asymmetry should be documented.

6. Many "empty" configs add a persisted product with no information
A large number of modules (CryEventGenerator, EventGenerator, PhotonGun, RanTest, PrimaryProtonGun, all the resamplers, etc.) emit a default-constructed SpectrumConfig with zero variables. That's a valid design choice (presence of the product is itself a marker), but it does add a product to every SubRun. Confirm this is intended for all generators rather than only those with meaningful spectra.

7. Minor: typo propagated across files — KINETIC_ENERY
The switch statements in MuCapDeuteron/Neutron/Proton and StoppedParticleReactionGun use case KINETIC_ENERY: (misspelled "ENERGY"). It's pre-existing enum naming, not introduced here, but it's now copy-pasted into the new code in several places.

8. Minor: indentation/brace inconsistency in PrimaryProtonGun_module.cc

        produces<GenParticleCollection>();
      produces<SpectrumConfig, art::InSubRun>();   // misaligned

Cosmetic, but worth tidying.


Merge readiness

Not quite ready — mostly green, but I'd resolve before merge:

  • Blocking-ish: Add #include <map> (Update README.md #1) and make accepted() const/at() (Add diagnostics #2) — these are correctness/robustness issues on the central new class.
  • Should clarify: the fraction_ = 1. semantics for restricted spectra (Added --as-needed flag to the link. #5), since it directly affects the headline total_fraction() use case.
  • Reviewer brownd1978's broader design suggestions (compute fraction from a PDF function; document the independence assumption of variables) are acknowledged by the author as reasonable future upgrades — not blockers.

No CI failures were surfaced in the review data. The change is additive and unlikely to break existing reconstruction, but it does change the output schema of essentially every generator (new SubRun product), so downstream consumers/datasets should be aware.


Suggestions to strengthen the PR

  1. Replace printf diagnostics with mf::LogWarning or throw cet::exception for undefined-variable lookups.
  2. Add a short doc comment clarifying that total_fraction() assumes independent variables (per the reviewer's note) and that fraction_ = 1. is a placeholder for generators not computing a real acceptance.
  3. Consider a unit test that constructs a SpectrumConfig, round-trips it through ROOT, and checks accepted()/is_physical()/total_fraction().

Copy link
Copy Markdown
Collaborator

@brownd1978 brownd1978 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Michael for looking at my suggestions. I agree integrating SpectrumConfig deeper into the generators is a future project, this PR forms a good base for that work and provides immediate benefit in documenting current production.

1.0 as fraction for physical spectra with limits seems wrong, Is this just a placeholder value? Given the number of cases I can see why it might be better to defer calculating all those. Should there be an issue for addressing those following this PR? Maybe for now you could have an enum value 'uknown' or 'incomplete' for those variables, to flag that the fractions aren't correct in that case?

default: spectrumVarName = "energy"; break;
}

config->add_var(SpectrumConfig::RestrictedVar(spectrumVarName, 1., _spectrumXMin, _spectrumXMax,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is the 1.0 fraction just a placeholder pending future work?


std::unique_ptr<SpectrumConfig> MuCapPhotonGenerator::spectrumConfig() {
auto config = std::make_unique<SpectrumConfig>();
config->add_var(SpectrumConfig::RestrictedVar("energy", 1., _spectrumXMin, _spectrumXMax,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same question about 1.0 fraction as above.

default: spectrumVarName = "energy"; break;
}

config->add_var(SpectrumConfig::RestrictedVar(spectrumVarName, 1., _spectrumXMin, _spectrumXMax,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And here.


std::unique_ptr<SpectrumConfig> MuplusMichelGenerator::spectrumConfig() {
auto config = std::make_unique<SpectrumConfig>();
config->add_var(SpectrumConfig::RestrictedVar("energy", 1., _spectrumXMin, _spectrumXMax,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here too 1.0 faction for a limited physical spectrum seems incomplete.


std::unique_ptr<SpectrumConfig> RMCGenerator::spectrumConfig() {
auto config = std::make_unique<SpectrumConfig>();
config->add_var(SpectrumConfig::RestrictedVar("energy", 1., _spectrumXMin, _spectrumXMax,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And here

//================================================================
void RMCGun::endSubRun(art::SubRun& sr) {
auto config = std::make_unique<SpectrumConfig>();
config->add_var(SpectrumConfig::RestrictedVar("energy", 1., spectrum_.getXMin(), spectrum_.getXMax(),
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants