test(evals): skill docs audit#16725
Draft
denolfe wants to merge 33 commits into
Draft
Conversation
Adds MIGRATIONS.md reference file covering migration CLI commands, per-adapter transaction patterns, configuration (migrationDir/ prodMigrations), production workflow, data migration patterns, and gotchas. Pairs with 4 eval cases (3 positive config modifications + 1 correction case) and deferred-case entries for adapter-config and standalone-migration-file patterns that don't fit the codegen pipeline.
…rations, dedupe fixtures
…ssertions, dot notation
…sk/jobsWorkflow assertion kinds Adds 5 new AST assertion kinds to the eval harness so eval cases can validate top-level buildConfig options, collection-level options beyond fields/hooks/access, db adapter args, and jobs tasks/workflows. Includes walkPath helper for dotted-path resolution and boolean shorthand (object literal satisfies value: true). 37 unit tests in evaluate.spec.ts.
…sets to new assertion kinds Uses configOption, collectionOption, jobsTask, and jobsWorkflow assertions to replace the empty assertions: [] cases that previously relied solely on the OpenAI scorer. Skips trivially-satisfied cases (fix-relative-handler-path generate-pdf task) and cases where the LLM slug is unpredictable (add-scheduled-task).
Remove CONFIG_RESERVED_KEYS filter from collectConfigOptions so that jobs, db, and collections are captured in parsed.configOptions alongside their dedicated structures — allowing walkPath to address jobs.autoRun and other reserved-key paths uniformly via configOption assertions.
Adds PRODUCTION.md (448 lines, 14 sections) covering build-without-db, CORS/CSRF config, GraphQL complexity limits, upload restrictions, maxLoginAttempts, prodMigrations, Docker, DocumentDB/CosmosDB caveats, and troubleshooting pointers. Adds 6 positive eval cases + 1 correction case (7 total) with heavy use of configOption, collectionOption, and dbAdapterOption assertions. Registers 'production' in EvalCategory and adds 4 package.json scripts.
…tchas) - Extend HOOKS.md: Root Hooks (afterError), Global Hooks (full family), Auth-Enabled Collection Hooks (beforeLogin, afterLogin, …), Validation Order, Blocking vs Non-Blocking semantics, Server-Only Execution, throw APIError pattern, Context Module Augmentation, originalDoc vs delta data gotcha, beforeDuplicate field hook, Generic-typed hook helpers - Update Collection/Field Hooks tables to include beforeOperation, afterOperation, afterDelete, afterError, beforeDuplicate - Add eval.hooks.spec.ts + 7 eval cases (6 positive + 1 correction) - Add datasets/hooks/codegen.ts with AST assertions where catalog covers - Add 7 fixture dirs under test/evals/fixtures/hooks/codegen/ - Add 'hooks' to EvalCategory (alphabetical between graphql and jobs) - Add 4 test:eval:hooks* scripts to package.json (alphabetical) - Append HOOKS deferred cases to 4-DEFERRED-EVAL-CASES.md No split: HOOKS.md is 423 lines (well under 800 limit).
…ectionHookName Extends CollectionHookName union to include all auth-enabled collection hooks: beforeLogin, afterLogin, afterLogout, afterMe, afterRefresh, afterForgotPassword, me, refresh, and afterOperation. Adds regression tests for beforeLogin/afterLogin.
…nding, versions API) Extends QUERIES.md with ~520 lines of new material covering Local API options table, findDistinct, bulk update/delete, globals, versions API, server functions, REST endpoint inventory + method override + SDK, custom GraphQL queries/mutations, GraphQL config options, per-field complexity, schema generation, collection-level graphQL naming/disable, pagination response shape, defaultDepth/maxDepth, select/exclude/defaultPopulate/populate, and multi-field sort. File stays at 796 lines (no QUERIES-ADVANCED.md split needed). Adds 7 positive + 1 correction eval cases with heavy collectionOption/fieldOption/configOption assertions.
…drafts, trash, field side effects) Extends ACCESS-CONTROL.md with 7 new sections covering gaps from §5 of the gap report: Access Operation context (undefined id/data/doc guard), default access behavior (Boolean(user)), unlock access, readVersions access, drafts publish constraint via _status, trash discrimination via data.deletedAt, and field access side effects (read omits key, update silently discards). Extends AccessOperation union with 'readVersions' and 'unlock', adds 4 regression tests. Adds 8 codegen eval cases (6 positive + 2 correction) with collectionAccess and fieldOption assertions.
…stead of unsafe access)
- Fix 4 broken anchors in SKILL.md (FIELDS#validation→#custom-validation, QUERIES#field-selection→#select--exclude-fields, HOOKS#context→#hook-context ×2) - Add FIELDS.md##custom-validation section so anchor resolves - Restore projectMemberAccess async cross-collection ownership pattern to ACCESS-CONTROL.md Cross-Collection Validation (was removed in Task 9) - Compact ACCESS-CONTROL.md Access Control Function Arguments section into a table to recover line budget (801→741) - Remove duplicate CSRF code block from PRODUCTION.md; replace with one-liner link to canonical AUTHENTICATION.md#csrf-allow-list
- ACCESS-CONTROL.md: payload.count returns { totalDocs } not a number;
the old code compared an object to 0 (always false), silently allowing
all deletes — fix both the inline example and the Cross-Collection
Validation helper
- SKILL.md: rename Quick Reference row from "Validate API key externally"
to "Validate Payload JWT externally" — the SHA-256/slice(0,32) derivation
is for JWT verification, not API key validation
- JOBS-QUEUE.md: fix comment "Retry this task up to 3 times" to
"Retry up to 2 times on failure (3 attempts total)" — retries:2 means
2 extra attempts, not 3
- access-control/codegen.ts: drop trivially-satisfied collectionAccess
assertions on the two correction cases; the starter fixtures already
define access.read on posts, so the old assertions provided false signal
- parseConfig.ts: export resolveToObjectLiteral and AdapterName type; change walkPath to return a discriminated WalkPathResult (ok/failedAt/reason) so callers can emit segment-specific error messages - evaluate.ts: remove resolveToObjectLiteralLocal duplicate (was missing the cycle guard from parseConfig.ts); import resolveToObjectLiteral from parseConfig.ts (I1); update all three walkPath call sites (configOption, collectionOption, dbAdapterOption) to use structured results with segment-level error messages (I5); add exhaustive default case to the evaluateOne switch to catch future assertion kinds (I2); surface '<unknown>' adapter in dbAdapterOption error messages (I3); distinguish non-literal expressions from missing values in fieldOption checks (I4) - types.ts: export AdapterName string-literal union and tighten dbAdapterOption.adapter from string to AdapterName (I6) - evaluate.spec.ts: add 2 regression tests covering missing-segment and not-object-literal error wording from walkPath (45 → 47 tests)
- RICHTEXT.md: setup example claimed FixedToolbarFeature was added but the features array only spread defaults; add FixedToolbarFeature() to the spread and include its import - MIGRATIONS.md: backfill-_status example used limit:0 (loads entire collection into memory) — refactor to paginated while-loop pattern, consistent with the rename example and Large dataset batching section - JOBS-QUEUE.md: Execution Methods intro referred to "step 2" and "step 4" which don't match the Scheduling lifecycle list (those steps are beforeSchedule/afterSchedule); drop the misleading parenthetical step numbers - ACCESS-CONTROL.md: self-salary example used user?.id === doc?.id which is only correct on the Users collection; add a comment anchoring the assumption and noting the alternative for other collections
…up AdapterName, drop unused import, reroute SKILL.md rows)
Contributor
📦 esbuild Bundle Analysis for payloadThis analysis was generated by esbuild-bundle-analyzer. 🤖
Largest pathsThese visualization shows top 20 largest paths in the bundle.Meta file: packages/next/meta_index.json, Out file: esbuild/index.js
Meta file: packages/payload/meta_index.json, Out file: esbuild/index.js
Meta file: packages/payload/meta_shared.json, Out file: esbuild/exports/shared.js
Meta file: packages/richtext-lexical/meta_client.json, Out file: esbuild/exports/client_optimized/index.js
Meta file: packages/ui/meta_client.json, Out file: esbuild/exports/client_optimized/index.js
Meta file: packages/ui/meta_shared.json, Out file: esbuild/exports/shared_optimized/index.js
DetailsNext to the size is how much the size has increased or decreased compared with the base branch of this PR.
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Overview
Expands the Payload AI skill to address gaps found in an audit of the docs against the existing skill.
Key Changes
6 new reference files under
tools/claude-plugin/skills/payload/reference/MIGRATIONS.md(CLI, per-adapter transactions, prod workflow)AUTHENTICATION.md(config, JWT, cookies, CSRF, API keys, strategies, operations)JOBS-QUEUE.md(tasks, workflows, schedules, four execution methods)CUSTOM-COMPONENTS.md(admin React hooks, custom views, slot inventories)RICHTEXT.md(Lexical features, converters, custom features, views)PRODUCTION.md(build-without-DB, deployment, security headers, FS warnings, Docker)3 existing reference files extended
HOOKS.md: root hooks, global hooks, auth-enabled hooks, validation order, blocking semantics, context augmentationQUERIES.md: full Local API options,findDistinct, bulk ops, GraphQL extending, versions API, pagination / select / depth detailsACCESS-CONTROL.md: Access Operation guard,readVersions, draft-publish constraint, trash discrimination, field access side effectsAdded 12 Quick Reference rows to
SKILL.mdEval harness expansion (
test/evals/assertions/)9 new codegen eval suites
Design Decisions
Pilot-first sequencing. The MIGRATIONS reference file and its eval suite were written first, then locked as a template (Quick Reference table, section structure, fixture preamble, dataset shape, AST hygiene rule). The remaining 5 new files and 3 extensions followed the locked pattern.
Eval scope:
payload.config.tsmodifications only. The codegen pipeline evaluates LLM-generated config files. Patterns that live outside the config (Dockerfiles, external JWT validation, server components, runtime API calls) are catalogued for a follow-up that extends the harness.AST assertion hygiene rule. Cases only assert what the LLM must actively produce. Assertions trivially satisfied by the starter fixture are removed and replaced with
assertions: []plus an inline reason. The OpenAI scorer carries cases where no AST kind applies.Grounding in
test/. 52 of 61 eval cases (85%) mirror a real working pattern from atest/<suite>/config. The 6 cases that use features documented indocs/but not exercised intest/carry an inline// docs-grounded:comment naming the docs path.