arrdem/source

Fork 0

feat(rulebook): Paperfile-style DSL for Arstotzka policy #4

Open

arrdem wants to merge 7 commits from arrdem/rulebook-dsl into trunk

arrdem commented

2026-04-23 18:04:02 +00:00

Owner

Introduces projects/rulebook/ — a Papers-Please-themed DSL for declaring Arstotzka audiences, resources, actions, and roles. Services will carry a literal Rulebook file; a future arstotzka_catalog aggregator collates them into the JSON catalog the auth server loads at startup.

This PR ships the DSL, loader, validator, renderer, and CLI only — no consumers yet. The evaluator is built on starlark-pyo3 (same as Tiller), so load() has native Starlark semantics with implicit symbol binding.

Scope

Statements carry an explicit effect of "allow" or "deny" so the broad-allow + point-deny pattern is expressible from day one — e.g. "console-reader has read on everything except secrets.*". The arstotzka evaluator gains matching deny semantics in PR #8 alongside the catalog loader, and that PR also updates projects/arstotzka/SPEC.md. Nothing reads the catalog before PR #8 lands; any future arstotzka build that reads the catalog without deny support would default effect-less statements to allow.

Closure expansion is asymmetric on `effect`

allow — the renderer expands each listed resource to the transitive descendant closure under extends, so listing queue actually emits ["queue", "queue.reservation"] and the runtime fnmatch evaluator matches both.
deny — the renderer emits the listed resources verbatim. Surgical exclusion ("allow secrets.* except secrets.private") is the load-bearing use case for deny; a deny that silently expanded into the whole subtree would also clobber secrets.private.metadata and break that contract. Subtree-deny remains expressible via a glob pattern (secrets.private.*) — fnmatch handles it.

Validation

Each action in a statement must appear in r.all_actions of every listed resource (extends-up inheritance only). Descendants do not launder their actions back up into ancestors at validation time — statement(resources=[parent], actions=["release"]) fails if only a child of parent publishes release, because the renderer would otherwise emit a catalog entry that the runtime fnmatch evaluator allows on parent directly.

Dimensions and conditional matchers are deferred to a later PR that pairs each addition with the matching evaluator change. The attr.* namespace stays exposed for cross-Starlark consistency with Tiller / Bazel-style rule definitions; its carriers are inert today and re-enter the model alongside the evaluator change.

Stack

PR #4 (this) — DSL, loader, validator, renderer, CLI.
PR #7 — rulebook_schema Bazel rule + first real Rulebook in projects/doorman/.
PR #8 — arstotzka_catalog aggregator + server-side JSON loader + GET /schema + evaluator deny semantics + SPEC update.
PR #9 — authctl schema / authctl validate / tab-completion off the live catalog.

The boundary at PR #8 is load-bearing: starlark-pyo3 is build-time-only and never enters the auth server's runtime closure. The server consumes the JSON action output from PR #7's rule.

Review response

First-round review fixes:

extends now does what the README claimed. The renderer expands every allow-statement's resources list to the transitive descendant closure of each listed resource, so listing the parent actually matches descendant literals under fnmatch.
Deny stays, paired with evaluator support. Rather than drop deny, this PR keeps effect="allow" | "deny" and PR #8 takes ownership of the evaluator + SPEC update. The broad-allow + point-deny pattern is too useful to deprecate by omission.
Dimensions dropped (for now). They had no wire representation and no clear path through the runtime; reintroducing them is paired with their own evaluator change in a later PR.
Loader hardening. Inner StarlarkError wrapped at every eval site (prelude, parse, body, load); _resolve_label rejects //../../etc:passwd-style traversal via is_relative_to; load cycles caught by an in-flight set; raise on missing workspace marker; .star-only suffixes.
Implicit-concat check dropped. Starlark's parser already rejects implicit string concatenation; the Python tokenize-based check did not handle 3.12+ f-strings and was redundant.
attr.string(pattern=...) eagerly compiles the regex; previously the pattern sat unvalidated in the schema.

Second-round review fixes (866aef8):

Validator no longer launders descendant actions. Closure-union over all_actions was rewritten to per-listed-resource all_actions (extends-up only). test_action_must_be_on_listed_resource_not_just_descendants is the regression test sketched in the review.
Deny doesn't expand. _statement_to_json keeps deny resources literal; allow expands. test_deny_statement_does_not_expand_descendants pins the four-deep-extends-chain case.
Forward-compat phrasing tightened. Pre-PR-#8 arstotzka does not read the catalog; phrasing now describes the staged-plan window correctly.

Not addressed in this PR:

Audience / JWT aud naming collision — flagged but not a blocker.
Implicit registration via load() — loaded modules can call role() directly. Authority shape worth revisiting; deferred so the staged plan can land.
attr.* left callable-but-inert. The reviewer suggested either dropping it or making it raise; the explicit project direction is to keep the namespace stable for cross-Starlark consistency with Tiller, and raising would break that goal the moment a Rulebook author copies a snippet across.

Changes are visible to end-users: no

Test plan

New test cases added — see projects/rulebook/tests/ (7 suites, 67 tests)
bazel test //projects/rulebook/... is currently blocked by an unrelated infra issue: a downloader rewrite redirects hermetic_launcher's GitHub upstream runfiles_stub_x86_64_linux artifact to cairn.tirefireind.us, which 404s on the expected blob key. --repository_cache= --disk_cache= does not bypass the rewrite. Tracked separately. 67/67 tests pass under uv run pytest -p no:pudb in the meantime.
End-to-end smoke via uv run rulebook render and uv run rulebook validate on sample files

Introduces `projects/rulebook/` — a Papers-Please-themed DSL for declaring Arstotzka audiences, resources, actions, and roles. Services will carry a literal `Rulebook` file; a future `arstotzka_catalog` aggregator collates them into the JSON catalog the auth server loads at startup. This PR ships the DSL, loader, validator, renderer, and CLI only — no consumers yet. The evaluator is built on `starlark-pyo3` (same as Tiller), so `load()` has native Starlark semantics with implicit symbol binding. ### Scope Statements carry an explicit `effect` of `"allow"` or `"deny"` so the broad-allow + point-deny pattern is expressible from day one — e.g. "console-reader has `read` on everything except `secrets.*`". The arstotzka evaluator gains matching deny semantics in [PR #8](https://git.tirefireind.us/arrdem/source/pulls/8) alongside the catalog loader, and that PR also updates `projects/arstotzka/SPEC.md`. Nothing reads the catalog before PR #8 lands; any future arstotzka build that reads the catalog without deny support would default effect-less statements to allow. #### Closure expansion is asymmetric on `effect` - **allow** — the renderer expands each listed resource to the transitive descendant closure under `extends`, so listing `queue` actually emits `["queue", "queue.reservation"]` and the runtime fnmatch evaluator matches both. - **deny** — the renderer emits the listed resources verbatim. Surgical exclusion ("allow `secrets.*` except `secrets.private`") is the load-bearing use case for deny; a deny that silently expanded into the whole subtree would also clobber `secrets.private.metadata` and break that contract. Subtree-deny remains expressible via a glob pattern (`secrets.private.*`) — fnmatch handles it. #### Validation Each action in a statement must appear in `r.all_actions` of **every** listed resource (extends-up inheritance only). Descendants do not launder their actions back up into ancestors at validation time — `statement(resources=[parent], actions=["release"])` fails if only a child of `parent` publishes `release`, because the renderer would otherwise emit a catalog entry that the runtime fnmatch evaluator allows on `parent` directly. Dimensions and conditional matchers are deferred to a later PR that pairs each addition with the matching evaluator change. The `attr.*` namespace stays exposed for cross-Starlark consistency with Tiller / Bazel-style rule definitions; its carriers are inert today and re-enter the model alongside the evaluator change. ### Stack - **PR #4 (this)** — DSL, loader, validator, renderer, CLI. - [PR #7](https://git.tirefireind.us/arrdem/source/pulls/7) — `rulebook_schema` Bazel rule + first real Rulebook in `projects/doorman/`. - [PR #8](https://git.tirefireind.us/arrdem/source/pulls/8) — `arstotzka_catalog` aggregator + server-side JSON loader + `GET /schema` + **evaluator deny semantics** + SPEC update. - [PR #9](https://git.tirefireind.us/arrdem/source/pulls/9) — `authctl schema` / `authctl validate` / tab-completion off the live catalog. The boundary at PR #8 is load-bearing: `starlark-pyo3` is build-time-only and never enters the auth server's runtime closure. The server consumes the JSON action output from PR #7's rule. ### Review response First-round review fixes: - **`extends` now does what the README claimed.** The renderer expands every allow-statement's `resources` list to the transitive descendant closure of each listed resource, so listing the parent actually matches descendant literals under fnmatch. - **Deny stays, paired with evaluator support.** Rather than drop deny, this PR keeps `effect="allow" | "deny"` and PR #8 takes ownership of the evaluator + SPEC update. The broad-allow + point-deny pattern is too useful to deprecate by omission. - **Dimensions dropped (for now).** They had no wire representation and no clear path through the runtime; reintroducing them is paired with their own evaluator change in a later PR. - **Loader hardening.** Inner `StarlarkError` wrapped at every eval site (prelude, parse, body, load); `_resolve_label` rejects `//../../etc:passwd`-style traversal via `is_relative_to`; load cycles caught by an in-flight set; raise on missing workspace marker; `.star`-only suffixes. - **Implicit-concat check dropped.** Starlark's parser already rejects implicit string concatenation; the Python `tokenize`-based check did not handle 3.12+ f-strings and was redundant. - **`attr.string(pattern=...)`** eagerly compiles the regex; previously the pattern sat unvalidated in the schema. Second-round review fixes (`866aef8`): - **Validator no longer launders descendant actions.** Closure-union over `all_actions` was rewritten to per-listed-resource `all_actions` (extends-up only). `test_action_must_be_on_listed_resource_not_just_descendants` is the regression test sketched in the review. - **Deny doesn't expand.** `_statement_to_json` keeps deny resources literal; allow expands. `test_deny_statement_does_not_expand_descendants` pins the four-deep-extends-chain case. - **Forward-compat phrasing tightened.** Pre-PR-#8 arstotzka does not read the catalog; phrasing now describes the staged-plan window correctly. Not addressed in this PR: - Audience / JWT `aud` naming collision — flagged but not a blocker. - Implicit registration via `load()` — loaded modules can call `role()` directly. Authority shape worth revisiting; deferred so the staged plan can land. - `attr.*` left callable-but-inert. The reviewer suggested either dropping it or making it raise; the explicit project direction is to keep the namespace stable for cross-Starlark consistency with Tiller, and raising would break that goal the moment a Rulebook author copies a snippet across. ### Changes are visible to end-users: no ### Test plan - New test cases added — see `projects/rulebook/tests/` (7 suites, 67 tests) - `bazel test //projects/rulebook/...` is currently blocked by an unrelated infra issue: a downloader rewrite redirects `hermetic_launcher`'s GitHub upstream `runfiles_stub_x86_64_linux` artifact to `cairn.tirefireind.us`, which 404s on the expected blob key. `--repository_cache= --disk_cache=` does not bypass the rewrite. Tracked separately. 67/67 tests pass under `uv run pytest -p no:pudb` in the meantime. - End-to-end smoke via `uv run rulebook render` and `uv run rulebook validate` on sample files

arrdem added 1 commit

2026-04-23 18:04:02 +00:00

feat(rulebook): Introduce Paperfile-style DSL for Arstotzka policy 4a2e8b0723

New projects/rulebook/ package. Each service will declare its audience,
resources (with typed dimensions and actions), and roles (statements
over those resources) in a literal `Rulebook` file -- evaluated in a
restricted Python sandbox mirroring Tiller's loader design.

This PR only lands the DSL, loader, validator, render/CLI, and unit
tests. Bazel rules (rulebook_library, rulebook_schema), the arstotzka
catalog aggregator, and the first real Rulebook in projects/doorman/
come in follow-up PRs.

arrdem added 1 commit

2026-04-23 18:18:20 +00:00

refactor(rulebook): Swap exec-sandbox loader for starlark-pyo3 92f9a9891a

Drop the hand-rolled restricted-Python `exec` evaluator in favour of
starlark-pyo3, matching Tiller's architecture. Builders remain Python
callables wrapped via `Module.add_callable`; return values travel as
`OpaquePythonObject` so they round-trip through Starlark data
structures without losing identity.

`attr.enum` / `attr.string` are exposed via a Starlark `struct()` in a
tiny prelude. `load("//path:name", "sym_a", "sym_b")` now uses native
Starlark semantics -- implicit symbol binding in the caller's scope --
so the "load returns a value you assign" wart is gone.

arrdem commented

2026-04-26 19:12:34 +00:00

Author

Owner

Reviewed against trunk merge-base 6cbd038. The real diff is 19 files / +1455, all new under projects/rulebook/ plus three workspace-registration lines — Forgejo's "1455 / 1378 / 19" stat is misleading because the unrelated monoversion churn is on trunk, not this branch. Worth a git fetch origin trunk && git rebase origin/trunk before merge so the file list isn't confusing in the merge commit.

Hostile conceptual review

I think the model is doing too much, claims things it doesn't deliver, and is meaningfully out of step with the system it's allegedly typing. Going from worst to least bad:

1. Rulebook's data model does not match Arstotzka's. Per projects/arstotzka/SPEC.md and projects/arstotzka/statements/src/arstotzka/statements/evaluator.py:

An arstotzka statement is {"actions": [...], "resources": [...]}. Both fields are flat fnmatch globs against opaque strings.
"There is no deny." — explicit text in the spec.
There is no concept of "dimensions."
Resources are flat strings; the <type>:<name> grammar is cosmetic.

Rulebook adds (a) deny statements, (b) typed dimensions (enum/string), and (c) extends-based resource hierarchy. None of these have a wire representation in the current arstotzka model. The README says PR #3 will make arstotzka "validate its config against the catalog" — but the config statements have no slot for dimensions or denies, so either Rulebook is silently lying about the runtime semantics, or Arstotzka's whole model is changing and that change isn't in this PR's body. Pick one and say so.

2. The extends "implicit subresource cover" is a runtime fiction. README:

Listing queue in a statement implicitly covers queue.reservation (because reservation extends queue); to exclude the subresource, add a deny-statement listing queue.reservation explicitly.

Read render.py:_statement_to_json — resources is rendered verbatim as [r.name for r in s.resources]. There is no expansion. The arstotzka evaluator does fnmatchcase(resource, pattern) against literals, so a statement listing queue will not match the runtime resource string queue.reservation. The behavior the README describes does not exist anywhere in the code; this is a correctness bug masquerading as a feature.

3. Dimensions have no story to ground them. evaluator.allows(statements, action, resource) takes two strings. The wire format has no place to put priority=max. Either dimensions get encoded into the resource string (e.g. queue:priority=max) — which the renderer does not do — or evaluated by relying parties out-of-band by some unwritten convention. As shipped, dimensions are decorative typing without a runtime contract.

4. effect="deny" is accepted, validated, and rendered, but arstotzka has no deny. Same root cause as (1). Pick a story.

5. The DSL is much bigger than the actual problem. Today's Arstotzka has on the order of two distinct statement shapes ({*, *} for admin, {*:read, *} for reader). For that you do not need: a Starlark interpreter, an OpaquePythonObject round-trip layer, a custom workspace-root walker, a bespoke //pkg:name label resolver, a Python-tokenize-based implicit-string-concat sniffer, registration-by-side-effect builders, and a four-stage roadmap. A YAML schema and 50 lines of jsonschema-style validation gets you the "compile-time vocabulary check" benefit without committing to a programming model. The Tiller analogy is doing a lot of work in this design ("we already use Starlark for one config, ship it again") — but Tiller has actual programmatic logic (digest pinning, transforms). Auth roles do not.

6. starlark-pyo3 will end up as a runtime dep on the auth server. PR #3 is described as "arstotzka server loads the catalog at startup." If "loads" means evaluates Rulebooks, that pulls a native Starlark interpreter into the boot path of the thing that gates every ingress. The right shape is: render the catalog to JSON at build time (the implied rulebook_schema Bazel rule in PR #2), check the JSON in to the arstotzka server's config, and never let Starlark anywhere near the server process. Worth being explicit about that boundary now so PR #3 doesn't drift.

7. Implicit registration via load() is an unfortunate authority shape for a security DSL. Loaded modules see the same builders as the root file and can call role() to register directly into the audience. That's action-at-a-distance: a load("//tools/arstotzka:policy", "standard_user_admin_roles") can in principle register additional roles you didn't ask for, invisibly. Pure constructor + explicit attach (audience.add(make_role(...))) is the kind of thing security policy specifically wants — the cost is verbose Rulebooks, the benefit is that what's in your file is what gets registered.

8. "Audience" naming collides with JWT aud. Arstotzka's aud claim means the forwarded host. Rulebook's audience(name="...") uses the same string for the same concept — fine — but reusing the JWT term for the service publishing the schema invites confusion when someone reads "audience" in code and is unsure whether it's the producer or the JWT consumer. Not a blocker, but the Papers-Please cosplay loses to clarity here.

9. PR-numbering in the body is internal-aspirational (this is forgejo PR #4, not the body's "PR #1"). Cosmetic, easy fix; flagged because anyone clicking the "PR #2/#3/#4" references will be confused.

The path I'd push for: shrink Rulebook to whatever model arstotzka's evaluator actually understands today, ship the JSON schema, and only grow back deny / dimensions / extends together with the matching evaluator changes. The current PR sets up an expressivity overhang — a DSL whose semantics have no runtime — that's hard to walk back once Rulebooks are written.

Code review

Bugs and correctness

loader.py — StarlarkError from a load()'d module isn't wrapped. evaluate_rulebook only catches StarlarkError at the top-level starlark.eval. Inside _make_file_loader.load_func, the inner starlark.eval(mod, ast, globals_, file_loader=inner_loader) call has no catch, so an error in a loaded policy module propagates as a raw StarlarkError past the CLI's except RulebookError handlers and prints a Python traceback to the user. Wrap the inner eval too. There's no test for this; test_loader.py:test_open_is_not_exposed only exercises top-level errors.
loader.py:_resolve_label does not constrain pkg to the workspace. A label like load("//../../../etc:passwd", "_") resolves via workspace_root / pkg / f"{name}{suffix}" and Path will happily traverse out. Today this is low-risk because Rulebooks are trusted, but a label resolver should resolve() the candidate and verify is_relative_to(workspace_root). Cheap to fix, prevents a footgun later.
loader.py:_make_file_loader is recursion-cycle-naive. The cache only stores FrozenModule after eval completes; during eval, a self-referential load() re-enters and recurses until the stack blows. Maintain an "in-flight" set of resolved paths; raise on re-entry.
loader.py:_find_workspace_root silently falls back to start.resolve().parent when no marker is found between the file and /. This is a hidden mode that makes label resolution work in tests (where .git is created in tmp_path) but produces silently-wrong resolution if a Rulebook is evaluated outside any workspace. Raise instead — the caller can catch.
_resolve_label falls through to a no-extension match (""). (.star, .py, "") means load("//foo:Rulebook") resolves to a file literally named Rulebook. Probably unintended for load(); that bare-name path is for the root Rulebook file, not for module loads. Consider dropping the "" suffix from the load path.
.py files load as Starlark. README says load resolves .star "or .py". A .py file rarely parses cleanly as Starlark; this resolution mostly invites confusion when someone names a shared module policy.py and gets a StarlarkError they don't understand. I'd drop .py from the suffix list.
_check_implicit_string_concat won't see f-string adjacency on Python 3.12+. Python 3.12 tokenizes f"a" as FSTRING_START/FSTRING_MIDDLE/FSTRING_END, not STRING. So f"a" "b" is not flagged. Belt-and-suspenders check is fine, but worth a note that Starlark's parser is the real defense.
validate.py:_is_glob only matches *, while the runtime evaluator uses fnmatchcase which also supports ?, [abc], [!abc]. So f?o is treated as a literal by validation and rejected for an enum that lacks the literal f?o, but accepted at runtime as a glob. Validator drift from the runtime matcher.
attr.string(pattern=...) never validates the pattern. It's documented as a regex but never compiled, never checked. An invalid regex sits silently in the schema until something downstream tries to use it.
render.py does not encode effect="deny" in any distinguishing way besides the field — fine on its own, but combined with conceptual-review point (4), the JSON schema is committing to a shape that arstotzka can't honor.

Style / nits

_install_builders has an over-engineered closure capture. def _wrap(inner=fn): ... ; module.add_callable(name, _wrap()) — the default-arg trick is right but could just be a top-level _wrap_builder(fn) helper for readability.
builders.py:_Context.audience is registered-once-or-error, but _require_audience("resource") ducks under that with a runtime error string. Consider asserting the caller name with a typed enum/Literal — small thing.
validate.py re-checks duplicate names that builders.py already enforces. Docstring calls it "defense in depth"; it's also dead in normal flows. Fine but flag-worthy.
validate.py:_validate_statement joins resource names with ", ". Resource names with embedded commas/spaces would make the message ambiguous. Cosmetic.
Statement is the only model type without a description field. Consistency.

Test gaps

No CLI test (click.testing.CliRunner) — --skip-validate, error exit codes, JSON output shape are uncovered.
No effect="deny" rendering test — given how shaky deny is, this should be exercised.
No load() error-path tests: missing colon, missing // prefix, non-existent target, syntax error inside loaded module (this is the bug above).
No path-traversal test for load("//../...").
No load-cycle test.
No .py extension test for load().

Build / packaging

pyproject.toml is fine. BUILD.bazel follows the project pattern. No __init__.py under src/rulebook/ — adheres to repo rule.
Workspace registrations in MODULE.bazel, root pyproject.toml, and uv.lock look complete.

PR description

"PR #2/#3/#4" in the body refers to logical follow-up PRs, but Forgejo numbers will not match. Either renumber as "Phase 2/3/4" or call them "follow-up A/B/C" to avoid confusion with the actual PR number (this is #4).

TL;DR

The code is clean and the build/packaging is correct, but conceptually the DSL is committing to model features (deny, dimensions, extends-as-cover) that the runtime doesn't implement, and the README claims a runtime behavior (queue covers queue.reservation) that the renderer doesn't produce. The biggest concrete bugs are: load-error wrapping, path-traversal in label resolution, and load-cycle handling. The biggest design question is: what does Rulebook commit Arstotzka to changing, and is that change in scope or out of scope?

Reviewed against trunk merge-base `6cbd038`. The real diff is 19 files / +1455, all new under `projects/rulebook/` plus three workspace-registration lines — Forgejo's "1455 / 1378 / 19" stat is misleading because the unrelated monoversion churn is on trunk, not this branch. Worth a `git fetch origin trunk && git rebase origin/trunk` before merge so the file list isn't confusing in the merge commit. ## Hostile conceptual review I think the model is doing too much, claims things it doesn't deliver, and is meaningfully out of step with the system it's allegedly typing. Going from worst to least bad: **1. Rulebook's data model does not match Arstotzka's.** Per `projects/arstotzka/SPEC.md` and `projects/arstotzka/statements/src/arstotzka/statements/evaluator.py`: - An arstotzka statement is `{"actions": [...], "resources": [...]}`. Both fields are flat fnmatch globs against opaque strings. - *"There is no deny."* — explicit text in the spec. - There is no concept of "dimensions." - Resources are flat strings; the `<type>:<name>` grammar is *cosmetic*. Rulebook adds (a) deny statements, (b) typed dimensions (enum/string), and (c) `extends`-based resource hierarchy. None of these have a wire representation in the current arstotzka model. The README says PR #3 will make arstotzka "validate its config against the catalog" — but the config statements have no slot for dimensions or denies, so either Rulebook is silently lying about the runtime semantics, or Arstotzka's whole model is changing and that change isn't in this PR's body. Pick one and say so. **2. The `extends` "implicit subresource cover" is a runtime fiction.** README: > Listing `queue` in a statement implicitly covers `queue.reservation` (because reservation `extends` queue); to exclude the subresource, add a deny-statement listing `queue.reservation` explicitly. Read `render.py:_statement_to_json` — `resources` is rendered verbatim as `[r.name for r in s.resources]`. There is no expansion. The arstotzka evaluator does `fnmatchcase(resource, pattern)` against literals, so a statement listing `queue` will *not* match the runtime resource string `queue.reservation`. The behavior the README describes does not exist anywhere in the code; this is a correctness bug masquerading as a feature. **3. Dimensions have no story to ground them.** `evaluator.allows(statements, action, resource)` takes two strings. The wire format has no place to put `priority=max`. Either dimensions get encoded into the resource string (e.g. `queue:priority=max`) — which the renderer does not do — or evaluated by relying parties out-of-band by some unwritten convention. As shipped, dimensions are decorative typing without a runtime contract. **4. `effect="deny"` is accepted, validated, and rendered, but arstotzka has no deny.** Same root cause as (1). Pick a story. **5. The DSL is much bigger than the actual problem.** Today's Arstotzka has on the order of two distinct statement shapes (`{*, *}` for admin, `{*:read, *}` for reader). For that you do not need: a Starlark interpreter, an `OpaquePythonObject` round-trip layer, a custom workspace-root walker, a bespoke `//pkg:name` label resolver, a Python-tokenize-based implicit-string-concat sniffer, registration-by-side-effect builders, and a four-stage roadmap. A YAML schema and 50 lines of jsonschema-style validation gets you the "compile-time vocabulary check" benefit without committing to a programming model. The Tiller analogy is doing a lot of work in this design ("we already use Starlark for one config, ship it again") — but Tiller has actual programmatic logic (digest pinning, transforms). Auth roles do not. **6. starlark-pyo3 will end up as a runtime dep on the auth server.** PR #3 is described as "arstotzka server loads the catalog at startup." If "loads" means evaluates Rulebooks, that pulls a native Starlark interpreter into the boot path of the thing that gates every ingress. The right shape is: render the catalog to JSON at build time (the implied `rulebook_schema` Bazel rule in PR #2), check the JSON in to the arstotzka server's config, and never let Starlark anywhere near the server process. Worth being explicit about that boundary now so PR #3 doesn't drift. **7. Implicit registration via `load()` is an unfortunate authority shape for a security DSL.** Loaded modules see the same builders as the root file and can call `role()` to register directly into the audience. That's action-at-a-distance: a `load("//tools/arstotzka:policy", "standard_user_admin_roles")` can in principle register *additional* roles you didn't ask for, invisibly. Pure constructor + explicit attach (`audience.add(make_role(...))`) is the kind of thing security policy specifically wants — the cost is verbose Rulebooks, the benefit is that what's in your file is what gets registered. **8. "Audience" naming collides with JWT `aud`.** Arstotzka's `aud` claim means the forwarded host. Rulebook's `audience(name="...")` uses the same string for the same concept — fine — but reusing the JWT term for the *service publishing the schema* invites confusion when someone reads "audience" in code and is unsure whether it's the producer or the JWT consumer. Not a blocker, but the Papers-Please cosplay loses to clarity here. **9. PR-numbering in the body is internal-aspirational** (this is forgejo PR #4, not the body's "PR #1"). Cosmetic, easy fix; flagged because anyone clicking the "PR #2/#3/#4" references will be confused. The path I'd push for: shrink Rulebook to whatever model arstotzka's evaluator actually understands today, ship the JSON schema, and only grow back deny / dimensions / extends *together with* the matching evaluator changes. The current PR sets up an expressivity overhang — a DSL whose semantics have no runtime — that's hard to walk back once Rulebooks are written. --- ## Code review ### Bugs and correctness - **`loader.py` — `StarlarkError` from a `load()`'d module isn't wrapped.** `evaluate_rulebook` only catches `StarlarkError` at the top-level `starlark.eval`. Inside `_make_file_loader.load_func`, the inner `starlark.eval(mod, ast, globals_, file_loader=inner_loader)` call has no catch, so an error in a loaded policy module propagates as a raw `StarlarkError` past the CLI's `except RulebookError` handlers and prints a Python traceback to the user. Wrap the inner eval too. There's no test for this; `test_loader.py:test_open_is_not_exposed` only exercises top-level errors. - **`loader.py:_resolve_label` does not constrain `pkg` to the workspace.** A label like `load("//../../../etc:passwd", "_")` resolves via `workspace_root / pkg / f"{name}{suffix}"` and `Path` will happily traverse out. Today this is low-risk because Rulebooks are trusted, but a label resolver should `resolve()` the candidate and verify `is_relative_to(workspace_root)`. Cheap to fix, prevents a footgun later. - **`loader.py:_make_file_loader` is recursion-cycle-naive.** The `cache` only stores `FrozenModule` *after* eval completes; during eval, a self-referential `load()` re-enters and recurses until the stack blows. Maintain an "in-flight" set of resolved paths; raise on re-entry. - **`loader.py:_find_workspace_root` silently falls back to `start.resolve().parent`** when no marker is found between the file and `/`. This is a hidden mode that makes label resolution work in tests (where `.git` is created in tmp_path) but produces silently-wrong resolution if a Rulebook is evaluated outside any workspace. Raise instead — the caller can catch. - **`_resolve_label` falls through to a no-extension match (`""`).** `(.star, .py, "")` means `load("//foo:Rulebook")` resolves to a file literally named `Rulebook`. Probably unintended for `load()`; that bare-name path is for the *root* `Rulebook` file, not for module loads. Consider dropping the `""` suffix from the load path. - **`.py` files load as Starlark.** README says load resolves `.star` "or `.py`". A `.py` file rarely parses cleanly as Starlark; this resolution mostly invites confusion when someone names a shared module `policy.py` and gets a `StarlarkError` they don't understand. I'd drop `.py` from the suffix list. - **`_check_implicit_string_concat` won't see f-string adjacency on Python 3.12+.** Python 3.12 tokenizes `f"a"` as `FSTRING_START`/`FSTRING_MIDDLE`/`FSTRING_END`, not `STRING`. So `f"a" "b"` is not flagged. Belt-and-suspenders check is fine, but worth a note that Starlark's parser is the real defense. - **`validate.py:_is_glob` only matches `*`,** while the runtime evaluator uses `fnmatchcase` which also supports `?`, `[abc]`, `[!abc]`. So `f?o` is treated as a literal by validation and rejected for an enum that lacks the literal `f?o`, but accepted at runtime as a glob. Validator drift from the runtime matcher. - **`attr.string(pattern=...)` never validates the pattern.** It's documented as a regex but never compiled, never checked. An invalid regex sits silently in the schema until something downstream tries to use it. - **`render.py` does not encode `effect="deny"` in any distinguishing way besides the field** — fine on its own, but combined with conceptual-review point (4), the JSON schema is committing to a shape that arstotzka can't honor. ### Style / nits - **`_install_builders` has an over-engineered closure capture.** `def _wrap(inner=fn): ... ; module.add_callable(name, _wrap())` — the default-arg trick is right but could just be a top-level `_wrap_builder(fn)` helper for readability. - **`builders.py:_Context.audience` is registered-once-or-error,** but `_require_audience("resource")` ducks under that with a runtime error string. Consider asserting the caller name with a typed enum/Literal — small thing. - **`validate.py` re-checks duplicate names** that `builders.py` already enforces. Docstring calls it "defense in depth"; it's also dead in normal flows. Fine but flag-worthy. - **`validate.py:_validate_statement` joins resource names with `", "`.** Resource names with embedded commas/spaces would make the message ambiguous. Cosmetic. - **`Statement` is the only model type without a `description` field.** Consistency. ### Test gaps - No CLI test (`click.testing.CliRunner`) — `--skip-validate`, error exit codes, JSON output shape are uncovered. - No `effect="deny"` rendering test — given how shaky deny is, this should be exercised. - No `load()` error-path tests: missing colon, missing `//` prefix, non-existent target, syntax error inside loaded module (this is the bug above). - No path-traversal test for `load("//../...")`. - No load-cycle test. - No `.py` extension test for `load()`. ### Build / packaging - `pyproject.toml` is fine. `BUILD.bazel` follows the project pattern. No `__init__.py` under `src/rulebook/` — adheres to repo rule. - Workspace registrations in `MODULE.bazel`, root `pyproject.toml`, and `uv.lock` look complete. ### PR description - "PR #2/#3/#4" in the body refers to logical follow-up PRs, but Forgejo numbers will not match. Either renumber as "Phase 2/3/4" or call them "follow-up A/B/C" to avoid confusion with the actual PR number (this is #4). --- ## TL;DR The code is clean and the build/packaging is correct, but conceptually the DSL is committing to model features (deny, dimensions, extends-as-cover) that the runtime doesn't implement, and the README claims a runtime behavior (`queue` covers `queue.reservation`) that the renderer doesn't produce. The biggest concrete bugs are: load-error wrapping, path-traversal in label resolution, and load-cycle handling. The biggest design question is: what does Rulebook commit Arstotzka to changing, and is that change in scope or out of scope?

arrdem force-pushed arrdem/rulebook-dsl from 92f9a9891a to 25233ac3b4

2026-04-26 19:29:34 +00:00

Compare

arrdem referenced this pull request from a commit

2026-04-26 19:31:34 +00:00

chore(authctl): Stub PR #7 scope — authctl catalog integration

arrdem referenced this pull request

2026-04-26 19:31:59 +00:00

WIP: feat(rulebook): bazel rules + first Rulebook #7

arrdem referenced this pull request

2026-04-26 19:32:29 +00:00

WIP: feat(arstotzka): catalog loader + GET /schema #8

arrdem referenced this pull request

2026-04-26 19:32:37 +00:00

WIP: feat(authctl): catalog browse + validate + completion #9

arrdem referenced this pull request from a commit

2026-04-26 19:35:43 +00:00

chore(rulebook): Stub PR #7 scope — bazel rules + first Rulebook

arrdem referenced this pull request from a commit

2026-04-26 19:35:43 +00:00

chore(arstotzka): Stub PR #8 scope — catalog loader

arrdem referenced this pull request from a commit

2026-04-26 19:35:43 +00:00

chore(authctl): Stub PR #9 scope — authctl catalog integration

arrdem added 1 commit

2026-04-26 19:35:43 +00:00

docs(rulebook): Anchor roadmap to real Forgejo PR numbers (#7/#8/#9) aeb35ad946

arrdem referenced this pull request from a commit

2026-04-26 19:41:40 +00:00

chore(rulebook): Stub PR #7 scope — bazel rules + first Rulebook

arrdem referenced this pull request from a commit

2026-04-26 19:41:40 +00:00

chore(arstotzka): Stub PR #8 scope — catalog loader

arrdem referenced this pull request from a commit

2026-04-26 19:41:40 +00:00

chore(authctl): Stub PR #9 scope — authctl catalog integration

arrdem added 1 commit

2026-04-26 19:41:41 +00:00

feat(rulebook): Restore deny semantics to the model c4a207a0aa

Statement gains an explicit effect field of "allow" | "deny" so the
broad-allow + point-deny pattern is expressible from day one — e.g.
"console-reader has read on everything except secrets.*". The pattern
is the load-bearing case for keeping deny in the DSL: without it, a
narrow exclusion forces the author to enumerate every leaf they want
to keep.

The arstotzka evaluator gains matching deny semantics in PR #8
alongside the catalog loader; nothing consumes Rulebook output until
that PR lands, so deny statements written today take effect in
lockstep with their runtime support. Older arstotzka builds that read
the rendered catalog default effect-less statements to allow, so the
wire shape is forward-compatible.

Updates the model + builders + renderer + validator + tests + README.
Restores test_statement_rejects_unknown_effect /
test_statement_accepts_deny_effect / a deny-render test / a
deny-validates-same-as-allow test.

arrdem commented

2026-04-26 19:49:13 +00:00

Author

Owner

Re-reviewed at c4a207a. Most of the original review is addressed; flagging what remains, plus one new finding.

Resolved

Extends does closure expansion. render.py:_expand_resources walks descendants and dedupes; test_statement_expands_to_extends_closure and test_listing_two_siblings_dedupes_resources cover it. The README/PR-body framing of expansion as the load-bearing transform reads honestly.
Loader hardening. All four bugs fixed: inner StarlarkError is wrapped per-eval site (prelude, parse, body), _resolve_label does is_relative_to(workspace_resolved), an in_flight set catches cycles, missing-marker now raises. New tests for each (test_load_inner_error_is_wrapped, test_load_path_traversal_is_rejected, test_load_cycle_is_detected, test_no_workspace_marker_raises).
Suffix list trimmed to .star only with test_load_only_resolves_star_extension pinning it.
Implicit-concat check dropped with a one-liner in the loader docstring explaining why — Starlark's parser is the source of truth. test_starlark_rejects_implicit_string_concat still verifies the behavior end-to-end.
attr.string(pattern=...) compiles the regex eagerly.
CLI tests added (test_cli.py, 7 cases) covering exit codes, --skip-validate, JSON shape, and load-error surfacing.
_install_builders closure capture cleanup. _wrap_builder is now top-level.
Stack/PR-numbering uses real Forgejo numbers (#7/#8/#9) with explicit links.

Still flagged

Closure expansion grants actions on resources that don't publish them

This is new — surfaces as a consequence of the closure-expansion design. Construct:

audience(name = "x")
parent = resource(name = "P", actions = ["read"])              # P publishes only read
child  = resource(name = "C", actions = ["read", "write"], extends = parent)
role(
    name = "r",
    statements = [statement(resources = [parent], actions = ["write"])],
)

validate.py:_validate_statement builds the closure [P, C], unions r.all_actions over the closure ({"read", "write"} because C has write), and accepts write on the statement. The renderer emits {"actions": ["write"], "resources": ["P", "C"]}. At runtime, (action="write", resource="P") matches via fnmatch — but P doesn't publish write.

So the catalog admits role grants that name actions on resources that don't expose them, by laundering through extends-down. Two coherent fixes:

Tighten validation: each action in a statement must be present on r.all_actions (extends-up only) for every resource in st.resources (the original list, not the closure). Descendants are render-time noise and shouldn't supply actions for the listed parents.
Tighten rendering: render one statement per (resource, action-subset-supported-by-that-resource) tuple, fanning out closure expansion only over actions each leaf actually publishes.

(1) is the smaller change and matches what authors mean when they write statement(resources=[P], actions=["write"]) — they're claiming P has write. Today the validator silently disagrees with the runtime.

A regression test along the lines of:

def test_action_must_be_on_listed_resource_not_just_descendants():
    ctx, b = _mk()
    parent = b["resource"](name="P", actions=["read"])
    b["resource"](name="C", actions=["read", "write"], extends=parent)
    b["role"](name="r", statements=[b["statement"](resources=[parent], actions=["write"])])
    assert any("write" in e for e in validate(ctx.audience))

This currently passes empty-errors; it should fail.

`attr.*` is dead surface

The model dropped dimensions, and EnumType / StringType no longer attach to anything. test_attr_namespace_available_even_if_unused literally writes _ = attr.enum("a", "b") and discards. The PR-body and README justify this as "Bazel-style cross-Starlark consistency" so the namespace stays stable across the future re-introduction.

The cost is that authors can write attr.enum("max", "high") calls today and have them silently no-op. That's a footgun: someone reads the README's attr.* paragraph, infers "this is how I type a dimension", writes dimensions={"priority": attr.enum(...)} (the previous shape), and gets TypeError: resource() got an unexpected keyword argument 'dimensions'. Better to either:

Drop attr from the prelude entirely and re-add when there's a consumer, or
Keep it but make attr.enum(...) and attr.string(...) raise a friendly RulebookError ("attr.* type carriers have no consumer in the current rulebook model; they will be re-introduced with dimensions in a later PR. Remove the call for now.").

The current "callable but inert" choice is the most surprising of the three.

Deny + descendants is asymmetric

README:

To allow the parent but exclude one descendant, pair the broad allow with a narrow deny that lists the descendant explicitly.

Closure expansion runs on deny statements too (correct, given how _statement_to_json is uniform). So a deny listing a descendant secrets.private also denies any descendant of secrets.private — secrets.private.metadata and beyond. That's not "one descendant" — it's a subtree. Either the README phrasing should acknowledge the subtree, or effect="deny" statements should opt out of expansion (renderer emits the literal listed resources only). The latter is more useful for surgical exclusion; the former is what the renderer does today. Pick one and document.

A test pinning the deny+expansion behavior would help — test_deny_statement_renders_with_effect_field doesn't exercise the case where the denied resource has its own descendants.

`bazel test` claim is unverified

The PR body acknowledges the cairn mirror 404 on runfiles_stub_x86_64_linux and falls back to uv run pytest. That's an honest disclosure, but it leaves the green-bazel claim unverified, which is the project's contract for "ready to merge" per CLAUDE.md. Worth either (a) opening a separate ticket on the mirror, (b) confirming locally with a non-cluster bazel cache, or (c) waiting until the mirror's healthy. The diff is well-tested via pytest in either case, but the "all testing through bazel" rule is being relaxed here.

Forward-compat phrasing slightly overstates the case

older arstotzka builds reading the catalog default effect-less statements to allow, so the wire shape is forward-compatible.

Pre-PR-#8 arstotzka doesn't read the catalog at all, so "older arstotzka builds reading the catalog" describes a state that doesn't exist in the field. The forward-compat story really only matters once the catalog loader exists without deny support — a window the staged plan keeps closed. Worth tightening to "any future arstotzka build that reads the catalog without deny support" — small but the current wording pretends there's a back-compat surface to maintain when there isn't yet.

Consciously deferred (acknowledged)

Audience / JWT aud naming collision.
Implicit registration via load() (loaded modules can call role() directly).
arstotzka SPEC.md "There is no deny." — moves to PR #8.

Tests look great

Coverage now includes load-error wrapping, traversal rejection, cycle detection, suffix list, missing-target, missing-//, missing-colon, missing-workspace, deny rendering, and CLI exit codes. The one regression test missing is the closure/validator mismatch above.

TL;DR

The architectural moves (drop dimensions, scope deny to PR #8, closure-expand at render time) are the right shape. The loader is now hardened. The remaining issue worth blocking on is the validator/render closure asymmetry — actions can be granted on parents that don't publish them, which defeats the "compile-time check against a real vocabulary" pitch. Once that's fixed (or explicitly justified), this is good to merge ahead of PR #7.

Re-reviewed at `c4a207a`. Most of the original review is addressed; flagging what remains, plus one new finding. ## Resolved - **Extends does closure expansion.** `render.py:_expand_resources` walks descendants and dedupes; `test_statement_expands_to_extends_closure` and `test_listing_two_siblings_dedupes_resources` cover it. The README/PR-body framing of expansion as the load-bearing transform reads honestly. - **Loader hardening.** All four bugs fixed: inner `StarlarkError` is wrapped per-eval site (prelude, parse, body), `_resolve_label` does `is_relative_to(workspace_resolved)`, an `in_flight` set catches cycles, missing-marker now raises. New tests for each (`test_load_inner_error_is_wrapped`, `test_load_path_traversal_is_rejected`, `test_load_cycle_is_detected`, `test_no_workspace_marker_raises`). - **Suffix list trimmed to `.star` only** with `test_load_only_resolves_star_extension` pinning it. - **Implicit-concat check dropped** with a one-liner in the loader docstring explaining why — Starlark's parser is the source of truth. `test_starlark_rejects_implicit_string_concat` still verifies the behavior end-to-end. - **`attr.string(pattern=...)` compiles the regex eagerly.** - **CLI tests added** (`test_cli.py`, 7 cases) covering exit codes, `--skip-validate`, JSON shape, and load-error surfacing. - **`_install_builders` closure capture cleanup.** `_wrap_builder` is now top-level. - **Stack/PR-numbering** uses real Forgejo numbers (#7/#8/#9) with explicit links. ## Still flagged ### Closure expansion grants actions on resources that don't publish them This is new — surfaces as a consequence of the closure-expansion design. Construct: ```python audience(name = "x") parent = resource(name = "P", actions = ["read"]) # P publishes only read child = resource(name = "C", actions = ["read", "write"], extends = parent) role( name = "r", statements = [statement(resources = [parent], actions = ["write"])], ) ``` `validate.py:_validate_statement` builds the closure `[P, C]`, unions `r.all_actions` over the closure (`{"read", "write"}` because `C` has `write`), and accepts `write` on the statement. The renderer emits `{"actions": ["write"], "resources": ["P", "C"]}`. At runtime, `(action="write", resource="P")` matches via fnmatch — but `P` doesn't publish `write`. So the catalog admits role grants that name actions on resources that don't expose them, by laundering through extends-down. Two coherent fixes: 1. **Tighten validation:** each action in a statement must be present on `r.all_actions` (extends-up only) for *every* resource in `st.resources` (the *original* list, not the closure). Descendants are render-time noise and shouldn't supply actions for the listed parents. 2. **Tighten rendering:** render one statement per `(resource, action-subset-supported-by-that-resource)` tuple, fanning out closure expansion only over actions each leaf actually publishes. (1) is the smaller change and matches what authors mean when they write `statement(resources=[P], actions=["write"])` — they're claiming `P` has `write`. Today the validator silently disagrees with the runtime. A regression test along the lines of: ```python def test_action_must_be_on_listed_resource_not_just_descendants(): ctx, b = _mk() parent = b["resource"](name="P", actions=["read"]) b["resource"](name="C", actions=["read", "write"], extends=parent) b["role"](name="r", statements=[b["statement"](resources=[parent], actions=["write"])]) assert any("write" in e for e in validate(ctx.audience)) ``` This currently passes empty-errors; it should fail. ### `attr.*` is dead surface The model dropped dimensions, and `EnumType` / `StringType` no longer attach to anything. `test_attr_namespace_available_even_if_unused` literally writes `_ = attr.enum("a", "b")` and discards. The PR-body and README justify this as "Bazel-style cross-Starlark consistency" so the namespace stays stable across the future re-introduction. The cost is that authors can write `attr.enum("max", "high")` calls today and have them silently no-op. That's a footgun: someone reads the README's `attr.*` paragraph, infers "this is how I type a dimension", writes `dimensions={"priority": attr.enum(...)}` (the previous shape), and gets `TypeError: resource() got an unexpected keyword argument 'dimensions'`. Better to either: - Drop `attr` from the prelude entirely and re-add when there's a consumer, or - Keep it but make `attr.enum(...)` and `attr.string(...)` raise a friendly `RulebookError` ("attr.* type carriers have no consumer in the current rulebook model; they will be re-introduced with dimensions in a later PR. Remove the call for now."). The current "callable but inert" choice is the most surprising of the three. ### Deny + descendants is asymmetric README: > To allow the parent but exclude one descendant, pair the broad allow with a narrow deny that lists the descendant explicitly. Closure expansion runs on deny statements too (correct, given how `_statement_to_json` is uniform). So a deny listing a descendant `secrets.private` *also* denies any descendant of `secrets.private` — `secrets.private.metadata` and beyond. That's not "one descendant" — it's a subtree. Either the README phrasing should acknowledge the subtree, or `effect="deny"` statements should opt out of expansion (renderer emits the literal listed resources only). The latter is more useful for surgical exclusion; the former is what the renderer does today. Pick one and document. A test pinning the deny+expansion behavior would help — `test_deny_statement_renders_with_effect_field` doesn't exercise the case where the denied resource has its own descendants. ### `bazel test` claim is unverified The PR body acknowledges the cairn mirror 404 on `runfiles_stub_x86_64_linux` and falls back to `uv run pytest`. That's an honest disclosure, but it leaves the green-bazel claim unverified, which is the project's contract for "ready to merge" per `CLAUDE.md`. Worth either (a) opening a separate ticket on the mirror, (b) confirming locally with a non-cluster bazel cache, or (c) waiting until the mirror's healthy. The diff is well-tested via `pytest` in either case, but the "all testing through bazel" rule is being relaxed here. ### Forward-compat phrasing slightly overstates the case > older arstotzka builds reading the catalog default effect-less statements to allow, so the wire shape is forward-compatible. Pre-PR-#8 arstotzka doesn't read the catalog at all, so "older arstotzka builds reading the catalog" describes a state that doesn't exist in the field. The forward-compat story really only matters once the catalog loader exists *without* deny support — a window the staged plan keeps closed. Worth tightening to "any future arstotzka build that reads the catalog without deny support" — small but the current wording pretends there's a back-compat surface to maintain when there isn't yet. ## Consciously deferred (acknowledged) - Audience / JWT `aud` naming collision. - Implicit registration via `load()` (loaded modules can call `role()` directly). - arstotzka SPEC.md "There is no deny." — moves to PR #8. ## Tests look great Coverage now includes load-error wrapping, traversal rejection, cycle detection, suffix list, missing-target, missing-`//`, missing-colon, missing-workspace, deny rendering, and CLI exit codes. The one regression test missing is the closure/validator mismatch above. ## TL;DR The architectural moves (drop dimensions, scope deny to PR #8, closure-expand at render time) are the right shape. The loader is now hardened. The remaining issue worth blocking on is the validator/render closure asymmetry — actions can be granted on parents that don't publish them, which defeats the "compile-time check against a real vocabulary" pitch. Once that's fixed (or explicitly justified), this is good to merge ahead of PR #7.

arrdem referenced this pull request from a commit

2026-04-26 20:01:38 +00:00

chore(rulebook): Stub PR #7 scope — bazel rules + first Rulebook

arrdem referenced this pull request from a commit

2026-04-26 20:01:38 +00:00

chore(arstotzka): Stub PR #8 scope — catalog loader

arrdem referenced this pull request from a commit

2026-04-26 20:01:38 +00:00

chore(authctl): Stub PR #9 scope — authctl catalog integration

arrdem added 1 commit

2026-04-26 20:01:38 +00:00

fix(rulebook): Tighten action validation; deny statements skip closure 866aef8c5e

Two correctness fixes from the second-round review:

1. Validator no longer launders descendant actions up. Previously the
   validator unioned all_actions across the closure (parent +
   descendants), so statement(resources=[parent], actions=["release"])
   passed because a child published "release" — but the renderer would
   then emit {"actions": ["release"], "resources": ["parent", ...]} and
   arstotzka's fnmatch evaluator would allow (release, parent) even
   though parent never publishes release. The catalog laundered actions
   the listed resources don't expose. Tighten validation: each action
   must be in r.all_actions of every listed resource (extends-up only).
   The new test_action_must_be_on_listed_resource_not_just_descendants
   pins this.

2. Closure expansion is asymmetric on effect. Allow expands; deny does
   not. The surgical-exclusion use case ("allow secrets.* except
   secrets.private") needs deny to mean "just this one literal", not
   "this subtree". A deny that silently expanded would also clobber
   secrets.private.metadata and break the contract. Subtree deny is
   still expressible via fnmatch glob (secrets.private.*) — let the
   evaluator do the work. test_deny_statement_does_not_expand_descendants
   pins the new behavior; README documents the asymmetry explicitly.

Validator simplification falls out: with action checks against listed
resources only, the descendants index is no longer needed. Drops the
duplicate-name re-check that was already enforced by builders.py.

arrdem commented

2026-04-26 20:02:09 +00:00

Author

Owner

Pushed 866aef8 addressing the second-round review.

Fixed

Validator no longer launders descendant actions. Each action in a statement must now appear in r.all_actions of every listed resource (extends-up only). The closure-union over descendants is gone. New regression test test_action_must_be_on_listed_resource_not_just_descendants is the one you sketched in your review and it now fails-then-passes against the fix. Added test_listing_descendant_with_inherited_action_is_valid and test_action_on_one_of_many_listed_resources_is_reported to pin the rest of the surface.
Deny + closure asymmetry resolved on the surgical-exclusion side. _statement_to_json emits the literal listed resources for effect="deny" and the descendant closure only for effect="allow". Subtree-deny stays expressible via fnmatch glob (secrets.private.*). New test_deny_statement_does_not_expand_descendants builds a four-deep extends chain and asserts the deny stops at its literal while the allow expands to the full subtree. README documents the asymmetry explicitly.
Forward-compat phrasing tightened. The earlier wording implied existing arstotzka builds read the catalog; they don't. Updated the PR body and render.py docstring to talk about "any future arstotzka build that reads the catalog without deny support" — matching the staged plan.

Validator drops out

With the new check operating on the listed resources only, validate.py no longer needs the descendants index, and the dead defense in depth duplicate-name re-check is gone — builders.py already enforces uniqueness at construction time.

On `attr.*`

Keeping it callable-but-inert. Reid's direction explicitly was to keep the attr namespace stable for cross-Starlark consistency with Tiller and Bazel-style rule definitions; raising on call would break that goal the moment a Rulebook author copies a snippet from Tiller. The README and attr.py docstring both flag it as a reserved-but-inert surface today; if the footgun bites in practice we can revisit.

Bazel test

Still blocked by the cairn mirror's 404 on runfiles_stub_x86_64_linux — looks like a downloader rewrite redirects the GitHub upstream to cairn, but the artifact at the expected blob key isn't there. Tried --repository_cache= --disk_cache= to force a fresh fetch; same failure. This is environmental and outside this branch's scope. 67/67 tests pass under uv run pytest -p no:pudb. Worth opening a separate ticket on cairn.

Deferred (still consciously)

Audience / JWT aud naming.
Implicit registration via load().
attr.* left callable (above).

Pushed `866aef8` addressing the second-round review. ## Fixed - **Validator no longer launders descendant actions.** Each action in a statement must now appear in `r.all_actions` of every *listed* resource (extends-up only). The closure-union over descendants is gone. New regression test `test_action_must_be_on_listed_resource_not_just_descendants` is the one you sketched in your review and it now fails-then-passes against the fix. Added `test_listing_descendant_with_inherited_action_is_valid` and `test_action_on_one_of_many_listed_resources_is_reported` to pin the rest of the surface. - **Deny + closure asymmetry resolved on the surgical-exclusion side.** `_statement_to_json` emits the literal listed resources for `effect="deny"` and the descendant closure only for `effect="allow"`. Subtree-deny stays expressible via fnmatch glob (`secrets.private.*`). New `test_deny_statement_does_not_expand_descendants` builds a four-deep extends chain and asserts the deny stops at its literal while the allow expands to the full subtree. README documents the asymmetry explicitly. - **Forward-compat phrasing tightened.** The earlier wording implied existing arstotzka builds read the catalog; they don't. Updated the PR body and `render.py` docstring to talk about "any future arstotzka build that reads the catalog without deny support" — matching the staged plan. ## Validator drops out With the new check operating on the listed resources only, `validate.py` no longer needs the descendants index, and the dead `defense in depth` duplicate-name re-check is gone — `builders.py` already enforces uniqueness at construction time. ## On `attr.*` Keeping it callable-but-inert. Reid's direction explicitly was to keep the `attr` namespace stable for cross-Starlark consistency with Tiller and Bazel-style rule definitions; raising on call would break that goal the moment a Rulebook author copies a snippet from Tiller. The README and `attr.py` docstring both flag it as a reserved-but-inert surface today; if the footgun bites in practice we can revisit. ## Bazel test Still blocked by the cairn mirror's 404 on `runfiles_stub_x86_64_linux` — looks like a downloader rewrite redirects the GitHub upstream to cairn, but the artifact at the expected blob key isn't there. Tried `--repository_cache= --disk_cache=` to force a fresh fetch; same failure. This is environmental and outside this branch's scope. 67/67 tests pass under `uv run pytest -p no:pudb`. Worth opening a separate ticket on cairn. ## Deferred (still consciously) - Audience / JWT `aud` naming. - Implicit registration via `load()`. - `attr.*` left callable (above).

arrdem commented

2026-04-26 20:17:55 +00:00

Author

Owner

Re-reviewed at 866aef8. The two blockers from round two are landed cleanly; only one small inconsistency left.

Verified fixes

Validator no longer launders descendant actions. _validate_statement iterates (action, listed_resource) pairs against r.all_actions (extends-up only). The closure/descendants index is gone from validate.py entirely. The regression test I sketched (test_action_must_be_on_listed_resource_not_just_descendants) now reproduces the bug and asserts the fix; test_action_on_one_of_many_listed_resources_is_reported pins the "must be on every listed resource" half. Inheritance up still works as expected (test_listing_descendant_with_inherited_action_is_valid).
Allow/deny render asymmetry. _statement_to_json branches on effect: allow expands, deny emits verbatim. The README documents both halves and points authors at glob (secrets.private.*) for subtree-deny. test_deny_statement_does_not_expand_descendants builds the four-deep chain (secrets → secrets.private → secrets.private.metadata) and verifies the deny stops at its literal while the allow expands the full subtree.
Forward-compat phrasing tightened in both the PR body and render.py docstring — now talks about "any future arstotzka build that reads the catalog" rather than implying back-compat with current builds.

Held with rationale (acceptable)

attr.* stays callable-inert for cross-Starlark consistency with Tiller / Bazel-style rule definitions. The README and attr.py docstring both flag it as reserved-but-unused. Reasonable trade — if it bites in practice, revisit.
Audience / JWT aud collision — flagged, deferred. README now anchors audience.name to the JWT aud claim explicitly, which mostly de-fangs it.
Implicit registration via load() — deferred.
Bazel test green — blocked on the cairn mirror's 404 on runfiles_stub_x86_64_linux. Environmental and out of this branch's scope; 67/67 under pytest. Worth a separate ticket on cairn.

Small cleanup left

validate.py's docstring still lists check #2 ("Resource and role names are unique. … this is a final sanity check on hand-built models.") but the function no longer performs that check — the only iteration in validate() is over _validate_statement for the action presence rule. Either drop item #2 from the docstring (matching the code, since builders.py is the only enforcement now) or restore the check. The former is consistent with the "validator simplification" you described in the response.

Verdict

Approve once the docstring trim above is in. The architectural shape — closure-expand on allow, literal on deny, validate against listed resources only — now hangs together coherently, and arstotzka_catalog (PR #8) has a clean wire format to consume.

Re-reviewed at `866aef8`. The two blockers from round two are landed cleanly; only one small inconsistency left. ## Verified fixes - **Validator no longer launders descendant actions.** `_validate_statement` iterates `(action, listed_resource)` pairs against `r.all_actions` (extends-up only). The closure/descendants index is gone from `validate.py` entirely. The regression test I sketched (`test_action_must_be_on_listed_resource_not_just_descendants`) now reproduces the bug and asserts the fix; `test_action_on_one_of_many_listed_resources_is_reported` pins the "must be on every listed resource" half. Inheritance up still works as expected (`test_listing_descendant_with_inherited_action_is_valid`). - **Allow/deny render asymmetry.** `_statement_to_json` branches on `effect`: `allow` expands, `deny` emits verbatim. The README documents both halves and points authors at glob (`secrets.private.*`) for subtree-deny. `test_deny_statement_does_not_expand_descendants` builds the four-deep chain (`secrets` → `secrets.private` → `secrets.private.metadata`) and verifies the deny stops at its literal while the allow expands the full subtree. - **Forward-compat phrasing tightened** in both the PR body and `render.py` docstring — now talks about "any future arstotzka build that reads the catalog" rather than implying back-compat with current builds. ## Held with rationale (acceptable) - **`attr.*` stays callable-inert** for cross-Starlark consistency with Tiller / Bazel-style rule definitions. The README and `attr.py` docstring both flag it as reserved-but-unused. Reasonable trade — if it bites in practice, revisit. - **Audience / JWT `aud` collision** — flagged, deferred. README now anchors `audience.name` to the JWT `aud` claim explicitly, which mostly de-fangs it. - **Implicit registration via `load()`** — deferred. - **Bazel test green** — blocked on the cairn mirror's 404 on `runfiles_stub_x86_64_linux`. Environmental and out of this branch's scope; 67/67 under `pytest`. Worth a separate ticket on cairn. ## Small cleanup left `validate.py`'s docstring still lists check #2 ("Resource and role names are unique. … this is a final sanity check on hand-built models.") but the function no longer performs that check — the only iteration in `validate()` is over `_validate_statement` for the action presence rule. Either drop item #2 from the docstring (matching the code, since `builders.py` is the only enforcement now) or restore the check. The former is consistent with the "validator simplification" you described in the response. ## Verdict Approve once the docstring trim above is in. The architectural shape — closure-expand on allow, literal on deny, validate against listed resources only — now hangs together coherently, and `arstotzka_catalog` (PR #8) has a clean wire format to consume.

arrdem referenced this pull request from a commit

2026-04-26 20:19:27 +00:00

chore(rulebook): Stub PR #7 scope — bazel rules + first Rulebook

arrdem referenced this pull request from a commit

2026-04-26 20:19:27 +00:00

chore(arstotzka): Stub PR #8 scope — catalog loader

arrdem referenced this pull request from a commit

2026-04-26 20:19:28 +00:00

chore(authctl): Stub PR #9 scope — authctl catalog integration

arrdem added 1 commit

2026-04-26 20:19:28 +00:00

docs(rulebook): Trim validate.py docstring to match the simplified body 3fbc2386fb

Round-three review noted the docstring still listed a uniqueness check
that the function no longer performs — uniqueness is enforced by
builders.py at construction time and was dropped from validate() in
the previous commit. Bring the docstring in line with the code: one
rule, action-presence on every listed resource.

arrdem commented

2026-04-26 20:19:34 +00:00

Author

Owner

Pushed 3fbc238 — validate.py docstring now describes the single rule the function actually enforces (action presence on every listed resource); the stale uniqueness item is gone, since builders.py is the only enforcement now. Stack rebased.

Ready for approval whenever you are.

Pushed `3fbc238` — `validate.py` docstring now describes the single rule the function actually enforces (action presence on every listed resource); the stale uniqueness item is gone, since `builders.py` is the only enforcement now. Stack rebased. Ready for approval whenever you are.

This pull request can be merged automatically.

You are not authorized to merge this pull request.

View command line instructions

Checkout

From your project repository, check out a new branch and test the changes.

git fetch -u origin arrdem/rulebook-dsl:arrdem/rulebook-dsl

git switch arrdem/rulebook-dsl

Merge

Merge the changes and update on Forgejo.

Warning: The "Autodetect manual merge" setting is not enabled for this repository, you will have to mark this pull request as manually merged afterwards.

git switch trunk

git merge --no-ff arrdem/rulebook-dsl

git switch arrdem/rulebook-dsl

git rebase trunk

git switch trunk

git merge --ff-only arrdem/rulebook-dsl

git switch arrdem/rulebook-dsl

git rebase trunk

git switch trunk

git merge --no-ff arrdem/rulebook-dsl

git switch trunk

git merge --squash arrdem/rulebook-dsl

git switch trunk

git merge --ff-only arrdem/rulebook-dsl

git switch trunk

git merge arrdem/rulebook-dsl

git push origin trunk

No reviewers

No labels

No milestone

No project

No assignees

1 participant

Notifications

Due date

The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference

arrdem/source!4

No description provided.

Rows
Columns

feat(rulebook): Paperfile-style DSL for Arstotzka policy #4

Scope

Closure expansion is asymmetric on effect

Validation

Stack

Review response

Changes are visible to end-users: no

Test plan

Hostile conceptual review

Code review

Bugs and correctness

Style / nits

Test gaps

Build / packaging

PR description

TL;DR

Resolved

Still flagged

Closure expansion grants actions on resources that don't publish them

attr.* is dead surface

Deny + descendants is asymmetric

bazel test claim is unverified

Forward-compat phrasing slightly overstates the case

Consciously deferred (acknowledged)

Tests look great

TL;DR

Fixed

Validator drops out

On attr.*

Bazel test

Deferred (still consciously)

Verified fixes

Held with rationale (acceptable)

Small cleanup left

Verdict

Checkout

Merge

Closure expansion is asymmetric on `effect`

`attr.*` is dead surface

`bazel test` claim is unverified

On `attr.*`