A single clobber on an abortOn:-bound fact is fine — the binding catches the race and the audit ledger records it. A loop is two or more resolvers whose when: predicates both satisfy a shared state and keep rewriting the fact every reconcile tick, indefinitely. Today the symptom surfaces as a value flapping between two states in a customer screenshot, even though the audit ledger holds 800 clobbers/sec of forensic evidence nobody is querying. clobberLoopPlugin closes the loop: one structured warning per detected loop, with a predicate-overlap proof that names the specific clauses fighting.

Quick Start

import { createSystem } from '@directive-run/core';
import { clobberLoopPlugin } from '@directive-run/core/plugins';

const detector = clobberLoopPlugin({
  threshold: 5,
  windowMs: 1000,
  onLoop: (event) => {
    pagerduty.trigger({
      severity: event.severity,
      summary: `Clobber loop on ${event.fact}: ${event.participants.join(' vs ')}`,
      details: event,
    });
  },
});

const system = createSystem({
  module: myModule,
  plugins: [detector.plugin],
});

// During incident response:
detector.disable();

The returned handle is { plugin, disable, enable, isEnabled }. Pass .plugin to createSystem; use the rest at runtime when an SRE needs to flip the detector off without redeploying.

When to use `clobberLoopPlugin` vs `clobberAlertPlugin`

	`clobberAlertPlugin`	`clobberLoopPlugin`
Fires on	every clobber on an irreversible-tagged fact	sustained loops only (≥ N distinct rejections from ≥ 2 resolvers in window)
Output	per-event alert	one event per detected loop, with predicate-overlap proof
Use when	a single clobber is operationally urgent (money, PII)	the noise of "many clobbers" is the problem itself
Mounts with	`clobber-alert`	`clobber-loop`
Pair?	yes — both can run together

Most production systems want both: clobberAlertPlugin pages on irreversible-tagged clobbers immediately, clobberLoopPlugin separately surfaces the rule-design bug behind a sustained churn.

How loop detection works

The detector subscribes to resolver.write.rejected events and aggregates them per fact:

┌────────────────────────────────────────────────────────┐
│  resolver.write.rejected stream                        │
│                                                        │
│  ringBuffer[fact] = [{ timestamp, requirementId,       │
│                       resolverId, seq }, ...]          │
│                                                        │
│  trim to windowMs                                      │
│                                                        │
│  if distinct(requirementId) ≥ threshold                │
│     AND distinct(resolverId) ≥ 2                       │
│     AND (fact, participantSet) not in cooldown         │
│  then:                                                 │
│    build PredicateOverlapProof from participants'      │
│      whenSpecs                                         │
│    PII-redact operands via system.meta.byTag("pii")    │
│    emit resolver.clobber.loop.detected                 │
│    enter cooldown for cooldownMs                       │
└────────────────────────────────────────────────────────┘

The distinct-by-requirement-id counting matters: a single resolver's retry storm shares one requirement ID, so it counts as one rejection. The detector only fires on true multi-participant contention — never on a single resolver retrying itself.

When the loop quiets (default 30s without a new rejection on the (fact, participantSet)), a resolver.clobber.loop.resolved event closes the alarm so dashboards show active loops, not historical loops.

Configuration

Option	Type	Default	Description
`windowMs`	`number`	`1000`	Window over which rejections aggregate.
`threshold`	`number`	`5`	Minimum distinct-requirement rejections in window to trigger.
`cooldownMs`	`number`	`5000`	Suppress same-`(fact, participantSet)` re-fire for this duration after emission.
`resolvedAfterMs`	`number`	`30000`	Quiet window before firing `resolver.clobber.loop.resolved`.
`maxTrackedFacts`	`number`	`256`	Global LRU cap on facts the detector tracks.
`maxParticipantsPerFact`	`number`	`16`	Per-fact cap on participant resolvers.
`maxEmissionsPerSec`	`number`	`10`	Global emission cap. Above-cap detections surface in next event's `suppressedSinceLastEmit`.
`capturePII`	`boolean`	`false`	If `false` (default), `whenSpec` operands at PII-tagged fact paths are redacted to `"[redacted]"` BEFORE the event leaves the plugin. Set to `true` only when the deployment has a data-processing addendum.
`onLoop`	`(event) => void`	`console.warn` in dev, `console.error` to stderr in prod	Called for each detected loop. NOT noop in production — defaults to stderr so the signal lands in log pipelines without explicit routing.
`onResolved`	`(event) => void`	`undefined`	Called when a previously-detected loop closes.

The defaults assume dev / staging operation. For production wire onLoop: pagerduty.trigger or onLoop: slack.post explicitly — the stderr default is the floor under "no monitoring at all," not a recommended production sink.

The detected event

{
  type: 'resolver.clobber.loop.detected',
  systemId: string,                         // multi-tenant routing key
  fact: string,                             // the contended fact key
  participants: readonly string[],          // sorted unique resolver IDs
  participantModules: readonly string[],    // each resolver's owning module
  count: number,                            // distinct-requirement rejections in window
  windowMs: number,
  firstAt: number,                          // ms epoch of first event in window
  lastAt: number,                           // ms epoch of trigger event
  predicateOverlap?: PredicateOverlapProof, // see below
  severity: 'warn' | 'error',               // escalated to "error" when fact tagged "pii" or "money"
  factTags: readonly string[],              // surfaces tags without leaking values
  suppressedSinceLastEmit: number,          // global rate-limit overflow counter
  rejectionSeqs: readonly number[],         // audit cross-references
}

Emitted through system.observe() so audit-ledger, devtools, and OTel exporters all see it without depending on clobberLoopPlugin directly.

The resolved companion:

{
  type: 'resolver.clobber.loop.resolved',
  systemId: string,
  fact: string,
  participants: readonly string[],
  durationMs: number,
  resolution: 'no-recurrence-in-window'
            | 'participant-disabled'
            | 'predicate-narrowed',
}

The `PredicateOverlapProof`

This is the killer feature — when both fighting resolvers' constraints use data-form when: predicates, the plugin tells you the exact clauses that co-fire so the suggested fix isn't a guess.

type PredicateOverlapProof =
  | {
      verdict: 'matched';
      coFireClauses: LeafClause[];
      conflictingClauses: never[];
    }
  | {
      verdict: 'overlap';
      coFireClauses: LeafClause[];
      conflictingClauses: LeafClause[];
    }
  | {
      verdict: 'indeterminate';
      reason: 'non-comparable-operator';
      coFireClauses: LeafClause[];
    }
  | {
      verdict: 'function-form-opaque';
      reason: 'one-or-both-when-is-a-function';
      whenSourceHashes?: string[];
    };

Verdict	Meaning	Warning text disclaims?
`matched`	Both predicates have identical structural clauses. The strongest verdict — the rules are syntactic duplicates.	No
`overlap`	Clauses share at least one path and at least one pairwise comparison says they co-fire, with no direct contradictions.	No
`indeterminate`	A non-comparable operator (`$regex`, `$elemMatch`, `$matches`) appeared. Cannot prove overlap structurally.	Yes ("cannot prove overlap — predicate uses non-comparable operator")
`function-form-opaque`	At least one constraint uses a function-form `when:`. Structural comparison is impossible. If audit-ledger is mounted, includes hashes of the function sources for cross-version diffing.	Yes ("cannot prove overlap — at least one constraint uses function-form `when:`")

The proof builder uses the same flattenPredicate + clause-comparison machinery directive doctor uses to find contradictions before runtime — so a loop the detector flags at runtime is the same shape doctor could have flagged at design time.

Default warning text from console.warn:

[directive] CLOBBER LOOP on `cart.discount` (5 clobbers in 482ms)
  Participants: applyCoupon, applyLoyaltyDiscount
  Predicate overlap: matched
  Both `when:` predicates fire when:
    - user.loyaltyTier >= 2
    - coupon.code exists
  Suggested fix: add `priority:` to one, narrow `when:` to disjoint conditions, or merge into a single resolver.

Reason-aware `shouldRetry` integration

v1.23.0 also widened RetryPolicy.shouldRetry with an optional third argument — ShouldRetryContext — so a retry policy can decide based on WHY the attempt failed. The motivating case pairs with this detector: "retry on clobber, fail loud on real bugs."

resolvers: {
  applyDiscount: {
    requirement: 'APPLY_DISCOUNT',
    retry: {
      attempts: 5,
      backoff: 'exponential',
      shouldRetry: (err, attempt, ctx) => {
        if (ctx?.reason === 'clobbered') return attempt < 5;
        if (ctx?.reason === 'timeout')   return attempt < 2;
        return false;  // 'error' / 'cancelled' → no retry
      },
    },
    resolve: async (req, ctx) => { /* ... */ },
  },
}

The ShouldRetryContext carries:

reason: 'clobbered' | 'timeout' | 'cancelled' | 'error'
clobber?: { fact, expected, actual } — populated when reason === 'clobbered'

Two-argument shouldRetry(err, attempt) callers continue to work unchanged — the third argument is additive. Before this change, a clobber-induced abort never reached shouldRetry at all; the controller's aborted signal short-circuited the retry path silently. Now policies can opt into bounded retries on contention while still failing loud on real errors.

PII safety

Redaction happens at event-construction time, not at message-format time. The plugin walks the participants' whenSpec through the same redactWhenSpec utility the audit-ledger uses, against system.meta.byTag("pii"), BEFORE the PredicateOverlapProof is attached to the emitted event. Downstream sinks (audit-ledger, devtools, third-party onLoop handlers) all receive the redacted form by default — there is no "raw vs formatted" split where a sink could accidentally log the unredacted operands.

capturePII: true is the explicit opt-out. Mirror of the audit-ledger contract — set it only when the deployment has a data-processing addendum that permits unredacted operand capture.

The slim per-rejection buffer entries ({ timestamp, requirementId, resolverId, seq }) deliberately don't carry expected / actual payloads at all — the audit ledger already keeps the forensic payload, so PII doesn't spread through the detector's own buffers.

Audit-ledger integration

When createAuditLedger is mounted alongside clobberLoopPlugin, both new event variants are captured as ledger entries:

import { createAuditLedger, clobberLoopPlugin, memorySink } from '@directive-run/core/plugins';

const ledger = createAuditLedger({ sink: memorySink() });
const detector = clobberLoopPlugin({ threshold: 5 });

const system = createSystem({
  module: myModule,
  plugins: [detector.plugin, ledger.plugin],
});

// Later — audit query for the loop:
const loopEntries = ledger
  .recent(1000)
  .filter((e) => e.kind === 'resolver.clobber.loop.detected');

Each resolver.clobber.loop.detected audit entry includes rejectionSeqs — the sequence numbers of the contributing resolver.write.rejected entries — so an auditor reading a loop entry can walk to every individual rejection. The proof's overlapVerdict is surfaced as a tag (the predicate clauses are PII-redacted upstream).

Runtime kill-switch

The plugin's return handle exposes disable(), enable(), and isEnabled() so an SRE can flip the detector off during incident response without redeploying. The buffer state survives across toggle — enable() resumes cleanly without a warm-up delay.

const detector = clobberLoopPlugin({ threshold: 5 });

// At incident time:
detector.disable();
// ...investigate / mitigate...
detector.enable();
detector.isEnabled(); // true

disable() stops emission. Inbound resolver.write.rejected events still update the ring buffer so a re-enable doesn't lose the picture of what happened during the freeze.

Cap behavior

Cap	When hit	Behavior
`maxTrackedFacts`	257th distinct fact churned	LRU eviction (not FIFO) — the legitimate hot fact stays resident even if a hot-then-cold attacker churns the map.
`maxParticipantsPerFact`	17th distinct resolver on one fact	Detailed participant tracking pauses for that fact until the cooldown expires. One "N-way contention" event still fires.
`maxEmissionsPerSec`	11th loop detected in same second across all facts	The 11th-through-Nth detections increment `suppressedSinceLastEmit`. The NEXT event's `suppressedSinceLastEmit` field reports how many were dropped.

The buffer memory bound is small: 32 entries × 256 facts × ~80 bytes = under 1 MB worst case.

Mounting alongside `clobberAlertPlugin`

Both plugins subscribe to the same resolver.write.rejected event stream and operate independently. Order doesn't matter; both will fire on the same rejection if their filters match.

const system = createSystem({
  module: myModule,
  plugins: [
    clobberAlertPlugin({
      irreversibleTags: ['money', 'pii'],
      onAlert: pagerduty.trigger,
    }),
    clobberLoopPlugin({
      threshold: 5,
      onLoop: slack.postIncident,
    }).plugin,
  ],
});

Pairing them is the recommended production posture: instant pages on irreversible-tagged clobbers via clobberAlertPlugin, slower-fuse rule-design signal via clobberLoopPlugin.

Clobber Loop Detector

Quick Start

When to use `clobberLoopPlugin` vs `clobberAlertPlugin`

How loop detection works

Configuration

The detected event

The `PredicateOverlapProof`

Reason-aware `shouldRetry` integration

PII safety

Audit-ledger integration

Runtime kill-switch

Cap behavior

Mounting alongside `clobberAlertPlugin`

Stay in the loop. Sign up for our newsletter.

Quick Start

When to use clobberLoopPlugin vs clobberAlertPlugin

How loop detection works

Configuration

The detected event

The PredicateOverlapProof

Reason-aware shouldRetry integration

PII safety

Audit-ledger integration

Runtime kill-switch

Cap behavior

Mounting alongside clobberAlertPlugin

Stay in the loop. Sign up for our newsletter.

When to use `clobberLoopPlugin` vs `clobberAlertPlugin`

The `PredicateOverlapProof`

Reason-aware `shouldRetry` integration

Mounting alongside `clobberAlertPlugin`