Skip to main content

Observability

5 min read

Guardrail Events

Guardrail plugins (today: createFactPIIGuardrail; future: input-content, rate-limit, policy) emit a typed "guardrail.blocked" ObservationEvent every time they detect a violation. Backend wiring (OTel exporters, audit-ledger plugins, the timeline renderer) subscribes via system.observe() rather than coordinating with per-plugin callbacks.


The event

{
  type: "guardrail.blocked",
  plugin: string,        // The guardrail's name, e.g. "fact-pii-guardrail"
  key: string,           // The fact key the violation was found in
  kind: "redact" | "alert" | "detect",
  count: number,         // Number of pattern matches in this batch
  category?: string,     // Coarse classifier (e.g. "email" | "ssn" | "credit_card")
}

kind semantics

  • "redact" — the guardrail rewrote the value via a follow-up store write. Pair with the subsequent fact.change event to see the redacted result.
  • "alert" — the guardrail observed but did not mutate (operator configured mode: "alert"). The raw value remains in the store.
  • "detect" — the guardrail observed but could not mutate. Today this applies to read-only structured types like Error — the walker can match Error.message / Error.cause but cannot mint a redacted Error with guaranteed .stack parity. A subscriber counting PII redactions should treat kind === "redact" and kind === "detect" equivalently.

Basic subscription

import { createSystem } from '@directive-run/core';
import { createFactPIIGuardrail } from '@directive-run/ai/guardrails';

const system = createSystem({
  module: customer,
  plugins: [createFactPIIGuardrail({ mode: 'redact' })],
});

system.observe((event) => {
  if (event.type === 'guardrail.blocked') {
    console.log(
      `[${event.plugin}] ${event.kind} on ${event.key}:`,
      `${event.count} ${event.category ?? 'unknown'} matches`,
    );
  }
});

system.start();

Every detection flows through the same channel. Multiple guardrails wired into the same system show up under a single observer subscription, filtered by event.plugin.


Wire to OpenTelemetry

import { trace } from '@opentelemetry/api';

const tracer = trace.getTracer('directive-guardrails');

system.observe((event) => {
  if (event.type !== 'guardrail.blocked') return;
  const span = tracer.startSpan('guardrail.blocked', {
    attributes: {
      'guardrail.plugin': event.plugin,
      'guardrail.kind': event.kind,
      'guardrail.category': event.category ?? 'uncategorized',
      'guardrail.count': event.count,
    },
  });
  span.end();
});

Span attributes deliberately avoid the actual matched text — count and category are coarse classifiers chosen so OTel exporters can label spans without exfiltrating PII into observability backends that may have weaker retention controls than the primary fact store.


Wire to an audit ledger

import { createAuditLedger } from '@directive-run/core/audit-ledger';

const ledger = createAuditLedger({ store: postgresStore });

const system = createSystem({
  module: customer,
  plugins: [
    createFactPIIGuardrail({ mode: 'redact' }),
    ledger.plugin,
  ],
});

system.observe((event) => {
  if (event.type === 'guardrail.blocked') {
    ledger.record({
      kind: 'pii_detected',
      plugin: event.plugin,
      key: event.key,
      action: event.kind,
      matches: event.count,
      category: event.category,
      timestamp: Date.now(),
    });
  }
});

The fact.change event for the redacted follow-up write fires independently — pair the two if you need to audit "what was the redacted value the next subscriber saw?" alongside "how many matches drove the redact."


Plugin authoring — emit from your own guardrail

If you're writing a new guardrail (e.g. content moderation, rate limiting, schema enforcement), emit guardrail.blocked via the system.notify surface so OTel / timeline / audit-ledger consumers see your plugin's activity through the same channel as createFactPIIGuardrail.

import type { Plugin } from '@directive-run/core';

export function createMyGuardrail(): Plugin {
  let systemRef: any = null;
  return {
    name: 'my-guardrail',
    onInit(system) {
      systemRef = system;
    },
    onFactSet(key, value) {
      const matches = inspect(value);
      if (matches.length === 0) return;
      systemRef?.notify.guardrailBlocked(
        'my-guardrail',  // MUST match this plugin's `name`
        key,
        'alert',
        matches.length,
        'my-category',
      );
    },
  };
}

The plugin field is validated against the registered plugin set — calling notify.guardrailBlocked with an unknown plugin name drops the event with a dev-mode warning. This is a forgery defense: third-party plugins cannot mint events claiming plugin: "fact-pii-guardrail" and mislead compliance audit consumers.

The notify surface also caps reentry depth at 4 to prevent infinite recursion through plugins whose onGuardrailBlocked re-emits.


Plugin hook (onGuardrailBlocked)

Plugins that want to react to guardrail activity without going through system.observe() implement the onGuardrailBlocked hook directly:

const auditPlugin: Plugin = {
  name: 'audit-plugin',
  onGuardrailBlocked(plugin, key, kind, count, category) {
    // Same signature as the event payload — fan-out from the engine.
  },
};

The synthetic plugin that backs system.observe() implements onGuardrailBlocked and maps it to the typed event, so subscribing via observe() and via the plugin hook see the same emissions.


What does NOT emit

  • The user onBlocked callback still fires INDEPENDENTLY of the observation event. It exists for backwards compatibility and ad-hoc paging (Sentry, Honeycomb). Prefer system.observe() for new integrations — observers see every registered guardrail's activity through the same typed stream.
  • The follow-up redact write that mutates the fact in mode: "redact" emits its own fact.change event AFTER the guardrail.blocked event. Subscribers expecting to see "the redacted value" should listen for fact.change on the same key with event.next === redactionToken.
  • The pre-emit raw write is briefly visible to observers between the original fact.change and the redact follow-up. Tier 0 PII guards do NOT prevent the raw value from reaching audit-ledger / debug-timeline / devtools during that microtask. RFC for a pre-emit transform hook is tracked separately.

Security considerations

  • The count and category fields are deliberately coarse. No payload content, no sample of the matched text. Avoids leaking PII into observability backends with weaker retention controls than the fact store.
  • The plugin field is a guardrail-declared string. Engine validates it against the registered plugin set on every call. Future RFC (per-plugin scoped notify handles) will further restrict this so one registered plugin cannot impersonate another.
  • The event channel is forward-only — no replay buffer. Late subscribers do not see past events. Reconstruct history from the audit-ledger / OTel backend, not from system.observe().

  • Sources — the upstream of most guardrail activity. Sources publish into facts; guardrails inspect those writes.
  • Guardrails — the catalog of built-in guardrails (PII, prompt injection, content moderation).
  • PII Detection — built-in PII guardrail config + categories.
  • Audit Trail — using the audit ledger plugin with guardrail.blocked events.
  • DevTools — observe the event stream live in development.
  • RFC 0010 in the directive monorepo (docs/rfcs/0010-guardrail-blocked-event.md) for the full design history.
Previous
Breakpoints & Checkpoints

Stay in the loop. Sign up for our newsletter.

We care about your data. We'll never share your email.

Powered by Directive. This signup uses a Directive module with facts, derivations, constraints, and resolvers – zero useState, zero useEffect. Read how it works

Directive - Constraint-Driven Runtime for TypeScript | AI Guardrails & State Management