Skip to main content

Guides

2 min read

Control AI Costs

Prevent runaway agents from burning through your API budget.


The Problem

An agent enters a reasoning loop and burns through $50 of tokens in minutes. A batch job scales to 1,000 concurrent agents overnight. Without guardrails on spending, LLM costs spiral out of control.

The Solution

Use maxTokenBudget on the orchestrator for simple token caps, or withBudget middleware for dollar-based limits with time windows:

import { createAgentOrchestrator, withBudget } from '@directive-run/ai';

// Option 1: Simple token cap on orchestrator
const orchestrator = createAgentOrchestrator({
  runner, // See Running Agents (/ai/running-agents) for setup
  autoApproveToolCalls: true,
  maxTokenBudget: 50000,
  budgetWarningThreshold: 0.8,
  onBudgetWarning: ({ currentTokens, maxBudget, percentage }) => {
    console.warn(`Budget ${Math.round(percentage * 100)}% used`);
  },
});

// Option 2: Dollar-based limits with time windows
const budgetedRunner = withBudget(runner, {
  maxCostPerCall: 0.50,
  budgets: [
    { window: 'hour', maxCost: 10.00 },
    { window: 'day', maxCost: 100.00 },
  ],
  pricing: {
    inputPerToken: 0.000003,
    outputPerToken: 0.000015,
  },
  onBudgetExceeded: (details) => {
    console.error(`Budget exceeded: ${details.reason}`);
  },
});

How It Works

  • maxTokenBudget is a hard cap on total tokens (input + output) across all agent turns. The agent pauses when exceeded.
  • budgetWarningThreshold fires onBudgetWarning at the specified percentage (0.8 = 80%).
  • withBudget wraps a runner with dollar-based cost tracking. It estimates costs before each call and rejects calls that would exceed the budget.
  • budgets array supports multiple time windows. Each window tracks spending independently.
  • pricing maps token counts to dollar costs. Set these to match your LLM provider's pricing.

Full Example

An orchestrator with both token caps and dollar budgets, plus logging:

import {
  createAgentOrchestrator,
  withBudget,
} from '@directive-run/ai';

const budgetedRunner = withBudget(runner, { // See Running Agents (/ai/running-agents) for setup
  maxCostPerCall: 1.00,
  budgets: [
    { window: 'hour', maxCost: 25.00 },
    { window: 'day', maxCost: 200.00 },
  ],
  pricing: {
    inputPerToken: 0.000003,
    outputPerToken: 0.000015,
  },
  onBudgetExceeded: (details) => {
    alertOps(`AI budget exceeded: ${details.reason}`); // Your alerting function
  },
});

const orchestrator = createAgentOrchestrator({
  runner: budgetedRunner,
  autoApproveToolCalls: true,
  maxTokenBudget: 100000,
  budgetWarningThreshold: 0.9,
  onBudgetWarning: ({ percentage }) => {
    console.warn(`Token budget at ${Math.round(percentage * 100)}%`);
  },
});

// Check spending at any time
const hourlySpend = budgetedRunner.getSpent('hour');
const dailySpend = budgetedRunner.getSpent('day');
console.log(`Spent: $${hourlySpend.toFixed(2)}/hr, $${dailySpend.toFixed(2)}/day`);
Previous
Human Approval Workflows

We care about your data. We'll never share your email.

Powered by Directive. This signup uses a Directive module with facts, derivations, constraints, and resolvers – zero useState, zero useEffect. Read how it works

Directive - Constraint-Driven State Management for TypeScript