Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.output.ai/llms.txt

Use this file to discover all available pages before exploring further.

The cost estimation CLI shows you costs after a workflow finishes. Cost events expose similar data while a workflow runs:
  • cost:llm:request — emitted when an LLM call finishes (generateText / streamText), with model id, token usage, and estimated dollar cost.
  • cost:http:request — emitted when you attach a dollar cost to an HTTP response with addRequestCost from @outputai/http, with the request id, URL, and cost you supplied.
Use either or both to log spend to your observability stack, trigger alerts, or aggregate cost per workflow over time.

Setup

Cost events use the same hooks system as error hooks:
  1. Create a hook file and import on from @outputai/core/hooks.
  2. Register a handler for cost:llm:request, cost:http:request, or both.
  3. Add the file path to output.hookFiles in package.json.
See Error Hooks — Setup for the hook file registration pattern.
src/llm_cost_hooks.js
import { on } from '@outputai/core/hooks';

on( 'cost:llm:request', async ( { workflowId, activityId, modelId, usage, cost } ) => {
  console.log( 'LLM call', {
    workflowId,
    activityId,
    modelId,
    tokens: usage?.totalTokens,
    cost: cost?.total
  } );
} );
src/http_cost_hooks.js
import { on } from '@outputai/core/hooks';

on( 'cost:http:request', async ( { workflowId, activityId, requestId, url, cost } ) => {
  console.log( 'HTTP request', { workflowId, activityId, requestId, url, total: cost?.total } );
} );
Handler errors are caught and logged by the framework — they never affect the workflow or the request that triggered them.

LLM request cost

When events fire

An cost:llm:request event is emitted after every generateText and streamText call completes. For streaming, the event fires when the stream finishes, not when it starts.

Payload

The handler receives a single object:
FieldTypeDescription
workflowIdstring | undefinedTemporal workflow execution ID. Present when the call runs inside a workflow.
activityIdstring | undefinedTemporal activity ID of the step that made the LLM call. Present when the call runs inside a workflow activity.
modelIdstringModel identifier (e.g. claude-sonnet-4-20250514, gpt-4o). From the prompt file config.
usageobjectToken usage for this call. See Usage shape.
costobjectComputed dollar cost for this call. See Cost structure.

Usage shape

usage follows the AI SDK’s totalUsage shape, aggregated across all steps for that call:
{
  "inputTokens": 217,
  "inputTokenDetails": { "noCacheTokens": 217, "cacheReadTokens": 0, "cacheWriteTokens": 0 },
  "outputTokens": 9,
  "outputTokenDetails": { "textTokens": null, "reasoningTokens": null },
  "totalTokens": 226,
  "reasoningTokens": null,
  "cachedInputTokens": 0
}
When the model uses extended thinking, reasoningTokens and outputTokenDetails.reasoningTokens are set. Cached prompt tokens are reported in cachedInputTokens and inputTokenDetails.cacheReadTokens.

Cost structure

Cost is computed from the model’s per-million-token pricing, fetched from a built-in pricing source and cached for 24 hours. For each priced dimension, the dollar amount is (tokens / 1_000_000) * pricePerMillion. Non-cached input tokens use inputTokens - (cachedInputTokens ?? 0). The payload matches LLMCallCost from @outputai/llm: on success you get a numeric total plus a components array; on failure total is null and message explains why.
FieldTypeDescription
totalnumber | nullTotal estimated cost in dollars. null if pricing could not be computed.
messagestringPresent when total is null (e.g. unknown model, pricing fetch failed, or unexpected error).
componentsarrayPresent when total is a number. Each entry is { name: string, value: number }. Only dimensions that have pricing for the model are included.
components[].name values
nameDescription
input_tokensNon-cached prompt tokens.
input_cached_tokensCached prompt read tokens (cachedInputTokens).
output_tokensCompletion tokens.
reasoning_tokensReasoning tokens, only when the model defines separate reasoning pricing; omitted otherwise.
Successful calculation:
{
  "total": 0.002341,
  "components": [
    { "name": "input_tokens", "value": 0.001302 },
    { "name": "input_cached_tokens", "value": 0 },
    { "name": "output_tokens", "value": 0.001039 },
    { "name": "reasoning_tokens", "value": 0.00027 }
  ]
}
A given response may list only a subset of these — for example no reasoning_tokens row if that price is not configured for the model. Unknown model or pricing unavailable:
{
  "total": null,
  "message": "Missing cost reference for model"
}

HTTP request cost

Events fire only when your code calls addRequestCost( response, cost ) with a response object created by @outputai/http (or its exported fetch). The SDK attaches the cost to the existing HTTP trace event and emits cost:http:request. If the response did not originate from this package, addRequestCost no-ops (with a console warning) and no hook event is emitted.

Payload

The handler receives a single object. As with cost:llm:request, workflowId and activityId are filled when the hook runs inside a Temporal workflow activity.
FieldTypeDescription
workflowIdstring | undefinedTemporal workflow execution ID. Present when the handler runs inside a workflow.
activityIdstring | undefinedTemporal activity ID. Present when the handler runs inside a workflow activity.
requestIdstringInternal id linking this payload to the HTTP trace event for that request.
urlstringFinal response URL (same as response.url).
costobjectDollar cost you passed to addRequestCost. See HTTP cost shape.

HTTP cost shape

Same top-level shape as LLM costs: a numeric total and an optional components array of { name, value }. For HTTP, name and value are entirely yours — they mirror what you pass to addRequestCost (for example labels derived from vendor billing headers).
{
  "total": 0.42,
  "components": [
    { "name": "input", "value": 0.12 },
    { "name": "output", "value": 0.3 }
  ]
}