cost:llm:request— emitted when an LLM call finishes (generateText/streamText), with an LLM usage attribute.cost:http:request— emitted when you attach a dollar cost to an HTTP response withaddRequestCostfrom@outputai/http, with an HTTP request cost attribute.
Setup
Cost events use the same hooks system as error hooks:- Create a hook file and import
onfrom@outputai/core/hooks. - Register a handler for
cost:llm:request,cost:http:request, or both. - Add the file path to
outputai.hookFilesinpackage.json.
src/llm_cost_hooks.ts
src/http_cost_hooks.ts
LLM request cost
When events fire
Ancost:llm:request event is emitted after every generateText and streamText call completes. For streaming, the event fires when the stream finishes, not when it starts.
Payload
The handler receives a single LLM usage attribute object:| Field | Type | Description |
|---|---|---|
eventId | string | UUID v4 stamped per emit. Use as an idempotency key — cost:llm:request and http:request for the same fetch get distinct eventIds. |
eventDate | number | Millisecond epoch timestamp for when the event was emitted. |
type | "llm:usage" | Attribute type. |
activityInfo | object | Temporal activity.Info for the activity that made the call. |
workflowDetails | object | Output’s serializable subset of Temporal workflow.WorkflowInfo. |
outputActivityKind | string | Output activity kind. Possible values are step, evaluator, and internal_step. |
modelId | string | Model identifier (e.g. claude-sonnet-4-20250514, gpt-4o). From the prompt file config. |
usage | array | Priced usage entries for this call. See Usage entries. |
total | number | Total estimated cost in dollars. |
tokensUsed | number | Total number of tokens represented by usage. |
Usage entries
Cost is computed from the model’s per-million-token pricing, fetched from a built-in pricing source and cached for 24 hours. For each priced dimension, the dollar amount is(tokens / 1_000_000) * pricePerMillion. Non-cached input tokens use inputTokens - (cachedInputTokens ?? 0).
| Field | Type | Description |
|---|---|---|
type | string | Usage dimension. |
ppm | number | Price per million tokens for this dimension. |
amount | number | Token count for this dimension. |
total | number | Cost for this dimension. |
usage[].type values:
type | Description |
|---|---|
input | Non-cached prompt tokens. |
input_cached | Cached prompt read tokens (cachedInputTokens). |
output | Completion tokens. |
reasoning | Reasoning tokens, only when the model defines separate reasoning pricing. |
reasoning is omitted when the model does not define separate reasoning pricing, and input_cached is omitted when there are no cached input tokens.
HTTP request cost
Events fire only when your code callsaddRequestCost( response, total ) with a response object created by @outputai/http (or its exported fetch). The SDK attaches the cost to the existing HTTP trace event and emits cost:http:request. If the response did not originate from this package, addRequestCost no-ops (with a console warning) and no hook event is emitted.
Payload
The handler receives a single HTTP request cost attribute object:| Field | Type | Description |
|---|---|---|
eventId | string | UUID v4 stamped per emit. Use as an idempotency key — cost:http:request and http:request for the same fetch get distinct eventIds, so consumers keying by eventId won’t collapse the two. |
eventDate | number | Millisecond epoch timestamp for when the event was emitted. |
type | "http:request:cost" | Attribute type. |
activityInfo | object | Temporal activity.Info for the activity that made the request. |
workflowDetails | object | Output’s serializable subset of Temporal workflow.WorkflowInfo. |
outputActivityKind | string | Output activity kind. Possible values are step, evaluator, and internal_step. |
requestId | string | Internal id linking this payload to the HTTP trace event for that request. |
url | string | Final response URL (same as response.url). |
total | number | Dollar cost passed to addRequestCost. |