@outputai/llm package is how you call LLMs from your steps and evaluators. It wraps the AI SDK and adds prompt files — version-controlled .prompt files that live alongside your code and define the provider, model, temperature, and message template in one place.
Generate Functions
generateText is the primary function for LLM calls. Use the output parameter with Output.* helpers to control the response shape. Use streamText for streaming responses:
| Output Shape | How | Use when you need |
|---|---|---|
| Unstructured text | generateText({ prompt }) | Summaries, emails, explanations |
| Streamed text | streamText({ prompt }) | Real-time output, long responses, UX responsiveness |
| Typed object | generateText({ prompt, output: Output.object({ schema }) }) | Structured data, evaluator judgments |
| Array of objects | generateText({ prompt, output: Output.array({ element }) }) | Lists, multiple items |
| One of N choices | generateText({ prompt, output: Output.choice({ options }) }) | Classification, routing |
Text Output
Generate unstructured text from a prompt file:steps.ts
result is a convenience alias for response.text.
Streaming
Stream text from a prompt file. UnlikegenerateText, streamText returns immediately with a stream result — properties like text, usage, and finishReason are promises that resolve when the stream completes.
steps.ts
streamText is not async — it returns a stream result synchronously. Iterate textStream to process chunks as they arrive. You can also await result.text to get the full text in one shot, but that collapses the stream and is functionally identical to generateText.
To process chunks with side effects (e.g., writing to stdout):
smoothStream for more natural output pacing:
Object Output
Generate a structured object matching a Zod schema. This is what you’ll use most in evaluators:evaluators.ts
output contains the typed object matching your schema.
Array Output
Generate an array of structured items:Choice Output
Select one value from a set of options:Agents
TheAgent class wraps AI SDK’s ToolLoopAgent with Output prompt files and the skills system. Use it when you need multi-step tool execution, conversation history, or a reusable agent instance with a fixed configuration. For single-shot LLM calls without tools, generateText is simpler.
Construction
The prompt file is loaded and rendered at construction time. Variables, skills, and tools are fixed at construction. The agent is ready to callgenerate() or stream() immediately.
steps.ts
| Option | Type | Default | Description |
|---|---|---|---|
prompt | string | (required) | Prompt file name (e.g. 'writing_assistant@v1') |
variables | Record<string, unknown> | {} | Template variables rendered at construction |
skills | Skill[] | [] | Skill packages for the LLM |
tools | ToolSet | {} | AI SDK tools available during the loop |
maxSteps | number | 10 | Maximum tool-loop iterations |
stopWhen | StopCondition | - | Custom stop condition (overrides maxSteps) |
output | Output | - | Structured output spec (e.g. Output.object({ schema })) |
conversationStore | ConversationStore | - | Pluggable store for multi-turn history |
temperature | number | - | Override prompt file temperature |
onStepFinish | Function | - | Callback after each tool-loop step |
prepareStep | Function | - | Customize each step before execution |
generate()
Run the agent and return when complete:generateText: text, result (alias for text), output, usage, finishReason, toolCalls, etc.
Pass additional messages to extend the conversation:
stream()
Stream the agent’s response:streamText, the stream result provides textStream and fullStream iterables, plus promise-based properties (text, usage, finishReason) that resolve on completion.
Structured Output
UseOutput.object() with Agent to get typed responses:
steps.ts
Conversation Store
By default, Agent is stateless. Eachgenerate() call starts fresh with only the initial prompt messages. Pass a conversationStore to maintain history across calls:
ConversationStore interface:
createMemoryConversationStore() is the built-in in-memory implementation. For production, implement the interface with your database.
stream() does not automatically append messages to the conversation store. If you use streaming with a conversation store, persist messages manually in the onFinish callback.When to Use Agent vs generateText
generateText | Agent | |
|---|---|---|
| Best for | Single-shot LLM calls | Multi-step tool loops |
| Tools | Supported | Supported |
| Skills | Supported | Supported |
| Conversation history | Manual | Built-in with conversationStore |
| Reusable instance | No (function call) | Yes (construct once, call many) |
| Structured output | Output.object() | Output.object() |
generateText. Move to Agent when you need conversation state or a reusable instance with a fixed configuration.
Response Object
generateText returns the full AI SDK response:
| Field | Description |
|---|---|
result | Convenience alias for text |
text | The raw generated text |
output | The structured output when using Output.* helpers |
usage | Token counts: inputTokens, outputTokens, totalTokens |
finishReason | Why generation stopped ('stop', 'length', 'tool-calls', etc.) |
response | Raw provider response metadata |
warnings | Any warnings from the provider |
toolCalls | Tool calls made by the model (when using tools) |
streamText returns a different result type. Stream iterables (textStream, fullStream) provide real-time chunks, while scalar properties (text, usage, finishReason, etc.) are promises that resolve when the stream completes:
| Field | Type | Description |
|---|---|---|
textStream | AsyncIterable<string> | Async iterable of text chunks |
fullStream | AsyncIterable<TextStreamPart> | Async iterable of all stream events (text deltas, tool calls, etc.) |
text | Promise<string> | Full text, resolved on completion |
usage | Promise<LanguageModelUsage> | Token counts, resolved on completion |
finishReason | Promise<FinishReason> | Why generation stopped, resolved on completion |
toolCalls | Promise | Tool calls made during streaming, resolved on completion |
response | Promise | Raw provider response metadata |
warnings | Promise | Any warnings from the provider |
Prompt Files
Instead of hardcoding model config and messages in your code, you write.prompt files that live in your workflow’s prompts/ folder. See the Prompts Guide for the full documentation.
prompts/generate_summary@v1.prompt
Configuration Options
| Option | Type | Description |
|---|---|---|
provider | string | anthropic, openai, azure, vertex, or bedrock |
model | string | Model identifier |
temperature | number | Sampling temperature (0.0-2.0) |
maxTokens | number | Maximum output tokens |
tools | object | Provider-specific tools (web search, etc.) |
providerOptions | object | Provider-specific options — see ProviderOptions Guide |
Providers
Anthropic
ANTHROPIC_API_KEY environment variable.
OpenAI
OPENAI_API_KEY environment variable.
Azure OpenAI
AZURE_OPENAI_API_KEY, AZURE_OPENAI_ENDPOINT, and AZURE_OPENAI_API_VERSION.
Vertex AI
Amazon Bedrock
AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_REGION) or IAM role-based authentication. Set AWS_SESSION_TOKEN when using temporary credentials (e.g., from aws sts assume-role).
For cross-region inference, use the regional inference profile format: us.anthropic.claude-sonnet-4-20250514-v1:0.
Always set maxTokens in your Bedrock prompt files. Unlike the direct Anthropic provider (which auto-detects per-model limits), the Bedrock SDK has no client-side defaults and relies on server-side defaults that may be lower than the model’s capacity.
When using providerOptions, use the bedrock namespace (not anthropic):
Provider Tools
Many providers offer built-in tools like web search. Configure them in YAML front matter:prompts/research@v1.prompt
vertex.tools.googleSearch({ mode: 'MODE_DYNAMIC', dynamicThreshold: 0.8 }) at the code level, but keeps your prompt self-contained.
YAML tools are merged with code-level tools, so you can combine provider tools (from YAML) with custom tools (from code). Code-level tools take precedence if names conflict.
For provider-specific tool options, see:
Tool Calling
Use tools withgenerateText to enable function calling:
AI SDK Pass-Through Options
All generate functions accept additional AI SDK options passed through to the provider:| Option | Type | Description |
|---|---|---|
tools | ToolSet | Tools the model can call (generateText and streamText) |
toolChoice | 'auto' | 'none' | 'required' | Tool selection strategy |
maxRetries | number | Max retry attempts (default: 2) |
seed | number | Seed for deterministic output |
abortSignal | AbortSignal | Cancel the request |
topP | number | Nucleus sampling (0-1) |
topK | number | Top-K sampling |
onChunk | Function | Callback for each stream chunk (streamText only) |
onFinish | Function | Callback when stream completes (streamText only) |
onError | Function | Callback on stream error (streamText only) |
experimental_transform | Function | Stream transform, e.g. smoothStream() (streamText only) |
LLM call cost event
EachgenerateText and streamText call emits a llm:call_cost event after the LLM responds. You can observe it with the same hooks mechanism as error hooks: register a handler with on('llm:call_cost', handler) from @outputai/core/hooks in a hook file listed under output.hookFiles. The payload includes token usage, computed cost, workflow/activity context, and model id. For payload details, setup, and cost structure, see Cost Events.