Skip to main content
The @outputai/llm package is how you call LLMs from your steps and evaluators. It wraps the AI SDK and adds prompt files — version-controlled .prompt files that live alongside your code and define the provider, model, temperature, and message template in one place.

Generate Functions

generateText is the primary function for LLM calls. Use the output parameter with Output.* helpers to control the response shape. Use streamText for streaming responses:
Output ShapeHowUse when you need
Unstructured textgenerateText({ prompt })Summaries, emails, explanations
Streamed textstreamText({ prompt })Real-time output, long responses, UX responsiveness
Typed objectgenerateText({ prompt, output: Output.object({ schema }) })Structured data, evaluator judgments
Array of objectsgenerateText({ prompt, output: Output.array({ element }) })Lists, multiple items
One of N choicesgenerateText({ prompt, output: Output.choice({ options }) })Classification, routing

Text Output

Generate unstructured text from a prompt file:
steps.ts
import { step } from '@outputai/core';
import { generateText } from '@outputai/llm';
import { GenerateSummaryInput, GenerateSummaryOutput } from './types.js';

export const generateSummary = step({
  name: 'generateSummary',
  description: 'Generate a company summary from research data',
  inputSchema: GenerateSummaryInput,
  outputSchema: GenerateSummaryOutput,
  fn: async (input) => {
    const { result } = await generateText({
      prompt: 'generate_summary@v1',
      variables: {
        companyName: input.name,
        industry: input.industry,
        size: input.size
      }
    });

    return result;
  }
});

// types.ts
// import { z } from '@outputai/core';
//
// export const GenerateSummaryInput = z.object({
//   name: z.string(),
//   industry: z.string(),
//   size: z.number()
// });
//
// export const GenerateSummaryOutput = z.string();
result is a convenience alias for response.text.

Streaming

Stream text from a prompt file. Unlike generateText, streamText returns immediately with a stream result — properties like text, usage, and finishReason are promises that resolve when the stream completes.
steps.ts
import { step } from '@outputai/core';
import { streamText } from '@outputai/llm';
import { GenerateContentInput, GenerateContentOutput } from './types.js';

export const generateContent = step({
  name: 'generateContent',
  description: 'Streams text generation and collects chunks',
  inputSchema: GenerateContentInput,
  outputSchema: GenerateContentOutput,
  fn: async ({ topic }) => {
    const result = streamText({
      prompt: 'stream_content@v1',
      variables: { topic }
    });

    const chunks: string[] = [];
    for await (const chunk of result.textStream) {
      chunks.push(chunk);
    }

    const content = chunks.join('');
    return {
      content,
      chunkCount: chunks.length,
      avgChunkSize: Math.round(content.length / chunks.length)
    };
  }
});
Note that streamText is not async — it returns a stream result synchronously. Iterate textStream to process chunks as they arrive. You can also await result.text to get the full text in one shot, but that collapses the stream and is functionally identical to generateText. To process chunks with side effects (e.g., writing to stdout):
const result = streamText({
  prompt: 'generate@v1',
  variables: { topic: 'AI safety' }
});

for await (const chunk of result.textStream) {
  process.stdout.write(chunk);
}
You can apply stream transforms like smoothStream for more natural output pacing:
import { streamText, smoothStream } from '@outputai/llm';

const result = streamText({
  prompt: 'generate@v1',
  variables: { topic },
  experimental_transform: smoothStream()
});
Use streaming callbacks for side effects without consuming the stream manually:
const result = streamText({
  prompt: 'generate@v1',
  variables: { topic },
  onChunk({ chunk }) {
    // Called for each chunk
  },
  onFinish({ text, usage }) {
    // Called when generation completes
  },
  onError({ error }) {
    // Called on stream error
  }
});

Object Output

Generate a structured object matching a Zod schema. This is what you’ll use most in evaluators:
evaluators.ts
import { evaluator, EvaluationBooleanResult } from '@outputai/core';
import { generateText, Output } from '@outputai/llm';
import { z } from '@outputai/core';
import { JudgeSummaryInput } from './types.js';

export const judgeSummaryQuality = evaluator({
  name: 'judgeSummaryQuality',
  description: 'Judge whether a company summary is accurate and useful',
  inputSchema: JudgeSummaryInput,
  fn: async (input) => {
    const { output } = await generateText({
      prompt: 'judge_summary@v1',
      variables: {
        summary: input.summary,
        companyName: input.companyName
      },
      output: Output.object({
        schema: z.object({
          reasoning: z.string(),
          passes: z.boolean(),
          confidence: z.number()
        })
      })
    });

    return new EvaluationBooleanResult({
      value: output.passes,
      confidence: output.confidence,
      reasoning: output.reasoning
    });
  }
});

// types.ts
// import { z } from '@outputai/core';
//
// export const JudgeSummaryInput = z.object({
//   summary: z.string(),
//   companyName: z.string()
// });
output contains the typed object matching your schema.

Array Output

Generate an array of structured items:
import { generateText, Output } from '@outputai/llm';
import { z } from '@outputai/core';

const { output } = await generateText({
  prompt: 'extract_contacts@v1',
  variables: { companyData: JSON.stringify(company) },
  output: Output.array({
    element: z.object({
      name: z.string(),
      role: z.string(),
      email: z.string().optional()
    })
  })
});

// output is an array of { name, role, email } objects

Choice Output

Select one value from a set of options:
import { generateText, Output } from '@outputai/llm';

const { output } = await generateText({
  prompt: 'classify_lead@v1',
  variables: { activity: leadActivity },
  output: Output.choice({ options: ['hot', 'warm', 'cold', 'unknown'] })
});

// output is one of 'hot', 'warm', 'cold', 'unknown'

Agents

The Agent class wraps AI SDK’s ToolLoopAgent with Output prompt files and the skills system. Use it when you need multi-step tool execution, conversation history, or a reusable agent instance with a fixed configuration. For single-shot LLM calls without tools, generateText is simpler.

Construction

The prompt file is loaded and rendered at construction time. Variables, skills, and tools are fixed at construction. The agent is ready to call generate() or stream() immediately.
steps.ts
import { step } from '@outputai/core';
import { Agent, Output, skill } from '@outputai/llm';
import { z } from '@outputai/core';

const audienceSkill = skill({
  name: 'audience_adaptation',
  description: 'Tailor feedback for the specified expertise level',
  instructions: '# Audience Adaptation\n...'
});

export const reviewContent = step({
  name: 'reviewContent',
  description: 'Review content with structured feedback',
  inputSchema: ReviewContentInput,
  outputSchema: ReviewContentOutput,
  fn: async (input) => {
    const agent = new Agent({
      prompt: 'writing_assistant@v1',
      variables: {
        content_type: input.contentType,
        focus: input.focus,
        content: input.content
      },
      skills: [audienceSkill],
      output: Output.object({ schema: reviewSchema }),
      maxSteps: 5
    });
    const { output } = await agent.generate();
    return output;
  }
});
Constructor options:
OptionTypeDefaultDescription
promptstring(required)Prompt file name (e.g. 'writing_assistant@v1')
variablesRecord<string, unknown>{}Template variables rendered at construction
skillsSkill[][]Skill packages for the LLM
toolsToolSet{}AI SDK tools available during the loop
maxStepsnumber10Maximum tool-loop iterations
stopWhenStopCondition-Custom stop condition (overrides maxSteps)
outputOutput-Structured output spec (e.g. Output.object({ schema }))
conversationStoreConversationStore-Pluggable store for multi-turn history
temperaturenumber-Override prompt file temperature
onStepFinishFunction-Callback after each tool-loop step
prepareStepFunction-Customize each step before execution

generate()

Run the agent and return when complete:
const result = await agent.generate();
console.log(result.text);   // Generated text
console.log(result.output); // Structured output (when using Output.object)
console.log(result.usage);  // Token counts
The result has the same shape as generateText: text, result (alias for text), output, usage, finishReason, toolCalls, etc. Pass additional messages to extend the conversation:
const result = await agent.generate({
  messages: [{ role: 'user', content: 'Now focus on the introduction section.' }]
});

stream()

Stream the agent’s response:
const stream = await agent.stream();

for await (const chunk of stream.textStream) {
  process.stdout.write(chunk);
}
Like streamText, the stream result provides textStream and fullStream iterables, plus promise-based properties (text, usage, finishReason) that resolve on completion.

Structured Output

Use Output.object() with Agent to get typed responses:
steps.ts
import { Agent, Output } from '@outputai/llm';
import { z } from '@outputai/core';

const reviewSchema = z.object({
  issues: z.array(z.string()).describe('List of issues found'),
  suggestions: z.array(z.string()).describe('Actionable suggestions'),
  score: z.number().describe('Quality score 0-100'),
  summary: z.string().describe('Brief overall assessment')
});

const agent = new Agent({
  prompt: 'writing_assistant@v1',
  variables: { content_type: 'documentation', focus: 'clarity', content: markdownContent },
  output: Output.object({ schema: reviewSchema }),
  maxSteps: 5
});

const { output } = await agent.generate();
// output: { issues: string[], suggestions: string[], score: number, summary: string }

Conversation Store

By default, Agent is stateless. Each generate() call starts fresh with only the initial prompt messages. Pass a conversationStore to maintain history across calls:
import { Agent, createMemoryConversationStore } from '@outputai/llm';

const store = createMemoryConversationStore();
const chatbot = new Agent({
  prompt: 'chatbot@v1',
  conversationStore: store
});

const r1 = await chatbot.generate({
  messages: [{ role: 'user', content: 'Hello, tell me about Output.' }]
});
// r1.text: "Output is an AI framework for..."

const r2 = await chatbot.generate({
  messages: [{ role: 'user', content: 'How does it handle retries?' }]
});
// r2 sees the full conversation history from r1
For custom storage backends, implement the ConversationStore interface:
interface ConversationStore {
  getMessages(): ModelMessage[] | Promise<ModelMessage[]>;
  addMessages(messages: ModelMessage[]): void | Promise<void>;
}
createMemoryConversationStore() is the built-in in-memory implementation. For production, implement the interface with your database.
stream() does not automatically append messages to the conversation store. If you use streaming with a conversation store, persist messages manually in the onFinish callback.

When to Use Agent vs generateText

generateTextAgent
Best forSingle-shot LLM callsMulti-step tool loops
ToolsSupportedSupported
SkillsSupportedSupported
Conversation historyManualBuilt-in with conversationStore
Reusable instanceNo (function call)Yes (construct once, call many)
Structured outputOutput.object()Output.object()
Start with generateText. Move to Agent when you need conversation state or a reusable instance with a fixed configuration.

Response Object

generateText returns the full AI SDK response:
FieldDescription
resultConvenience alias for text
textThe raw generated text
outputThe structured output when using Output.* helpers
usageToken counts: inputTokens, outputTokens, totalTokens
finishReasonWhy generation stopped ('stop', 'length', 'tool-calls', etc.)
responseRaw provider response metadata
warningsAny warnings from the provider
toolCallsTool calls made by the model (when using tools)
Streaming response shape. streamText returns a different result type. Stream iterables (textStream, fullStream) provide real-time chunks, while scalar properties (text, usage, finishReason, etc.) are promises that resolve when the stream completes:
FieldTypeDescription
textStreamAsyncIterable<string>Async iterable of text chunks
fullStreamAsyncIterable<TextStreamPart>Async iterable of all stream events (text deltas, tool calls, etc.)
textPromise<string>Full text, resolved on completion
usagePromise<LanguageModelUsage>Token counts, resolved on completion
finishReasonPromise<FinishReason>Why generation stopped, resolved on completion
toolCallsPromiseTool calls made during streaming, resolved on completion
responsePromiseRaw provider response metadata
warningsPromiseAny warnings from the provider

Prompt Files

Instead of hardcoding model config and messages in your code, you write .prompt files that live in your workflow’s prompts/ folder. See the Prompts Guide for the full documentation.
prompts/generate_summary@v1.prompt
---
provider: anthropic
model: claude-sonnet-4-20250514
temperature: 0.7
---

<system>
You write concise company summaries for sales teams.
</system>

<user>
Write a 2-3 paragraph summary of {{ companyName }}.

Industry: {{ industry }}
Company size: {{ size }} employees
</user>

Configuration Options

OptionTypeDescription
providerstringanthropic, openai, azure, vertex, or bedrock
modelstringModel identifier
temperaturenumberSampling temperature (0.0-2.0)
maxTokensnumberMaximum output tokens
toolsobjectProvider-specific tools (web search, etc.)
providerOptionsobjectProvider-specific options — see ProviderOptions Guide

Providers

Anthropic

---
provider: anthropic
model: claude-sonnet-4-20250514
---
Requires ANTHROPIC_API_KEY environment variable.

OpenAI

---
provider: openai
model: gpt-4o
---
Requires OPENAI_API_KEY environment variable.

Azure OpenAI

---
provider: azure
model: gpt-4o
---
Requires AZURE_OPENAI_API_KEY, AZURE_OPENAI_ENDPOINT, and AZURE_OPENAI_API_VERSION.

Vertex AI

---
provider: vertex
model: gemini-1.5-pro
---
Requires Google Cloud authentication and configuration.

Amazon Bedrock

---
provider: bedrock
model: anthropic.claude-sonnet-4-20250514-v1:0
---
Requires AWS credentials (AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_REGION) or IAM role-based authentication. Set AWS_SESSION_TOKEN when using temporary credentials (e.g., from aws sts assume-role). For cross-region inference, use the regional inference profile format: us.anthropic.claude-sonnet-4-20250514-v1:0. Always set maxTokens in your Bedrock prompt files. Unlike the direct Anthropic provider (which auto-detects per-model limits), the Bedrock SDK has no client-side defaults and relies on server-side defaults that may be lower than the model’s capacity. When using providerOptions, use the bedrock namespace (not anthropic):
providerOptions:
  bedrock:
    guardrailConfig:
      guardrailIdentifier: my-guardrail
      guardrailVersion: "1"

Provider Tools

Many providers offer built-in tools like web search. Configure them in YAML front matter:
prompts/research@v1.prompt
---
provider: vertex
model: gemini-2.0-flash
tools:
  googleSearch:
    mode: MODE_DYNAMIC
    dynamicThreshold: 0.8
---

<user>
Research {{ topic }} and provide sources
</user>
This is equivalent to calling vertex.tools.googleSearch({ mode: 'MODE_DYNAMIC', dynamicThreshold: 0.8 }) at the code level, but keeps your prompt self-contained. YAML tools are merged with code-level tools, so you can combine provider tools (from YAML) with custom tools (from code). Code-level tools take precedence if names conflict. For provider-specific tool options, see:

Tool Calling

Use tools with generateText to enable function calling:
import { generateText, tool } from '@outputai/llm';
import { z } from '@outputai/core';

const { result, toolCalls } = await generateText({
  prompt: 'agent@v1',
  variables: { task: 'Research competitor pricing' },
  tools: {
    searchWeb: tool({
      description: 'Search the web for information',
      parameters: z.object({ query: z.string() }),
      execute: async ({ query }) => fetchSearchResults(query)
    })
  },
  toolChoice: 'auto'
});

AI SDK Pass-Through Options

All generate functions accept additional AI SDK options passed through to the provider:
OptionTypeDescription
toolsToolSetTools the model can call (generateText and streamText)
toolChoice'auto' | 'none' | 'required'Tool selection strategy
maxRetriesnumberMax retry attempts (default: 2)
seednumberSeed for deterministic output
abortSignalAbortSignalCancel the request
topPnumberNucleus sampling (0-1)
topKnumberTop-K sampling
onChunkFunctionCallback for each stream chunk (streamText only)
onFinishFunctionCallback when stream completes (streamText only)
onErrorFunctionCallback on stream error (streamText only)
experimental_transformFunctionStream transform, e.g. smoothStream() (streamText only)
Options set in the prompt file (temperature, maxTokens) can be overridden at call time.

LLM call cost event

Each generateText and streamText call emits a llm:call_cost event after the LLM responds. You can observe it with the same hooks mechanism as error hooks: register a handler with on('llm:call_cost', handler) from @outputai/core/hooks in a hook file listed under output.hookFiles. The payload includes token usage, computed cost, workflow/activity context, and model id. For payload details, setup, and cost structure, see Cost Events.

loadPrompt

Load and render a prompt file without generating — useful for debugging:
import { loadPrompt } from '@outputai/llm';

const prompt = loadPrompt('generate_summary@v1', {
  companyName: 'Acme Corp',
  industry: 'SaaS',
  size: 250
});

console.log(prompt.config);   // { provider: 'anthropic', model: '...', temperature: 0.7 }
console.log(prompt.messages);  // Rendered message array

API Reference

For complete TypeScript API documentation, see the LLM Module API Reference.