LLM Integration¶

Strata Notebook has two ways to use LLMs: prompt cells (declarative, part of the DAG) and the AI assistant (conversational, in a sidebar panel). Both use the same provider configuration and support any OpenAI-compatible API.

Configuration¶

Set an API key in the Runtime panel under Environment Variables. The key determines which provider is used:

Environment Variable	Provider	Default Model
`ANTHROPIC_API_KEY`	Anthropic	claude-sonnet-4-6
`OPENAI_API_KEY`	OpenAI	gpt-5.4
`GEMINI_API_KEY`	Google	gemini-3-flash
`MISTRAL_API_KEY`	Mistral	mistral-large-latest
`STRATA_AI_API_KEY`	Custom (requires `[ai]` config)	—

Resolution order (highest priority wins):

notebook.toml [ai] section — per-notebook override
Runtime panel env vars — set in the UI
Server config (STRATA_AI_* env vars) — admin default

Process environment is not consulted

A shell-exported OPENAI_API_KEY does not leak into notebooks. This is intentional — each notebook must explicitly opt in to an LLM provider. See the Annotations page for how env vars flow.

Custom Provider Configuration¶

For self-hosted models (Ollama, vLLM) or custom endpoints, add an [ai] section to notebook.toml:

[ai]
base_url = "http://localhost:11434/v1"
model = "llama3"

The recommended place for secrets is still the Runtime panel env vars, but the [ai] section can also override advanced fields such as:

api_key
base_url
model
max_context_tokens
max_output_tokens
timeout_seconds

Use [ai].api_key sparingly, because it persists in notebook.toml.

Supported Providers¶

Any service that implements the OpenAI /v1/chat/completions endpoint works, including:

OpenAI (GPT-4o, GPT-4, GPT-3.5)
Anthropic (Claude, via their OpenAI-compatible endpoint)
Google (Gemini, via their OpenAI-compatible endpoint)
Mistral (Mistral Large, Codestral)
Ollama (local models)
vLLM, TGI, LiteLLM (self-hosted)

Prompt Cells¶

Prompt cells are notebook cells with language="prompt". They render a text template, call the LLM, and store the response as an artifact — just like Python cells, they participate in the DAG and cache by provenance.

Basic Syntax¶

A prompt cell is plain text with {{ variable }} template injection:

# @name summary
Summarize this dataset:

{{ df }}

Return 3 key findings as a numbered list.

The {{ df }} placeholder is replaced with a text representation of the upstream variable df before sending to the LLM. The response is stored as an artifact named summary (from the @name annotation).

Template Variables¶

Variables are injected using {{ expression }} syntax. The expression is resolved against upstream cell outputs:

Upstream Type	Text Representation
pandas DataFrame	Markdown table (first 20 rows)
pandas Series	String representation (first 20 values)
numpy ndarray	Shape + dtype + first 10 elements
dict / list	JSON (indented, truncated)
str / int / float	Direct string conversion

Each variable has a token budget (default: 2,000 tokens). If the text representation exceeds the budget, it's truncated with a ... (truncated) marker.

Attribute access is supported for safe, read-only operations:

{{ df.describe() }}     # OK — pandas describe() is allowed
{{ df.head() }}         # OK — pandas head() is allowed
{{ obj.value }}         # OK — attribute access (non-callable)
{{ obj.mutate() }}      # BLOCKED — unknown method, left as-is

Only a small set of known-safe methods are allowed (describe, head, tail on pandas DataFrames/Series). Arbitrary method calls are blocked to prevent side effects in template rendering.

Prompt Cell Annotations¶

Annotation	Description	Default
`@name`	Output variable name (must be a Python identifier)	`result`
`@model`	Override the LLM model	From provider config
`@temperature`	Sampling temperature (0.0 = deterministic)	`0.0`
`@max_tokens`	Maximum response tokens	`4096`
`@system`	System prompt prepended to the request	None

Example with all annotations:

# @name classification
# @model gpt-4o
# @temperature 0.0
# @max_tokens 1000
# @system You are a data scientist. Return only valid JSON.
Classify each paper by topic:

{{ sampled_papers }}

Return a JSON object mapping paper ID to topic.

Caching¶

Prompt cells cache like any other cell. The cache key includes:

The rendered template text (after variable injection)
The model name
The temperature
The system prompt
The output type

If you re-run a prompt cell with the same inputs and the same model config, the cached response is returned instantly — no LLM call is made.

AI Assistant¶

The AI assistant is a sidebar panel (toggle with the AI Assistant button) that provides conversational access to an LLM. It operates outside the DAG — it doesn't create artifacts or participate in caching.

Chat Mode (Enter)¶

Type a message and press Enter. The assistant streams a response with full conversation context:

Conversation memory: prior turns are sent back to the LLM so follow-up questions work ("give an example of that", "now do it for column X")
Notebook context: the current notebook state (cell sources, variable definitions, packages) is included in every request as a system prompt
Cell context: optionally select a cell from the dropdown to focus the conversation on that cell's code and errors
Code insertion: assistant responses with fenced code blocks show an "Insert Cell" button to add the code as a new notebook cell

The conversation resets when you click "Clear" or reload the page. History is session-only (not persisted to disk).

Agent Mode (Shift+Enter)¶

Type an instruction and press Shift+Enter. The agent autonomously takes actions on the notebook:

Available tools:

Tool	Description
`get_notebook_state`	Read all cells, variables, and execution status
`create_cell`	Add a new Python or prompt cell
`edit_cell`	Modify an existing cell's source
`delete_cell`	Remove a cell
`run_cell`	Execute a cell and observe the result
`add_package`	Install a Python package via uv

The agent runs as a background task with a 10-iteration limit. Progress events appear in the panel as they happen. You can cancel a running agent with the Cancel button.

Example agent instructions:

"Add a cell that loads the iris dataset and prints its shape"
"Install pandas and create a simple data analysis"
"Fix the error in cell c3 and run it again"

The agent works best for additive tasks (creating new cells, installing packages). For complex refactoring, use Chat mode to discuss the approach first.