Inference Providers

When you write infer in a Turn program, the VM does not make HTTP calls. Instead, it delegates the entire request pipeline to a Wasm inference driver — a sandboxed WebAssembly plugin that knows how to talk to a specific LLM provider.

This page explains how the system works, how to configure it, and how to write your own.


The Architecture

Turn's inference pipeline is built on a strict security boundary: the Turn VM is the only component that can access the network. The Wasm driver is purely computational.

the dual-pass pipeline
Turn VM (Host)
│
│  (1) Passes Turn Inference Request JSON to Wasm module
▼
Wasm Driver (sandboxed)
│
│  (2) Returns HTTP Config JSON — URL, headers (with $env: templates), body
│      The driver CANNOT access the network or filesystem
▼
Turn VM (Host)
│
│  (3) Substitutes $env:OPENAI_API_KEY → real value from process environment
│  (4) Executes the HTTPS call via reqwest
│  (5) Passes raw HTTP response JSON back to Wasm module
▼
Wasm Driver (sandboxed)
│
│  (6) Parses HTTP response → structured Turn result JSON
▼
Turn VM (Host)
│
└──▶  result bound to the infer expression

Why this matters:

  • A Wasm driver cannot read your SSH keys, scan your disk, or exfiltrate your API keys. The W3C sandbox prevents all system calls.
  • Credentials are never in driver code. The driver writes $env:OPENAI_API_KEY as a template string. The Host substitutes the real value before making the HTTP call.
  • A single .wasm file runs everywhere — macOS, Linux, Windows — wherever the Turn VM runs. No native binary distribution per platform.
  • Microsecond cold starts. Wasm modules initialize in under 100μs vs. 10–50ms for an OS subprocess.
InvariantDriver Sandbox Invariant

A Wasm inference driver is a pure transformation function: JSON string → JSON string. It has no host imports. It cannot access the network, filesystem, environment variables, or system clock directly.


Configuring a Provider

Set TURN_INFER_PROVIDER to the absolute path of a compiled .wasm driver:

terminal
export TURN_INFER_PROVIDER=~/.turn/providers/turn_provider_openai.wasm
export OPENAI_API_KEY=sk-...

turn run my_agent.tn

The provider path is resolved once at VM startup. All infer calls in the program use the same provider.


Official Providers

All official drivers are compiled to wasm32-unknown-unknown and available in the Turn repository under providers/:

Standard OpenAI

Connects to api.openai.com. Uses OpenAI's structured outputs (JSON Schema mode) for Cognitive Type Safety.

configuration
export TURN_INFER_PROVIDER=~/.turn/providers/turn_provider_openai.wasm
export OPENAI_API_KEY=sk-...
export OPENAI_MODEL=gpt-4o        # optional, default: gpt-4o

Azure OpenAI

Connects to your Azure OpenAI deployment endpoint.

configuration
export TURN_INFER_PROVIDER=~/.turn/providers/turn_provider_azure_openai.wasm
export AZURE_OPENAI_ENDPOINT=https://my-resource.openai.azure.com
export AZURE_OPENAI_API_KEY=...
export AZURE_OPENAI_DEPLOYMENT=gpt-4o

Azure AI Foundry Anthropic

Connects to Anthropic's Claude via Azure AI Foundry (not the direct Anthropic API).

configuration
export TURN_INFER_PROVIDER=~/.turn/providers/turn_provider_azure_anthropic.wasm
export AZURE_ANTHROPIC_ENDPOINT=https://my-foundry-resource.azure.com
export AZURE_ANTHROPIC_API_KEY=...

The $env: Template Syntax

Wasm drivers use $env:VARIABLE_NAME placeholders in their HTTP config output. The Turn Host resolves these before executing the request:

HTTP config returned by a Wasm driver
{
"url": "https://api.openai.com/v1/chat/completions",
"method": "POST",
"headers": {
  "Authorization": "Bearer $env:OPENAI_API_KEY",
  "Content-Type": "application/json"
},
"body": { "model": "$env:OPENAI_MODEL", "messages": [...] }
}

After substitution, the HTTP request the Host sends uses your real credentials — but the .wasm file itself never contains or reads them.


Writing Your Own Provider

A Turn inference driver is a Rust cdylib compiled to wasm32-unknown-unknown. It must export exactly three C-ABI functions:

provider_template/src/lib.rs
use serde_json::{json, Value};

// Memory management — the Turn host calls this to allocate space for JSON strings
#[no_mangle]
pub extern "C" fn alloc(len: u32) -> u32 {
  let mut buf: Vec<u8> = Vec::with_capacity(len as usize);
  let ptr = buf.as_mut_ptr();
  std::mem::forget(buf);
  ptr as usize as u32
}

// Pass 1: Turn Request → HTTP Config
// Input:  JSON string (Turn Inference Request)
// Output: JSON string (HTTP Request Config with $env: templates)
// Returns: packed u64 = (ptr << 32) | len
#[no_mangle]
pub unsafe extern "C" fn transform_request(ptr: u32, len: u32) -> u64 {
  let input = read_string(ptr, len);
  let req: Value = serde_json::from_str(&input).unwrap();

  let prompt = req["params"]["prompt"].as_str().unwrap_or("");
  let schema = &req["params"]["schema"];

  let body = json!({
      "model": "$env:MY_MODEL",
      "messages": [{"role": "user", "content": prompt}],
      "response_format": { "type": "json_object", "schema": schema }
  });

  let config = json!({
      "url": "https://my-llm-provider.com/v1/completions",
      "method": "POST",
      "headers": { "Authorization": "Bearer $env:MY_API_KEY" },
      "body": body
  });

  pack_string(config.to_string())
}

// Pass 2: HTTP Response → Turn Result
// Input:  JSON string (HTTP response: { status, headers, body })
// Output: JSON string (JSON-RPC result: { jsonrpc, id, result } or { error })
#[no_mangle]
pub unsafe extern "C" fn transform_response(ptr: u32, len: u32) -> u64 {
  let input = read_string(ptr, len);
  let http_res: Value = serde_json::from_str(&input).unwrap();

  if http_res["status"].as_u64().unwrap_or(0) != 200 {
      return pack_string(json!({
          "jsonrpc": "2.0", "id": 1,
          "error": format!("HTTP {}: {}", http_res["status"], http_res["body"])
      }).to_string());
  }

  // Parse provider-specific response format
  let response: Value = serde_json::from_str(
      http_res["body"].as_str().unwrap_or("{}")
  ).unwrap_or(json!({}));

  let content = response["choices"][0]["message"]["content"].as_str().unwrap_or("{}");
  let result: Value = serde_json::from_str(content).unwrap_or(json!(content));

  pack_string(json!({ "jsonrpc": "2.0", "id": 1, "result": result }).to_string())
}

// ── Helpers ──────────────────────────────────────────────────────────────────

unsafe fn read_string(ptr: u32, len: u32) -> String {
  let buf = Vec::from_raw_parts(ptr as *mut u8, len as usize, len as usize);
  String::from_utf8_lossy(&buf).into_owned()
}

fn pack_string(s: String) -> u64 {
  let len = s.len() as u64;
  let mut buf = s.into_bytes();
  let ptr = buf.as_mut_ptr() as u64;
  std::mem::forget(buf);
  (ptr << 32) | len
}

Build it:

terminal
# Install the Wasm target if you haven't already
rustup target add wasm32-unknown-unknown

# Build
cargo build --target wasm32-unknown-unknown --release

# The driver is at:
ls target/wasm32-unknown-unknown/release/my_provider.wasm

# Use it
export TURN_INFER_PROVIDER=$(pwd)/target/wasm32-unknown-unknown/release/my_provider.wasm

TIP

Because providers are pure JSON transformers, they can target any HTTP API — local Ollama server, Llama.cpp, a private inference cluster, or a custom gateway. The Wasm model means the Turn community can build and distribute drivers for any provider without touching the core VM.


Next Steps

  • The infer Primitive — How infer uses providers and what Cognitive Type Safety means
  • Memory & Context — How Turn enriches inference calls with semantic memory automatically