• Docs
  • Docs
Quick Start
Agent setup
Claude Code setup
Codex setup
Cursor setup
Visual Studio Code Copilot setup
Web browser proxy setup
LangGraph / LangChain
SDK setup
Vertex AI

LangGraph / LangChain (Ollama)

Start a LangGraph or LangChain project with aiproxy in the loop. The starter below mirrors our internal LangGraph + Ollama + aiproxy demo: one node calls Ollama, every request carries X-Proxy-Chat-Session, and all traffic routes through the proxy so you can inspect it.

Prerequisites

  • aiproxy running locally (examples use http://localhost:8080)
  • Install the aiproxy certificate authority (see Quick Start → TLS certificate)
  • Python 3.10+ with pip or uv
  • Either a local ollama serve (http://127.0.0.1:11434/) or hosted api.ollama.com

Step 1 · Bootstrap the sample app

Create a folder and add three files modeled after our ollama-test example:

requirements.txt

langgraph langchain-core langchain-ollama python-dotenv

aiproxy.toml (routes local Ollama through /ollama)

[[routes]] path = "/ollama" type = "ollama" upstream = "http://127.0.0.1:11434/"

main.py — broken into small steps.

Imports and shared header constant

import argparse, os, uuid, operator from typing import Annotated, Dict, List, TypedDict from dotenv import load_dotenv from langchain_core.messages import AIMessage, BaseMessage, HumanMessage from langchain_core.runnables import RunnableConfig from langchain_ollama import ChatOllama from langgraph.checkpoint.memory import MemorySaver from langgraph.graph import END, StateGraph

State type used by LangGraph

class GraphState(TypedDict): messages: Annotated[List[BaseMessage], operator.add]

Trust the proxy CA when MITM is enabled

def set_proxy_ca(): ca = os.getenv("PROXY_CA") if ca: ca = os.path.expanduser(os.path.expandvars(ca)) os.environ.setdefault("SSL_CERT_FILE", ca) os.environ.setdefault("REQUESTS_CA_BUNDLE", ca)

Build a ChatOllama client that always sends the session header

def build_model(base_url: str, model: str, session_id: str) -> ChatOllama: # This header lets aiproxy thread every request into one conversation. # Without it, the proxy sees unrelated calls and can't group your logs. headers: Dict[str, str] = {"X-Proxy-Chat-Session": session_id} token = os.getenv("OLLAMA_TOKEN") if token: headers["Authorization"] = f"Bearer {token}" return ChatOllama( model=model, base_url=base_url.rstrip("/"), client_kwargs={"headers": headers}, )

Why the header matters: aiproxy groups telemetry by X-Proxy-Chat-Session. Setting it here keeps all LangGraph turns for a run in one timeline—even if the graph forks or retries—so you can search a single UUID in the proxy UI/logs.

Assemble a one-node LangGraph app

def build_app(base_url: str, model: str): memory = MemorySaver() workflow = StateGraph(GraphState) agents: Dict[str, ChatOllama] = {} def chat_node(state: GraphState, config: RunnableConfig): thread_id = str((config or {}).get("configurable", {}).get("thread_id", "default")) agent = agents.setdefault(thread_id, build_model(base_url, model, thread_id)) reply: AIMessage = agent.invoke(state["messages"]) return {"messages": [reply]} workflow.add_node("chat", chat_node) workflow.set_entry_point("chat") workflow.add_edge("chat", END) return workflow.compile(checkpointer=memory)

Each LangGraph run gets its own thread_id (we reuse it as the session header) so concurrent runs stay separate in aiproxy.

Parse CLI args

def parse_args(): p = argparse.ArgumentParser(description="LangGraph + Ollama proxy demo") p.add_argument("message", nargs="?", default="hello from langgraph") p.add_argument("--ollama-base-url", default=os.getenv("OLLAMA_BASE_URL", "http://127.0.0.1:11434/v1")) p.add_argument("--ollama-model", default=os.getenv("OLLAMA_MODEL", "llama3")) return p.parse_args()

Main entrypoint

def main(): load_dotenv() set_proxy_ca() args = parse_args() session_id = uuid.uuid4().hex # also becomes X-Proxy-Chat-Session app = build_app(args.ollama_base_url, args.ollama_model) result = app.invoke( {"messages": [HumanMessage(content=args.message)]}, config={"configurable": {"thread_id": session_id}}, ) last = result["messages"][-1] print(last.content if isinstance(last, AIMessage) else last) if __name__ == "__main__": main()

Create a virtualenv and install dependencies:

python -m venv .venv && source .venv/bin/activate # or `uv venv` pip install -r requirements.txt

Step 2 · Choose your Ollama target

  1. Start aiproxy with the Ollama route:
aiproxy -c aiproxy.toml
  1. Call the route directly (no HTTP proxy vars needed):
OLLAMA_BASE_URL=http://localhost:8080/ollama/v1 \ python main.py "hello from langgraph"
  1. Start aiproxy (default config).

  2. Export proxy + CA settings so HTTPX tunnels through aiproxy and trusts it:

export HTTP_PROXY=http://localhost:8080 export HTTPS_PROXY=$HTTP_PROXY export PROXY_CA=~/.aiproxy/aiproxy-ca-cert.pem export OLLAMA_BASE_URL=https://api.ollama.com/v1 export OLLAMA_MODEL=llama3 export OLLAMA_TOKEN=your_api_token python main.py "hello from langgraph"

Unset the proxy variables when you finish testing:

unset HTTP_PROXY HTTPS_PROXY PROXY_CA

Verify traffic reaches aiproxy

  • The script prints the model reply and sends X-Proxy-Chat-Session: <uuid> on every request. Use that UUID to filter the aiproxy UI or logs.
  • For Option A, look for traffic under the /ollama route. For Option B, watch for api.ollama.com requests flowing through the MITM.

    On this page

  • Prerequisites
  • Step 1 · Bootstrap the sample app
  • Step 2 · Choose your Ollama target
  • Option A · Local `ollama serve` via aiproxy route
  • Option B · Hosted `api.ollama.com` with MITM
  • Verify traffic reaches aiproxy