Start a LangGraph or LangChain project with aiproxy in the loop. The starter below mirrors our internal LangGraph + Ollama + aiproxy demo: one node calls Ollama, every request carries X-Proxy-Chat-Session, and all traffic routes through the proxy so you can inspect it.
aiproxyrunning locally (examples usehttp://localhost:8080)- Install the
aiproxycertificate authority (see Quick Start → TLS certificate) - Python 3.10+ with
piporuv - Either a local
ollama serve(http://127.0.0.1:11434/) or hostedapi.ollama.com
Create a folder and add three files modeled after our ollama-test example:
requirements.txt
langgraph
langchain-core
langchain-ollama
python-dotenv
aiproxy.toml (routes local Ollama through /ollama)
[[routes]]
path = "/ollama"
type = "ollama"
upstream = "http://127.0.0.1:11434/"
main.py — broken into small steps.
Imports and shared header constant
import argparse, os, uuid, operator
from typing import Annotated, Dict, List, TypedDict
from dotenv import load_dotenv
from langchain_core.messages import AIMessage, BaseMessage, HumanMessage
from langchain_core.runnables import RunnableConfig
from langchain_ollama import ChatOllama
from langgraph.checkpoint.memory import MemorySaver
from langgraph.graph import END, StateGraphState type used by LangGraph
class GraphState(TypedDict):
messages: Annotated[List[BaseMessage], operator.add]Trust the proxy CA when MITM is enabled
def set_proxy_ca():
ca = os.getenv("PROXY_CA")
if ca:
ca = os.path.expanduser(os.path.expandvars(ca))
os.environ.setdefault("SSL_CERT_FILE", ca)
os.environ.setdefault("REQUESTS_CA_BUNDLE", ca)Build a ChatOllama client that always sends the session header
def build_model(base_url: str, model: str, session_id: str) -> ChatOllama:
# This header lets aiproxy thread every request into one conversation.
# Without it, the proxy sees unrelated calls and can't group your logs.
headers: Dict[str, str] = {"X-Proxy-Chat-Session": session_id}
token = os.getenv("OLLAMA_TOKEN")
if token:
headers["Authorization"] = f"Bearer {token}"
return ChatOllama(
model=model,
base_url=base_url.rstrip("/"),
client_kwargs={"headers": headers},
)Why the header matters: aiproxy groups telemetry by X-Proxy-Chat-Session. Setting it here keeps all LangGraph turns for a run in one timeline—even if the graph forks or retries—so you can search a single UUID in the proxy UI/logs.
Assemble a one-node LangGraph app
def build_app(base_url: str, model: str):
memory = MemorySaver()
workflow = StateGraph(GraphState)
agents: Dict[str, ChatOllama] = {}
def chat_node(state: GraphState, config: RunnableConfig):
thread_id = str((config or {}).get("configurable", {}).get("thread_id", "default"))
agent = agents.setdefault(thread_id, build_model(base_url, model, thread_id))
reply: AIMessage = agent.invoke(state["messages"])
return {"messages": [reply]}
workflow.add_node("chat", chat_node)
workflow.set_entry_point("chat")
workflow.add_edge("chat", END)
return workflow.compile(checkpointer=memory)Each LangGraph run gets its own thread_id (we reuse it as the session header) so concurrent runs stay separate in aiproxy.
Parse CLI args
def parse_args():
p = argparse.ArgumentParser(description="LangGraph + Ollama proxy demo")
p.add_argument("message", nargs="?", default="hello from langgraph")
p.add_argument("--ollama-base-url", default=os.getenv("OLLAMA_BASE_URL", "http://127.0.0.1:11434/v1"))
p.add_argument("--ollama-model", default=os.getenv("OLLAMA_MODEL", "llama3"))
return p.parse_args()Main entrypoint
def main():
load_dotenv()
set_proxy_ca()
args = parse_args()
session_id = uuid.uuid4().hex # also becomes X-Proxy-Chat-Session
app = build_app(args.ollama_base_url, args.ollama_model)
result = app.invoke(
{"messages": [HumanMessage(content=args.message)]},
config={"configurable": {"thread_id": session_id}},
)
last = result["messages"][-1]
print(last.content if isinstance(last, AIMessage) else last)
if __name__ == "__main__":
main()Create a virtualenv and install dependencies:
python -m venv .venv && source .venv/bin/activate # or `uv venv`
pip install -r requirements.txt- Start
aiproxywith the Ollama route:
aiproxy -c aiproxy.toml- Call the route directly (no HTTP proxy vars needed):
OLLAMA_BASE_URL=http://localhost:8080/ollama/v1 \
python main.py "hello from langgraph"-
Start
aiproxy(default config). -
Export proxy + CA settings so HTTPX tunnels through
aiproxyand trusts it:
export HTTP_PROXY=http://localhost:8080
export HTTPS_PROXY=$HTTP_PROXY
export PROXY_CA=~/.aiproxy/aiproxy-ca-cert.pem
export OLLAMA_BASE_URL=https://api.ollama.com/v1
export OLLAMA_MODEL=llama3
export OLLAMA_TOKEN=your_api_token
python main.py "hello from langgraph"Unset the proxy variables when you finish testing:
unset HTTP_PROXY HTTPS_PROXY PROXY_CA- The script prints the model reply and sends
X-Proxy-Chat-Session: <uuid>on every request. Use that UUID to filter theaiproxyUI or logs. - For Option A, look for traffic under the
/ollamaroute. For Option B, watch forapi.ollama.comrequests flowing through the MITM.