OpenAI-compatible.Point your existing client at the mesh.

Every ClosedMesh peer exposes a standard /v1/chat/completions endpoint. Anything that speaks the OpenAI API — the official SDKs, LangChain, your own scripts — works by changing one base URL. Below is exactly how, and an honest map of what's open today versus what arrives with the paid API.

Three ways to reach the API

What's open today.

ClosedMesh is in early access, so the surfaces aren't all equally open yet. The fully-open path that works right now is running a node and calling your own local runtime — same code, same model quality, and the prompt never leaves your machine.

Path	Base URL	Auth	Status
Local node run the runtime yourself	http://localhost:9337/v1	None	Open now
Hosted mesh the public entry node	https://mesh.closedmesh.com/v1	Bearer key	Models open · chat gated
Web chat zero setup, not the API	closedmesh.com	None	Open now

The hosted entry serves /v1/models openly so you can see what the mesh can run, but /v1/chat/completions is access-gated while monetization is built. Public API keys arrive with the paid inference API — follow the dev log. Until then, run a node for full programmatic access.

Quickstart

Call it in three lines.

These examples target a local node at http://localhost:9337/v1. Install the desktop app or curl the runtime first — it autostarts and joins the mesh. Swap the base URL for the hosted entry once you have a key.

List the models this node can servebash

curl http://localhost:9337/v1/models

Chat completion · curlbash

curl http://localhost:9337/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "Qwen3-8B",
    "messages": [
      { "role": "user", "content": "Summarize peer-to-peer inference in two sentences." }
    ]
  }'

Python · the official openai SDKpython

from openai import OpenAI

client = OpenAI(
    base_url="http://localhost:9337/v1",
    api_key="not-needed",  # local node is unauthenticated
)

resp = client.chat.completions.create(
    model="Qwen3-8B",
    messages=[{"role": "user", "content": "Classify: 'battery great, screen dim'."}],
)
print(resp.choices[0].message.content)

JavaScript / TypeScript · the openai packagejavascript

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "http://localhost:9337/v1",
  apiKey: "not-needed",
});

const resp = await client.chat.completions.create({
  model: "Qwen3-8B",
  messages: [{ role: "user", content: "Extract every date: shipped 2026-01-09." }],
});
console.log(resp.choices[0].message.content);

Streaming · ask for token usage toopython

stream = client.chat.completions.create(
    model="Qwen3-8B",
    messages=[{"role": "user", "content": "Write a haiku about latency."}],
    stream=True,
    stream_options={"include_usage": True},  # needed for usage stats
)
for chunk in stream:
    if chunk.choices and chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="", flush=True)

Discover what's live

The model set changes as peers come and go.

There's no fixed model list — it's whatever live peers are serving. Two open, no-auth endpoints tell you what's routable right now, so you can pick a model that actually exists.

What the hosted mesh can serve (OpenAI shape)bash

curl https://mesh.closedmesh.com/v1/models

Live mesh status — nodes online + routable modelsbash

curl https://closedmesh.com/api/status

Good to know

Notes for building against the mesh.

It's a real OpenAI-compatible surface

Chat completions, streaming (SSE), and model listing follow the OpenAI schema. Most SDKs and agent frameworks need only the base URL changed.

Latency-tolerant by design

The mesh targets summarization, classification, extraction, and background agent work — not shaving a second off a single reply. Set generous client timeouts and prefer batched / async calls.

Ask for usage when streaming

Pass stream_options.include_usage = true or the final chunk omits token counts — and the per-model throughput catalog on /status can't record a sample.

Run your own for full control

The endpoint a hosted peer exposes is the same one your local runtime exposes. For anything you don't want to route through someone else, run a node and keep the whole loop on your hardware.

Run a node, get the API.

The fastest path to programmatic access today is your own node — full quality, no key, nothing leaves your machine.

Run a node Runtime on GitHub