LightPhon ⚡ API Guide — OpenAI-Compatible API, L402 & MCP

1. Overview — What Is LightPhon?

LightPhon is an OpenAI-compatible proxy that sits between your application and a decentralized network of GPU nodes running open-source LLMs.

Think of it like this: instead of sending requests to api.openai.com, you send them to lightphon.com. The API format is identical — same endpoints, same JSON structure, same SDKs. LightPhon takes your request, picks the best available GPU node, and returns the response.

Key points:

Base URL — https://lightphon.com
Protocol — standard OpenAI REST API (JSON over HTTPS, SSE for streaming)
Auth — Bearer token (API Key or Agent Token)
Payment — Bitcoin Lightning (automatic, deducted from your wallet balance)
Models — open-source LLMs (LLaMA, Qwen, Mistral, DeepSeek, Gemma…) served by GPU nodes + OpenClaw

Already using the OpenAI SDK? You only need to change two lines: set base_url to LightPhon and api_key to your token. Everything else stays the same.

2. Quick Start (5 minutes)

Get up and running in 3 steps:

1

Create an account & fund your wallet

Go to lightphon.com/app.html, register, and deposit some sats via Lightning or card.

2

Get your API key

Open the 🧭 Model Router tab → Step 1 → click + New Key. Copy the key — it's your apiKey.

3

Make a request

Use any OpenAI-compatible client. Here's a quick Python example:

from openai import OpenAI

client = OpenAI(
    base_url="https://lightphon.com/v1",
    api_key="your-api-key-here"
)

response = client.chat.completions.create(
    model="auto",                       # router picks the best model
    messages=[{"role": "user", "content": "Hello!"}]
)
print(response.choices[0].message.content)

That's it. The model: "auto" setting tells the router to pick the best available model automatically.

3. Authentication

There are two ways to authenticate with the LightPhon API:

Option A — API Key (recommended)

Generate an API Key from the Model Router → Step 1 in the web app. This key never expires and works like a standard OpenAI API key.

# Use as a Bearer token in the Authorization header:
Authorization: Bearer lp_abc123def456...

# Or pass it as apiKey in any OpenAI SDK:
client = OpenAI(base_url="https://lightphon.com/v1", api_key="lp_abc123def456...")

Option B — Agent Token (URL-based)

The Agent Token embeds authentication directly in the URL. Useful for tools that don't support custom headers (some IDE plugins, automation scripts, etc.).

# Base URL with embedded token:
https://lightphon.com/api/agent/<your-token>

# The apiKey field can be anything (token is in the URL):
client = OpenAI(
    base_url="https://lightphon.com/api/agent/<your-token>",
    api_key="x"
)

Which one should I use? Use the API Key for most cases — it works exactly like OpenAI's key. Use the Agent Token only if your tool doesn't support custom API keys or you need a single-URL setup.

⚠️ Keep your keys secret. Anyone with your key or token can make requests and spend your balance.

4. API Endpoints

All endpoints follow the OpenAI convention under /v1/:

Method	Path	Description
GET	`/v1/models`	List all available models
GET	`/v1/models/{id}`	Get details for a specific model
POST	`/v1/chat/completions`	Create a chat completion (main endpoint)
POST	`/api/agent/<token>/v1/chat/completions`	Chat completion via Agent Token URL
POST	`/api/models/route`	Advanced: query the router directly
POST	`/mcp`	MCP server for AI agents (§10)
GET	`/v1/l402/info`	L402 pay-per-use info & pricing (§9)
POST	`/v1/l402/token`	Buy prepaid credit with a Lightning invoice
GET	`/v1/l402/balance`	Remaining credit of an L402 token

5. Chat Completions

The main endpoint. Send a conversation and get a response — identical to OpenAI's chat completions API.

Request

POST /v1/chat/completions
Authorization: Bearer <api-key>
Content-Type: application/json

{
  "model": "auto",
  "messages": [
    { "role": "system", "content": "You are a helpful assistant." },
    { "role": "user",   "content": "Explain Bitcoin Lightning in 3 lines." }
  ],
  "max_tokens": 512,
  "temperature": 0.7,
  "stream": false
}

Response

{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1717171200,
  "model": "Qwen/Qwen2.5-Coder-14B-Instruct-GGUF",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Bitcoin Lightning is a Layer 2 payment protocol..."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 24,
    "completion_tokens": 86,
    "total_tokens": 110
  }
}

Supported parameters

model — model ID from /v1/models, or "auto" for automatic routing
messages — array of { role, content } objects
max_tokens — maximum tokens to generate (default: 4096)
temperature — sampling temperature 0–2 (default: 0.7)
stream — true for Server-Sent Events streaming
top_p, frequency_penalty, presence_penalty — standard OpenAI params

6. Streaming (SSE)

Set "stream": true to receive tokens as they are generated, in real-time.

Event format

data: {"id":"chatcmpl-abc","object":"chat.completion.chunk","choices":[{"index":0,"delta":{"content":"Hello"},"finish_reason":null}]}

data: {"id":"chatcmpl-abc","object":"chat.completion.chunk","choices":[{"index":0,"delta":{"content":" world"},"finish_reason":null}]}

data: {"id":"chatcmpl-abc","object":"chat.completion.chunk","choices":[{"index":0,"delta":{},"finish_reason":"stop"}]}

data: [DONE]

Python streaming example

from openai import OpenAI

client = OpenAI(base_url="https://lightphon.com/v1", api_key="your-key")

stream = client.chat.completions.create(
    model="auto",
    messages=[{"role": "user", "content": "Hello!"}],
    stream=True
)
for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="", flush=True)

curl streaming example

curl -N https://lightphon.com/v1/chat/completions \
  -H "Authorization: Bearer <api-key>" \
  -H "Content-Type: application/json" \
  -d '{"model":"auto","messages":[{"role":"user","content":"Hi"}],"stream":true}'

7. Model Router & Auto Selection

The Model Router automatically selects the best available node and model for your request. You don't have to hardcode a model name — just use "auto".

How `"auto"` works

When you set "model": "auto", the router:

Checks all online GPU nodes and OpenClaw models
Scores them based on availability, speed, and capabilities
Picks the best match and forwards your request

The response includes a routing_info field showing which node and model were used.

Advanced — direct routing query

You can query the router directly to find models matching specific criteria:

POST /api/models/route
Authorization: Bearer <api-key>
Content-Type: application/json

{
  "providers": ["meta", "mistral", "deepseek"],
  "min_params_b": 7,
  "max_params_b": 72,
  "capabilities": ["code", "reasoning"],
  "prefer_size": "balanced",
  "limit": 5
}

Filter options

providers — meta, mistral, deepseek, google, microsoft, alibaba…
min_params_b / max_params_b — parameter range in billions
size_classes — tiny, small, medium, large, xlarge
capabilities — chat, code, reasoning, math, vision, tool_use, rag, agent…
families — llama, qwen, mistral, gemma…
min_context_length — minimum context window
prefer_size — smallest | largest | balanced
known_only — true to only return cataloged models

8. OpenClaw Integration

OpenClaw is an external OpenAI-compatible AI platform that the LightPhon server can integrate as a virtual node, so its models appear in GET /v1/models alongside the real GPU nodes. In the API you can tell them apart by the owned_by field, which reads "OpenClaw" for these models (they carry a 🐾 badge in the Model Router UI).

OpenClaw and LightPhon can work together in two directions:

A) Use LightPhon from OpenClaw

If you use OpenClaw as your AI assistant, you can add LightPhon as one of its model providers. This way, OpenClaw sends inference requests to the LightPhon network.

# In your OpenClaw models configuration:
models:
  providers:
    - name: lightphon
      api: openai-completions
      baseUrl: https://lightphon.com/v1
      apiKey: "your-api-key"
      models:
        - name: auto               # router picks best model
        - name: Qwen/Qwen2.5-Coder-14B-Instruct-GGUF  # or a specific model

Tip: Use auto as the model name to let the router pick the best model. Or specify a model ID from GET /v1/models to always use a particular model.

B) Use OpenClaw as a backend for LightPhon

Server administrators can configure LightPhon to pull models from an OpenClaw instance, making them available alongside local GPU nodes.

Server environment variables

OPENCLAW_ENABLED=true
OPENCLAW_BASE_URL=http://<openclaw-host>:<port>
OPENCLAW_API_KEY=          # optional, if OpenClaw requires auth
OPENCLAW_NODE_NAME=OpenClaw
OPENCLAW_PRICE=0           # cost in sats/min (0 = free)
OPENCLAW_TIMEOUT=120       # request timeout in seconds

How it works

1

Discovery

On startup, the LightPhon server calls GET /v1/models on the OpenClaw instance and registers its models as a virtual node.

2

Routing

The Model Router scores all sources equally — local GPU nodes and OpenClaw — and picks the best match.

3

Forwarding

If an OpenClaw model is selected, the server forwards via POST /v1/chat/completions over HTTP (no WebSocket).

4

Response

The response is returned in standard OpenAI format with an added routing_info field showing the source.

⚠️ Network: The OpenClaw instance must be reachable from the LightPhon server (check firewall rules).

9. L402 — Pay-Per-Use Without an Account

L402 (HTTP 402 Payment Required + Bitcoin Lightning) lets an agent pay for inference on the fly — no account, no signup, no pre-configured API key. The agent just needs a Lightning wallet. This makes LightPhon one of the first L402-native inference providers: ideal for autonomous agents that discover and pay for services by themselves.

When to use L402 vs. an API key: use an API key (§3) when you have an account and a funded wallet. Use L402 for anonymous, machine-to-machine payments — an agent that pays per task without any human onboarding.

Availability: L402 must be enabled by the server operator (L402_ENABLED=true). Check GET /v1/l402/info to see whether it's active and the current pricing limits.

How it works

1

Call a paid endpoint with no credentials

POST to /v1/chat/completions (or /mcp) without an Authorization header. The server replies 402 Payment Required.

2

Read the challenge

The 402 carries a WWW-Authenticate header (and a JSON l402 field) with a macaroon (access token) and a Lightning invoice.

3

Pay the invoice

Pay the BOLT11 invoice with any Lightning wallet. On settlement you learn the preimage — the secret that proves you paid.

4

Retry with the credential

Send Authorization: L402 <macaroon>:<preimage>. Your request runs and the cost is deducted from the prepaid credit — reuse the same header until the credit runs out.

The 402 challenge

HTTP/1.1 402 Payment Required
WWW-Authenticate: L402 macaroon="AGY...", invoice="lnbc10u1p..."
Content-Type: application/json

{
  "error": { "message": "Payment required. ...", "code": "payment_required" },
  "l402": {
    "macaroon": "AGY...",
    "invoice": "lnbc10u1p...",
    "payment_hash": "3a1f...",
    "amount_sats": 1000,
    "expires_at": "2026-08-14T12:00:00Z"
  }
}

Authenticated request

POST /v1/chat/completions
Authorization: L402 <macaroon>:<preimage>
Content-Type: application/json

{ "model": "auto", "messages": [{ "role": "user", "content": "Hello!" }] }

Buy credit explicitly

Instead of waiting for an automatic challenge, request a credit of a chosen size up front. The response is a 402 with the same macaroon + invoice challenge:

curl -X POST https://lightphon.com/v1/l402/token \
  -H "Content-Type: application/json" \
  -d '{"amount_sats": 5000}'

Python example (with a Lightning wallet)

import requests, hashlib

BASE = "https://lightphon.com"
body = {"model": "auto", "messages": [{"role": "user", "content": "Hello!"}]}

# 1) Unauthenticated call → 402 challenge
r = requests.post(f"{BASE}/v1/chat/completions", json=body)
challenge = r.json()["l402"]
macaroon = challenge["macaroon"]
invoice  = challenge["invoice"]

# 2) Pay the invoice with your Lightning wallet (LND, LDK, Alby, …)
preimage = my_wallet.pay(invoice)     # returns the 32-byte preimage as hex

# 3) Retry with the L402 credential
headers = {"Authorization": f"L402 {macaroon}:{preimage}"}
r = requests.post(f"{BASE}/v1/chat/completions", json=body, headers=headers)
print(r.json()["choices"][0]["message"]["content"])

# Reuse the same header on later calls until the credit is spent.
# Check remaining balance:
bal = requests.get(f"{BASE}/v1/l402/balance", headers=headers).json()
print("Remaining:", bal["balance_sats"], "sats")

Billing: credit is charged per minute of node compute time — the same rates as account balances — and deducted from the prepaid token. When it hits zero, the next request returns a fresh 402 so the agent can top up in a single round-trip. Tokens expire after the server's configured TTL (default 30 days).

10. MCP Server (Model Context Protocol)

LightPhon exposes the whole network as an MCP server, so any MCP-speaking agent (Claude, IDE assistants, orchestrators…) can run inference on the decentralized network as a native tool — "run inference on a decentralized network" becomes a callable action. It's also publishable in MCP registries/marketplaces as a distribution channel.

Endpoint: POST /mcp — Streamable HTTP transport, stateless, plain-JSON responses. Auth follows the same rules as /v1: an lp- API key, the global key, L402 (§9), or anonymous access when the operator allows it.

Tools

inference — generate text on the network. Give a plain prompt (optionally with system) or OpenAI-style messages; omit model to let the router choose. Optional max_tokens, temperature.
list_models — list the models currently online, with size, context length, capabilities and availability.

Connect from an MCP client

Most MCP clients accept an HTTP server entry. Point it at /mcp with your API key (or configure the client to handle the L402 flow):

{
  "mcpServers": {
    "lightphon": {
      "type": "http",
      "url": "https://lightphon.com/mcp",
      "headers": { "Authorization": "Bearer lp-your-api-key" }
    }
  }
}

Raw JSON-RPC example

MCP speaks JSON-RPC 2.0 over the single POST endpoint. Calling the inference tool directly:

curl -X POST https://lightphon.com/mcp \
  -H "Authorization: Bearer lp-your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "jsonrpc": "2.0",
    "id": 1,
    "method": "tools/call",
    "params": {
      "name": "inference",
      "arguments": { "prompt": "Explain Lightning in 2 lines", "model": "auto" }
    }
  }'

For agents that pay their own way: combine MCP with L402 — an agent discovers LightPhon in an MCP registry, calls /mcp, gets a 402, pays over Lightning, and starts running inference. No human onboarding at any step.

11. Use With Any Client

LightPhon works with any tool or library that supports the OpenAI chat completions API. Just change the base URL and API key.

Python (openai SDK)

from openai import OpenAI

client = OpenAI(
    base_url="https://lightphon.com/v1",
    api_key="your-api-key"
)

resp = client.chat.completions.create(
    model="auto",
    messages=[{"role": "user", "content": "Hello!"}]
)
print(resp.choices[0].message.content)

Node.js (openai SDK)

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://lightphon.com/v1",
  apiKey: "your-api-key"
});

const resp = await client.chat.completions.create({
  model: "auto",
  messages: [{ role: "user", content: "Hello!" }]
});
console.log(resp.choices[0].message.content);

curl

curl https://lightphon.com/v1/chat/completions \
  -H "Authorization: Bearer your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "auto",
    "messages": [{"role":"user","content":"Hello!"}]
  }'

LangChain

from langchain_openai import ChatOpenAI

llm = ChatOpenAI(
    base_url="https://lightphon.com/v1",
    api_key="your-api-key",
    model="auto"
)
print(llm.invoke("Explain Lightning in 2 lines").content)

Aider / Continue.dev / Cursor / any IDE plugin

Most AI coding tools let you configure a custom OpenAI endpoint. Use these settings:

Base URL : https://lightphon.com/v1
API Key  : your-api-key
Model    : auto

Or, if the tool only supports a single URL (no separate API key field), use the Agent Token URL:

Base URL : https://lightphon.com/api/agent/<your-token>
API Key  : x         (any non-empty string)
Model    : auto

12. Error Handling

Errors follow the standard OpenAI format:

{
  "error": {
    "message": "No suitable node found for the requested model",
    "type": "invalid_request_error",
    "code": "model_not_available"
  }
}

Common errors

HTTP	Meaning	Fix
`401`	Invalid or missing token	Check your API key or Agent Token URL
`402`	Insufficient balance, or L402 payment required	Deposit more sats — or, for L402, pay the invoice in the challenge (§9)
`404`	Model not found	Use `"auto"` or check `/v1/models` for available models
`503`	No nodes available	All nodes are offline — wait or check network status
`504`	Inference timeout	Reduce `max_tokens` or try a smaller model

13. FAQ

What is `"auto"` and should I use it?

Yes, for most use cases. "auto" tells the Model Router to pick the best available model based on availability, speed, and capabilities. You can also specify a model ID if you need a specific model.

What's the difference between API Key and Agent Token?

Both authenticate your requests. The API Key goes in the Authorization: Bearer header (standard OpenAI style). The Agent Token is embedded in the URL — useful for tools that don't support custom headers. Use whichever is more convenient.

What is OpenClaw and why do I see it in the model list?

It's an external AI platform the server can add as a virtual node, so its models show up next to the GPU nodes (marked with a 🐾 badge / owned_by: "OpenClaw"). Full details in §8.

Do I need a Lightning wallet?

Not directly. You deposit sats to your LightPhon wallet (via Lightning invoice or EUR card), and the server deducts from your balance when you use the API. No external wallet interaction needed after funding.

Is my data stored?

No. Conversations are routed directly to the GPU node — no data is stored on the LightPhon server. When the session ends, everything is gone.

What models are available?

It depends on which nodes are online. Call GET /v1/models or check the Model Router in the web app to see the current list. Common models include LLaMA, Qwen, Mistral, DeepSeek, Gemma, and more.

Can an agent use LightPhon without an account?

Yes — that's what L402 is for (§9). An agent with a Lightning wallet calls a paid endpoint, receives a 402 with an invoice, pays it, and authenticates with the resulting credential. No signup, no API key. It works on both /v1/chat/completions and the MCP server.

Do you support the Model Context Protocol (MCP)?

Yes. LightPhon exposes a native MCP server at POST /mcp with inference and list_models tools (§10), so any MCP-speaking agent can run inference on the network directly. It can be combined with L402 so agents pay per use.

Can I self-host?

Yes! LightPhon is fully open source. See the GitHub repository for deployment instructions.

📖 API Guide

📑 Table of Contents

1. Overview — What Is LightPhon?

2. Quick Start (5 minutes)

Create an account & fund your wallet

Get your API key

Make a request

3. Authentication

Option A — API Key (recommended)

Option B — Agent Token (URL-based)

4. API Endpoints

5. Chat Completions

Request

Response

Supported parameters

6. Streaming (SSE)

Event format

Python streaming example

curl streaming example

7. Model Router & Auto Selection

How "auto" works

Advanced — direct routing query

Filter options

8. OpenClaw Integration

A) Use LightPhon from OpenClaw

B) Use OpenClaw as a backend for LightPhon

Server environment variables

How it works

Discovery

Routing

Forwarding

Response

9. L402 — Pay-Per-Use Without an Account

How it works

Call a paid endpoint with no credentials

Read the challenge

Pay the invoice

Retry with the credential

The 402 challenge

Authenticated request

Buy credit explicitly

Python example (with a Lightning wallet)

10. MCP Server (Model Context Protocol)

Tools

Connect from an MCP client

Raw JSON-RPC example

11. Use With Any Client

Python (openai SDK)

Node.js (openai SDK)

curl

LangChain

Aider / Continue.dev / Cursor / any IDE plugin

12. Error Handling

Common errors

13. FAQ

What is "auto" and should I use it?

What's the difference between API Key and Agent Token?

What is OpenClaw and why do I see it in the model list?

Do I need a Lightning wallet?

Is my data stored?

What models are available?

Can an agent use LightPhon without an account?

Do you support the Model Context Protocol (MCP)?

Can I self-host?

How `"auto"` works

What is `"auto"` and should I use it?