# infer.x402cloud.ai

Edge AI inference with x402 micropayments. Pay per token with USDC — no signup, no API keys.

## Usage

OpenAI-compatible. Set base_url to https://infer.x402cloud.ai and POST to any model endpoint.

## Payment Scheme

Uses x402 "upto" (metered) payments: authorize a max amount, pay only for actual tokens used.

## Models

- /nano (max $0.001255) — Fastest, simple tasks [@cf/ibm-granite/granite-4.0-h-micro]
- /fast (max $0.003019) — Quick and capable [@cf/meta/llama-4-scout-17b-16e-instruct]
- /smart (max $0.001869) — Reliable workhorse [@cf/meta/llama-3.1-8b-instruct-fast]
- /think (max $0.012012) — Deep reasoning [@cf/deepseek-ai/deepseek-r1-distill-qwen-32b]
- /code (max $0.003563) — Code specialist [@cf/qwen/qwen2.5-coder-32b-instruct]
- /big (max $0.006118) — Highest quality [@cf/meta/llama-3.3-70b-instruct-fp8-fast]
- /embed (max $0.001107) — Text embeddings [@cf/baai/bge-m3]
- /image (max $0.003091) — Image generation [@cf/black-forest-labs/flux-1-schnell]

## Request Format

POST /{model} with JSON body:
```json
{ "messages": [{"role": "user", "content": "Hello"}], "max_tokens": 512 }
```

## Payment

x402 protocol — USDC on Base. Include payment header automatically via x402 client.
Recipient: 0x207C6D8f63Bf01F70dc6D372693E8D5943848E88