Inference Service
LLM chat completions via AI Gateway
The Inference Service provides access to multiple LLM providers through a unified OpenAI-compatible API.
POST /inference/v1/chat/completions
Send chat messages and receive model completions.
Request
POST https://console.mira.network/inference/v1/chat/completions
Authorization: Bearer mk_inference_YOUR_API_KEY
Content-Type: application/jsonBody Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
model | string | Yes | Model identifier (e.g., openai/gpt-4o-mini) |
messages | array | Yes | Array of message objects |
temperature | number | No | Sampling temperature (0-2, default: 1) |
max_tokens | number | No | Maximum tokens in response |
stream | boolean | No | Enable streaming (default: false) |
Message Object
| Field | Type | Description |
|---|---|---|
role | string | system, user, or assistant |
content | string | The message content |
Available Models
| Model ID | Provider | Description |
|---|---|---|
openai/gpt-4o-mini | OpenAI | Fast, cost-effective model |
openai/gpt-4o | OpenAI | Most capable OpenAI model |
anthropic/claude-sonnet-4-5 | Anthropic | Claude Sonnet 4.5 |
Example Request
curl -X POST https://console.mira.network/inference/v1/chat/completions \
-H "Authorization: Bearer mk_inference_YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "openai/gpt-4o-mini",
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "What is the capital of France?"}
]
}'Response
{
"id": "chatcmpl-abc123",
"object": "chat.completion",
"created": 1699999999,
"model": "gpt-4o-mini",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "The capital of France is Paris."
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 25,
"completion_tokens": 8,
"total_tokens": 33
}
}Response Schema
interface ChatCompletionResponse {
id: string;
object: 'chat.completion';
created: number;
model: string;
choices: Choice[];
usage: Usage;
}
interface Choice {
index: number;
message: Message;
finish_reason: 'stop' | 'length' | 'content_filter';
}
interface Message {
role: 'assistant';
content: string;
}
interface Usage {
prompt_tokens: number;
completion_tokens: number;
total_tokens: number;
}JavaScript Example
async function chat(messages, apiKey) {
const response = await fetch('https://console.mira.network/inference/v1/chat/completions', {
method: 'POST',
headers: {
'Authorization': `Bearer ${apiKey}`,
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: 'openai/gpt-4o-mini',
messages,
}),
});
const data = await response.json();
return data.choices[0].message.content;
}
// Usage
const response = await chat([
{ role: 'system', content: 'You are a helpful assistant.' },
{ role: 'user', content: 'Hello!' },
], 'mk_inference_YOUR_API_KEY');
console.log(response);Multi-turn Conversation
Include previous messages to maintain context:
const messages = [
{ role: 'system', content: 'You are a helpful assistant.' },
{ role: 'user', content: 'My name is Alice.' },
{ role: 'assistant', content: 'Nice to meet you, Alice!' },
{ role: 'user', content: 'What is my name?' },
];
const response = await chat(messages, apiKey);
// "Your name is Alice."GET /inference/v1/health
Health check endpoint. Does not require authentication.
curl https://console.mira.network/inference/v1/healthResponse:
{
"status": "ok",
"service": "inference-service",
"version": "1.0.0"
}