LLM Client
Package matrix/neo/internal/llm is Neo's OpenAI-compatible function-calling transport. It reuses matrix/mcl/llm's Config, provider detection, and model registry for gateway mete...
Package matrix/neo/internal/llm is Neo's OpenAI-compatible function-calling transport. It reuses matrix/mcl/llm's Config, provider detection, and model registry for gateway metering and provider routing, but owns its own tool-calling message shape.
Source files: neo/internal/llm/client.go, neo/internal/llm/message.go.
Design decisions
Chat-completions shape only. v1 supports the OpenAI chat-completions API with native tools/tool_calls/tool role. Anthropic Messages and OpenAI Responses shapes are rejected with a clear error — their tool schemas differ and would break the loop.
Gateway metering path. When Config.GatewayURL is set, calls are rewritten to ${GatewayURL}/v1/chat/completions with MATRIX_GATEWAY_TOKEN bearer and X-Matrix-* metadata headers. Spend is attributed to the actor under slot "neo".
Reasoning channel extraction. Some providers inline chain-of-thought inside content as … or <thinking>…</thinking>. splitInlineThink moves this out of the visible channel and into the Reasoning field, so internal monologue never leaks into the chat.
Dry-run support. Any code that constructs an interpreter.Interpreter with llm=nil runs in dry-run mode — prompts are built and interpolated but no LLM call is made.
Client
type Client struct {
model string
provider mcllm.Provider
endpoint string
apiKey string
gatewayURL string
actorDID string
intentID string
slotLabel string
temperature float64
maxTokens int
seed int64
}
client, err := llm.New(mcllm.Config{
Model: "Qwen/Qwen3.7-Max",
Temperature: 0.4,
MaxTokens: 4096,
GatewayURL: cfg.GatewayURL,
ActorDID: cfg.ActorDID,
SlotLabel: "neo",
})
Provider is auto-detected from the model string. Endpoint defaults to the provider's canonical URL. API key is read from environment (FIREWORKS_API_KEY, TOGETHER_API_KEY, OPENCODE_API_KEY) unless overridden in Config.APIKey.
Chat
func (c *Client) Chat(ctx context.Context, req ChatRequest) (*ChatResult, error)
Sends the message list + optional tool schemas, returns the model's single assistant turn.
type ChatRequest struct {
Messages []Message
Tools []Tool
ToolChoice string // "auto" (default), "none", "required"
}
type ChatResult struct {
Message Message
FinishReason string
Usage Usage
}
FinishReason values: "stop", "length" (truncated), "tool_calls". Truncated generation is handled by the agent loop — never emitted raw.
Message types
type Message struct {
Role string // "system" | "user" | "assistant" | "tool"
Content string // text content
ToolCalls []ToolCall // assistant turn: requested calls
ToolCallID string // tool turn: which call this answers
Name string // tool turn: function name
Reasoning string // chain-of-thought (not serialized)
}
Constructors
llm.SystemMessage("be helpful")
llm.UserMessage("what is the PAX price")
llm.AssistantMessage("PAX is trading around $X") // seeding from history
llm.ToolResult("call-1", "paxeer__price", `{"pax": "0.42"}`)
ToolCall
type ToolCall struct {
ID string
Type string // always "function"
Function FunctionCall
}
type FunctionCall struct {
Name string
Arguments string // JSON-encoded args
}
ParseArgs() decodes the JSON string into a map[string]interface{}.
Tool schema
type Tool struct {
Type string // always "function"
Function FunctionDef
}
type FunctionDef struct {
Name string
Description string
Parameters map[string]interface{} // JSON Schema object
}
tool := llm.NewFunctionTool("fs__read_file", "Read a file", map[string]interface{}{
"type": "object",
"properties": map[string]interface{}{
"path": map[string]interface{}{"type": "string"},
},
"required": []string{"path"},
})
Reasoning channel
fromWireRespMessage handles three provider postures:
- Separate
reasoning_contentfield — copied directly toMessage.Reasoning - Inline
…or<thinking>…</thinking>incontent— extracted bysplitInlineThink - Unterminated opening tag — the whole remainder is reasoning (truncated generation safety)
The Reasoning field is never serialized onto the wire and never treated as the answer. It is surfaced as a distinct channel only.
Gateway headers
When gatewayURL is set, the request carries:
Authorization: Bearer ${MATRIX_GATEWAY_TOKEN}
X-Matrix-Actor-DID: <actor DID>
X-Matrix-Intent-ID: <intent ID>
X-Matrix-Slot: neo
This matches the daemon/router environment key MATRIX_GATEWAY_URL and the MCL compiler's gateway posture.
Error handling
HTTP errors are parsed for structured error bodies:
neo/llm: fireworks http 429: Rate limit exceeded (type=rate_limit)
Empty choices, parse failures, and API errors all return wrapped errors with the provider name for attribution.
Modifying the client
| What to change | Where |
|---|---|
| Supported API shapes | llm/client.go — New() shape guard |
| Gateway header set | llm/client.go — newHTTPRequest() |
| Inline reasoning tags | llm/client.go — splitInlineThink() |
| Message constructors | llm/message.go — SystemMessage, UserMessage, etc. |
| Tool schema defaults | llm/message.go — NewFunctionTool() |
