Skip to content

Adapter Workers

Adapters translate native LLM APIs into the mecha worker contract (GET /health, POST /task). They run in-process — no Docker required.

When to Use Adapters

Use caseWorker type
Claude/Codex in DockerManaged (docker: section)
Ollama, vLLM, LiteLLM, llama.cppAdapter (adapter: section)
External HTTP endpoint you controlUnmanaged (endpoint: field)

Adapters are ideal for local LLMs where you don't need Docker isolation but want mecha's lifecycle management (start/stop/health).

Supported Adapters

TypeUpstream APIHealth CheckSource
ollama/api/chatGET /internal/adapter/ollama.go
openai/v1/chat/completionsGET /v1/modelsinternal/adapter/openai.go

Configuration

Ollama

yaml
name: local-ollama
adapter:
  type: ollama
  upstream: http://localhost:11434
  model: gemma2:9b
timeout: 10m

OpenAI-Compatible

Works with vLLM, LiteLLM, llama.cpp server, or any OpenAI-compatible endpoint:

yaml
name: vllm-worker
adapter:
  type: openai
  upstream: http://gpu-server:8000
  model: meta-llama/Llama-3-70b
  api_key: ${VLLM_API_KEY}
timeout: 15m

Fields

FieldRequiredDescription
adapter.typeYesollama or openai
adapter.upstreamYesBase URL of the LLM API
adapter.modelYesModel name passed to the API
adapter.api_keyNoInline API key for authenticated endpoints. Not persisted to SQLite (in-memory only) — use adapter.token for restart-safe secrets
adapter.tokenNo~/.mecha/secrets.yml reference (e.g. codex.default). Resolved at adapter start, persists across restarts

Prefer adapter.token over adapter.api_key: the token is stored as a reference (not the raw value) and survives mecha serve restarts. The in-memory api_key is intentionally json:"-" so raw keys never land in mecha.db.

How It Works

When mecha serve starts, it auto-starts any offline adapter workers in-process. Each adapter runs an HTTP server that:

  1. Translates GET /health into the upstream's native health endpoint
  2. Translates POST /task into the upstream's chat completion API
  3. Converts the upstream response into the mecha result contract

The adapter server binds to a random loopback port. Mecha records the endpoint in the registry like any other worker, and stops the adapter on graceful shutdown.

Lifecycle

Adapters run in-process and are auto-started by mecha serve. You do not run mecha worker start on an adapter — the CLI rejects that with an error explaining adapters are started by the server.

bash
# Add the worker definition
mecha worker add workers/ollama-gemma.yml

# Start mecha serve — any adapter workers come up automatically
mecha serve

# In another terminal, check status
mecha worker ls
# NAME          TYPE     STATE   ENDPOINT                    HEALTH
# local-ollama  adapter  online  http://127.0.0.1:52431      ok

Adapter workers follow the same state machine as managed workers: offline → online ↔ busy → error. They stop when the server shuts down (the runners are drained as part of graceful shutdown).

Comparison with Unmanaged Workers

FeatureAdapterUnmanaged
Lifecycle managementYes (start/stop)No (always running externally)
Health translationYes (native API → worker contract)No (must implement /health natively)
In-processYesNo
Docker requiredNoNo
Custom API translationAutomaticManual (your endpoint must speak worker contract)

Adding Custom Adapters

Adapters are compiled-in Go packages implementing the adapter.Adapter interface:

go
type Adapter interface {
    Name() string
    Health(ctx context.Context) error
    SendTask(ctx context.Context, prompt string) ([]byte, error)
}

See internal/adapter/ollama.go for a reference implementation.

Released under the ISC License.