Guide
Hybrid CLI
aldo run auto-decides local vs hosted based on the agent's required capabilities. --route auto|local|hosted overrides.
aldo run <agent> decides where to execute the agent based on
three inputs:
- The agent's required capability classes — primary +
fallbacks declared in the spec's
modelPolicy. - What's locally reachable — what
@aldo-ai/local-discoverysays is up on the operator's machine (Ollama, vLLM, llama.cpp, MLX, LM Studio). - Whether the operator has hosted access —
ALDO_API_TOKENset againstALDO_API_URL(defaulthttps://ai.aldo.tech).
#The decision
--route auto (the default) follows these rules in order:
- If any local model advertises a capability class the agent asks for (primary or any fallback), run locally. This is the LLM-agnostic default — the user already paid for the model.
- If
ALDO_LOCAL_DISCOVERYis unset or returns no classes, default to local (the gateway router fails downstream with a typed error if no model actually resolves). This preserves the pre-§14-A behaviour for callers that opt out of probing. - If local can't serve and
ALDO_API_TOKENis set, delegate to the hosted plane via REST. - Otherwise: typed error with a hint to either pull a matching
local model or set
ALDO_API_TOKEN.
--route hosted forces remote dispatch and errors if no token is
set. --route local forces local execution; the gateway router
will produce its own typed failure if no model resolves.
#Configuring
# Local-only: nothing extra. `aldo run` finds Ollama at
# http://localhost:11434 and runs against whatever you've pulled.
export ALDO_LOCAL_DISCOVERY=ollama # opt into the probe
ollama pull qwen3:14b # whatever your spec needs
# Hybrid: add a hosted token.
export ALDO_API_TOKEN=ak_live_… # mint at /settings/api-keys
# ALDO_API_URL defaults to https://ai.aldo.tech.#Examples
# Auto. Local for sensitive agents (privacy_tier=sensitive +
# capability_class=local-reasoning). Hosted for cloud-tier ones
# (capability_class=reasoning-large with no local fallback that
# matches).
aldo run principal --inputs '{"brief":"add /v1/healthz/db"}'
# Force local. Useful when you know your local model is up to it
# even if it doesn't advertise the exact class.
aldo run principal --route local
# Force hosted. Useful for a cost / quality comparison.
aldo run principal --route hosted
# JSON output for piping into a CI step or log aggregator.
aldo run principal --route hosted --json#What hosted dispatch does
runOnHostedApi is a thin REST wrapper around the platform API:
- POST
/v1/runswith{ agentName, agentVersion?, inputs?, project? }. - Poll GET
/v1/runs/:idevery 1.5 s until the status hits a terminal state (completed,failed,rejected,canceled). - Return the final
RunDetail(events + usage included).
Transient poll non-200s are logged to stderr but do not kill
the run — the executor on the hosted side is decoupled from the
poll loop. A 10-minute global ceiling fires HostedRunTimeoutError
if the run never reaches a terminal status.
The hosted path produces the same done in <ms>ms, $<usd> on <model> line as the local path so scripts don't have to branch on
which side ran.
#Privacy
Privacy-tier enforcement is unchanged: a sensitive agent must
have capabilityClass: local-reasoning in its policy and is fail-
closed routed to a local model. The hybrid CLI never delegates a
sensitive agent to hosted — even with --route hosted, the gateway
router on the hosted side will refuse and return a privacy_tier_ unroutable error.
#What hybrid CLI does NOT do
- Stream tokens through hosted polling. v0 polls; SSE wiring
is a follow-up that needs a new API endpoint. The user-visible
difference: a hosted run shows
... hosted run <id> status → runningthen jumps to the final output, instead of streaming every delta. - Auto-discover hosted endpoints.
ALDO_API_URLis the only knob. Self-hosted ALDO deployments work the same way — point the var at your install. - Manage local model lifecycle.
aldo rundoesn'tollama pulla missing model for you. Pull what you need, then route.