Your LLM reads your entire tool list on every call. That's 50-150 tokens per tool, before it thinks a single thought. mcp-flowgate replaces the whole list with seven fixed tools — and adds governance you declare in YAML.
version: "1.0.0"
connections:
github:
transport: stdio
command: github-mcp-server
proxy:
import:
- connection: github # import all tools
expose:
- name: deploy.prod
executor: { kind: human } # LLM can't fire this Every MCP tool you register lands in the model's system prompt. Ten tools? Fine. Fifty? You're spending 5,000+ tokens per call just to describe what's available.
It gets worse. The model has to reason about every tool to pick the right one. More tools means more output tokens spent choosing, more wrong choices, more retries. A model staring at 50 tools and picking the wrong one wastes a full round trip.
And none of it comes with audit, retries, approval gates, or governance. You get a flat list and a prayer.
mcp-flowgate exposes exactly seven MCP tools regardless of how many capabilities you wire in. The model's tool list never grows.
gateway.home Browse the full catalog gateway.search Find capabilities by keyword gateway.describe Get full schema on demand workflow.start Begin a governed workflow workflow.get Check current state workflow.submit Execute a transition workflow.explain Inspect guards and rules Your 50 capabilities surface through search results and response links — loaded one at a time, only when relevant. This pattern has a name: HATEOAS.
Every guardrail is YAML. No glue code, no per-tool wrappers, no host-specific routing.
Define inputSchema on any capability. Bad input
never reaches your executor.
permission, role, expr,
evidence — compose them with allOf,
anyOf, not.
Tag a transition actor: "human" and the runtime
rejects LLM attempts with ACTOR_MISMATCH. Not a hint — a hard gate.
Tag steps actor: "deterministic". The runtime
chains through them automatically. Zero LLM round trips
for computable work.
Every action emits a structured JSON event. Route to file, stdout, or your observability stack.
Timeout, retry with exponential backoff, fallback executors. Declare it in YAML, not code.
MCP servers, shell commands, REST APIs. Import existing tools
with proxy.import — no rewriting.
Send SIGHUP to reload config without restarting. In-flight workflows continue uninterrupted.
Claims deserve evidence. Here's the actual wire format from the
content-publish workflow.
1. The model searches — not scans
→ gateway.search { "query": "publish content" }
← { "items": [{
"id": "workflow:content_publish",
"title": "Governed content publishing"
}] } One hit. Not 50 tool definitions.
2. Start the workflow — get a prefilled link back
→ workflow.start {
"definitionId": "content_publish",
"input": { "topic": "Q2 launch" }
}
← { "state": "idea",
"links": [{
"rel": "create_outline",
"method": "workflow.submit",
"args": { /* prefilled */ }
}] } The model doesn't guess the next step. It follows the link.
3. Governance stops the model cold
← { "state": "awaiting_approval",
"links": [{
"rel": "approve",
"actor": "human"
}] }
No agent link exists. If the model tries anyway: ACTOR_MISMATCH.
The only way forward is a human.
Linting, testing, building artifacts — these are computable. Tag them
actor: "deterministic" and the runtime chains through
them automatically.
states:
lint:
transitions:
run_lint:
target: test
actor: deterministic
executor: { kind: cli, command: lint-check }
test:
transitions:
run_tests:
target: build
actor: deterministic
build:
transitions:
build_artifact:
target: ready_to_deploy
actor: deterministic
ready_to_deploy:
goal: Confirm deployment
transitions:
deploy:
target: deployed
actor: agent # chain stops here
The model calls workflow.start. The runtime chains
lint, test, build automatically. Three executor calls,
zero LLM round trips. The response arrives at
ready_to_deploy with a chain trace
and guidance for the decision ahead.
One YAML file. Seven tools. Full governance. Free and open source.