Agent API
Stable, dual-auth API + CLI for terminals, scripts, and agents.
Base URL: https://api.getpromethic.com/api/v2/public
Quickstart
Issue a key in the web app at app.getpromethic.com → Settings → Developer Keys, then:
curl -H "X-API-Key: pmk_..." \
https://api.getpromethic.com/api/v2/public/prompts
Or via the CLI:
npm install -g @soulwarestudio/promethic-cli
promethic auth login # paste pmk_... key
promethic prompts list
promethic run <prompt-id> --input "summarize this article"
For write-scope keys, an agent can author a whole prompt
declaratively via a YAML manifest. Note the nested
parameters object — the server requires
{ model_id, parameters } shape so per-model
parameter values stay distinct from envelope fields.
Step one: discover a real model_id via
GET /api/v2/public/models. Catalog IDs are
opaque (e.g., gpt54nano_2c6f9b4d); the wire
name isn't the marketing name. Pick one from the response:
curl -H "X-API-Key: pmk_..." \
https://api.getpromethic.com/api/v2/public/models \
| jq '.recommended_defaults'
# → {"model_id": "gpt55_8e2b1d4f", "reasoning_effort": "medium", ...}
Step two: paste it directly into your manifest — the
catalog response is round-trippable. Copy model_id
into modelSettings.model_id; pick parameter
values from each param's values /
min / max /
provider_default.
# prompt.yaml
name: Article summarizer
promptText: |
Summarize the input in three bullets.
modelSettings:
model_id: gpt54nano_2c6f9b4d # ← from /models response above
parameters:
reasoning_effort: low
attachments:
- type: text
file: ./examples.txt
promethic prompts create --manifest prompt.yaml
Authentication
All requests carry an API key in the X-API-Key header. Keys
start with pmk_ and are scoped to one user. Three scopes are
available in V1.0.1:
| Scope | Grants |
|---|---|
| read | list/get prompts, versions, records, attachments, record images, models catalog |
| execute | run, revise (runId or recordId), finalize, abandon |
| write | create + update + delete prompts, versions, attachments, records (V1.0.1 + V1.1 Phases 2 & 3) |
Scopes are checked literally; an execute-only key cannot
list prompts. To do all three, request all three:
scopes: ["read", "execute", "write"]. DELETE
is now part of write as of V1.1; per-prompt grants gate
leaked-key blast radius.
Per-prompt grants V1.1
On top of scopes, each key can be optionally restricted to a specific set of prompts. Managed via the web app at app.getpromethic.com → Settings → Developer Keys → Manage; agents do not configure their own restrictions.
- Unrestricted (default) — a freshly-minted key has zero per-prompt grants and can access ALL of your prompts, gated only by its scopes.
- Restricted — once you add ANY prompt to a key's grant list, the key is restricted: only listed prompts are accessible. Calls to other prompts return
403 grant_required. - Removing the last prompt from a restricted key returns it to unrestricted (the web UI confirms this transition).
Enforcement is per-call: every endpoint that resolves to a single prompt (run, revise on runId, revise on recordId, finalize, abandon, prompt and version reads, prompt PATCH/version create/attachment upload, record GET/image/DELETE/PATCH) re-checks the grant list at the time of the call. If you revoke a grant mid-workflow, the next call to the affected prompt 403s.
400 mixed_credentials_principal_mismatch.
Endpoints
Read (read scope)
| GET | /prompts?limit=&cursor= | list (slim DTO) |
| GET | /prompts/{id} | + current version |
| GET | /prompts/{id}/versions | paginated history |
| GET | /prompts/{id}/versions/{vid} | single version |
| GET | /prompts/{id}/attachments | prompt attachments |
| GET | /attachments/{id} | attachment download |
| GET | /records?promptId=&versionId=&source=&createdBy= | list (cursor-paginated) |
| GET | /records/{id} | slim record DTO |
| GET | /records/{id}/image?index=N | image PNG (binary) |
| GET | /models | catalog: model_id, supportedOutputModalities, costs (V1.0.1) |
Execute (execute scope)
| POST | /prompts/{id}/run | creates RunSession; SSE |
| POST | /runs/{runId}/revise | append turn; SSE |
| POST | /runs/{runId}/finalize | session→record (V1.1: ?persist= removed; always commits) |
| POST | /runs/{runId}/abandon | idempotent |
| GET | /runs/{runId}/images/{N} | session image (active sessions) |
| POST | /records/{id}/revise V1.2 | rehydrate fresh run from record + revise (replaces V1.1 /revise-again); per-record advisory lock; SSE |
| DELETE | /records/{id} V1.1 | self-delete (ApiKey-owned, 24h window) |
| PATCH | /records/{id} V1.1 | amend notes/tag (RFC 7396; no time window) |
Run lifecycle body shapes V1.1
The execute surface mirrors the desktop Avalonia app's Convert / Revise / Copy / Copy & Tag flow. Body shapes for each call:
POST /runs/{runId}/revise
{
"instruction": "make it more concise", // required
"intermediateOutput": "edited prior output" // optional
}
intermediateOutput is the user's edited
prior-turn output. When passed, the model sees this (instead
of the prior turn's actual output) as context for this
revision, AND it lands in the resulting
ConversionDelta.IntermediateOutput. Mirrors the
Avalonia "user edited the textbox before hitting Revise"
flow. Omit for vanilla revise-from-prior-output. Image-
modality runs reject this field (400 intermediate_output_not_supported_image) —
image revisions always source from the model's actual prior
output. 32 KB cap.
POST /runs/{runId}/finalize
{
"finalText": "edited final output", // optional; creates edit delta if differs from model output
"tag": "exemplar", // optional; attaches to edit delta (requires finalText)
"notes": "context for this run" // optional; record-level
}
Three-axis surface that mirrors Avalonia's Copy / Copy & Tag semantic exactly:
- Plain finalize (no body): equivalent to Avalonia Copy with no edits. Saves the record from the model's last output. No edit delta.
- finalText only: Avalonia
Copy after editing the output box.
If
finalTextdiffers from the model's last output, server creates an edit delta with empty tag. If it matches, no edit delta (treated as a clean Copy). - finalText + tag: Avalonia
Copy & Tag. Tag attaches to the
edit delta — the Refine wizard signal. Server returns
400 tag_without_deltaiffinalTextis omitted or matches the model output (no edit delta to tag). Avalonia disables the Copy & Tag button in the same situation. - notes: independent of the above.
Record-level free text. Available without
finalTexton plain finalize, or alongside finalText/tag.
Image-modality runs reject finalText
(400 final_text_not_supported_image) —
record.finalCopiedOutput for image records is
server-derived from the per-turn effectivePromptForImage
accumulation chain (training-data invariant; CLAUDE.md
"Image Records: FinalCopiedOutput as Accumulated Prompt").
Caps: finalText 256 KB
(413 final_text_too_large);
notes 64 KB
(413 notes_too_large);
tag 256 chars
(413 tag_too_large).
Record DTO turns[] field V1.2
Records returned by GET /records, GET /records/{id},
and POST /finalize include a turns[] array that is
a synthesized linear history of the record's states (run + revisions + optional
edit). Each entry has a stable index matching the
fromTurn parameter accepted by /revise and /finalize.
{
"id": "...",
"promptId": "...",
"inputText": "Summarize this article: ...",
"finalCopiedOutput": "Y_edited",
"turns": [
{ "index": 0, "kind": "run", "input": "Summarize this article: ...", "output": "X" },
{ "index": 1, "kind": "revision", "instruction": "make formal",
"intermediateOutput": "X", "output": "Y", "modelId": "...", "costMicroCents": 1234 },
{ "index": 2, "kind": "edit", "intermediateOutput": "Y",
"output": "Y_edited", "tag": "user-edit" }
],
...
}
Three kinds:
kind: "run"— always index 0. Carriesinput(the prompt's user input). NOTE:outputfor the run turn is "what the next turn saw as its prior context, OR the record'sfinalCopiedOutputif no later turn." If a user edited the textbox client-side before pressing Revise, the edit was committed forward as the next revision'sintermediateOutput; the model's literal output text is not preserved.kind: "revision"— each /revise call appends one.instructionis the user's revise instruction.intermediateOutputis what the model saw as prior context.kind: "edit"— at most one per record; always the last entry. Created when /finalize was called withfinalTextdiffering from the model's last output.intermediateOutputis the model's actual last output before the edit;outputis the user's edited text (= record'sfinalCopiedOutput).
Use turns[].index to pass fromTurn on /revise or
/finalize for rewind-and-redo (V1.2).
Record self-management (V1.1)
DELETE /api/v2/public/records/{id} hard-deletes a record
if and only if (a) the caller is an API key, (b) the record was
created by the same API key (credentialPrincipalType=ApiKey +
matching id), and (c) the record is less than 24h old (anchored to
record.createdAt). Returns 204 No Content on
success, or 403 record_not_owned_by_api_key /
409 record_self_delete_window_expired /
404 record_not_found otherwise. Delete is hard
(cascade to deltas; FK ON DELETE SET NULL on
RunSession.FinalizedRecordId auto-clears any
finalize-replay rendezvous so the next /finalize replay returns
410 record_was_deleted). Image bytes are best-effort
cleaned from blob storage. Retries after a successful DELETE
return 404 — the row is gone, so HTTP-level
idempotency is by-construction (the second call always 404s).
PATCH /api/v2/public/records/{id} updates
notes and/or the record's edit-delta tag.
Same ApiKey-owned check as DELETE, but no time window —
amends are non-destructive. Body is RFC 7396 JSON Merge Patch
(Content-Type: application/merge-patch+json):
missing key = unchanged, explicit null = clear,
value = set. Setting tag on a record with no edit
delta returns 409 record_no_edit_delta — use
notes for record-level labels instead. The
response shape is { id, notes, tag, lastPatchedAtUtc }.
HIPAA §164.312(b) audit row is written with field
presence/length/SHA-256 prefix metadata only —
never the raw notes/tag content.
Write (write scope) V1.0.1
All mutating POSTs honour the Idempotency-Key header — see
Idempotency. PATCH /prompts/{id}
follows RFC 7396 JSON Merge Patch:
missing keys leave server state untouched; explicit null clears.
| POST | /prompts | create prompt + initial version |
| PATCH | /prompts/{id} | RFC 7396 merge patch — application/merge-patch+json |
| PUT | /prompts/{id}/current-version | switch which version is "current" |
| POST | /prompts/{id}/versions | append new version (auto-increments versionNumber) |
| POST | /prompts/{id}/attachments | upload file (multipart/form-data) |
| DELETE | /prompts/{id} V1.1 | soft-delete prompt; cascades hide records/versions/attachments under it |
| DELETE | /prompts/{id}/versions/{vid} V1.1 | soft-delete version; rejects current version (409 version_is_current) |
| DELETE | /attachments/{id} V1.1 | soft-delete + storage refund; 409 attachment_referenced_by_active_run if any active run's snapshot references the blob |
Field caps: name ≤ 256 chars, promptText ≤ 256 KB,
modelSettings JSON ≤ 64 KB, text attachments ≤ 5 MB,
image attachments ≤ 10 MB, ≤ 20 attachments per prompt, ≤ 50 MB per prompt,
1 GB per user.
Cursor pagination
List endpoints return { items: [...], nextCursor: string|null }.
Cursors are signed (HMAC-SHA256) with a server-side root key; tampered
cursors return 400 invalid_cursor; cursors issued for one
user can't be replayed by another (400 invalid_cursor);
changing query filters mid-pagination returns 400 cursor_filter_mismatch.
SSE protocol
All run-producing endpoints (/run,
/runs/{runId}/revise,
/records/{id}/revise) return 200 OK with
Content-Type: text/event-stream. Errors are in-stream
events, not HTTP status codes — clients should NOT branch on
HTTP status for these endpoints.
Event taxonomy (protocol v1)
| Event | When | Payload |
|---|---|---|
run_session | First event | {protocolVersion, runId, turnIndex, modelId, outputModality?, seededFromRecordId?} |
client_hint V1.1 Phase 0 | Image-modality runs only, immediately after run_session | {filter: "partial_image_b64", reason: "image_modality_inline_payloads", canonicalAccess: "GET /api/v2/public/runs/{runId}/images/{n}"} |
| (upstream events) | Middle | OpenAI's response.output_text.delta etc., passed through verbatim |
run_completed | Success terminator | {runId, turnIndex, modelId, costMicroCents, imageCount?} |
run_failed | Failure terminator | {runId, reasonCode, message?, charged, costMicroCents?, usageLogId?} |
run_replayed V1.1 | Idempotency-Key replay terminator | {runId, turnIndex, modelId, outputModality?, state, streamingInProgress, recordId?, hint} |
record_finalized V1.2 | Chained auto-finalize succeeded; emitted AFTER run_completed on the same stream when ?autoFinalize=true | {runId, recordId, turns, costMicroCents?} |
record_finalize_failed V1.2 | Chained auto-finalize failed; the upstream run already succeeded. If retryable, agent calls POST /finalize manually. | {runId, reasonCode, retryable} |
record_finalize_skipped V1.2 | Informational; emitted AFTER run_failed when ?autoFinalize=true was set. Agents MUST NOT trigger separate failure handling — the failure is already reported in run_failed. | {runId, reason: "run_failed", reasonCode} |
protocolVersion in run_session against your
parser's expected version.
- HTTP callers (direct REST to
/api/v2/public): the SSE stream includes inline base64 image payloads onresponse.image_generation_call.partial_imageandresponse.output_item.done. These can be tens of KB to several MB per frame. Most clients will want to filter them out and fetch the canonical bytes viaGET /api/v2/public/runs/{runId}/images/{n}afterrun_completed. Aclient_hintevent is emitted right afterrun_sessionto flag this. - MCP callers (
mcp.getpromethic.com/v1): the MCP transport dropspartial_image_*events and redacts large base64 from kept frames automatically. Final image bytes are returned inline as MCPimagecontent blocks in thetools/callresult alongside the text transcript — no follow-up fetch needed. Clients that don't renderimagecontent can still callpromethic_get_run_imageby index, orGET /records/{recordId}/image?index={n}after finalize. run_completed.imageCountreports how many images THIS turn produced (per-turn, not session-aggregate). Iteratenin[0, imageCount). A text-only revise on an image session emitsimageCount: 0.- Pre-finalize,
GET /runs/{runId}/images/{n}reads the LATEST turn's images. To access prior-turn images after the run ends, finalize first and readrecord.imageStoredPathviaGET /records/{id}/image?index=N— every per-turn image is preserved on the record. - If image generation completed upstream but blob storage failed (3× retries exhausted), the run terminates with
run_failed { reasonCode: "image_upload_failed", charged: true }. Replay with the same Idempotency-Key returns this same failure (no re-attempt, no double-bill).
run_replayed does not reflect post-replay state.
When you retry a /run or /revise with the
same Idempotency-Key, the server returns the
runId from the original call (the billable work is
already in flight or done) and terminates this stream with
run_replayed instead of run_completed.
The payload's state field is a snapshot from when
the replay was recorded — it does NOT track later state
changes on the same run. For the live state of a replayed
run, the agent must derive it from its own bookkeeping of
the original call (and, once GET /api/v2/public/runs/{runId}
ships in V1.2, poll that). The CLI's RunCallResult
surfaces this as {succeeded: false, reasonCode: "replayed_state_unknown"}
rather than masquerading as success.
Charge visibility
run_failed carries charged: true when the
upstream call was billed despite the local failure
(usage_log_write_failed or session_lost_mid_*
after a successful upstream call). Agents should record the cost from
costMicroCents for reconciliation.
Cost units V1.1
All cost fields on the public API surface use microcents
(1 cent = 1000 microcents; 1 USD = 100,000 microcents).
The integer wire format preserves sub-cent precision for
image-token / reasoning-heavy calls that previously
truncated. Render as cents-with-decimals via
costMicroCents / 1000; as USD via
costMicroCents / 100000; e.g.
costMicroCents: 503 = 0.503¢ = $0.00503.
Field was renamed from costMicros in
2026-05-11 because the Micros suffix
incorrectly suggested microUSD (the value is actually
1/1000 of a cent, off by 100×).
Affected fields: record.costMicroCents,
run_completed.costMicroCents,
run_failed.costMicroCents,
finalize.costMicroCents. The pre-V1.1
costCents field is removed from public DTOs;
pre-V1.1 records (where the column was NULL) backfill via
costCents * 1000 so historical rows still
surface a cost.
Rate-limit headers V1.1
Every public-API response (success + 429) carries:
X-RateLimit-Limit: 60
X-RateLimit-Remaining: 47
X-RateLimit-Reset: 1746201600
X-RateLimit-Bucket: key=47/60,user=120/300
The standard pair (Limit / Remaining)
reports the most-restrictive bucket — the per-key bucket
when the caller is API-key-attributed (always smaller than
per-user), else the per-user bucket. The diagnostic
X-RateLimit-Bucket reports both
(key=N/L,user=N/L) so agents can observe per-key
vs per-user pressure separately. Session-only callers
see X-RateLimit-Bucket: user=N/L.
CLI
The promethic Node CLI (@soulwarestudio/promethic-cli
on npm) ships every public-API endpoint behind ergonomic commands.
npm install -g @soulwarestudio/promethic-cli
promethic auth login # paste pmk_... key
promethic auth status
promethic auth logout
promethic prompts list [--limit N] [--cursor C] [--json]
promethic prompts get <id> [--json]
promethic prompts delete <id> # V1.1
promethic prompts delete-version <promptId> <versionId> # V1.1
promethic run <prompt-id> [--input "..."] [--input-file path] [--no-accept] [--auto-finalize true|false] [--json] # V1.2: --auto-finalize
promethic revise <handle> --instruction "..." [--intermediate-output "..."] [--from-turn N] [--no-accept] [--json]
# V1.2: handle is run-id or record-id; --from-turn rewinds
promethic finalize <run-id> [--final-text "..."] [--tag "..."] [--notes "..."] [--from-turn N] [--json]
# V1.2: --from-turn rewinds; finalize on Finalized session amends in place
promethic abandon <run-id> [--json]
promethic records list [--prompt <id>] [--source API] [--json]
promethic records get <id> [--json]
promethic records image <id> --index N --output path.png
promethic records delete <id> # V1.1
promethic records patch <id> [--notes "..."|--clear-notes] [--tag "..."|--clear-tag] [--json] # V1.1
promethic run <promptId> --image <file> [--image <file>] # V1.1 Phase 3 — vision input
promethic attachments add <promptId> <file> [--type image|text] [--filename ...] [--json]
promethic attachments list <promptId> [--json] # V1.1 Phase 3
promethic attachments get <attachmentId> <outputPath> # V1.1 Phase 3
promethic attachments delete <id> # V1.1
promethic mcp [--probe] # V1.1 — Claude Desktop / MCP server
Claude Desktop & other MCP clients V1.1
The CLI doubles as a local MCP
(Model Context Protocol) server. promethic mcp exposes
25 tools covering the full Avalonia desktop / Expo web
workspace surface:
- read:
list_prompts(returns name + description +outputModalityso agents can decide whether to invoke without an extra round-trip),get_prompt,list_records,list_versions,get_version,list_attachments,get_attachment,get_catalog - execute:
run_prompt(V1.2: optionalautoFinalizeauto-creates a record on success),revise_run(V1.2: acceptsrunIdXORrecordId— auto-finalized records can be revised by recordId, no need to setautoFinalize=falseto "keep editing";fromTurnrewinds),finalize_run(V1.2:fromTurnrewinds; can amend a Finalized session),abandon_run,get_run_image,delete_record,patch_record(notes / tag merge-patch),create_record(V1.4: agent-curated training data —{promptId, input, output, notes?}creates a finalized record with no LLM call, no spend; for "voice prompt" workflows where the user collaborates with the agent to seed examples before running Generate Prompt on the desktop) - write (V1.1 Phase 2 + Phase 3):
create_prompt,update_prompt,delete_prompt,create_version(with optionalsetAsCurrentin one transaction),update_version(versionDescription + description),switch_current_version,delete_version,upload_attachment,delete_attachment
All tools mirror the workspace flow agents would otherwise need a desktop or web browser to drive: author prompts, manage versions, attach reference files, run with vision, revise, finalize, edit notes/tags.
Install once, then add this block to
~/Library/Application Support/Claude/claude_desktop_config.json
(macOS) or %APPDATA%\Claude\claude_desktop_config.json (Windows):
{
"mcpServers": {
"promethic": {
"command": "promethic",
"args": ["mcp"]
}
}
}
Restart Claude Desktop. The 25 Promethic tools appear in its tool tray.
The MCP server uses the same pmk_… key as the rest of the CLI
(so promethic auth login covers Claude Desktop too); set
PROMETHIC_API_KEY in the config block's
env to override per-install.
Smoke-test before wiring Claude Desktop:
promethic mcp --probe # auth + URL + connectivity check, exits
Image input/output V1.1 Phase 3
Output: image-modality runs return image bytes inline as base64
(≤ 16 MB raw) or as a local-file path (larger). Same shape via
get_run_image for in-flight runs and
get_attachment for prompt attachments.
Input: run_prompt.images accepts the SAME
media-ref shape as the output. Agents can pipe a prior run's image straight
back in:
{
"images": [
{ "inline": true, "base64": "...", "mimeType": "image/png" },
{ "inline": false, "localPath": "/tmp/photo.png", "mimeType": "image/png" }
]
}
Up to 16 images per run, 10 MB each. Requires the prompt's model to declare
input_image capability (GPT-5.x via Responses API; gpt-image-1.x
as edit inputs). Trust note: localPath is read by the MCP server
with the user's permissions — only use paths the agent is authorized to read.
Attachment management
upload_attachment takes either inline base64 or a
localPath; idempotency keys are derived from
(promptId, filename, content) so retries replay the original
upload (no double-billing). list_attachments,
get_attachment, and delete_attachment round out
the surface. Per-file: 10 MB image / 5 MB text. Per-prompt: 50 MB total.
MCP cancellation propagates to a server-side /abandon so
cancelled runs release their RunSession within ~1 s.
Same pattern works for Cursor, Zed, Continue, Cline — any desktop-class MCP-aware client.
Hosted MCP V1.3
For agents that can't (or shouldn't) run a local stdio server
— Claude iOS, claude.ai web, sandboxed automations — Promethic
hosts the same 24-tool surface at https://mcp.getpromethic.com/v1
speaking MCP
Streamable HTTP transport (spec 2025-03-26). Same tools,
same scopes, same pmk_ keys.
Why hosted MCP exists: an agent calling tools via stdio needs a local CLI install + a long-lived process. A hosted endpoint replaces both with one HTTP URL — Claude iOS just adds a connector, no local binary. The wire shape is identical to the local CLI, so existing scripts don't change.
Connect Claude Desktop / Cursor
{
"mcpServers": {
"promethic": {
"url": "https://mcp.getpromethic.com/v1",
"headers": {
"Authorization": "Bearer pmk_..."
}
}
}
}
Claude Desktop config path is the same as the stdio install
above (claude_desktop_config.json). Cursor: settings
JSON, same shape. The pmk_ key supplies auth; the
server exchanges it at initialize for a connection-bound
short-lived mcps_ session token.
Connect Claude iOS / claude.ai web / ChatGPT (OAuth)
These clients use OAuth 2.1 + PKCE instead of bearer-key paste —
their connector UI doesn't accept a static token. Add a custom
connector pointing at https://mcp.getpromethic.com/v1
and leave the OAuth fields blank; the client auto-discovers them
via /.well-known/oauth-protected-resource. On Connect,
a popup opens to the Promethic consent screen — sign in, click
Allow, the connector activates. Revoke any time at
app.getpromethic.com →
Settings → Connected Apps. The full 24-tool surface appears in
the agent's tool tray. Tools are namespaced
promethic_<name> on the wire (underscore separator
per Anthropic's Tool API name regex).
Image-modality runs over MCP
promethic_run_prompt on an image-modality prompt returns
the generated image bytes inline as MCP image
content blocks in the tools/call result, alongside
the text transcript. Claude.ai / ChatGPT render these directly. The
text transcript shows [image bytes elided ...] for events
that originally carried base64 — those bytes live in the
image content blocks instead. Use
promethic_get_run_image only as a fallback (e.g., when
re-fetching after losing the original tool result).
Per-tool grants (opt-in)
Hosted MCP supports an optional per-tool allow-list on top of the
read/execute/write scopes.
New keys are unconfigured by default and can call any tool the key's
scopes permit — useful while the per-tool config UI is being built.
Once you opt in (set an explicit allow-list on the key), only those
tool names succeed; anything else returns
tool_grant_required. The wildcard ["*"]
explicitly reverts to allow-all, and the empty array []
blocks every non-discovery tool.
Discovery surfaces (list_prompts, get_catalog)
always bypass the per-tool gate so agents can always discover what
tools exist.
Differences from local CLI MCP
upload_attachmentrequires inlinebytes_base64;localPathis rejected (the server has no agent's filesystem). 10 MB raw cap matches the local CLI; chunked upload (upload_id/chunk_index/chunk_total) is reserved for V2.- Idempotency keys: vendor-prefixed
_metanamespace —_meta["com.getpromethic/idempotency-key"]on the JSON-RPC request. Same byte-identity replay semantics as the HTTPIdempotency-Keyheader. - Streaming tools (
run_prompt,revise_run) usenotifications/progress(MCP spec) for live model output. Path B durable resume: if the live stream drops, GET/v1/sessions/{sid}/calls/{toolCallId}returns the final result once available.
Run lifecycle (sessions vs records, auto-finalize)
A run is a transient session: the model receives your
input, streams an output, and the server tracks state in
RunSession (1h sliding TTL). A record is the
persisted artifact: input + final output + any revision turns + edits +
cost — the structured data Promethic uses to refine your prompt over time.
To go from session to record you call finalize_run. To do
nothing and let the session expire you can abandon_run.
Auto-finalize (default ON since V1.2) chains a
finalize_run after a successful run_prompt on
the same SSE stream. One call, one round trip, one saved record. The
new recordId arrives via the record_finalized
SSE event. This is what most agents want — call run_prompt,
use the returned record.
Pass autoFinalize: false on a single run_prompt
call to opt out — useful when you want to revise_run the
output before saving, or just inspect it and decide whether to
abandon_run instead. Then revise with the
runId and finalize manually with
--final-text / --tag / --notes
when ready. Use revise with a recordId
instead of a runId to rehydrate from a finalized record.
To set the persistent default (so all your runs from any client behave the same way), use one of:
- Avalonia desktop: Settings → Appearance → Hosted MCP toggle.
- Expo (web / iOS): Settings → toggle "Auto-save MCP runs as records".
- CLI:
promethic config set auto-finalize-mcp-runs <true|false>(V1.3+).
Per-call autoFinalize always overrides the persistent
default. The persistent default in turn overrides the V1.2 server
default of true.
--no-accept on the CLI saves a JSON artifact at
~/.promethic/runs/<runId>.json (mode 0600) so the
runId + state survives across shell sessions.
Override the API URL
export PROMETHIC_API_URL=http://localhost:8080
promethic auth status
Only https://..., http://localhost, and
http://127.0.0.1 are accepted — the CLI refuses to send
your key to other http:// hosts.
Errors
All 4xx/5xx responses follow
RFC 7807
application/problem+json. Designed for self-healing
agents — every error names what went wrong, what to do about it,
and (where applicable) which exact field tripped:
{
"type": "https://api.getpromethic.com/problems/invalid_model_settings",
"title": "Model settings reference an unknown or inactive model.",
"status": 400,
"detail": "The model_id is not in the catalog, or it has been retired.",
"reason_code": "invalid_model_settings",
"action_hint": "List models via GET /api/v2/public/models, then retry with a current model_id.",
"request_id": "req_01HX...",
"invalid_params": [
{ "name": "modelSettings.model_id", "reason": "unknown_or_inactive_model" }
]
}
Each type URL is also a working redirect:
GET /problems/{reason_code} 302s to the matching
section of these docs (e.g.
/problems/idempotency_key_reused). Agents that
follow the link land on a human description plus the resolution
steps for that specific reason.
Reason codes — each row's id is the redirect target for /problems/{reason_code}:
| HTTP | reason_code | Meaning |
|---|---|---|
| 400 | invalid_request | shape error — see invalid_params (subsumes null-not-allowed via required_field_clear reason) |
| 400 | invalid_model_settings | unknown / inactive model_id, or missing parameters object; action_hint tells you to GET /api/v2/public/models |
| 400/413 | field_too_large | see V1.0.1 caps in the Write section |
| 400 | idempotency_key_invalid | missing, > 255 chars, comma-joined, or non-visible-ASCII |
| 400 | stream_required | POST /run/revise must include stream: true |
| 400 | invalid_cursor | tampered or cross-user cursor |
| 400 | cursor_filter_mismatch | filter params changed mid-pagination |
| 400 | persist_query_param_removed | V1.1 — ?persist query param removed; /finalize always commits. Use DELETE /records/{id} within 24h to undo. |
| 410 | record_was_deleted | V1.1 — this run finalized to a record that has since been self-deleted. Mint a new run. |
| 500 | snapshot_corrupt | V1.1 — server-side data integrity: the run session's version snapshot failed to parse. Mint a new run. |
| 409 | reopen_limit_exceeded | V1.1 — session has been reopened more than 100 times via /revise after /finalize. Mint a new run. |
| 400 | intermediate_output_not_supported_image | V1.1 — image-modality runs reject intermediateOutput. Mirrors desktop: image revisions always use the model's actual output. |
| 413 | intermediate_output_too_large | V1.1 — intermediateOutput exceeds 32 KB per-turn cap. |
| 413 | final_text_too_large | V1.1 — finalText exceeds 256 KB cap. |
| 413 | notes_too_large | V1.1 — notes exceeds 64 KB cap. |
| 413 | tag_too_large | V1.1 — tag exceeds 256 character cap. |
| 400 | tag_without_delta | V1.1 — /finalize received tag but no edit delta was produced (finalText omitted or matches model output). Use notes for record-level labels, or PATCH /records/{id} to re-tag. |
| 403 | record_not_owned_by_api_key | V1.1 — DELETE/PATCH /records/{id} on the public API is restricted to the API key that created the record. Mutate via the Promethic web/desktop app, or use the original API key. |
| 409 | record_self_delete_window_expired | V1.1 — DELETE /records/{id} on the public API is restricted to the first 24h after record creation. Delete via the Promethic web/desktop app instead. |
| 409 | record_no_edit_delta | V1.1 — PATCH /records/{id} attempted to set tag on a record with no edit delta. Tags attach to edit deltas only; use notes for record-level labels. |
| 403 | grant_required | V1.1 — API key is restricted to a specific set of prompts and the requested prompt is not in that set. Manage at Settings → Developer Keys → Manage prompts, or use an unrestricted key. |
| 409 | version_is_current | V1.1 — DELETE /prompts/{id}/versions/{vid} attempted on the prompt's current version. Switch the current version first via PUT /prompts/{id}/current-version. |
| 409 | attachment_referenced_by_active_run | V1.1 — DELETE /attachments/{id} blocked because an active RunSession's snapshot references this attachment. Wait for runs to finalize/expire (max 1h), or POST /runs/{runId}/abandon them. |
| 409 | prompt_referenced_by_active_run | V1.1 — DELETE /prompts/{id} blocked because at least one non-terminal RunSession is still active for this prompt. Wait or /abandon. |
| 409 | version_referenced_by_active_run | V1.1 — DELETE /prompts/{id}/versions/{vid} blocked because at least one non-terminal RunSession pins this version. Wait or /abandon. |
| 409 / lifted | image_runs_not_supported_v1 | V1.1 Phase 7: lifted. Image-modality runs via API key are now supported on /run, /revise, /revise-again. The accumulated effective prompt is persisted per-turn and surfaces as record.finalCopiedOutput after /finalize. The reason code is kept in the table for back-compat with old SDKs but is no longer emitted. |
| 400 | final_text_not_supported_image | V1.1 — image-modality runs reject finalText. record.FinalCopiedOutput for image records is server-derived from the image-prompt accumulation chain (training-data invariant). |
| 413 | session_deltas_too_large | V1.1 — total session.Deltas jsonb exceeds the 2 MB cap. Finalize and start fresh. |
| 413 | cost_incurred_no_delta_persisted | V1.1 — upstream model call billed but the resulting turn couldn't persist (post-upstream cap exceeded). UsageLog has the charge. |
| 400 | invalid_image_base64 | bad base64 in images[].data |
| 400 | invalid_index | index ≥ 0 violation (e.g. ?index= on record image) |
| 400 | invalid_source | ?source= not in {App, Manual, API} |
| 400 | instruction_required | POST /revise needs a non-empty instruction |
| 400 | from_turn_invalid | V1.2 — fromTurn not a valid non-negative integer |
| 400 | from_turn_out_of_range | V1.2 — fromTurn exceeds current turn count; re-read turns[] |
| 400 | mixed_credentials_principal_mismatch | session + key resolve to different users |
| 400 | mixed_credentials_key_mismatch | two API keys present that don't match |
| 401 | key_unauthorized | missing / invalid / expired / revoked API key |
| 403 | scope_required | key lacks the required scope |
| 403 | api_key_not_permitted | endpoint requires a session, not a key |
| 404 | prompt_not_found | no prompt with that id is visible to this caller |
| 404 | version_not_found | no matching version on the prompt |
| 409 | current_version_missing | prompt's currentVersionId points to a deleted version; no fallback could be self-healed |
| 404 | record_not_found | no record with that id is visible to this caller |
| 404 | run_not_found | run expired, never existed, or not yours |
| 404 | attachment_not_found | no attachment with that id is visible |
| 404 | no_image_stored | this record has no stored images |
| 404 | image_index_out_of_range | ?index past the count of stored images |
| 404 | invalid_image_reference | defense-in-depth validation refused the path |
| 409 | idempotency_key_reused | same key on a different body — won't replay; mint a new key |
| 409 | idempotency_in_flight | same key still being processed; retry after Retry-After |
| 409 | session_busy | another /revise or /finalize in flight |
| 409 | session_not_active | rare CAS-race; re-fetch run state |
| 409 | session_already_finalized | terminal state — fired by /revise on a finalized run. Note: /finalize itself is idempotent and replays the original 200 + recordId on retry; it does NOT 409. |
| 409 | session_expired | past 1h TTL |
| 409 | session_failed | terminal — see reason_code |
| 409 | session_abandoned | abandoned manually or by RevokeAsync's bulk abandon |
| 409 | revision_chain_too_long | 25 turns/session cap |
| 409 | revise_again_attachments_unsupported | record's prompt has attachments |
| 409 | record_revise_in_progress | V1.2 — another caller holds the per-record rehydrate lock for this recordId; retry after a short backoff |
| 409 | snapshot_modality_unreadable | internal snapshot data unreadable |
| 409 | finalize_completion_failed | internal: finalize transaction failed |
| 500 | finalize_conflict | unexpected record conflict during finalize; session reset to Active — retry /finalize |
| 500 | image_upload_failed | V1.1 Phase 0 — image generated upstream but blob storage write failed after retries; upstream charged (charged: true); replay returns this failure for the Idempotency-Key |
| 500 | image_extraction_overflow | V1.1 Phase 0 — upstream produced more images than the 16-per-turn cap; reduce n or split runs |
| 409 | run_already_terminal | cannot abandon a finalized/abandoned/expired/failed run |
| 409 | version_create_contention | concurrent version inserts; retry |
| 413 | storage_quota_exceeded | per-prompt or per-user storage cap reached |
| 429 | rate_limited | per-key or per-user bucket overflow; honour Retry-After |
| 500 | stream_setup_failed | SSE response failed to initialize before the proxy call |
| 503 | auth_store_unavailable | transient idempotency-store race; retry |
| 500 | idempotency_outcome_unknown | V1.3 Phase 4b — process died mid-flight after possibly committing the domain mutation but before recording Complete; retries of the same Idempotency-Key replay this body until 24h TTL. Verify via GET before retry — see Recovery from idempotency_outcome_unknown for per-tool recipes. |
Idempotency V1.0.1
Every mutating POST (the five Write endpoints
above, plus /prompts/{id}/run when its body is the same
as a prior attempt) accepts an Idempotency-Key header.
This is a Stripe-style guarantee: a network glitch mid-call is safe
to retry — the server replays the original response byte-identically
instead of double-applying the side effect.
Contract
- Header value: 1–255 visible-ASCII characters (0x21–0x7E), no commas, sent at most once.
- Same key + same body + same route → server replays the original status, headers, and body.
- Same key + different body →
409 idempotency_key_reused(the agent picked a key it already used for a different request — generate a new one). - Same key, original still in flight →
409 idempotency_in_flight+Retry-After: 1. - Records expire 24 h after the original call completes (Stripe parity). After expiry the same key is fresh again.
-
Replay returns the response shape from the original call.
If we ship a new field on (e.g.)
POST /promptsbetween your first call and your retry, the retry returns the OLD shape — not the new one. This is intentional Stripe parity: replays are byte-identical snapshots. The 24 h TTL bounds staleness; for the freshest shape, mint a new key.
How the CLI uses it
The CLI auto-generates a fresh UUIDv4 per invocation by default —
each promethic prompts create call is a distinct
attempt. Pass --idempotency-key <uuid> to pin
one if you want a manual retry to be a no-op. The
attachments add command derives a deterministic key
from sha256(promptId + filename + size + content)
so a retry of the same upload is naturally idempotent.
--filename newname.txt)
is treated as a fresh upload and consumes storage twice.
If you want to rename an existing attachment, delete the
original through the web app first (DELETE on attachments
is V1.1 — see "Not in V1.0.1" below).
Recovery from idempotency_outcome_unknown V1.3 Phase 4b
If the server process dies between committing the domain mutation
and recording the idempotency Complete, a sweep flips the row to
state=failed with a synthetic body:
{
"type": "https://api.getpromethic.com/errors/idempotency_outcome_unknown",
"title": "Idempotent run outcome unknown",
"status": 500,
"reasonCode": "idempotency_outcome_unknown",
"detail": "The original request died mid-flight (process crash or lease expired without heartbeat). The domain change MAY OR MAY NOT have landed. Verify via a GET before any retry — replaying the same Idempotency-Key returns this body verbatim, and a NEW key may duplicate the original mutation.",
"route": "POST /api/v2/public/prompts"
}
Replays of the same key continue to return this body
until the row's 24 h TTL expires. The atomicity refactor
in PR #18 (per-endpoint BeginTransactionAsync wrapping
the domain mutation + Complete) makes this case much rarer post-2026-05-09
— for tools that landed BEFORE PR #18 (or future tools added
without the wrapper), this recovery is still load-bearing.
Per-tool verify recipes (use these BEFORE retrying with the same OR a new key):
run_prompt below uses a future clientIdempotencyKey
field on the record DTO as the authoritative disambiguator. That
field is not yet shipped — the explicit "(when available)"
framing in the recipe handles this. Until Phase 6 lands, agents will
fall back to the heuristic match (createdAt +
inputText). The heuristic is unreliable for repeated identical
inputs in the same window — verify carefully OR mint a new key + accept the
duplicate cost when in doubt.
| Tool / route | Verify recipe |
|---|---|
POST /prompts + MCP create_prompt |
GET /api/v2/public/prompts (or MCP
list_prompts) then match by the
name field you submitted — names are
user-chosen + likely unique within your set.
If found: the create succeeded; do NOT retry.
If not found: safe to mint a new key + retry.
|
POST /prompts/{id}/versions + MCP create_version |
GET /api/v2/public/prompts/{id}/versions
(or MCP list_versions) then match by
versionNumber = (highest from your
pre-call read) + 1. If a version with that
number exists with your prompt text: succeeded.
If not: safe to retry with a new key.
|
POST /prompts/{id}/attachments + MCP upload_attachment |
GET /api/v2/public/prompts/{id}/attachments
(or MCP list_attachments) then match
by originalFilename + fileSize.
If found: succeeded. If not: safe to retry.
Note: storage quota was reserved at Begin time; a
lost call leaves the quota reserved until the
idempotency row's 24 h TTL refunds it via
the orphan-blob sweep.
|
POST /prompts/{id}/run + MCP run_prompt |
Run records auto-finalize by default. Authoritative
disambiguator (when available): filter
list_records by clientIdempotencyKey
— every record carries the originating Idempotency-Key
from the call that created it. If found: the run
succeeded and the record exists. Cost was billed;
you've paid for it. If not found: the run did not
complete; safe to mint a new key + retry.
Heuristic fallback (use ONLY when the authoritative path isn't available — e.g., a tool that doesn't yet expose clientIdempotencyKey):
match by createdAt in your call window AND
inputText. Be aware that an agent calling
run_prompt with the same input multiple times
in 24h cannot disambiguate via output /
cost_micros alone — those are nearly
identical for deterministic prompts. The heuristic is
a guess; do not blindly retry on a match-of-many.
|
MCP finalize_run / revise_run |
These take a runId. Step 1:
GET the run state via GET /api/v2/public/runs/{runId}
(or let MCP list_records filter by
runSessionId). If a record exists with
your runId: finalize succeeded.
If RunSession.State == Finalized with
a finalizedRecordId: ditto, succeeded.
If State == Active: the run is back to
a state you can retry from — mint a new key + retry.
If State ∈ {Running, Finalizing}: a
concurrent attempt is in flight or recovering — wait
+ re-poll.
If State == Failed: terminal; do not retry.
Do NOT mint a new key without GET-checking
state first — retrying a fresh-key finalize
against a Finalizing session 409s
(session_busy) or races the
Phase 4b finalize-failure→Active reset.
|
MCP delete_* / patch_record |
GET the resource by id. If 404 (delete)
or fields match your patch (patch): succeeded.
Otherwise safe to retry with a new key.
|
Rate limits
Per-minute fixed-window buckets, evaluated AFTER auth (so an unauthenticated burst can't drain a per-user bucket the caller doesn't own):
| Scope | Per key | Per user |
|---|---|---|
| read | 60/min | 300/min |
| execute | 30/min | 90/min |
On overflow: 429 with a Retry-After header and
an RFC 7807 problem document carrying reason_code: "rate_limited"
plus an action_hint describing whether the key or the user
bucket overflowed.
Versioning
The URL path carries the major version (/api/v2/public).
The SSE protocol carries an in-band protocolVersion for
forward-compat extension within the same path version.
- Removing or renaming an existing endpoint or event = major bump.
- Adding a new endpoint, event, or response field = minor (no bump).
- Changing field semantics on an existing field = major bump.
Catalog stability
GET /api/v2/public/models is an agent-facing contract.
What's safe (no major bump) for us to do:
- Add a new model.
- Add a new value to a parameter's
valuesenum (e.g.reasoning_effort: ["none","low","medium","high"]→[..., "xhigh"]). Strict-validating agents should treat unknown enum values as forward-compat additions, not errors. - Add a new capability bit, parameter, or cost field.
- Retire a model. Once retired the
model_idis no longer in the catalog and any prompt referencing it gets400 invalid_model_settingswithaction_hintdirecting the agent to fetch the live catalog and pick a current model.
Known V1.0.1 limitations resolved in V1.1
Resolved in V1.1 Phase 8: public DTOs now exposecost_centson records is integer-truncated; sub-cent costs round to0.costMicroCents(microcents, 1/1000 cent) for sub-cent precision.costCentsis removed from the public surface; render viacostMicroCents / 1000.
Not in V1.1
DELETE on prompts / versions / records / attachments — leaked-key blast radius too high without per-prompt grants.Resolved in V1.1 Phase 5 (records) + Phase 6b (prompts/versions/attachments): per-prompt grants gate every mutation; record self-delete restricted to the originating API key + 24h window; attachment delete blocked while an active RunSession references the blob.Image-output runs for API-key callers —Resolved in V1.1 Phase 7 + Phase 0: image-modality runs are supported on /run, /revise, /revise-again. Per-turn409 image_runs_not_supported_v1.effectivePromptForImageaccumulation persists intorecord.finalCopiedOutputon /finalize, restoring the desktop accumulated-prompt invariant. V1.1 Phase 0 wired the actual blob upload (Phase 7 lifted the gate but leftImageBlobKeys: nullhardcoded — pre-Phase-0 records came back withimageStoredPath: null). All images now persist to blob storage, retrievable viaGET /runs/{runId}/images/{n}in-flight andGET /records/{id}/image?index=Npost-finalize. Records preserve every per-turn image (re-finalize merges, never shrinks).GET /api/v2/public/runs/{runId}polling endpoint — V1.2. Until then, agents derive run state from their own bookkeeping of the original call. Therun_replayedevent on idempotent retries surfaces{succeeded: false, reasonCode: "replayed_state_unknown"}rather than masquerading as success.- CLI
run --output-dir <dir>— V1.2. Image bytes are fetchable today viaGET /runs/{runId}/images/{N}(in-flight) orGET /records/{id}/image?index=N(post-finalize); the auto-save UX is a CLI ergonomics improvement. - CLI grants management (
keys grants list/add/remove) — V1.2. Per-prompt restrictions are configured by the user via Settings → Developer Keys → Manage in the web/desktop apps; agents do not configure their own restrictions. - Searchable prompt picker in Manage view — V1.2. V1.1 ships a plain scrollable checkbox list; search arrives once a user has 30+ prompts.
- Webhooks, OAuth, PAT, team keys — V2.
- Streaming on the CLI — CLI internally buffers SSE for revise chain. V2 may surface raw streaming.
Resolved in V1.2:?fromTurn=Nrewind — RunSession.Deltas is turn-indexed today; surface in V2./runs/{runId}/reviseand/runs/{runId}/finalizeacceptfromTurnin the request body. Dropssession.Deltasentries withturnIndex > fromTurnbefore applying the operation. Image blobs orphaned by the rewind enqueue intoblob_cleanup_queue(drained by a background worker with reference-count guard).
Changelog
v1.3 — 2026-05-11 — BREAKING (per-prompt MCP tools)
- Prompt-level
descriptiondropped from every wire surface.POST /api/v2/public/promptsno longer acceptsdescription;PATCH /api/v2/public/prompts/{id}accepts onlynameandabbreviation.PublicPromptCreatedResponsedrops the field. Thecloud_prompts.Descriptioncolumn is dropped from the database with no data preservation. Capability descriptions live on versions only. - Version
descriptionis the agent-facing capability description. Every version carries a one-sentence summary that describes what the prompt does — what it expects as input and what it returns. Surfaced intools/listsynthesized tool descriptions andlist_prompts, so agents can pick a prompt in one round-trip. - New
descriptionModefield on versions. Numeric on the wire (0=Auto,1=Manual). In Auto, the server regenerates Description with gpt-5.4-nano on every PUT version that changespromptText(fire-and-forget worker, ~$0.0003 per fire, conditional UPDATE that no-ops on stale starts). In Manual, the user/agent owns the field. - Description-write rule. Writing
descriptionatPATCH /api/v2/public/prompts/{id}/versions/{vid}(orupdate_versionon MCP, or PUT version on the private cloud surface) is treated as the caller taking ownership — Mode auto-flips to Manual if it isn't already. PassdescriptionMode: 0in the same request to revert to Auto and let the server worker resume regenerating; explicitdescriptionModewins over the implicit description-presence flip. JSONnullfordescriptionis a no-op (send""to clear deliberately). Earlier (pre-2026-05-13) silent-ignore-in-Auto + explicit-only-flip rule was a footgun and is no longer in force. If-Matchprecondition (optional) on PUT/PATCH version endpoints. Token format:UpdatedAt.Ticksas lowercase hex (e.g.If-Match: 8db7e12c0e7c100). Mismatch returns412 Precondition Failed. Absent header keeps last-write-wins legacy semantics. PUT version now returns200 OK + VersionResponse(was204) so the client gets the newUpdatedAtfor the next If-Match token. Future PUT/PATCH endpoints will follow this convention.- Per-prompt MCP tools (opt-in). Toggle via
POST /api/v2/prompts/{id}/mcp-togglewith{ "expose": true }. Each exposed prompt appears in your agent's MCP tools/list aspromethic_{slug}(e.g.promethic_clay_cuties). Agents invoke by name in one round-trip — nolist_prompts+get_prompt+run_promptdance. Cap = 50 per account; cap-hit returns409 mcp_tool_cap_reached. Tool name is stable across prompt renames so hardcoded agent code keeps working. To re-derive the tool name from the new prompt name, callPOST /api/v2/prompts/{id}/mcp-rename; collision returns409 tool_name_takenor409 tool_name_reserved.
v1.2 — 2026-05-07
- Auto-finalize on
/run: pass?autoFinalize=true(now the default; toggle per-user via theautoFinalizeMcpRunssetting) and the server chains an internal/finalizeafter a successful run. The newrecordIdarrives via therecord_finalizedSSE event on the same stream asrun_completed. Three new SSE events:record_finalized,record_finalize_failed(chain failed; agent decides whether to callPOST /finalizemanually based onretryable),record_finalize_skipped(informational, afterrun_failed). fromTurnrewind primitive:/runs/{runId}/reviseand/runs/{runId}/finalizeacceptfromTurn. Drops session turns >fromTurn, then applies the operation. Out-of-range →400 from_turn_out_of_range.- Finalize-on-Finalized amend: calling
/finalizewith new content (finalText/tag/notes/fromTurn) on a Finalized session reopens the session, bumpsRunGeneration, and re-finalizes — same record ID, same handle. Fresh idempotency boundary for the new gen. - Unified
turns[]onPublicRecordResponse: every record DTO carries a synthesizedturnsarray (run / revision / edit, indexed contiguously) reconstructed from the input + delta chain + final-copied-output. Resolves the V1.1 stitching gap where agents had to mentally combineinputText+finalCopiedOutput+deltas[]. POST /records/{id}/revisereplaces/revise-again: rehydrate a freshRunSessionfrom a finalized record's snapshot and revise. Same body shape as/runs/{runId}/revise(carriesintermediateOutput+fromTurn). Per-record advisory lock serializes concurrent rehydrate attempts (409 record_revise_in_progresson contention). Old/revise-againroute HARD-REMOVED.- Image blob cleanup queue:
fromTurnrewinds legitimately shrink record image history. Dropped per-turn blobs enqueue intoblob_cleanup_queue(background worker, single-leader viapg_advisory_lock(3), reference-count guard against bothImageStoredPathstorage formats before S3 DELETE). - Spend audit discriminator:
UsageLog.Discriminatorcolumn reserved for billing-eligibility tagging;SpendQueryFilters.Billabledrives all SUM rollups (admin LIST endpoints intentionally show every row for audit visibility). - MCP CLI surface:
revise_againtool COLLAPSED intorevise_run(acceptsrunIdXORrecordId).run_promptgrowsautoFinalize?: boolean.finalize_run+revise_rungrowfromTurn?: number. Tool count: 25 → 24.
v1.1 — 2026-05-03
- Per-prompt grants (Phase 6a): API keys can be restricted to a specific set of prompts. Configured via Settings → Developer Keys → Manage in the web/desktop apps. Three new session-only endpoints:
GET/POST/DELETE /api/v2/keys/{keyId}/grants. Restricted-key access to non-granted prompts returns403 grant_required. - Server-stateful runs:
RunSessiontable replaces the V1 echo-back signed-blob model. Agents hold an opaquerunId; the server keeps prompt + version snapshot frozen at /run time, immune to mid-flight prompt edits.POST /runs/{runId}/revise,/finalize,/abandon,/revise-again;GET /runs/{runId}/images/{N}for in-flight image fetch. - Idempotency-Key on the full execute surface (Phase 3e/3f):
/run,/revise,/finalizeall replay byte-identically on retry. Route-signature composition with@gen{N}on/finalizeso reopen-on-revise creates a fresh idempotency boundary. SSE replay protocol via the newrun_replayedevent (terminal-with-info). - Record self-management (Phase 5):
DELETE /records/{id}(24h, ApiKey-owned, hard-delete + cascade) andPATCH /records/{id}(notes + tag, no time window). HIPAA §164.312(b) audit row on every mutation with PHI-aware presence/length/SHA-256-prefix metadata. - DELETE on prompts/versions/attachments (Phase 6b): write-scope + grant check.
DELETE /attachments/{id}blocks if any activeRunSessionreferences the blob (viaVersionSnapshotORCurrentImageBlobKeys) →409 attachment_referenced_by_active_run.DELETE /prompts/{id}/versions/{vid}rejects current version atomically.DELETE /prompts/{id}+versions/{vid}reject when active runs are pinned (409 prompt_referenced_by_active_run/version_referenced_by_active_run). - Image-output runs for API-key callers (Phase 7):
409 image_runs_not_supported_v1gate lifted on /run, /revise, /revise-again. Per-turneffectivePromptForImageaccumulation persists intorecord.finalCopiedOutputon /finalize, restoring the desktop accumulated-prompt invariant. - Catalog enforcement (Phase 4):
ModelSettingsValidatorwired ADDITIVELY intoPOST /prompts+POST /prompts/{id}/versions+PATCH /prompts/{id}. Out-of-enum values likereasoning_effort: "xtreme"now400 invalid_model_settingsat write time instead of silently failing at /run. - Observability + cost precision (Phase 8):
X-RateLimit-*headers on every response (both buckets reported).cost_micros(1/1000 cent) replacescost_centson public DTOs for sub-cent precision.response.usageSSE event reasoning_tokens fix for the Responses API shape.
v1.0.1 — 2026-04-29
- Write scope + 5 new endpoints: prompt create / patch (RFC 7396) / current-version switch, version create, attachment upload.
- Idempotency-Key header on all mutating POSTs (Stripe parity, 24 h TTL).
- RFC 7807 problem+json errors with
action_hint+invalid_paramsfor self-healing agents. GET /models— slim catalog endpoint withsupportedOutputModalities.- CLI:
prompts create/prompts patch/prompts switch-current/versions create/attachments add, plus YAML manifest mode. - Developer Keys management UI in the web + desktop apps.
v1 (alpha) — 2026-04-27
- Initial public surface: read + execute scopes.
- SSE protocol v1 with
run_session/run_completed/run_failedtaxonomy. promethicCLI alpha (Node 18+).