Implements the Replicate API backend (FLUX schnell / FLUX dev) per ImaGen issue #3: - internal/backend/replicate.go — Backend adapter. Supports model refs as "owner/name" (uses /v1/models/{owner}/{name}/predictions) and "owner/name:hash" (uses /v1/predictions with explicit version). Polls /v1/predictions/{id} every 500ms with model-aware timeout (60s schnell, 120s dev). Resilience: 401 names api_token_env, 429 with exp backoff up to 3 retries (honours Retry-After), 5xx retries once, image download retries once on transient failure. - internal/backend/replicate_pricing.go — hardcoded per-image USD rates for known FLUX models, snapshotted from replicate.com/pricing with a refresh TODO. - internal/backend/replicate_test.go — mocked-HTTP unit tests covering happy path (model + version-pinned), 401, 429 retry policy, failed prediction, poll timeout, image-download retry, ctx cancel, BackendOpts passthrough, default_steps, aspect-ratio reduction, sha256 prompt hash. - internal/usage/usage.go — Supabase REST sink + read-side query for mai.imagen_usage. Adapter writes are best-effort: failures warn but the image still lands. - cmd/imagen/usage.go — `imagen usage [--since DATE] [--raw]` reads the table and prints a tab-aligned grouped or raw table with totals. - cmd/imagen/backends.go — instances of type=replicate now report "ok" or "not configured (set REPLICATE_API_TOKEN)" depending on env. - internal/config/config.go — sample adds flux-schnell-replicate + flux-dev-replicate; default_backend stays flux-schnell-local. - Supabase migration mai.imagen_usage (id, created_at, backend, model, seed, prompt_hash, latency_ms, cost_usd_estimate, caller) + indexes on (created_at DESC) and (caller). The raw prompt is never stored. Caller identity resolves from MAI_FROM_ID, then the tmux pane's @mai-name option, mirroring the maimcp identity logic. Prompt hash is sha256 of the user-facing prompt; raw prompt never reaches the table.
43 lines
1.5 KiB
Go
43 lines
1.5 KiB
Go
package backend
|
|
|
|
import "strings"
|
|
|
|
// Replicate pricing snapshot.
|
|
//
|
|
// Source: https://replicate.com/pricing and the per-model "Run" tab on
|
|
// each model page. Replicate bills per second of GPU time, but the
|
|
// black-forest-labs FLUX models also publish a flat per-image price for
|
|
// the typical settings — that flat number is what we hardcode here.
|
|
//
|
|
// Snapshot date: 2026-05-08. TODO(refresh): re-check quarterly. If the
|
|
// rates drift more than ~10%, update the table and bump snapshotDate.
|
|
const replicatePricingSnapshotDate = "2026-05-08"
|
|
|
|
// replicatePerImageUSD is the per-image cost estimate keyed by Replicate
|
|
// model identifier ("owner/name", with any ":version" trimmed). Returns
|
|
// the rate and true if the model is known, 0 and false otherwise — an
|
|
// unknown model writes a row with NULL cost rather than a wrong number.
|
|
func replicatePerImageUSD(model string) (float64, bool) {
|
|
key := normalisePricingKey(model)
|
|
switch key {
|
|
case "black-forest-labs/flux-schnell":
|
|
return 0.003, true
|
|
case "black-forest-labs/flux-dev":
|
|
return 0.025, true
|
|
case "black-forest-labs/flux-pro":
|
|
return 0.055, true
|
|
case "black-forest-labs/flux-1.1-pro":
|
|
return 0.040, true
|
|
}
|
|
return 0, false
|
|
}
|
|
|
|
// normalisePricingKey strips the optional ":version" suffix and lowercases
|
|
// the owner/name pair. "Owner/Name:hash" → "owner/name".
|
|
func normalisePricingKey(model string) string {
|
|
if i := strings.IndexByte(model, ':'); i >= 0 {
|
|
model = model[:i]
|
|
}
|
|
return strings.ToLower(model)
|
|
}
|