How Ezra is built
Ezra is a Slack-native AI agent that connects to affiliate platforms via their APIs. Here's how the architecture works — the stack, the design decisions, and the hard problems we're solving.
The stack
Slack
Primary interface. Bot API for DMs and channels, Block Kit for approval cards and interactive elements. OAuth for workspace install. Ezra lives where your team already works — no new tab, no new login.
Fly.io
Python FastAPI backend. Handles Slack webhooks, LLM routing, and platform API calls. Deploys globally with low latency and a clean deploy story.
Anthropic Claude
The reasoning engine. Sonnet for most operations, Haiku for classification, Opus for complex analysis. Per-intent model routing to balance cost and quality — not every question deserves the most expensive model.
Platform adapters
Abstraction layer over Impact.com, Everflow, Tune, and Trcker APIs. Read and suggest for all platforms. Each adapter normalizes partner, conversion, and performance data into a common schema so Ezra can reason about your program regardless of which platform you're on.
Postgres
Workspace configs, conversation history, API credentials (encrypted), and a full audit trail. Boring and good.
The manager-in-the-loop architecture
Every action flows through suggest → approve → execute. This isn't just a UX choice — it's a technical constraint baked into the adapter layer. Write operations require an explicit approval token from the Slack interaction payload. There's no code path that bypasses it.
Ezra will pull data, analyze trends, and draft recommendations on its own. But changing a commission rate, pausing a partner, or sending an outreach message? That requires you to tap "Approve" in Slack first. Every time.
The hard problems
Platform API inconsistency
Impact, Everflow, Tune, and Trcker all model affiliates, conversions, and commissions differently. The normalization layer is where most of the complexity lives. Same concept, four different schemas, four different auth flows, four different rate limits.
LLM routing
Not every question needs the most expensive model. Ezra classifies intent first, then routes to Haiku (quick lookups), Sonnet (analysis), or Opus (complex reasoning). Misrouting wastes money or produces bad answers. Getting the classifier right is an ongoing calibration problem.
Morning briefings
Pulling overnight data across timezones, computing meaningful deltas against 7-day averages, and presenting it in a scannable Slack message at exactly 8am local time. Sounds simple. Isn't.
Conversation memory
Ezra needs to remember your program's patterns — "this partner always spikes on Fridays" — without retaining data longer than necessary. Memory is auditable and deletable on demand.
Security and privacy
- API credentials encrypted at rest (AES-256)
- Claude API in no-training mode — your data never trains a model
- Slack's permissions model respected — Ezra only accesses channels it's invited to
- "Delete my data" wipes credentials, conversation history, and cached data within 24 hours
- Full audit trail for every action Ezra suggests and every approval you give