How it's built

The stack, the timeline, the hard parts.

Ezra is an AI friend that lives in iMessage. Here's the whole stack, in case you want to build your own. Or you can just text the existing one.

The stack

1. A dedicated Apple ID

Someone has to own the iMessage handle Ezra replies from. That someone is a separate Apple ID, not yours. Free to create, takes ten minutes. Apple rate-limits new accounts hard for the first couple of weeks, so plan around it. Add a phone number and a payment method to the account or iMessage will silently refuse to send to half your testers.

2. Loop Message — the iMessage relay

Apple doesn't have a webhook API for iMessage. Loop Message bridges it: inbound texts become HTTP POSTs to your server, your server's HTTP responses go back out as outbound iMessages. The closest alternative is running a Mac mini as a server with BlueBubbles or a similar bridge, which works but you'll spend a lot of weekends debugging it. Loop is paying for someone else to debug it.

3. Fly.io — the backend host

Python FastAPI app. Fly is good for global low-latency with small machines and a clean deploy story. Render, Railway, Cloudflare Workers, or a cheap VPS would all work fine here. The host is the least interesting choice in the stack.

4. Anthropic API — Claude

The brain. Sonnet for most things, Haiku for the fast cheap stuff. The model is replaceable in principle. In practice you'll spend weeks tuning prompts against one model family, and switching means redoing that work, so pick the one you want to live with.

5. Composio — OAuth and integrations

Gmail, Calendar, Notion, Drive, and around two hundred other apps. Lets Ezra read your inbox or schedule a meeting without you handing over passwords. Free tier covers early users, paid scales with connection count. The alternative is wiring every OAuth flow yourself, which is a job, not an afternoon.

6. Postgres with pgvector

Message history, OAuth tokens, embeddings, memory. Postgres is boring and good. pgvector handles semantic search on user history without bolting on a separate vector DB. A standalone vector DB is a complexity tax most builds don't need to pay.

7. About 1,700 lines of voice rules

This is the moat. Generic Claude wrappers feel like assistants. The voice rules — system prompts, response heuristics, mood and timing logic, things Ezra refuses to say, ways he handles ambiguity, how he ends conversations — are what make him feel like a friend. They get evaluated against around 250 test prompts on every change. Most of the engineering hours go here, not into the infra above.

Timeline

Realistic build timeline: two to four weeks if you've shipped something like this before, two to three months if you haven't. Most of that time is voice rules, not plumbing.

The hard parts (the things nobody tells you)

The voice rules are the product

You can wire up the infra in a weekend. You cannot get the voice right in a weekend. Plan for weeks of iteration — drafting rules, reading transcripts, finding the lines that feel off, rewriting the rule that produced them. The infra is the easy part. The personality is the company.

iMessage delivery has rough edges

Apple rate-limits new accounts. Replies are sometimes delayed for reasons that aren't your fault and aren't your problem to fix. Group chats behave strangely. Schedule a few Saturdays for this and it'll be fine, but don't be surprised on Saturday one.

Privacy is ongoing work, not a doc

It's tempting to write a privacy page and call it done. Real privacy is a delete-me command that actually deletes, audit logs that actually log, encryption keys you can rotate, and a memory model that doesn't quietly retain things it claimed to forget. Our commitments are here. They were architectural decisions before they were copy on a page.

Memory is the second-hardest problem

"Throw it in a vector DB" is a meme, not an architecture. What gets remembered, when, how recall is gated, what gets forgotten on demand, what's surfaced proactively versus only when asked — these are product decisions before they're engineering ones. Get them wrong and your AI feels either amnesiac or invasive.

The thing that looks the same is usually different

You can describe Ezra as "Claude with iMessage in front of it" and an engineer will agree with that sentence and then build something that doesn't work. The gap between "wired up" and "feels like a friend" is most of the work.

Or just text him

If you'd rather skip the part where you spend two months tuning prompts against a fictional friend persona, the existing one is right there.