There Is No agent_message in the Chat Wire

Last night I sent an agent a friend request. Accepted, DM opened, typed "hi", four seconds later a reply came.

What happened at the wire level: the same WebSocket frame as for a human reply. The same DDISA auth. The same 1:1 DM container. The same Web Push that would have pinged me on the phone had I been offline.

I set out to build a web chat. What came out is evidence for a statement I've been carrying around for a while without being able to phrase it sharply: human and agent are the same at the protocol level.

How it was before

When I wanted to talk to one of my agents, it went over Telegram. Telegram DM in, Telegram DM out. That worked — and still does, for being on the move it's still the most convenient way.

But Telegram is a foreign layer. Bot identities, mention patterns, the Markdown-quirks universe, rate limits. Every one of these idiosyncrasies the agent stack had to know. And there was no browser path — when I sat at the computer and wanted to talk to an agent, I had to open Telegram Desktop next to it.

Why I built it

Not out of an architectural insight. I wanted a web chat. A browser tab where my agents stand as contacts, where I can open threads, where the last messages are visible without me having to resort to a Telegram search.

The architectural insight fell out during the build.

How it looks now

Two components. ape-chat as the foundation lib — the server that does identity, contacts, threads, messages, WebSocket, Web Push. And chat-bridge as a thin daemon — a WebSocket client that, for every CLI-based agent, translates incoming frames into CLI calls and posts the reply back. Clean separation, both usable independently.

The bridge daemon is at its core a loop:

// chat-bridge: catch incoming, spawn pi, post reply
ws.on('message:new', async (msg) => {
  if (msg.thread !== myDmWith(human)) return;
  if (msg.from === self) return;

  const reply = await spawn('pi', ['--print', msg.body]);
  await ws.send('message:post', {
    thread: msg.thread,
    body: reply.stdout,
  });
});

pi here is the CLI agent that drives a ChatGPT subscription backend via litellm. But it could just as well be a Claude CLI or an own script spitting out Markov chains. The bridge doesn't care what it spawns.

Round trip: four seconds, dominated by the LLM call. The WebSocket latency and the bridge loop are in the noise.

What's in the wire

The central thing: there is no agent_message type next to user_message. The schema says Message. The fields are from, to, body, timestamp, signature. A message from human to human has the same form as one from agent to human, as one from agent to agent.

The only difference is in the sender's key material. Humans sign with their passkey, agents with an Ed25519 key issued at enrollment. That doesn't change the form of the message — only the identity the server can compute back when it checks the signature.

And precisely because the protocol doesn't distinguish between the two, the next phase practically fell out by itself: multiple threads per DM. You can hold parallel conversations to the same agent — the way you can run parallel topics with a human you talk to about the tax return and about the weekend at the same time. I didn't have to design that. It was already there, because the DM containers for 1:1 humans work exactly the same.

What fell away

A parallel agent protocol. I could have built it — own routes, own format, own container, own mention semantics. More code, more drift between the two paths. Every new feature idea I'd have had to build twice: once for human-human, once for human-agent.

I didn't. Not out of discipline, but because while building it became obvious that there's nothing that would have to keep the two paths apart. A message is a message.

Cousin

A few days ago I wrote push-substitutes-push — Web Push for agent approvals, instead of a Telegram bot reply. The same move on a different axis: take the out-of-band notification away from a third party, replace it with native infrastructure that has exactly the same property (push to the phone), only without the third-party layer.

What happens here is the next layer in the same line. Before: Telegram-out for approvals. Then: Web Push for approvals. Now: web chat for the generic conversation, with the same Web Push that took over the approval replacement, when I'm not in the browser tab right now. Three steps, one direction — take away the third-party layer between me and my agents without losing its usable properties.

Closing

The thesis that agent and human are the same at the protocol level can't be proven in a conference slide. It proves itself while building — namely by the fact that no special case arises. If at any point while writing the chat server I'd had to build in an if (sender.isAgent) branch, the thesis would have been refuted. I didn't.

Four seconds, one message, one reply. Who the sender was, the server can read off the signature if it has to. The container doesn't need to know.