← playbook9 min read

Playbook · Architecture

How to build a custom MCP in your app.

An opinionated playbook for the in-app HTTP MCP pattern. Build it once inside your existing TypeScript app, and it pays back twice: at the keyboard interactively, and on a schedule via Routines.

Most of what an AI agent does in your codebase is read files, run commands, and call third-party APIs. The interesting cases are the ones where it needs to read your data (the rows in your database, the state of your campaigns, the queue depth in your worker) and where it needs to take your actions: send the email, update the record, post the message. The Model Context Protocol is the standard surface for that, and a custom MCP is what you build when none of the off-the-shelf MCPs in the directory cover what your specific application can do.

The cost-benefit math behind building a custom MCP changed last week with Routines, because the same artifact now pays back in two places. It is the highest-leverage architectural move a small team can make for working with AI agents in 2026, and almost no part of building one is novel; what is worth getting right is the architecture, the auth, and the discipline of how you design the tools.

§01When a custom MCP is the right call

Most teams reach for a custom MCP later than they should. The default move is a cron job that writes a digest to Slack, or a scheduled Lambda that emails a templated report. Both work. Both lock you into a single rendering of the data, written once, never improved.

The case for an MCP is that it inverts the problem. Instead of writing a renderer for one specific report, you expose the underlying data and let the model render whatever the moment needs. The same MCP that powers your morning digest also powers the ad-hoc "what is happening in production" query you type at the terminal an hour later. That reuse is the whole point.

Custom is the wrong call when an off-the-shelf MCP already covers your case. The filesystem, github, postgres, and slack MCPs handle a meaningful percentage of the data you might want exposed. Check the directory before writing any code. Custom is the right call when your data lives behind your own auth, your own schema, or your own business logic, and the questions you want to ask of it are the kind only your app can answer cheaply.

§02The choice that matters most: in-app HTTP, not stdio

There are two ways to ship a Model Context Protocol server. The stdio path is a separate process that the client (Claude Code, Claude Desktop) launches as a subprocess, communicating over standard input and output. The HTTP path is an endpoint inside your existing application that the client connects to over the network. Most public examples online lean stdio because that is what the early reference implementations did. For a custom MCP that exposes your app's data, HTTP is the right call almost every time, and the rest of this playbook assumes it.

The reason is reuse. An HTTP MCP lives next to the rest of your app's API surface. It is deployed by the same pipeline, shares the same auth context, the same database connections, the same env vars, and the same secrets manager. You do not run a separate process. You do not maintain a separate deploy. The MCP becomes one more route in the app you already operate, and the cost of adding it is roughly the cost of adding any other route.

The other reason is Routines. A scheduled Routine running on Anthropic's cloud cannot launch a stdio process on your laptop. It can hit an HTTP endpoint that lives on your production deploy. If you want the same custom MCP to work both at your keyboard and on a schedule, HTTP is the only path that crosses both surfaces.

§03Pick the host

For a TypeScript team, the default host is your existing Next.js (or Hono, or Express) app, and the default package is mcp-handler on npm. The package wraps the MCP wire format and exposes a small declarative API for defining tools. Drop a route at app/api/mcp/[secret]/[transport]/route.ts, and you have a working MCP endpoint with no separate server to deploy.

The shape of a minimum endpoint is roughly:

import { createMcpHandler } from "mcp-handler";
import { z } from "zod";

const handler = createMcpHandler(
  (server) => {
    server.tool(
      "get_today_summary",
      "Returns today's structured operations summary. Call this when the user asks for the morning briefing or a status check on the system as a whole.",
      { date: z.string().optional() },
      async ({ date }) => {
        const summary = await buildOperationsSummary({ date });
        return {
          content: [
            { type: "text", text: JSON.stringify(summary, null, 2) },
          ],
        };
      },
    );
  },
  { serverInfo: { name: "your-app", version: "0.1.0" } },
);

export { handler as GET, handler as POST };

The pattern translates one-for-one to other Node hosts. Hono and Express need a thin adapter; the rest is the same shape.

§04Design the tools

A custom MCP tool falls into one of two families. Read tools return data: a query, a digest, a status check, a search. They are pure functions of inputs to outputs, with no side effects on your system. Action tools cause something to happen: an email goes out, a row gets updated, a Slack message gets posted, a ticket is created. They are write operations dressed in tool-call clothing.

The discipline is to keep these families separate. A tool that reads and writes in the same call (get_and_label_issues) is harder for the model to reason about, harder for the user to inspect when something goes wrong, and a small step toward the kind of tool soup that makes agents drift. One tool, one verb, one effect.

Two more rules that pay back over time. Tool schemas should be tight: Zod everywhere, name every input, mark optionals explicitly, and add .describe() calls so the model knows what each field means. Tool descriptions are the system prompt for that tool. The string you pass as the second argument to server.tool is what the model reads when deciding whether to call the tool at all. A description that says only what the tool does is mediocre. A description that says when to call this tool, and when not to is what gets the call rate right.

§05Auth: pick the boundary that matches your data sensitivity

Three authentication patterns, in roughly increasing order of seriousness. A URL secret, where the entire endpoint sits behind a long random path segment that acts as the credential. Cheap, fine for personal projects, dangerous if the URL ever gets logged or shared. A bearer token in the Authorization header, generated server-side and rotated on a schedule. The default for production. OAuth, when multiple users share the MCP and each user's tools should run with that user's permissions. Heavyweight, and worth the weight only when the MCP genuinely serves more than one human.

For most teams shipping a single-user internal MCP, the URL-secret pattern paired with an env-var check is the right starting point. The shape is straightforward: a route segment carries the secret, the handler compares it to a server env var on every request, and a mismatch returns a 404. The 404 (rather than a 401) is deliberate. It leaks no information about the URL being a real endpoint, so a probe against the route looks identical to any other 404 in your logs.

Whichever pattern you pick, two rules survive contact with reality. Rotate the credential on a schedule, never on demand. And never paste the URL into a Slack channel, even a private one. A leaked URL is an open back door for anyone who finds it, and the smaller blast radius is the whole reason the secret exists.

§06Wire it to Claude Code

Once the endpoint is live, registering it with Claude Code is one command:

claude mcp add your-app --url \
  https://your-app.com/api/mcp/$MCP_SECRET/sse

For clients that prefer a JSON config (Claude Desktop, others), the entry shape is:

{
  "mcpServers": {
    "your-app": {
      "url": "https://your-app.com/api/mcp/<secret>/sse"
    }
  }
}

Verify with claude mcp list. Then test interactively before you wire anything scheduled. Open a Claude Code session in any project, ask the agent to call your read tool, and confirm the response shape. The first call is where you find every schema typo and every authentication mistake. Finding them at the keyboard is far cheaper than finding them in a Routine that fires at three in the morning and silently fails.

§07Compose with Routines

The whole reason to ship a custom MCP in 2026 is that it composes with Routines. Three patterns are worth knowing.

The morning-digest pattern. A read tool that pulls today's structured state, paired with an action tool that sends an email. A Routine fires daily, calls the read tool, lets the model decide what is worth saying, and triggers the action tool to deliver. The whole thing is one Routine prompt and the existing MCP.

The alert-triage pattern. The same MCP, plus an API trigger on the Routine. Your monitoring tool detects a threshold breach and POSTs to the Routine's /fire endpoint with the alert body as text. The Routine reads the relevant slice of state from the MCP, correlates with whatever else the MCP exposes, and opens a draft pull request, a ticket, or a written notification with a proposed next action.

The PR-review pattern. The same MCP, plus a GitHub trigger on the Routine. Each new pull request fires the Routine. The Routine pulls additional context from your app's MCP (which feature flag the change touches, which tables the migration affects, which tickets reference the file), and posts an inline review more useful than a generic checklist could ever be.

Three patterns, one MCP underneath. That asymmetry is what makes a custom MCP the highest-leverage artifact to build right now.

◆ pull quote

The MCP is the artifact. Routines are the multiplier. The artifact pays back interactively, then again on every schedule.

§08Gotchas worth knowing before they cost you

Token weight is the first one most teams hit. A read tool that returns a thousand-row JSON blob will eat the model's context budget on a single call and degrade every subsequent reasoning step. Cap the response size in the tool itself, page when the natural answer is long, and prefer summary tools over full-dump tools when the consumer is going to summarize anyway. The model does not need the full table; it needs the slice the question is about.

Idempotency on action tools is the second one. A flaky network connection causes a retry. The retry sends the email twice, charges the customer twice, posts the comment twice. Action tools should be idempotent by design: include a request key in the schema, deduplicate on the server side, and return the original result on a duplicate call rather than re-executing.

Rate limits on the underlying APIs your tools wrap are the third one. The Routine that runs daily does not care, but the agent that has been told to "list every issue and triage it" calls the same MCP tool a hundred times in three minutes. Add a rate-limiting layer at the tool boundary so the agent's enthusiasm cannot crash an external service or land you in a Stripe support ticket.

Secret rotation is the fourth. URL secrets that have been live for six months are URL secrets that have been logged, screenshotted, and pasted into a debug session somewhere. Treat the secret like an API key with a rotation schedule, and rotate on a calendar rather than waiting for a leak.

A custom MCP is the highest-leverage artifact a small team can build for working with AI agents in 2026. It encodes your domain at the right layer, exposes the right surfaces to the model, and pays back twice over the lifetime of every working session and every scheduled Routine. The build cost is a weekend. The compounding return is what justifies the weekend.

◇ summary · field notes
$ vibgineer summarize how-to-build-a-custom-mcp
the artifact
Custom MCP
  • your data, exposed as read tools
  • your actions, exposed as action tools
  • your auth, defining the boundary
  • lives inside your existing app
  1. 01
    Design the tools
    • read tools return data
    • action tools cause side effects
    • tight Zod schemas per tool
    • tool descriptions are the prompt
  2. 02
    Wire the auth
    • URL secret for prototypes
    • bearer token for production
    • OAuth for shared, multi-user MCPs
    • rotate on a schedule
  3. 03
    Connect the agent
    • claude mcp add for Claude Code
    • JSON config for other clients
    • test interactively first
    • then attach a Routine
✓ 1 server · 2 tool families · 3 setup decisions.
Summary: Custom MCP (your data, exposed as read tools, your actions, exposed as action tools, your auth, defining the boundary, lives inside your existing app). Step 01: Design the tools (read tools return data, action tools cause side effects, tight Zod schemas per tool, tool descriptions are the prompt). Step 02: Wire the auth (URL secret for prototypes, bearer token for production, OAuth for shared, multi-user MCPs, rotate on a schedule). Step 03: Connect the agent (claude mcp add for Claude Code, JSON config for other clients, test interactively first, then attach a Routine). ✓ 1 server · 2 tool families · 3 setup decisions.