Hey HN – Metrx, scorecard for AI agents to understand and optimize their worth

The problem I kept hitting: I'm a solo founder running AI agents (Codex, Claude Code, custom pipelines) across my projects. I could see my total LLM bill. But I couldn't answer the question that actually matters: which agents are creating business value and which are dead weight?

It's the same visibility gap that existed for human workforces before performance reviews. I was paying 11 agents with zero accountability.

What Metrx does: It's an MCP server with 23 tools across 10 domains that gives each agent a scorecard — a P&L statement showing both what it costs and what it produces. An agent can:

- Check its own ROI (cost vs. attributed revenue via Stripe/HubSpot and generic webhook) - See its performance grade (A+ to F) relative to other agents - Get optimization recommendations (switch models, reduce waste) - Generate board-ready ROI audits for human decision-makers - Run A/B model experiments with statistical significance testing

The server sits between your agents and your LLM providers, tracks every call through a Cloudflare AI Gateway, tags each call by agent identity, and exposes both cost and revenue attribution as MCP tools.

The key insight: Cost tracking tells you what you spent. Revenue attribution tells you what you earned. Together, that's a P&L per agent — and that's what lets you manage AI agents like a real workforce. Most "AI cost tools" stop at the cost side. The revenue attribution is what makes this a scorecard, not a billing dashboard.

The 10 tool domains (all prefixed `metrx_*`): 1. Agent fleet overview & performance summaries (3 tools) 2. Optimization — model routing, provider arbitrage (4 tools) 3. Budget management & enforcement (3 tools) 4. Alert monitoring & failure prediction (3 tools) 5. A/B model experiments with statistical analysis (3 tools) 6. Cost leak detection & waste scanning (1 tool) 7. Revenue attribution & ROI calculation (3 tools) 8. Alert threshold configuration (1 tool) 9. Board-ready ROI audit reports (1 tool) 10. Upgrade justification business cases (1 tool)

Stack: Next.js 14, Supabase (Postgres + RLS), Cloudflare AI Gateway, Vercel, TypeScript. The MCP server is published on npm and listed on Smithery.

Try it in 30 seconds — no signup needed: ``` npx @metrxbot/mcp-server --demo ```

Or if you want the full scorecard dashboard + revenue attribution + team features: https://metrxbot.com

Pricing: Free (3 agents) → Lite $19/mo (10 agents) → Pro $49/mo (unlimited)

What I'd love feedback on: 1. For those running AI agents in production — can you attribute revenue to specific agents today? Or is it all aggregate spend? 2. Is the MCP approach (giving agents themselves performance awareness) useful, or do you prefer human-only dashboards? 3. The "agent generates its own ROI audit" feature — clever or too weird?

Happy to answer any technical questions about the architecture.