For thirty years, customer service was a menu. Press 1 for English. Press 2 to block your card. Press 3 for your last transaction. Even when your card had been stolen and was bleeding cash, you pressed 2. Then 3. Then 0 to scream at a human. We didn't fix this when we built smarter menus. We fixed it when we built a translator.
The interesting move in AI right now isn't "replace the rule engine with the LLM" — that's a category error you can lose your job over. It's "keep the rule engine, swap the menu for a conversation."
1. The bug was never the rules. It was the doorway.
Your IVR business logic is correct. Block card. Fetch last transaction. Transfer to agent. These are deterministic, audited, traceable — and they should stay that way. None of that needs an LLM.
What needs an LLM is the doorway — turning "my card was stolen and I need to know if anything moved on it" into three function calls against systems you already trust.
2. The LLM is a dispatcher, not a database.
Tool-calling flipped the architecture. The model doesn't know your last transaction — it knows it should call getLastTransaction(userId) and wait for the result. Your APIs stay canonical. The model becomes the routing layer.
This is the part most teams get backwards. They try to stuff knowledge into the prompt, fine-tune the model on their schema, build a RAG pipeline that re-derives data the database already owns. Stop. Your database is the source of truth. The LLM's job is to figure out which question is being asked and which function answers it.
// IVR routing
switch (digit) {
case "1": setLanguage("en"); break;
case "2": blockCard(userId); break;
case "3": showLast(userId); break;
case "0": transferToAgent(); break;
default: replayMenu();
}// Tool schemas — LLM decides
const tools = [
{ name: "blockCard",
params: { userId: "string" } },
{ name: "getLastTransaction",
params: { userId: "string" } },
{ name: "transferToAgent",
params: { reason: "string" } },
];
// model picks tools + args from
// "my card was stolen — what moved?"A quick gut-check before you scale this
The math on hybrid systems isn't obvious. Most teams either over-route to the LLM (expensive) or under-route (defeats the point).
What does the hybrid actually cost at scale?
3. Always wire the kill switch.
The LLM will hallucinate, mis-route, or pick the wrong tool. Not often. Not catastrophically. But unpredictably enough that production needs a fallback.
Every system I've shipped with an LLM in the loop has the same escape valve: low confidence → fall back to the deterministic path. The model says "block their card" with 80% confidence — fine, execute. Says "refund last three charges" with 45% confidence — don't execute, hand to a human or replay a small menu.
Customer types: 'block my card and refund last 3 charges'. The model returns blockCard at 0.82 confidence and refundCharges at 0.45 confidence. Best move?
So what's the actual shape?
Deterministic systems do what they've always done — execute business rules predictably, leave an audit trail, fail in legible ways. The LLM sits in front, listens to messy human intent, picks tools, and gets out of the way. When it's unsure, the deterministic path takes over.
In your last AI build, which mistake did you make?
It's not that AI replaced the rule engine. It's that we finally have something that can stand at the door and understand what people are actually asking for — then pass them through to the system that already knew the answer.
The menu was never the answer. It was the apology for not having one.
What do you push back on?