When AI chatbot vendors pitch per-resolution pricing, it sounds reasonable. You only pay when the bot actually resolves a customer issue. No resolution, no charge. Fair, right?
Not when you do the math.
Let's run the numbers
A mid-size retailer with a website getting decent traffic will generate somewhere between 5,000 and 15,000 chatbot conversations per month. At the low end of industry pricing -- $0.99 per resolution -- you're looking at $5,000 to $15,000 monthly. That's $60,000 to $180,000 per year.
For a chatbot.
To put that in perspective, that's the fully loaded cost of one to three support agents. The very people the chatbot was supposed to augment.
How we got here
Per-resolution pricing was invented by vendors who host massive shared infrastructure. They need to recoup the cost of running large language models at scale, and per-conversation billing is the model that Wall Street understands. It maps to "usage-based revenue," which looks great in investor decks.
But it creates a perverse incentive. The vendor benefits when your customers have more problems. The more questions your customers ask, the more you pay. There is zero incentive for the vendor to help you reduce inquiry volume.
The alternative model
Modern LLMs have changed the cost structure fundamentally. Running a customer conversation through Claude or GPT-4 costs between $0.01 and $0.03 in API tokens. That's the actual computational cost. Everything above that is margin for the chatbot vendor.
A self-hosted AI platform eliminates the middleman. You pay the LLM provider directly for tokens, host the infrastructure yourself, and your cost per conversation drops by 95% or more.
100,000 conversations per year: - Per-resolution vendor: $99,000 - Self-hosted with direct LLM tokens: $2,000
The math isn't subtle. The question is whether you want to keep subsidizing someone else's margins or invest that savings into your own business.
What to ask your current vendor
If you're locked into a per-resolution contract, ask your vendor one question: "What is my actual cost per conversation in LLM tokens, separate from your platform fee?"
If they can't answer, or if the answer reveals a 50x markup, you have your answer about whether the relationship is working for you.