Explore otherproducts
7 min

How to Build a Voice Agent That Actually Works - And the Mistakes That Will Tank It

Over the past year, I’ve configured dozens of voice agents for our customers, and during that time I’ve seen the same mistakes play out over and over again. Here’s what actually matters.

11.06.2026

 

Most agents that "don't work" don't actually have an agent problem. They have a configuration problem. AI voice agents are incredibly capable — but they don't come with a built-in understanding of what you need them to do. If you don't tell the agent how to handle customers, how to work through issues, and what it's supposed to accomplish in a conversation - it'll figure something out on its own. Unfortunately, not always in the way you’d expect.

When an agent isn’t performing, I start by reviewing the configuration - it’s usually where the issues are hiding.

To make this blog concrete, I'll compare two agents. One - let's call him Lucky was configured thoughtfully and intentionally. The other - Unlucky, was given a bit less love. Generic prompt, massive unstructured knowledge base, no real testing. Shipped straight to production.

Here are the five differences that separate them. And the five mistakes I see most often.

Mistake #1: A Prompt Without a Goal Is a Recipe for an Encyclopedia Agent

Let's start with how each agent was described in the prompt.

Unlucky "You are a professional AI assistant. Your job is to provide users with accurate and comprehensive information about our products and answer their questions thoroughly."

Lucky "You are a helpful, direct, and natural-sounding voice assistant. Your goal is to keep the conversation simple and clear, and to move the user toward their next step."

At first glance, Unlucky sounds fine. Professional, thorough — what's not to like? Here's what.

A snippet from a real Unlucky conversation:

Customer: "I don't really understand what you offer — can you help?" Unlucky: "Our company provides advanced AI-powered solutions that support communication and sales processes across organizations." Customer: "Okay… but what does that actually mean?" Unlucky: "The solution can be applied across a range of business areas depending on your organizational needs."

Technically accurate. Completely useless. Now Lucky, with two extra lines in the prompt, handling the same question:

Lucky: "Sure. Short version — a voice agent talks to your customers instead of a human, or supports your team when they're handling calls." Customer: "And what can it actually do?" Lucky: "Full inbound handling — answering calls, qualifying leads, collecting whatever info you need. Want me to walk you through an example?"

The difference in the prompt text? Literally two lines. The difference in behavior? Enormous. The key phrase in Lucky's setup is "move the user toward their next step." Without it, the agent talks. With it, the agent does something.

The question worth asking before you write any prompt: What does a successful conversation look like? A booked demo? Collected data? A customer who knows exactly what to do next?

Without that answer, the agent will invent its own goal. And that goal probably isn't yours.

Mistake #2: No Conversation Flow

Unlucky was given instructions that sound reasonable on paper:

  • "Respond in detail and comprehensively"
  • "Cover all relevant aspects of the topic"
  • "Don't sacrifice accuracy for brevity"

The result? Someone calls to book a demo:

Unlucky: "Of course. A demo will allow you to explore the features and potential applications of our solution. Please reach out to our sales team to arrange one."

He answered. The conversation ended. The customer now has to go find the sales team themselves. Meanwhile, Lucky:

Lucky: "Love it — great call. Let's find a time that works for you. Do you want something this week, or are you thinking next week?" Customer: "Tomorrow works." Lucky: "Tomorrow it is. Any time preference, or should we pick something for you?"

The difference? Lucky had one instruction: "Always try to take the next step in the conversation." That's it. No complex branching logic, no elaborate decision trees — one sentence in the prompt changed everything.

A solid conversation flow looks like this at minimum: greeting → qualification → action. If you don't give the agent that structure, it'll invent its own. You'll get a perfectly pleasant conversation that goes nowhere.

Mistake #3: A Corporate Persona Instead of an Actual Conversationalist

This isn't about the agent's identity - it's about how it actually talks. Unlucky: formal language, complete sentences, no casual phrasing, sounds like an expert. Sounds great on paper. In practice:

Customer: "It's too expensive." Unlucky: "The price of the solution depends on a number of factors, including the scope of implementation and the specific needs of your organization."

And that's that. Customer hangs up.

Lucky was set up to speak naturally - like a consultant, not a corporate brochure. Simple language, helpful but not pushy. His response to the same objection:

Lucky: "Totally fair. Is it more about hitting a specific budget number, or is it more about whether the ROI actually makes sense for your business?" Customer: "Honestly, more the second one." Lucky: "That I can help with. Tell me a bit about your situation and I'll show you what the impact looks like on a real example."

Lucky didn't just handle the objection - he surfaced exactly the information a sales rep would need to take over the call. Unlucky was a company Wikipedia. Lucky was an actual conversation partner.

Mistake #4: Fallbacks: How Much Are Your Leads Actually Worth to You?

90% of callers aren't the textbook-perfect customer your scenario was written for. Someone is on the fence. Someone has no idea what they want. Someone says "I'm not sure if this is for me." Someone just wants to talk to a human.

Here's how an agent without fallbacks handles "I'm not sure this is for me":

Unlucky: "Our solution can be applied across many industries and use cases."

Conversation over.

Lucky has a clear instruction: ask one specific question to move things forward — keep it simple and closed-ended (yes/no, or a concrete choice). Not "tell me about your needs." His response to the same hesitation:

Lucky: "Let's figure that out real quick. Tell me — are you mostly handling inbound calls from customers, or are you doing more outbound outreach?" Customer: "Mostly inbound." Lucky: "That sounds like a strong fit. We can automate your incoming calls and tune the agent specifically for how you operate."

Lucky is built for a specific customer profile, so his exact voice and pacing won't be right for every use case - that's fine. But the principle holds: a well-configured agent can handle unexpected situations because it has a plan. A poorly configured one just says the same thing regardless of context.

Mistake #5: Response Length: The Longer You Talk, the Faster You Lose People

This is the mistake I see most often with new clients. And it seems trivial until you see it in action.

Unlucky, asked "How does this work?" — launches into a full response. Twenty-two seconds. Accurate information. Zero call to action at the end.

Lucky, same question:

Lucky: "Quick version — the agent picks up the call and handles the request. What that looks like depends on what you need: collecting info, filtering calls, routing to the right team instead of a painful phone tree. Want to see it on a real example?"

Short, concrete, ends with a question. This has nothing to do with which AI model you chose or how advanced the engine is. It's purely a configuration decision.

The longer your agent talks, the faster you lose the caller. On a phone call, there's no rewinding. If someone loses the thread, they hang up.

Pre-Launch Checklist — Check These Before Your Agent Goes Live

A common trap: you test the agent once, it answers the way you wanted, it knew the answer to one question - ship it. In reality, you should test at least five real-world scenarios. Ask your team what calls they got last week. Those are your test cases.

And before you even get to testing, make sure these are actually addressed in your configuration:

  • Conversation goal - Does the agent know what success looks like? Does it have a defined next step it's actively working toward?
  • Conversation flow - Is there a structure: greeting → qualification → action? Is the agent actively moving the caller forward?
  • Specific instructions - Does the prompt include examples? Does the agent know what it should not do (e.g., never quote specific prices)?
  • Tone and style - Should it sound like a consultant or an encyclopedia? Is that clearly spelled out in the prompt?
  • Fallbacks - What happens when a caller is undecided, goes quiet, changes the subject, or asks for a human?
  • Response length - Is there a limit set? Do responses end with a question or a clear next step?
  • Knowledge base - Is it focused and well-organized? More isn't better. Structure is better.
  • Voice testing - Does the agent's voice fit your brand? Is the pacing right? Check this before your customers do.

One Last Thing

If there's one thing I want you to take away from this: the difference between an agent that works and one that doesn't isn't the AI. It's entirely the configuration.

A voice agent is a tool. A genuinely great tool — I wouldn't spend my days on them otherwise. But like any tool, it requires you to know why you're using it and to set it up properly.

When you hire someone new to answer your phones, you don't just say "pick up and answer questions." You tell them: if it's Sarah from Company X, put her through to Mike. If they ask about pricing, let them know we'll follow up with a quote. If they don't know what they want, ask them this and this. That sticky note with instructions? That's your prompt.

Write it like you mean it.

Start generating leads today!

Start a 14-day free trial now,

set up the widget on your site, and see how many more leads you can capture with CallPage

  • No credit card required
  • 10 minutes set up
  • 14 days fully-features free trial