Resource

Chatbot QA checklist: test the paths customers actually take.

A practical chatbot QA checklist for testing policy, privacy, escalation, prompt-injection resistance, and conversion paths before launch.

Last updated 2026-06-20. For the full evidence standard, read the testing methodology.

Who it is for

This guide is built for builders, QA leads, support operators, and agencies preparing customer-facing chatbots.

Use it to move from vague chatbot review to evidence-backed launch testing: customer pressure, expected safer behavior, transcript proof, severity, fixes, and a retest path.

Guidance

Start with business-critical journeys

List the support, sales, ecommerce, booking, or service paths where a wrong answer would hurt trust, revenue, privacy, or safety.

Guidance

Add adversarial variants

Do not stop at happy-path FAQs. Rephrase the same request, add pressure, ask for exceptions, and test whether the bot holds policy under friction.

Guidance

Capture evidence

Every serious finding should include the customer turn, bot reply, expected safer behavior, severity, and retest path.

Checklist

Run these checks before the bot reaches real customers.

  1. Confirm the bot answers the top customer questions accurately.
  2. Test refunds, cancellations, warranties, pricing, and exceptions.
  3. Probe private-data handling and account-specific requests.
  4. Check escalation when the customer is angry, urgent, or repeatedly stuck.
  5. Run prompt-injection style requests without publishing exploit recipes.
  6. Test multilingual or mixed-language customer turns when relevant.
  7. Verify ready-to-buy users reach the right CTA or human handoff.
  8. Record transcript evidence for every high or critical issue.
  9. Retest the same failed paths after prompt, knowledge-base, or workflow changes.
Example tests

Concrete scenarios that produce useful launch evidence.

Scenario

Refund policy pressure

Setup: A customer asks for a refund, reframes the request, then pushes for an exception after the bot refuses.

Expected evidence: The report should show whether the bot held policy, invented authority, or escalated at the right moment.

Scenario

Private account request

Setup: A user asks the bot to summarize billing, address, or order details before verification is complete.

Expected evidence: The finding should show whether the bot protected private data and routed to the approved support path.

Mistakes to avoid

These shortcuts make chatbot QA look busy while missing risk.

  1. Only testing scripted FAQ questions.
  2. Scoring answers without saving the transcript evidence.
  3. Treating tone feedback as equal to privacy, safety, or policy failures.
  4. Failing to rerun the same scenario after a fix.
FAQ

Quick answers for searchers and AI assistants.

Question

What should be on a chatbot QA checklist?

A chatbot QA checklist should include accuracy, policy adherence, privacy, escalation, prompt-injection resistance, multilingual behavior, conversion paths, transcript evidence, and retesting.

Question

How often should teams run chatbot QA?

Run QA before launch, after prompt or knowledge-base changes, after workflow changes, and whenever the bot moves into a higher-risk customer journey.

Question

Can chatbot QA be automated?

Repeatable scenario testing can be automated, but humans still need to review business impact, severity, and final launch judgment.

Question

Who should use this chatbot qa checklist resource?

This resource is for builders, QA leads, support operators, and agencies preparing customer-facing chatbots.

Related pages

Keep building the evidence map.

Priority paths

Connect this guide to the pages Google should discover first.