Test Scenarios

Scenarios let you test how your AI agents handle different situations before going live. Think of them as automated quality assurance tests that run realistic conversations with your agents.

Why Use Scenarios?

Before deploying an AI agent to handle real customer calls, you need to know:

Will it handle angry customers appropriately?
Can it process refund requests correctly?
Does it follow your company’s policies?
Will it maintain quality across different customer types?

Scenarios answer these questions by automatically testing your agent against multiple customer behaviors and measuring the results.

How Scenarios Work

A scenario defines a specific situation (like “customer requesting refund”) and automatically tests it across different customer personalities and agent configurations.

Example: Create one “Refund Request” scenario with 5 customer personas and 3 agent variants. Chanl automatically runs 15 conversations (5 × 3) and scores each one.

The Core Formula

Scenario + Personas + Agents = Simulations

1 scenario × 3 personas × 2 agents = 6 automated test conversations

Each combination runs as a separate simulation, giving you comprehensive test coverage with minimal setup.

Creating Your First Scenario

Chanl provides both a visual UI wizard and an API for creating scenarios. Here’s how to do it:

Via UI
Via API

The scenario creation wizard guides you through 6 steps:

Step 1: Define the Scenario

Give your scenario a name and describe the situation you want to test. You can use variables to make scenarios reusable.

Scenario definition with name, tags, prompt description and variables

Key elements:

Name: Descriptive title (e.g., “Order Tracking”)
Tags: Organize scenarios by category
Prompt: Describe the customer situation
Variables: Make scenarios reusable with placeholders like {{customer_name}}

Step 2: Select Personas

Choose which customer personalities to test against your agent.

Persona selection showing various customer types with different emotions and speech styles

Pick diverse personas to ensure comprehensive testing:

Different emotional states (angry, friendly, stressed)
Various speech patterns (fast, slow, mumbled)
Multiple accents and languages

Step 3: Choose Target Agents

Select which agents to test with this scenario.

Agent selection showing connected agents from VAPI and custom providers

You can test:

Production vs staging agents
Different prompt versions
Various AI models (GPT-4, Claude, etc.)

Step 4: Select Score Criteria

Pick the scorecard that defines quality standards for evaluation.

Scorecard selection showing various evaluation criteria options

Choose scorecards based on your goals:

Customer service quality
Sales effectiveness
Compliance requirements

Step 5: Set Schedule

Configure how often this scenario should run automatically.

Options:

Once: Run immediately, manual reruns
Daily: Continuous regression testing
Weekly/Monthly: Periodic quality checks
Stop condition: Never, after date, or after N runs

Step 6: Preview & Launch

Review your configuration before running the simulations.

Preview showing all simulation combinations that will be created

The preview shows exactly how many simulations will run:

2 personas × 1 agent × 1 scorecard = 2 simulations

Click “Publish & Run” to start testing!

Managing Scenarios

After creation, view all your scenarios in one place:

Scenarios dashboard showing list of scenarios with scores and statuses

Dashboard features:

Total runs and average scores
Active vs completed scenarios
Quick access to results
Edit or rerun scenarios

Step 1: Define the Situation

Start with a clear description of what’s happening in this test:

{
  "name": "Product Refund Request",
  "prompt": "Customer purchased a wireless headphone 2 weeks ago. The Bluetooth connection keeps dropping. They want a full refund.",
  "personas": ["frustrated-customer", "analytical-customer"],
  "agents": ["prod-agent-v1", "prod-agent-v2"],
  "scorecard": "customer-service-quality"
}

Step 2: Add Prompt Variables (Optional)

Make scenarios reusable with variables:

{
  "name": "Product Refund Request",
  "prompt": "Customer purchased a {{product}} {{timeframe}} ago. {{issue}}. They want a {{resolution}}.",
  "variables": [
    {
      "name": "product",
      "type": "string",
      "default": "wireless headphone"
    },
    {
      "name": "timeframe",
      "type": "string",
      "default": "2 weeks"
    },
    {
      "name": "issue",
      "type": "string",
      "default": "The Bluetooth keeps dropping"
    },
    {
      "name": "resolution",
      "type": "string",
      "default": "full refund"
    }
  ]
}

Now you can run the same scenario with different products and issues without creating duplicates.

Step 3: Select Your Personas

Choose which customer behaviors to test:

curl -X POST https://api.chanl.ai/v1/scenarios \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Refund Request",
    "prompt": "Customer wants refund for defective product",
    "personas": ["frustrated", "analytical", "confused"],
    "agents": ["agent-123"],
    "scorecard": "service-quality"
  }'

Step 4: Choose Agents to Test

Select which agent versions or configurations to test:

Test new prompts against production baseline
Compare different AI models (GPT-4 vs Claude)
Validate configuration changes before deployment

Step 5: Assign a Scorecard

Pick evaluation criteria that match your test goals:Customer Service - Empathy, problem resolution, professionalismSales - Needs discovery, objection handling, closingCompliance - Required disclosures, policy adherence

Scheduling Automated Tests

Run scenarios automatically to catch issues before customers do.

Scheduling Options

Once

Run immediately, then manually trigger again when needed

Daily

Perfect for testing production agents every night

Weekly

Good for regression testing major scenarios

Monthly

Useful for comprehensive quality audits

Setting End Conditions

Control when scheduled tests stop:

{
  "schedule": {
    "frequency": "daily",
    "time": "02:00 AM EST",
    "endCondition": "after_runs",
    "maxRuns": 30
  }
}

End Condition Options

Never - Runs indefinitely (useful for continuous monitoring)
End Date - Stops after a specific date (good for limited testing periods)
After N Runs - Stops after specified executions (e.g., 30 days of daily tests)

Understanding Scenario Results

After running a scenario, you’ll see results for each simulation:

Reading the Results Dashboard

Scenario: Product Refund Request

Total Simulations: 6 (3 personas × 2 agents)

Combination	Score	Status
Frustrated + Agent V1	78	⚠️
Frustrated + Agent V2	92	✅
Analytical + Agent V1	85	✅
Analytical + Agent V2	88	✅
Confused + Agent V1	71	❌
Confused + Agent V2	82	✅

Key Finding: Agent V2 performs better with frustrated customers Recommendation: Deploy V2, improve V1’s empathy responses

Analyzing Patterns

Look for:

Persona weaknesses - Which customer types cause issues?
Agent comparisons - Which version performs better?
Consistent failures - What scenarios always score low?
Score trends - Are agents improving over time?

Common Scenario Templates

Customer Service

{
  "name": "Angry Customer Escalation",
  "prompt": "Customer has been on hold for 45 minutes and is extremely frustrated. They're threatening to cancel their account.",
  "personas": ["angry", "demanding"],
  "scorecard": "de-escalation-quality"
}

Sales

{
  "name": "Price Objection Handling",
  "prompt": "Prospect is interested in the product but says the price is too high compared to competitors.",
  "personas": ["price-sensitive", "skeptical"],
  "scorecard": "sales-effectiveness"
}

Technical Support

{
  "name": "Complex Technical Issue",
  "prompt": "Customer's software won't connect to the server. They've already tried restarting and checking their internet.",
  "personas": ["frustrated-technical", "patient-technical"],
  "scorecard": "technical-support-quality"
}

Compliance Verification

{
  "name": "TCPA Compliance Check",
  "prompt": "Outbound sales call to verify agent provides all required disclosures and consent requests.",
  "personas": ["rushed", "detail-oriented"],
  "scorecard": "compliance-tcpa"
}

Best Practices

Start Small, Scale Up

Begin with 2-3 personas and 1-2 agents. Once you validate the scenario works, expand to cover more combinations.

Test Edge Cases

Don’t just test happy paths. Include difficult personas like “confused elderly customer” or “angry and rushed.”

Use Descriptive Names

Name scenarios clearly: “Refund Request - Defective Product” not “Scenario 1”

Version Your Scenarios

When testing new agent versions, keep scenario names consistent to compare results over time.

Schedule Regression Tests

Run key scenarios daily to catch when agent updates break existing functionality.

Automated Testing with API

Automate scenario testing programmatically:

// validate-agent.js - Automated quality validation
const chanl = require('@chanl/sdk');

async function validateAgent(agentId) {
  // Create test scenario
  const scenario = await chanl.scenarios.create({
    name: `Quality Check - ${new Date().toISOString()}`,
    prompt: "Customer requests refund for defective product",
    personas: ['frustrated', 'analytical', 'confused'],
    agents: [agentId],
    scorecard: 'customer-service-quality'
  });

  // Wait for all simulations to complete
  const results = await chanl.scenarios.waitForCompletion(scenario.id, {
    timeout: 300000 // 5 minutes
  });

  // Check if quality threshold met
  const avgScore = results.averageScore;
  const minScore = results.minScore;

  if (avgScore < 80 || minScore < 70) {
    throw new Error(
      `Agent quality below threshold. Average: ${avgScore}, Min: ${minScore}`
    );
  }

  console.log(`✅ Agent passed quality tests. Average score: ${avgScore}`);
  return results;
}

// Run validation
validateAgent(process.env.AGENT_ID)
  .then(() => console.log('Validation complete'))
  .catch(err => {
    console.error('❌ Agent validation failed:', err.message);
    process.exit(1);
  });

Troubleshooting

Simulations timing out

Problem: Simulations take too long or timeoutSolutions:

Reduce the number of personas or agents in the scenario
Check if your agent has timeout issues in production
Contact support if timeouts persist

Low scores across all simulations

Problem: All combinations scoring below 70Solutions:

Review your scorecard criteria - are they too strict?
Check agent configuration for obvious issues
Review simulation transcripts to identify common failure points

Inconsistent results

Problem: Same scenario getting different scores on rerunsSolutions:

This is normal with AI - some variation expected
Look at trends over multiple runs, not single scores
If variation is extreme (±20 points), review agent configuration

What’s Next?

Create Personas

Build customer behavior profiles to test against

Review Simulations

Analyze your scenario results in detail

Build Scorecards

Define quality criteria for evaluating scenarios

API Reference

Complete API documentation for scenarios

​Test Scenarios

​Why Use Scenarios?

​How Scenarios Work

​The Core Formula

​Creating Your First Scenario

​Step 1: Define the Scenario

​Step 2: Select Personas

​Step 3: Choose Target Agents

​Step 4: Select Score Criteria

​Step 5: Set Schedule

​Step 6: Preview & Launch

​Managing Scenarios

​Step 1: Define the Situation

​Step 2: Add Prompt Variables (Optional)

​Step 3: Select Your Personas

​Step 4: Choose Agents to Test

​Step 5: Assign a Scorecard

​Scheduling Automated Tests

​Scheduling Options

Once

Daily

Weekly

Monthly

​Setting End Conditions

​Understanding Scenario Results

​Reading the Results Dashboard

Scenario: Product Refund Request

​Analyzing Patterns

​Common Scenario Templates

​Customer Service

​Sales

​Technical Support

​Compliance Verification

​Best Practices

​Automated Testing with API

​Troubleshooting

​What’s Next?

Create Personas

Review Simulations

Build Scorecards

API Reference

Test Scenarios

Why Use Scenarios?

How Scenarios Work

The Core Formula

Creating Your First Scenario

Step 1: Define the Scenario

Step 2: Select Personas

Step 3: Choose Target Agents

Step 4: Select Score Criteria

Step 5: Set Schedule

Step 6: Preview & Launch

Managing Scenarios

Step 1: Define the Situation

Step 2: Add Prompt Variables (Optional)

Step 3: Select Your Personas

Step 4: Choose Agents to Test

Step 5: Assign a Scorecard

Scheduling Automated Tests

Scheduling Options

Setting End Conditions

Understanding Scenario Results

Reading the Results Dashboard

Analyzing Patterns

Common Scenario Templates

Customer Service

Sales

Technical Support

Compliance Verification

Best Practices

Automated Testing with API

Troubleshooting

What’s Next?