Test Schedules
Schedules let you run your test scenarios automatically on a recurring basis. Think of them as your agent’s automated quality assurance system that works while you sleep.Why Automate Testing?
Manual testing works for initial validation, but you need automation to:- Catch regressions - Know immediately when code changes break existing functionality
- Monitor production - Run daily tests against live agents to detect quality degradation
- Save time - Test continuously without manual effort
- Build confidence - Deploy knowing your agents are consistently tested
Example: Schedule a nightly test that runs 20 scenarios against your production agent. Wake up to a report showing if everything still works perfectly.
How Schedules Work
When you create a scenario, you can optionally add a schedule to run it automatically:Creating a Schedule
Schedules are configured directly in your scenarios:- Via UI
- Via API
When creating or editing a scenario:
- Navigate to the scenario settings
- Toggle “Enable Schedule”
- Choose frequency (Once, Daily, Weekly, Monthly)
- Set end condition (Never, End Date, After N Runs)
- Save the scenario
Scheduling Options
Frequency Types
Once
Use for: One-time validationRuns immediately when created. Can be manually triggered again later.
Daily
Use for: Continuous monitoringRuns every day at specified time. Perfect for regression testing.
Weekly
Use for: Regular checkpointsRuns once per week on specified day.
Monthly
Use for: Quarterly auditsRuns once per month on specified date.
End Conditions
Control when your schedule stops running:Never
Never
Schedule runs indefinitely until manually stopped.Use for: Production monitoring that should run continuously
End Date
End Date
Schedule stops after a specific date.Use for: Time-limited testing periods or trials
After N Runs
After N Runs
Schedule stops after specified number of executions.Use for: Fixed testing cycles (e.g., 30 days of daily tests)
Common Scheduling Patterns
Continuous Production Monitoring
Run critical scenarios every night to catch issues early:Pre-Deployment Validation
Test before each deployment:Weekly Regression Suite
Comprehensive testing every weekend:Managing Schedules
Viewing Active Schedules
- Via UI
- Via API
Navigate to Schedules in the sidebar to see:
- Active schedules
- Next run time
- Last run date and status
- Total number of runs
Pausing a Schedule
Temporarily stop a schedule without deleting it:- During maintenance windows
- When testing major agent changes
- Temporarily reducing API usage
Reactivating a Schedule
Resume a paused schedule:Schedule Notifications
Get alerted when tests complete or fail:Email Notifications
Slack Integration
Webhook Integration
Send results to your own systems:Monitoring Schedule Performance
Key Metrics to Track
Success Rate
Percentage of scheduled runs that complete successfully
Average Score
Mean quality score across all scheduled runs
Score Trends
Whether quality is improving, stable, or degrading over time
Run Duration
How long tests take to complete
Analyzing Trends
Best Practices
Start with Critical Scenarios
Schedule your most important test scenarios first. Don’t try to automate everything at once.
Run During Low-Traffic Periods
Schedule tests for nights or weekends to avoid impacting production systems during peak hours.
Set Up Alerts
Configure notifications so you know immediately when quality drops below acceptable levels.
Review Trends Weekly
Don’t just look at individual runs. Monitor trends over time to catch gradual degradation.
Programmatic Schedule Management
Create and manage schedules via API:Troubleshooting
Schedule not running
Schedule not running
Problem: Schedule shows as active but isn’t executingCheck:
- Verify timezone is correct
- Ensure end condition hasn’t been reached
- Check if schedule was manually paused
- Verify API key has necessary permissions
Tests failing consistently
Tests failing consistently
Problem: Scheduled tests suddenly showing low scoresInvestigate:
- Review recent agent configuration changes
- Check if test scenarios need updating
- Verify scorecard criteria haven’t changed
- Look for patterns (specific personas failing?)
Missing notifications
Missing notifications
Problem: Not receiving schedule completion alertsSolutions:
- Verify notification settings in schedule configuration
- Check spam/junk folders for emails
- Test webhook URLs are accessible
- Confirm Slack integration is properly configured
High API usage
High API usage
Problem: Schedules consuming too many API callsSolutions:
- Reduce frequency of less critical schedules
- Decrease number of personas/agents in scenarios
- Pause schedules during testing periods
- Contact support to discuss quota limits