Back to Blog
AI Test Automation

How AI Test Generation Works in TestInspector: From Chat Prompt to Running Test

Avanish Pandey

June 12, 2026

How AI Test Generation Works in TestInspector: From Chat Prompt to Running Test

TestInspector generates browser tests by converting natural language or Gherkin input into structured step records stored in a database, then executes those steps via Selenium WebDriver. No test code is written at any point; the system maintains a structured representation of test intent that runs, edits, and exports independently of the original prompt. The workflow moves through five stages: chat input, AI agent orchestration, step validation and database persistence, Selenium execution, and optional export or CI integration.

What Is AI Test Generation in TestInspector?

Most test automation tools produce code — a Python file, a JavaScript spec, a Java class. TestInspector produces structured step records: each action a browser should perform is stored as a database row with a command, a target selector or URL, and an optional value. The AI agent writes these rows; the Selenium-based runner reads them at execution time.

This separation of test intent from execution has concrete consequences. Tests are immediately editable through the UI without touching code. They export to multiple formats — Playwright TypeScript, Selenium IDE, Gherkin — without any code migration. The AI can modify them through follow-up chat instructions because the underlying representation is structured data rather than natural language embedded in a source file.

The platform runs on Django with Anthropic Claude Sonnet 4.6 as its default model and executes via Selenium WebDriver on Chrome, Firefox, Edge, or Safari. Teams interact through a web dashboard, a browser extension for recording, or TestInspector MCP tokens for IDE integration with Claude Code and Cursor. For teams evaluating broader automation options, Astaqc's test automation services cover framework selection, implementation, and long-term maintenance.

Stage 1: Chat Input and WebSocket Message Handling

A user opens a Generate Tests session and types a description of what should be tested — a login flow, an e-commerce checkout, an API response assertion. Input can be free-form English, structured Gherkin (Given/When/Then), or a link to an Azure DevOps or Jira work item. No template is required; the agent interprets intent from context.

The message travels over a persistent WebSocket connection to a Django Channels consumer called ChatConsumer. ChatConsumer authenticates using a short-lived JWT token embedded in the WebSocket URL, retrieves the user's organization context, and locates or creates the Room record that tracks the active LLM model and cumulative token usage for that session.

Once the message is persisted, ChatConsumer triggers the AI agent pipeline asynchronously. The user immediately begins receiving a streaming response — tool call announcements, planning thoughts, partial answers — as the agent works through the test design.

Stage 2: AI Agent Orchestration and Tool Calls

The AIAgent class orchestrates the process using a Smolagents-based manager-agent architecture. The primary sub-agent for test creation is qa_agent, which has access to database CRUD tools and integration tools. The manager agent reads the user's prompt alongside system instruction files that define the agent's methodology, tool selection logic, output formatting rules, and security guardrails.

A typical test generation session involves these tool calls in sequence:

  • get_organization_data — retrieves existing suites, folders, and org-level variables for context
  • create_suite — inserts a new Suite record if the test belongs to a new context
  • create_test — inserts a Test record linked to the suite
  • validate_step — checks each proposed step: command must be in the allowed list of 36 commands, target must be non-empty for selector-based commands, value must be present where required
  • create_step — inserts validated step records with ascending sequence numbers
  • submit_generated_tests — bundles all created tests into a structured JSON payload and sends it to the frontend

If validation fails — a command name is misspelled or a required field is absent — the agent corrects the step rather than persisting an invalid record. If the agent needs clarification, it calls ask_user_input, which pauses execution and returns a question through the chat stream.

For tasks that involve analyzing a screenshot, the agentic_vision sub-agent is invoked. For Azure DevOps or Jira tickets, the devops_ticket_agent parses ticket content into a structured test design brief that the qa_agent then acts on.

Stage 3: Step Validation and Database Persistence

Each Step record stores: a sequence number, the command (one of 36 supported commands), a target (CSS selector, XPath, or URL), an optional value (input text, expected text, or JSON request body), optional modifier flags for keypress steps, an optional variable_name for store and extract commands, and optional HTTP assertion fields for request steps.

Variable references use double-brace syntax. {{VARIABLE_NAME}} resolves through a four-level hierarchy at runtime — test-level first, then suite-level, then org-level, then runtime injection via the CI/CD API. Special tokens resolve differently: {{TIMESTAMP}} becomes the current Unix timestamp, {{ALPHANUMERIC}} generates a random alphanumeric string, and {{TOTP:secret}} generates a valid TOTP one-time password from the stored secret.

Test-level and suite-level variables are stored encrypted using a custom EncodedChar field. Org-level variables use an unencrypted JSONField. Message content is streamed through a Redis buffer and persisted to Message.content_v1 when the agent finishes. On page reload, the bootstrap API returns the stored message history and the UI re-renders from content_v1 — the agent does not re-execute.

Stage 4: Test Execution via Selenium

When a user triggers a run, the frontend calls POST /api/v1/suites/tests/run/. The backend enqueues a Celery task on RabbitMQ. A Celery worker picks it up, instantiates TestRunner, and initializes a Selenium WebDriver session on the configured browser.

TestRunner resolves all variable references in step targets and values, then iterates through steps in ascending sequence order. Each command dispatches to a dedicated handler: a click step calls the element's click method; an assign step clears the field and types the value; an assertText step reads element text and compares to the expected value; a requestGET step issues an HTTP request and checks the response against the configured assertion.

Run progress broadcasts over a TestRunConsumer WebSocket, giving the frontend live step-by-step pass/fail status. Screenshots are uploaded to Cloudinary. If a baseline screenshot exists, the runner computes an SSIM structural similarity score between the new screenshot and the baseline to detect visual regressions.

If a step fails and auto_retry is enabled, the self_healing_test module queries the AI for alternative selectors based on the live page structure and original step intent, and retries. If healing succeeds, the corrected selector is stored for future runs. For professional support on test maintenance strategy, see Astaqc's software testing services.

Stage 5: Export, Scheduling, and CI Integration

Tests export to multiple formats entirely client-side. Playwright TypeScript export maps each step to the corresponding Playwright API call, adds the request fixture only when HTTP steps are present, and wraps optional steps in try/catch blocks. Selenium IDE (.side) produces the Selenium IDE 2.0 JSON format. Gherkin produces Given/When/Then scenarios with step notes as comments.

Scheduling uses Django Celery Beat. Tests and suites have a many-to-many relationship with PeriodicTask records, supporting cron expressions, fixed intervals, and one-time clocked schedules. For CI/CD integration, the run endpoint accepts a custom_config payload where pipelines inject environment-specific variables that override all other variable levels. See the outsource QA guide and manual testing services for context on integrating automated and manual workflows.

AI-Generated Steps vs Coded Frameworks: Key Differences

DimensionTestInspector (AI-Generated Steps)Coded Framework (Playwright/Selenium)
Test representationStructured database recordsSource code files
Creation methodChat prompt or browser recordingManual coding in IDE
EditabilityUI or follow-up chatCode editor with version control
Export to codeYes — Playwright TS, Selenium IDE, GherkinCode is the native format
Self-healingBuilt-in AI selector retryManual update or third-party plugin
SchedulingBuilt-in (cron/interval/one-time)CI/CD pipeline configuration
Skill requirementQA analyst or product role, no codingDeveloper or SDET with framework knowledge

Frequently Asked Questions

Can TestInspector test login flows that require TOTP two-factor authentication?

Yes. The {{TOTP:secret}} variable token generates a valid TOTP one-time password at execution time from a stored secret. The secret is saved as an encrypted test-level or suite-level variable and is never exposed in exported files or execution logs.

What happens if the AI generates a step with an incorrect CSS selector?

If the step fails at runtime and auto_retry is enabled, the self-healing module queries the AI for alternative selectors based on the live page structure and original step intent, and retries. If healing succeeds, the corrected selector is stored for future runs without requiring manual intervention.

Does the AI agent re-run when I reload the page after a generation session?

No. All generated content is persisted to Message.content_v1 during the session. On reload, the bootstrap API returns the stored message history and the UI re-renders from the saved data. The AI does not re-execute.

Can I use TestInspector to test API endpoints without a browser?

Yes. The HTTP request step commands (requestGET, requestPOST, requestPUT, requestPATCH, requestDELETE) issue actual HTTP requests independently of the browser session. Each step accepts JSON headers, a request body, and assertions on response status or body content. These steps also export to Playwright's request context when generating a .spec.ts file.

How does variable injection work for CI/CD pipeline runs?

The run endpoint accepts a custom_config payload where pipelines inject key-value pairs overriding all other variable levels. A single test definition handles multiple environments with environment-specific credentials and base URLs supplied at trigger time. See Astaqc's test automation services for CI/CD pipeline integration support.

What LLM models does TestInspector support for test generation?

The default is Claude Sonnet 4.6 via OpenRouter. The platform also supports Qwen3 Max, Gemini 3 Pro, and custom models via LiteLLM. The active model is configurable per chat room. Read the AI in software testing guide for context on how language models are being applied across the QA workflow.

Avanish Pandey

June 12, 2026

icon
icon
icon

Subscribe to our Newsletter

Sign up to receive and connect to our newsletter

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.

Latest Article

copilot