Integrate with Playwright
Playwright.js is an open-source automation library developed by Microsoft, mainly used for end-to-end testing and web scraping of web applications.
There are two ways to integrate with Playwright:
- Directly integrate and call the Midscene Agent via script, suitable for quick prototyping, data scraping, and automation scripts.
- Integrate Midscene into Playwright test cases, suitable for UI testing scenarios.
Set up API keys for model
Set your model configs into the environment variables. You may refer to Model strategy for more details.
For more configuration details, please refer to Model strategy and Model configuration.
Direct integration with Midscene agent
You can find an example project of direct Playwright integration here: https://github.com/web-infra-dev/midscene-example/blob/main/playwright-demo
Step 1: Install dependencies
Step 2: Write the script
Save the following code as ./demo.ts:
For more Agent API details, please refer to API Reference.
Step 3: Run the script
Use tsx to run, and you will see the product information printed in the terminal:
Step 4: View the run report
After the above command executes successfully, it will output: Midscene - report file updated: /path/to/report/some_id.html. Open this file in your browser to view the report.
Integration in Playwright test cases
Here we assume you already have a repository with Playwright integration.
You can find an example project of Playwright test integration here: https://github.com/web-infra-dev/midscene-example/blob/main/playwright-testing-demo
Step 1: Add dependencies and update configuration
Add dependencies
Update playwright.config.ts
Reporter configuration options:
-
type: Report mode, can bemerged(default) orseparate.mergedmeans multiple test cases generate one merged report,separatemeans each test case generates its own report. -
outputFormat: Controls how the report is generated.'single-html'(default) embeds all screenshots as base64 in a single HTML file.'html-and-external-assets'saves screenshots as separate PNG files in a subdirectory, useful when report files are too large. Note: When using'html-and-external-assets', reports must be served via HTTP server and cannot be opened directly usingfile://protocol (because browser CORS restrictions block loading local images via relative paths from the file protocol). Navigate to the report directory and run one of the following commands:- Using Node.js:
npx serve - Using Python:
python -m http.serverorpython3 -m http.server
Then access the report via
http://localhost:3000(or the port shown in the terminal). - Using Node.js:
Step 2: Extend the test instance
Save the following code as ./e2e/fixture.ts:
Step 3: Write test cases
Review the full catalog of action, query, and utility methods in the Agent API reference. When you need lower-level control, you can use agentForPage to obtain the underlying PageAgent instance and call any API directly:
Example code
For more Agent API details, please refer to API Reference.
Step 4. Run test cases
Step 5. View test report
After the command executes successfully, it will output: Midscene - report file updated: ./current_cwd/midscene_run/report/some_id.html. Open this file in your browser to view the report.
Advanced
About opening in a new tab
Each Agent instance is bound to a single page. To make debugging easier, Midscene intercepts new tabs by default (for example, links with target="_blank") and opens them in the current page.
If you want to restore opening in a new tab, set forceSameTabNavigation to false—but you’ll need to create a new Agent instance for each new tab.
Connect Midscene Agent to a Remote Playwright Browser
You can find an example project of remote Playwright integration here: https://github.com/web-infra-dev/midscene-example/tree/main/remote-playwright-demo
Connect to a remote Playwright browser when you already run browsers in your own infra or vendor grid. This keeps the browser close to the target environment, avoids repeated launches, and still lets Midscene drive it with the same AI APIs.
Prerequisites
Getting a CDP WebSocket URL
You can get a CDP WebSocket URL from various sources, for example:
- BrowserBase: Sign up at https://browserbase.com and get your CDP URL
- Browserless: Use https://browserless.io or run your own instance
- Local Chrome: Run Chrome with
--remote-debugging-port=9222and usews://localhost:9222/devtools/browser/... - Docker: Run Chrome in a Docker container with debugging port exposed
Code example
Once connected, keep using PlaywrightAgent the same way you would with a locally launched browser.
Provide custom actions
Use the customActions option to extend the agent's action space with your own actions defined via defineAction. When provided, these actions will be appended to the built-in ones so the agent can call them during planning.
Check Integrate with any interface for more details about defining custom actions.
FAQ
The webpage continues to flash when running in headed mode
In the local visualization interface, continuous flashing is usually caused by a mismatch between the viewport's deviceScaleFactor and the system/browser's pixel ratio (common on high-resolution or Retina screens).
This flashing does not affect Midscene's screenshots or automation execution, but it does affect the local preview experience. To resolve this, set deviceScaleFactor to match your browser's window.devicePixelRatio, or use Puppeteer's auto-adaptation feature.
If you are unsure of your browser's pixel ratio, you can press F12 on any page to open the console and type window.devicePixelRatio to check; or paste the following into the Chrome address bar and press Enter to see the value in a popup:
Customize the network timeout
When doing interaction or navigation on web page, Midscene automatically waits for the network to be idle. It's a strategy to ensure the stability of the automation. Nothing would happen if the waiting process is timeout.
The default timeout is configured as follows:
- If it's a page navigation, the default wait timeout is 5000ms (the
waitForNavigationTimeout) - If it's a click, input, etc., the default wait timeout is 2000ms (the
waitForNetworkIdleTimeout)
You can also customize or disable the timeout by options:
- Use
waitForNetworkIdleTimeoutandwaitForNavigationTimeoutparameters in Agent. - Use
waitForNetworkIdleparameter in Yaml or PlaywrightAiFixture.
waiting for fonts to load or page.screenshot: Timeout ... exceeded when taking screenshots
If you see an error like this in a Playwright-based environment:
This is usually not caused by Midscene itself. Playwright waits for fonts to finish loading before taking a screenshot. In some CI, container, or restricted network environments, font resources may load very slowly or never finish, which can eventually cause the screenshot to time out.
You can work around it by setting this environment variable:
If you want to set it only for a single command, you can also write:
For more background, see the Playwright issue: [BUG] Page.screenshot method hangs indefinitely.
More
- For all the methods on the Agent, please refer to API Reference.
- For the Playwright API reference, see Playwright Agent API.
- Demo projects
- Direct integration: https://github.com/web-infra-dev/midscene-example/blob/main/playwright-demo
- Playwright test integration: https://github.com/web-infra-dev/midscene-example/blob/main/playwright-testing-demo
- Remote Playwright integration: https://github.com/web-infra-dev/midscene-example/tree/main/remote-playwright-demo

