Integrate with Playwright
Playwright.js is an open-source automation library developed by Microsoft, mainly used for end-to-end testing and web scraping of web applications.
There are two ways to integrate with Playwright:
- Directly integrate and call the Midscene Agent via script, suitable for quick prototyping, data scraping, and automation scripts.
- Integrate Midscene into Playwright test cases, suitable for UI testing scenarios.
Setup AI model service
Set your model configs into the environment variables. You may refer to choose a model for more details.
# replace with your own
export OPENAI_API_KEY="sk-abcdefghijklmnopqrstuvwxyz"
Direct Integration with Midscene Agent
Step 1: Install dependencies
npm install @midscene/web playwright @playwright/test tsx --save-dev
Step 2: Write the script
Save the following code as ./demo.ts
:
import { chromium } from 'playwright';
import { PlaywrightAgent } from '@midscene/web/playwright';
import 'dotenv/config'; // read environment variables from .env file
const sleep = (ms) => new Promise((r) => setTimeout(r, ms));
Promise.resolve(
(async () => {
const browser = await chromium.launch({
headless: true, // 'true' means we can't see the browser window
args: ['--no-sandbox', '--disable-setuid-sandbox'],
});
const page = await browser.newPage();
await page.setViewportSize({
width: 1280,
height: 768,
});
await page.goto('https://www.ebay.com');
await sleep(5000); // 👀 init Midscene agent
const agent = new PlaywrightAgent(page);
// 👀 type keywords, perform a search
await agent.aiAction('type "Headphones" in search box, hit Enter');
// 👀 wait for the loading
await agent.aiWaitFor('there is at least one headphone item on page');
// or you may use a plain sleep:
// await sleep(5000);
// 👀 understand the page content, find the items
const items = await agent.aiQuery(
'{itemTitle: string, price: Number}[], find item in list and corresponding price',
);
console.log('headphones in stock', items);
const isMoreThan1000 = await agent.aiBoolean(
'Is the price of the headphones more than 1000?',
);
console.log('isMoreThan1000', isMoreThan1000);
const price = await agent.aiNumber(
'What is the price of the first headphone?',
);
console.log('price', price);
const name = await agent.aiString(
'What is the name of the first headphone?',
);
console.log('name', name);
const location = await agent.aiLocate(
'What is the location of the first headphone?',
);
console.log('location', location);
// 👀 assert by AI
await agent.aiAssert('There is a category filter on the left');
// 👀 click on the first item
await agent.aiTap('the first item in the list');
await browser.close();
})(),
);
For more Agent API details, please refer to API Reference.
Step 3: Run the script
Use tsx
to run, and you will see the product information printed in the terminal:
# run
npx tsx demo.ts
# The terminal should output something like:
# [
# {
# itemTitle: 'JBL Tour Pro 2 - True wireless Noise Cancelling earbuds with Smart Charging Case',
# price: 551.21
# },
# {
# itemTitle: 'Soundcore Space One Wireless Headphones 40H ANC Playtime 2XStronger Voice',
# price: 543.94
# }
# ]
Step 4: View the run report
After the above command executes successfully, it will output: Midscene - report file updated: /path/to/report/some_id.html
. Open this file in your browser to view the report.
How to force navigation in the current tab
If you want to force the page to open in the current tab (for example, clicking a link with target="_blank"
), you can set the forceSameTabNavigation
option to true
:
const mid = new PlaywrightAgent(page, {
forceSameTabNavigation: true,
});
Integration in Playwright Test Cases
Here we assume you already have a repository with Playwright integration.
Step 1: add dependencies and update configuration
Add dependencies
npm install @midscene/web --save-dev
Update playwright.config.ts
export default defineConfig({
testDir: './e2e',
+ timeout: 90 * 1000,
+ reporter: [["list"], ["@midscene/web/playwright-reporter", { type: "merged" }]], // type optional, default is "merged", means multiple test cases generate one report, optional value is "separate", means one report for each test case
});
The type
option of the reporter
configuration can be merged
or separate
. The default value is merged
, which indicates that one merged report for all test cases; the optional value is separate
, indicating that the report is separated for each test case.
Step 2: extend thetest
instance
Save the following code as ./e2e/fixture.ts
:
import { test as base } from '@playwright/test';
import type { PlayWrightAiFixtureType } from '@midscene/web/playwright';
import { PlaywrightAiFixture } from '@midscene/web/playwright';
export const test = base.extend<PlayWrightAiFixtureType>(
PlaywrightAiFixture({
waitForNetworkIdleTimeout: 2000, // optional, the timeout for waiting for network idle between each action, default is 2000ms
}),
);
Step 3: write test cases
Basic AI Operation APIs
ai
or aiAction
- General AI interaction
aiTap
- Click operation
aiHover
- Hover operation
aiInput
- Input operation
aiKeyboardPress
- Keyboard operation
aiScroll
- Scroll operation
Query
aiAsk
- Ask AI Model anything about the current page
aiQuery
- Extract structured data from current page
aiNumber
- Extract number from current page
aiString
- Extract string from current page
aiBoolean
- Extract boolean from current page
More APIs
aiAssert
- AI Assertion
aiWaitFor
- AI Wait
aiLocate
- Locate Element
Besides the exposed shortcut methods, if you need to call other API provided by the agent, you can use agentForPage
to get the PageAgent
instance, and use PageAgent
to call the API for interaction:
test('case demo', async ({ agentForPage, page }) => {
const agent = await agentForPage(page);
await agent.logScreenshot();
const logContent = agent._unstableLogContent();
console.log(logContent);
});
Example Code
./e2e/ebay-search.spec.ts
import { expect } from '@playwright/test';
import { test } from './fixture';
test.beforeEach(async ({ page }) => {
page.setViewportSize({ width: 400, height: 905 });
await page.goto('https://www.ebay.com');
await page.waitForLoadState('networkidle');
});
test('search headphone on ebay', async ({
ai,
aiQuery,
aiAssert,
aiInput,
aiTap,
aiScroll,
aiWaitFor,
}) => {
// Use aiInput to enter search keyword
await aiInput('Headphones', 'search box');
// Use aiTap to click search button
await aiTap('search button');
// Wait for search results to load
await aiWaitFor('search results list loaded', { timeoutMs: 5000 });
// Use aiScroll to scroll to bottom
await aiScroll(
{
direction: 'down',
scrollType: 'untilBottom',
},
'search results list',
);
// Use aiQuery to get product information
const items = await aiQuery<Array<{ title: string; price: number }>>(
'get product titles and prices from search results',
);
console.log('headphones in stock', items);
expect(items?.length).toBeGreaterThan(0);
// Use aiAssert to verify filter functionality
await aiAssert('category filter exists on the left side');
});
For more Agent API details, please refer to API Reference.
Step 4. run test cases
npx playwright test ./e2e/ebay-search.spec.ts
Step 5. view test report
After the command executes successfully, it will output: Midscene - report file updated: ./current_cwd/midscene_run/report/some_id.html
. Open this file in your browser to view the report.
More