Midscene.js - Joyful Automation by AI

Interact, query and assert by natural language

There are three main capabilities: action, query, assert.

  • Use action (.ai, .aiAction) to execute a series of actions by describing the steps
  • Use query (.aiQuery) to extract customized data from the UI. Describe the JSON format you want, and AI will give the answer based on its "understand" of the page
  • Use assert (.aiAssert) to perform assertions on the page.

All these methods accept natural language prompt as param. Obviously, the cost of script maintenance will be greatly decreased.

Start with Chrome extension

To quickly experience the main features of Midscene, you can use the Midscene Chrome extension. It allows you to use Midscene on any webpage without writing any code.

Click here to install Midscene extension from Chrome Web Store.

For instructions, please refer to Quick Experience.

Multiple ways to integrate

Maintaining automation scripts by Midscene could be a brand new experience. For example, to search for headphones on a website, you can do this:

// 👀 type keywords, perform a search
await ai('type "Headphones" in search box, hit Enter');

// 👀 find the items, return in JSON
const items = await aiQuery(
  "{itemTitle: string, price: Number}[], find item in list and corresponding price"
);

console.log("headphones in stock", items);

// 👀 assert by natural language
await aiAssert("There is a category filter on the left");

There are several ways to integrate Midscene into your code project:

Visualized report

Midscene provides a visual report after each run. With this report, you can review the animated replay and view the details of each step in the process. What's more, there is a playground in the report file for you to adjust your prompt without re-running all your scripts.

visualized report

Customize model

Currently, the model we are using by default is the OpenAI GPT-4o model, while you can customize it to a different multimodal model like Gemini, Qwen, etc if needed.

Just you and model provider, no third-party services

All data gathered from pages will be sent directly to OpenAI or the custom model provider according to your configuration. Therefore, no third-party platform will access the data.

For more details, please refer to Data Privacy.

Follow us