There are three main capabilities: action, query, assert.
.ai
, .aiAction
) to execute a series of actions by describing the steps.aiQuery
) to extract customized data from the UI. Describe the JSON format you want, and AI will give the answer based on its "understanding" of the page.aiAssert
) to perform assertions on the page.All these methods accept natural language prompt as param. Obviously, the cost of script maintenance will be greatly decreased.
To quickly experience the main features of Midscene, you can use the Midscene Chrome extension. It allows you to use Midscene on any webpage without writing any code.
Click here to install Midscene extension from Chrome Web Store.
For instructions, please refer to Quick Experience.
Maintaining automation scripts by Midscene could be a brand new experience. For example, to search for headphones on a website, you can do this:
There are several ways to integrate Midscene into your code project:
Midscene wants to provide a way to make automation more stable and easier to debug, so we provide a visual report after each run. With this report, you can review the animated replay and view the details of each step in the process.
What's more, there is a playground in the report file for you to adjust your prompt without re-running all your scripts.
Midscene supports both general-purpose LLM and open-source model. You can use the general-purpose LLM like gpt-4o
as the default model, it works well for most cases.
You can also use the open-source model named UI-TARS
, which is an end-to-end GUI agent model based on VLM architecture. You can deploy it on your own server, and it will dramatically improve the performance and data privacy.
Read more about it in Choose a model.
There are so many UI automation tools out there, and each one seems to be all-powerful. What's special about Midscene.js?
Debugging Experience: You will soon realize that debugging and maintaining automation scripts is the real challenge. No matter how magical the demo looks, ensuring stability over time requires careful debugging. Midscene.js offers a visualized report file, a built-in playground, and a Chrome Extension to simplify the debugging process. These are the tools most developers truly need, and we’re continually working to improve the debugging experience.
Open Source, Free, Deploy as you want: Midscene.js is an open-source project. It's decoupled from any cloud service and model provider, you can choose either public or private deployment. There is always a suitable plan for your business.
Integrate with Javascript: You can always bet on Javascript 😎
All data gathered from pages will be sent directly to OpenAI or the custom model provider according to your configuration. Therefore, no third-party platform will access the data.
For more details, please refer to Data Privacy.