Midscene.js provides a Chrome extension. By using it, you can quickly experience the main features of Midscene on any webpage, without needing to set up a code project.
The extension shares the same code as the npm @midscene/
packages, so you can think of it as a playground or a way to debug with Midscene.
Prepare an API key from one of these models: OpenAI GPT-4o, Qwen-2.5-VL, UI-TARS, or any other supported providers. We will be using it soon.
You can check the supported models in Choose a model
Install Midscene extension from chrome web store: Midscene
Start the extension (may be folded by Chrome extension icon), setup the config by pasting the config in the K=V format:
After the configuration, you can immediately experience Midscene. There are three main tabs in the extension:
Enjoy !
After experiencing, you may want to write some code to integrate Midscene. There are multiple ways to do that. Please refer to the documents below:
It's mainly due to conflicts with other extensions injecting <iframe />
or <script />
into the page. Try disabling the suspicious plugins and refresh.
To find the suspicious plugins:
<script>
or <iframe>
with a url like chrome-extension://{ID-of-the-suspicious-plugin}/...
.chrome://extensions/
, use cmd+f to find the plugin with the same ID, disable it.