Quick experience with iOS

By using Midscene.js playground, you can quickly experience the main features of Midscene on iOS devices, without needing to write any code.

The playground shares the same codebase as the @midscene/ios package, so you can consider it as a playground or debugging tool for the Midscene iOS SDK.

Preparation

Install Node.js

Install Node.js 18 or higher.

Prepare API Key

Prepare an API Key for a visual language (VL) model.

You can find supported models and configurations for Midscene.js in the Choose a Model documentation.

Prepare WebDriver Server

Before getting started, you need to set up the iOS development environment:

  • macOS (required for iOS development)
  • Xcode and Xcode command line tools
  • iOS Simulator or real device

Environment Configuration

Before using Midscene iOS, you need to prepare the WebDriverAgent service. Please refer to the official documentation for setup:

Verify Environment Configuration

After completing the configuration, you can verify whether the service is working properly by accessing WebDriverAgent's status endpoint:

Access URL: http://localhost:8100/status

Correct Response Example:

{
  "value": {
    "build": {
      "version": "10.1.1",
      "time": "Sep 24 2025 18:56:41",
      "productBundleIdentifier": "com.facebook.WebDriverAgentRunner"
    },
    "os": {
      "testmanagerdVersion": 65535,
      "name": "iOS",
      "sdkVersion": "26.0",
      "version": "26.0"
    },
    "device": "iphone",
    "ios": {
      "ip": "10.91.115.63"
    },
    "message": "WebDriverAgent is ready to accept commands",
    "state": "success",
    "ready": true
  },
  "sessionId": "BCAD9603-F714-447C-A9E6-07D58267966B"
}

If you can successfully access this endpoint and receive a similar JSON response as shown above, it indicates that WebDriverAgent is properly configured and running.

Run the playground

npx --yes @midscene/ios-playground

Configure the API key

Click the gear button to enter the configuration page and paste your API key config.

Refer to Config Model and Provider document, config the API Key.

Start experiencing

After the configuration, you can immediately experience Midscene. It provides multiple key operation tabs, including but not limited to:

  • Action: interact with the web page. This is also known as "Auto Planning". For example:
type Midscene in the search box
click the login button
  • Query: extract JSON data from the web page
extract the user id from the page, return in \{ id: string \}
  • Assert: validate the page
the page title is "Midscene"
  • Tap: perform a single tap on the element where you want to click. This is also known as "Instant Action".
the login button

All Agent APIs can be directly debugged and run in the Playground! Interactive, extraction, and verification methods are fully covered, with visual operations and verification that boost your automation development efficiency!

Enjoy !

For the different between "Auto Planning" and "Instant Action", please refer to the API document.

Want to write some code?

After experiencing, you may want to write some code to integrate Midscene. There are multiple ways to do that. Please refer to the documents below: