Expose devices as an MCP service

MCP (Model Context Protocol) is a protocol standard that lets AI models interact with external tools and capabilities.

Midscene provides MCP services that expose atomic operations in Midscene Agent (each Action in the Action Space) as MCP tools. Upper-layer Agents can use natural language to inspect the UI, precisely operate UI elements, and run automation tasks without needing to understand the underlying implementation.

Because Midscene Agent relies on a vision model, configure the environment variables required by Midscene inside the MCP service instead of reusing the upstream Agent's model configuration.

MCP tool list

Tool nameDescription
Device connections such as web_connect, ios_connect, android_connectConnect to target devices such as browsers, iOS devices, or Android devices
take_screenshotGet the latest screenshot
Device actionsEach Action in the Action Space, such as Tap, Scroll, etc.

View execution reports

After each interaction finishes, Midscene generates a task report. You can open it directly in the command line:

open report_file_name.html

The report includes detailed interaction information such as screenshots, operation logs, and error details to help with debugging and troubleshooting.

Configure MCP

Browser Bridge Mode

@midscene/web-bridge-mcp exposes the Chrome extension Bridge Mode as an MCP service.

Environment preparation

Refer to Chrome Bridge Mode to ensure the browser extension starts correctly and that you have clicked Allow connection in Bridge Mode.

Configuration

Add the Midscene Web Bridge MCP server (@midscene/web-bridge-mcp) in your MCP client. For model configuration parameters, see Model strategy.

{
  "mcpServers": {
    "midscene-web": {
      "command": "npx",
      "args": ["-y", "@midscene/web-bridge-mcp"],
      "env": {
        "MIDSCENE_MODEL_BASE_URL": "replace with your model service URL",
        "MIDSCENE_MODEL_API_KEY": "replace with your API Key",
        "MIDSCENE_MODEL_NAME": "replace with your model name",
        "MIDSCENE_MODEL_FAMILY": "replace with your model family",
        "MCP_SERVER_REQUEST_TIMEOUT": "600000"
      }
    }
  }
}

iOS MCP service

Environment preparation

  • AI model service: Prepare an OpenAI API Key or another supported AI model service. See Model strategy for more details.
  • Device setup: Follow iOS Getting Started to configure WebDriverAgent, certificates, and device connections, and make sure WebDriverAgent is running. You can verify screenshots and basic operations in iOS Playground.

Configuration

Add the Midscene iOS MCP server (@midscene/ios-mcp) in your MCP client. For model configuration parameters, see Model strategy.

{
  "mcpServers": {
    "midscene-ios": {
      "command": "npx",
      "args": ["-y", "@midscene/ios-mcp"],
      "env": {
        "MIDSCENE_MODEL_BASE_URL": "replace with your model service URL",
        "MIDSCENE_MODEL_API_KEY": "replace with your API Key",
        "MIDSCENE_MODEL_NAME": "replace with your model name",
        "MIDSCENE_MODEL_FAMILY": "replace with your model family",
        "MCP_SERVER_REQUEST_TIMEOUT": "800000"
      }
    }
  }
}

Android MCP service

Environment preparation

  • AI model service: Prepare an OpenAI API Key or another supported AI model service. See Model strategy for more details.
  • Device setup: Follow Android Getting Started to configure adb and connect your device. Ensure adb devices can recognize the target device. Use Android Playground to verify screenshots and basic operations.

Configuration

Add the Midscene Android MCP server (@midscene/android-mcp) in your MCP client. For model configuration parameters, see Model strategy.

{
  "mcpServers": {
    "midscene-android": {
      "command": "npx",
      "args": ["-y", "@midscene/android-mcp"],
      "env": {
        "MIDSCENE_MODEL_BASE_URL": "replace with your model service URL",
        "MIDSCENE_MODEL_API_KEY": "replace with your API Key",
        "MIDSCENE_MODEL_NAME": "replace with your model name",
        "MIDSCENE_MODEL_FAMILY": "replace with your model family",
        "MCP_SERVER_REQUEST_TIMEOUT": "800000"
      }
    }
  }
}

Implement your own MCP

If you want to integrate Midscene tools into your own MCP service, you can use the mcpKitForAgent function to get tool definitions and expose your own MCP service as needed.

The tools provided by mcpKitForAgent include screenshots and every Action in the Action Space.

Using mcpKitForAgent

The mcpKitForAgent function takes an Agent instance and returns an object containing description and tools list:

import { mcpKitForAgent } from '@midscene/web/mcp-server';
import { Agent } from '@midscene/core/agent';

const agent = new Agent();
const { description, tools } = await mcpKitForAgent(agent);

// description - "Control the browser / device using natural language commands"
// tools - Tool[] - array of tool definitions

Platform support

Each platform provides its corresponding mcpKitForAgent function:

Web platform

import { mcpKitForAgent } from '@midscene/web/mcp-server';

iOS platform

import { mcpKitForAgent } from '@midscene/ios/mcp-server';

Android platform

import { mcpKitForAgent } from '@midscene/android/mcp-server';

Integrate into custom MCP service

You can integrate the obtained tools into your own MCP service:

import { McpServer } from '@modelcontextprotocol/sdk/server/mcp.js';
import { mcpKitForAgent } from '@midscene/web/mcp-server';

const agent = new Agent();
const { description, tools } = await mcpKitForAgent(agent);
const server = new McpServer({
  name: 'my-custom-mcp',
  version: '1.0.0',
  description
});

// Register Midscene tools to your MCP service
for (const tool of tools) {
  server.tool(tool.name, tool.description, tool.schema, tool.handler);
}