Expose devices as an MCP service

MCP (Model Context Protocol) is a protocol standard that lets AI models interact with external tools and capabilities.

Midscene provides MCP services that expose atomic operations in Midscene Agent (each Action in the Action Space) as MCP tools. Upper-layer Agents can use natural language to inspect the UI, precisely operate UI elements, and run automation tasks without needing to understand the underlying implementation.

Because Midscene Agent relies on a vision model, configure the environment variables required by Midscene inside the MCP service instead of reusing the upstream Agent's model configuration.

MCP tool list

Tool name	Description
Device connections such as `web_connect`, `ios_connect`, `android_connect`, `computer_connect`	Connect to target devices such as browsers, iOS devices, Android devices, or computer desktops
`take_screenshot`	Get the latest screenshot
Device actions	Each Action in the Action Space, such as `Tap`, `Scroll`, etc.

View execution reports

After each interaction finishes, Midscene generates a task report. You can open it directly in the command line:

open report_file_name.html

The report includes detailed interaction information such as screenshots, operation logs, and error details to help with debugging and troubleshooting.

Configure MCP

Browser Bridge Mode

@midscene/web-bridge-mcp exposes the Chrome extension Bridge Mode as an MCP service.

Environment preparation

Refer to Chrome Bridge Mode to ensure the browser extension starts correctly. We recommend enabling Background Bridge Mode, which allows the connection to run persistently in the background without manual intervention and won't disconnect when closing the extension popup.

Background Bridge Mode

With background bridge mode enabled, the MCP service can connect at any time without user intervention. See Background Bridge Mode for details.

Configuration

Add the Midscene Web Bridge MCP server (@midscene/web-bridge-mcp) in your MCP client. For model configuration parameters, see Model strategy.

{
  "mcpServers": {
    "midscene-web": {
      "command": "npx",
      "args": ["-y", "@midscene/web-bridge-mcp"],
      "env": {
        "MIDSCENE_MODEL_BASE_URL": "replace with your model service URL",
        "MIDSCENE_MODEL_API_KEY": "replace with your API Key",
        "MIDSCENE_MODEL_NAME": "replace with your model name",
        "MIDSCENE_MODEL_FAMILY": "replace with your model family",
        "MCP_SERVER_REQUEST_TIMEOUT": "600000"
      }
    }
  }
}

iOS MCP service

Environment preparation

AI model service: Prepare an OpenAI API Key or another supported AI model service. See Model strategy for more details.
Device setup: Follow iOS Getting Started to configure WebDriverAgent, certificates, and device connections, and make sure WebDriverAgent is running. You can verify screenshots and basic operations in iOS Playground.

Configuration

Add the Midscene iOS MCP server (@midscene/ios-mcp) in your MCP client. For model configuration parameters, see Model strategy.

{
  "mcpServers": {
    "midscene-ios": {
      "command": "npx",
      "args": ["-y", "@midscene/ios-mcp"],
      "env": {
        "MIDSCENE_MODEL_BASE_URL": "replace with your model service URL",
        "MIDSCENE_MODEL_API_KEY": "replace with your API Key",
        "MIDSCENE_MODEL_NAME": "replace with your model name",
        "MIDSCENE_MODEL_FAMILY": "replace with your model family",
        "MCP_SERVER_REQUEST_TIMEOUT": "800000"
      }
    }
  }
}

Android MCP service

Environment preparation

AI model service: Prepare an OpenAI API Key or another supported AI model service. See Model strategy for more details.
Device setup: Follow Android Getting Started to configure adb and connect your device. Ensure adb devices can recognize the target device. Use Android Playground to verify screenshots and basic operations.

Configuration

Add the Midscene Android MCP server (@midscene/android-mcp) in your MCP client. For model configuration parameters, see Model strategy.

{
  "mcpServers": {
    "midscene-android": {
      "command": "npx",
      "args": ["-y", "@midscene/android-mcp"],
      "env": {
        "MIDSCENE_MODEL_BASE_URL": "replace with your model service URL",
        "MIDSCENE_MODEL_API_KEY": "replace with your API Key",
        "MIDSCENE_MODEL_NAME": "replace with your model name",
        "MIDSCENE_MODEL_FAMILY": "replace with your model family",
        "MCP_SERVER_REQUEST_TIMEOUT": "800000"
      }
    }
  }
}

Computer Desktop MCP service

@midscene/computer-mcp exposes the computer desktop automation capabilities as an MCP service, allowing AI to control your computer through mouse, keyboard, and screenshot operations.

Environment preparation

AI model service: Prepare an OpenAI API Key or another supported AI model service. See Model strategy for more details.
System permissions: On macOS, you need to grant accessibility and screen recording permissions to the terminal or application running the MCP service.

Configuration

Add the Midscene Computer MCP server (@midscene/computer-mcp) in your MCP client. For model configuration parameters, see Model strategy.

{
  "mcpServers": {
    "midscene-computer": {
      "command": "npx",
      "args": ["-y", "@midscene/computer-mcp"],
      "env": {
        "MIDSCENE_MODEL_BASE_URL": "replace with your model service URL",
        "MIDSCENE_MODEL_API_KEY": "replace with your API Key",
        "MIDSCENE_MODEL_NAME": "replace with your model name",
        "MIDSCENE_MODEL_FAMILY": "replace with your model family",
        "MCP_SERVER_REQUEST_TIMEOUT": "800000"
      }
    }
  }
}

Implement your own MCP

If you want to integrate Midscene tools into your own MCP service, you can use the mcpKitForAgent function to get tool definitions and expose your own MCP service as needed.

The tools provided by mcpKitForAgent include screenshots and every Action in the Action Space.

Using mcpKitForAgent

The mcpKitForAgent function takes an Agent instance and returns an object containing description and tools list:

import { mcpKitForAgent } from '@midscene/web/mcp-server';
import { Agent } from '@midscene/core/agent';

const agent = new Agent();
const { description, tools } = await mcpKitForAgent(agent);

// description - "Control the browser / device using natural language commands"
// tools - Tool[] - array of tool definitions

Platform support

Each platform provides its corresponding mcpKitForAgent function:

Web platform

import { mcpKitForAgent } from '@midscene/web/mcp-server';

iOS platform

import { mcpKitForAgent } from '@midscene/ios/mcp-server';

Android platform

import { mcpKitForAgent } from '@midscene/android/mcp-server';

Computer platform

import { mcpKitForAgent } from '@midscene/computer/mcp-server';

Integrate into custom MCP service

You can integrate the obtained tools into your own MCP service:

import { McpServer } from '@modelcontextprotocol/sdk/server/mcp.js';
import { mcpKitForAgent } from '@midscene/web/mcp-server';

const agent = new Agent();
const { description, tools } = await mcpKitForAgent(agent);
const server = new McpServer({
  name: 'my-custom-mcp',
  version: '1.0.0',
  description
});

// Register Midscene tools to your MCP service
for (const tool of tools) {
  server.tool(tool.name, tool.description, tool.schema, tool.handler);
}

#Expose devices as an MCP service

#MCP tool list

#View execution reports

#Configure MCP

#Browser Bridge Mode

#iOS MCP service

#Android MCP service

#Computer Desktop MCP service

#Implement your own MCP

#Using mcpKitForAgent

#Platform support

#Integrate into custom MCP service

Expose devices as an MCP service

MCP tool list

View execution reports

Configure MCP

Browser Bridge Mode

iOS MCP service

Android MCP service

Computer Desktop MCP service

Implement your own MCP

Using mcpKitForAgent

Platform support

Integrate into custom MCP service