将设备暴露为 MCP 服务

MCP（Model Context Protocol）是一套协议标准，让 AI 模型可以与外部工具和能力进行交互。

Midscene 提供了 MCP 服务，可以将 Midscene Agent 中的原子化操作（即 Action Space 中的每个 Action）暴露为 MCP 工具，让上层 Agent 可以通过自然语言来查看界面、精准操作 UI 界面、执行自动化任务等，而无需理解复杂的底层实现。

由于 Midscene Agent 依赖于视觉模型，因此你需要在 MCP 服务中配置 Midscene 所需的环境变量，而不是复用上层 Agent 的模型配置。

MCP 工具列表

工具名称	功能描述
设备连接，如 `web_connect`、`ios_connect`、`android_connect`、`computer_connect`	连接到目标设备，如浏览器、iOS 设备、Android 设备、电脑桌面
`take_screenshot`	获取最新截图
设备操作	对应 Action Space 中的每个 Action，如 `Tap`、`Scroll` 等

查看执行报告

每次交互操作执行结束，都会生成一份 Midscene 任务报告。可直接在命令行打开：

open report_file_name.html

报告中包含了交互操作的详细信息，包括截图、操作日志、错误信息等，便于调试和问题排查。

配置 MCP

浏览器桥接模式

@midscene/web-bridge-mcp 支持将 Chrome 插件的桥接模式发布为 MCP 服务。

环境准备

参考 Chrome 桥接模式，确保浏览器插件可以启动。推荐开启后台桥接模式（Background Bridge），这样无需手动点击「允许连接」，且连接在后台持续运行，不会因关闭插件弹窗而断开。

后台桥接模式

开启后台桥接模式后，MCP 服务可以随时连接，无需用户手动干预。详见后台桥接模式。

配置

在 MCP 客户端中添加 Midscene Web Bridge MCP 服务器（ @midscene/web-bridge-mcp ）。其中模型配置的参数请参考模型策略。

{
  "mcpServers": {
    "midscene-web": {
      "command": "npx",
      "args": ["-y", "@midscene/web-bridge-mcp"],
      "env": {
        "MIDSCENE_MODEL_BASE_URL": "替换为你的模型服务地址",
        "MIDSCENE_MODEL_API_KEY": "替换为你的 API Key",
        "MIDSCENE_MODEL_NAME": "替换为你的模型名称",
        "MIDSCENE_MODEL_FAMILY": "替换为你的模型系列",
        "MCP_SERVER_REQUEST_TIMEOUT": "600000"
      }
    }
  }
}

iOS MCP 服务

环境准备

AI 模型服务：准备 OpenAI API Key 或其他支持的 AI 模型服务，更多信息参见模型策略。
设备环境：请按照 iOS 快速开始配置 WebDriverAgent、证书与设备连接，确保 WebDriverAgent 已正常运行。可以在 iOS Playground 中验证截图和基础操作是否正常。

配置

在 MCP 客户端中添加 Midscene iOS MCP 服务器（ @midscene/ios-mcp ）。其中模型配置的参数请参考模型策略。

{
  "mcpServers": {
    "midscene-ios": {
      "command": "npx",
      "args": ["-y", "@midscene/ios-mcp"],
      "env": {
        "MIDSCENE_MODEL_BASE_URL": "替换为你的模型服务地址",
        "MIDSCENE_MODEL_API_KEY": "替换为你的 API Key",
        "MIDSCENE_MODEL_NAME": "替换为你的模型名称",
        "MIDSCENE_MODEL_FAMILY": "替换为你的模型系列",
        "MCP_SERVER_REQUEST_TIMEOUT": "800000"
      }
    }
  }
}

Android MCP 服务

环境准备

AI 模型服务：准备 OpenAI API Key 或其他支持的 AI 模型服务，更多信息参见模型策略。
设备环境：请按照 Android 快速开始配置 adb 工具与设备连接，确保 adb devices 可以识别目标设备。可以用 Android Playground 获取截图并执行简单操作来检验环境。

配置

在 MCP 客户端中添加 Midscene Android MCP 服务器（ @midscene/android-mcp ）。其中模型配置的参数请参考模型策略。

{
  "mcpServers": {
    "midscene-android": {
      "command": "npx",
      "args": ["-y", "@midscene/android-mcp"],
      "env": {
        "MIDSCENE_MODEL_BASE_URL": "替换为你的模型服务地址",
        "MIDSCENE_MODEL_API_KEY": "替换为你的 API Key",
        "MIDSCENE_MODEL_NAME": "替换为你的模型名称",
        "MIDSCENE_MODEL_FAMILY": "替换为你的模型系列",
        "MCP_SERVER_REQUEST_TIMEOUT": "800000"
      }
    }
  }
}

电脑桌面 MCP 服务

@midscene/computer-mcp 支持将电脑桌面自动化能力发布为 MCP 服务，让 AI 可以通过鼠标、键盘和截图操作来控制你的电脑。

环境准备

AI 模型服务：准备 OpenAI API Key 或其他支持的 AI 模型服务，更多信息参见模型策略。
系统权限：在 macOS 上，需要授予运行 MCP 服务的终端或应用程序辅助功能和屏幕录制权限。

配置

在 MCP 客户端中添加 Midscene Computer MCP 服务器（ @midscene/computer-mcp ）。其中模型配置的参数请参考模型策略。

{
  "mcpServers": {
    "midscene-computer": {
      "command": "npx",
      "args": ["-y", "@midscene/computer-mcp"],
      "env": {
        "MIDSCENE_MODEL_BASE_URL": "替换为你的模型服务地址",
        "MIDSCENE_MODEL_API_KEY": "替换为你的 API Key",
        "MIDSCENE_MODEL_NAME": "替换为你的模型名称",
        "MIDSCENE_MODEL_FAMILY": "替换为你的模型系列",
        "MCP_SERVER_REQUEST_TIMEOUT": "800000"
      }
    }
  }
}

实现自己的 MCP

如果你想在自己的 MCP 服务中集成 Midscene 工具，可以使用 mcpKitForAgent 函数来获取工具定义，继而自己按需暴露 MCP 服务。

mcpKitForAgent 提供的工具包括截图与 Action Space 中的每个 Action。

使用 mcpKitForAgent

mcpKitForAgent 函数接受一个 Agent 实例，返回包含描述和工具列表的对象：

import { mcpKitForAgent } from '@midscene/web/mcp-server';
import { Agent } from '@midscene/core/agent';

const agent = new Agent();
const { description, tools } = await mcpKitForAgent(agent);

// description - "Control the browser / device using natural language commands"
// tools - Tool[] - 工具定义数组

平台支持

每个平台都提供了对应的 mcpKitForAgent 函数：

Web 平台

import { mcpKitForAgent } from '@midscene/web/mcp-server';

iOS 平台

import { mcpKitForAgent } from '@midscene/ios/mcp-server';

Android 平台

import { mcpKitForAgent } from '@midscene/android/mcp-server';

Computer 平台

import { mcpKitForAgent } from '@midscene/computer/mcp-server';

集成到自定义 MCP 服务

你可以将获取的工具集成到自己的 MCP 服务中：

import { McpServer } from '@modelcontextprotocol/sdk/server/mcp.js';
import { mcpKitForAgent } from '@midscene/web/mcp-server';

const agent = new Agent();
const { description, tools } = await mcpKitForAgent(agent);
const server = new McpServer({
  name: 'my-custom-mcp',
  version: '1.0.0',
  description
});

// 注册 Midscene 工具到你的 MCP 服务
for (const tool of tools) {
  server.tool(tool.name, tool.description, tool.schema, tool.handler);
}

#将设备暴露为 MCP 服务

#MCP 工具列表

#查看执行报告

#配置 MCP

#浏览器桥接模式

#iOS MCP 服务

#Android MCP 服务

#电脑桌面 MCP 服务

#实现自己的 MCP

#使用 mcpKitForAgent

#平台支持

#集成到自定义 MCP 服务

将设备暴露为 MCP 服务

MCP 工具列表

查看执行报告

配置 MCP

浏览器桥接模式

iOS MCP 服务

Android MCP 服务

电脑桌面 MCP 服务

实现自己的 MCP

使用 mcpKitForAgent

平台支持

集成到自定义 MCP 服务