Integrate with Android (adb)

After connecting the Android device with adb, you can use Midscene to control Android devices with visual-language (VL) models.

Demo Project

Preparation

Config API Key

Config the API key for VL model. For example, if you use qwen-2.5-vl, you can config the API key like this:

# replace with your own
OPENAI_BASE_URL="https://dashscope.aliyuncs.com/compatible-mode/v1"
OPENAI_API_KEY="......"
MIDSCENE_MODEL_NAME="qwen-vl-max-latest"
MIDSCENE_USE_QWEN_VL=1

Android devices can be only be controlled by visual-language (VL) models. For more details, please refer to choose a model.

Adb environment

adb is a command-line tool that allows you to communicate with an Android device.

Verify adb is installed successfully:

adb --version

When you see the following output, adb is installed successfully:

Android Debug Bridge version 1.0.41
Version 34.0.4-10411341
Installed as /usr/local/bin//adb
Running on Darwin 24.3.0 (arm64)

Connect Android device with adb

In the developer options of the system settings, enable the 'USB debugging' of the Android device, if the 'USB debugging (secure settings)' exists, also enable it, then connect the Android device with a USB cable

android usb debug

Verify the connection:

adb devices -l

When you see the following output, the connection is successful:

List of devices attached
s4ey59	device usb:34603008X product:cezanne model:M2006J device:cezan transport_id:3

Step 1. Install dependencies

npm
yarn
pnpm
bun
npm install @midscene/android --save-dev

Step 2. Write scripts

Let's take a simple example: search for headphones on eBay using the browser in the Android device. (Of course, you can also use any other apps on the Android device.)

Write the following code, and save it as ./demo.ts

./demo.ts
import { AndroidAgent, AndroidDevice, getConnectedDevices } from '@midscene/android';

const sleep = (ms) => new Promise((r) => setTimeout(r, ms));
Promise.resolve(
  (async () => {
    const devices = await getConnectedDevices();
    const page = new AndroidDevice(devices[0].udid);

    // 👀 init Midscene agent
    const agent = new AndroidAgent(page,{
      aiActionContext:
        'If any location, permission, user agreement, etc. popup, click agree. If login page pops up, close it.',
    });
    await page.connect();

    // 👀 open browser and navigate to ebay.com (Please ensure that the current page has a browser app)
    await agent.aiAction('open browser and navigate to ebay.com');

    await sleep(5000);

    // 👀 type keywords, perform a search
    await agent.aiAction('type "Headphones" in search box, hit Enter');

    // 👀 wait for loading completed
    await agent.aiWaitFor("There is at least one headphone product");
    // or you can use a normal sleep:
    // await sleep(5000);

    // 👀 understand the page content, extract data
    const items = await agent.aiQuery(
      "{itemTitle: string, price: Number}[], find item in list and corresponding price"
    );
    console.log("headphones in stock", items);

    // 👀 assert by AI
    await agent.aiAssert("There is a category filter on the left");
  })()
);

Step 3. run

Using tsx to run

# run
npx tsx demo.ts

After a while, you will see the following output:

[
{
  itemTitle: 'Beats by Dr. Dre Studio Buds Totally Wireless Noise Cancelling In Ear + OPEN BOX',
  price: 505.15
},
{
  itemTitle: 'Skullcandy Indy Truly Wireless Earbuds-Headphones Green Mint',
  price: 186.69
}
]

Step 4: view the report

After the above command executes successfully, the console will output: Midscene - report file updated: /path/to/report/some_id.html. You can open this file in a browser to view the report.

More interfaces in AndroidAgent

Except the common agent interfaces in API Reference, AndroidAgent also provides some other interfaces:

agent.launch()

Launch a webpage or native page.

  • Type
function launch(uri: string): Promise<void>;
  • Parameters:

    • uri: string - The uri to open, can be a webpage url or a native app's package name or activity name, if the activity name exists, it should be separated by / (e.g. com.android.settings/.Settings).
  • Return Value:

    • Returns a Promise that resolves to void when the page is opened.
  • Examples:

import { AndroidAgent, AndroidDevice } from '@midscene/android';

const page = new AndroidDevice('s4ey59ytbitot4yp');
const agent = new AndroidAgent(page);

await agent.launch('https://www.ebay.com'); // open a webpage
await agent.launch('com.android.settings'); // open a native page
await agent.launch('com.android.settings/.Settings'); // open a native page

agentFromAdbDevice()

Create a AndroidAgent from a connected adb device.

  • Type
function agentFromAdbDevice(deviceId?: string, opts?: PageAgentOpt): Promise<AndroidAgent>;
  • Parameters:

    • deviceId?: string - Optional, the adb device id to connect. If not provided, the first connected device will be used.
    • opts?: PageAgentOpt - Optional, the options for the AndroidAgent, refer to constructor.
  • Return Value:

    • Promise<AndroidAgent> Returns a Promise that resolves to an AndroidAgent.
  • Examples:

import { agentFromAdbDevice } from '@midscene/android';

const agent = await agentFromAdbDevice('s4ey59ytbitot4yp'); // create a AndroidAgent from a specific adb device
const agent = await agentFromAdbDevice(); // no deviceId, use the first connected device

getConnectedDevices()

Get all connected Android devices.

  • Type
function getConnectedDevices(): Promise<Device[]>;
interface Device {
  /**
   * The device udid.
   */
  udid: string;
  /**
   * Current device state, as it is visible in
   * _adb devices -l_ output.
   */
  state: string;
  port?: number;
}
  • Return Value:

    • Promise<Device[]> Returns a Promise that resolves to an array of Device.
  • Examples:

import { agentFromAdbDevice, getConnectedDevices } from '@midscene/android';

const devices = await getConnectedDevices();
console.log(devices);
const agent = await agentFromAdbDevice(devices[0].udid);

More

FAQ

Why can't I control the device even though I've connected it?

Please check if the device is unlocked in the developer options of the system settings.

android usb debug

Why can't I see the device even though I've connected it?

Connect device , please ensure that the device is unlocked.