Midscene supports caching Plan steps and matched DOM element information to reduce AI model calls and greatly improve execution efficiency. Please note that DOM element cache is only supported for web automation tasks.
Effect
With caching hit, time cost is significantly reduced. For example, in the following case, execution time was reduced from 51 seconds to 28 seconds.
Midscene's caching mechanism is based on input stability and output reusability. When the same task instructions are repeatedly executed in similar page environments, Midscene will prioritize using cached results to avoid repeated AI model calls, significantly improving execution efficiency.
The core caching mechanisms include:
ai
, aiAction
), Midscene uses the prompt instruction as the cache key to store the execution plan returned by AIaiLocate
, aiTap
), the system uses the location prompt as the cache key to store element XPath information, and verifies whether the XPath is still valid on the next executionaiBoolean
, aiQuery
, aiAssert
will never be cached.Cache contents will be saved in the ./midscene_run/cache
directory with the .cache.yaml
as the extension name.
By configuring the cache
option, you can enable caching for your agent.
Configuration: cache: false
or not configuring the cache
option
Completely disable cache functionality, always call AI model for every operation. Suitable when you need real-time results or for debugging. By default, if you don't configure the cache
option, caching is disabled.
Configuration: cache: { id: "my-cache-id" }
or cache: { strategy: "read-write", id: "my-cache-id" }
Automatically read existing cache and update cache files during execution. The default value of strategy
is read-write
.
YAML mode also supports cache: true
to automatically use the file name as the cache ID.
Configuration: cache: { strategy: "read-only", id: "my-cache-id" }
Only read cache, no automatic writing to cache files. Requires manual agent.flushCache()
call to write cache files. Suitable for production environments to ensure cache consistency.
Configuration via MIDSCENE_CACHE=1
environment variable with cacheId, equivalent to read-write mode.
@midscene/web/playwright
When using PlaywrightAiFixture
from @midscene/web/playwright
, pass the same cache
options to control caching behaviour.
When you run the fixture in read-only mode you need to manually persist the cache after your test steps. Use the agentForPage
helper provided by the fixture to fetch the underlying agent, then call agent.flushCache()
at the point where you want to write the cache file:
Please ensure you have correctly configured caching:
cache: { id: "your-cache-id" }
in the constructorcache: true
or cache: { id: "your-cache-id" }
in fixture configurationagent.cache.id
in the YAML fileagent.flushCache()
methodcacheId
and enable MIDSCENE_CACHE=1
environment variableYou can view the report file. If the cache is hit, you will see the cache
tip and the time cost is obviously reduced.
You need to commit the cache files to the repository in CI and recheck the cache hit conditions.
No. Caching is the way to accelerate the execution, but it's not a tool for ensuring long-term script stability. We have noticed many scenarios where the cache may miss when the DOM structure changes. AI services are still needed to reevaluate the task when the cache miss occurs.
You can remove the cache file in the ./midscene_run/cache
directory, or edit the contents in the cache file.
You can use the cacheable
option to disable the cache for a single API.
Please refer to the documentation of the corresponding API for details.
Midscene uses XPath to cache the element location. We are using a relatively strict strategy to prevent false matches. In these situations, the cache will not be accessed.
When the cache is not hit, the process will fall back to continue using AI services to find the element.
You can set the DEBUG=midscene:cache:*
environment variable to get more debug logs for caching.