🎭 Playwright MCP:微软官方 32.6k Stars,一行命令让 AI Agent 操控真实浏览器
核心亮点
Microsoft 出品的 Playwright MCP 是 Model Context Protocol (MCP) 生态里最受欢迎的浏览器自动化服务器。一行 npx 启动,你的 Claude / Cursor / Copilot 就能直接操控真实 Chrome/Firefox——不靠截图、不靠视觉模型,全靠 Playwright 的 accessibility tree 结构化数据。
它和 browser-use 最大的区别:browser-use 走的是截图+视觉模型路线,Playwright MCP 走的是 accessibility tree + DOM 结构化信息,更轻量、更确定。32.6k Stars,每天 5000+ npm 下载量,社区已经卷疯了。
一句话启动
npx @playwright/mcp@latest
然后在你的 MCP 客户端配置文件里加一段 JSON:
{
"mcpServers": {
"playwright": {
"command": "npx",
"args": ["@playwright/mcp@latest"]
}
}
}
装完就能让 Agent 去"打开 GitHub,搜索 hermes-agent,把前三个结果截图保存"——全部用自然语言描述就行。
真实命令
Claude Code 客户端配置:
claude mcp add playwright npx @playwright/mcp@latest
VS Code Insiders 一键安装:
code --add-mcp '{"name":"playwright","command":"npx","args":["@playwright/mcp@latest"]}'
支持的客户端:Claude Code、Cursor、VS Code、Windsurf、Goose、Copilot、Gemini CLI、Cline、Codex、opencode、Warp……基本覆盖了市面上所有主流 MCP 客户端。
进阶玩法
Playwright MCP 不止能"打开网页点按钮",还支持:
- CDP 连接:连到已运行的 Chrome 实例,利用已有的登录态
- 浏览器扩展:装 Playwright 扩展后,直接操控你当前浏览器标签页——所有 Cookie、Session 都在
- Vision 模式:加上
--caps vision参数,启用坐标级交互(点击某个像素位置) - 代码生成:
--codegen typescript让 Agent 把操作转成 Playwright 测试代码 - 持久 Profile:登录信息自动保存,下次开浏览器不用重新登录
# 带 vision 功能启动
npx @playwright/mcp@latest --caps vision
# 连接已有浏览器实例
npx @playwright/mcp@latest --cdp-endpoint http://localhost:9222
# 模拟 iPhone 15
npx @playwright/mcp@latest --device "iPhone 15" --headless
自动驾驶测试 vs 手工调试
如果你写测试,Playwright MCP 等于给你的 AI Agent 装了个浏览器驱动——Agent 可以自己跑 E2E 测试、截图对比、填表单验证。配合 --save-session 还能保存完整的 session 日志和截图,方便事后复盘。
不过注意:MCP 协议每次调用都要传完整的 accessibility tree,token 消耗不低。微软自己也建议——编码场景用 CLI+SKILLS,自动化/测试场景用 MCP。所以如果你是写代码时顺便测一下,用
@playwright/cli更省 token。
适合谁用
- Coding Agent 用户:Agent 写代码的同时直接验证页面效果
- 测试团队:让 Agent 自动跑 E2E,截图对比、填表单、检查渲染
- 爬虫/数据采集:Agent 操控浏览器抓取 JS 渲染页面
- 任何想让 AI 操作浏览器的人:一条命令的事
🎭 Playwright MCP: 32.6k★ — Microsoft's Official MCP Server, One Command to Give Your AI Agent a Real Browser
The gist: Microsoft's Playwright MCP is the most popular browser automation MCP server out there. One npx command and your Claude/Cursor/Copilot can drive a real Chrome/Firefox—no screenshots, no vision models, just Playwright's accessibility tree.
One-liner install:
npx @playwright/mcp@latest
Then drop this into your MCP client config and you're done. Your AI Agent can "open GitHub, search hermes-agent, screenshot the top 3 results" in natural language.
Supported clients: Claude Code, Cursor, VS Code, Windsurf, Goose, Copilot, Gemini CLI, Cline, Codex, opencode, Warp… basically everything that speaks MCP.
Pro tips:
- --caps vision for coordinate-based interactions
- --cdp-endpoint http://localhost:9222 to connect your existing browser
- --device "iPhone 15" for mobile emulation
- --codegen typescript to turn operations into Playwright test code
Trade-off: MCP sends the full accessibility tree on every call—token-heavy. Microsoft recommends CLI+SKILLS for coding workflows, MCP for testing/automation. Pick your poison.