Page Agent17.7k Stars JS GUI Agent script：

项目地址：alibaba/page-agent | ⭐ 17.7k Stars | 🛠 TypeScript | 🏢 Alibaba

老实说，之前要实现网页自动化，要么装个 Selenium 全家桶，要么搞 Puppeteer 跑个无头浏览器，要么装个浏览器插件——整得跟要搞个大工程似的。Alibaba 开源的 Page Agent 直接打破了这个局面，纯 JS 一个 script 标签就能让你的网页拥有 GUI Agent 能力，连后端都不用动。

🎯 用一句话说清楚它做了什么

Page Agent 是一个纯 JavaScript 实现的 GUI Agent 库。你把它塞进你的网页，然后就能用自然语言指挥它操作页面——点按钮、填表单、抓数据，全都不用写选择器。

最骚的是它不需要截图、不需要多模态模型、不需要浏览器插件。基于文本的 DOM 操作，你的普通 LLM 就能驱动它。

⚡ 一行代码体验

最快的体验方式——在你的页面里加一个 script 标签：

<script src="https://cdn.jsdelivr.net/npm/page-agent@1.8.1/dist/iife/page-agent.demo.js" crossorigin="true"></script>

⚠️ 这个 CDN 用的是阿里提供的免费测试 LLM API，仅供技术评估。

🛠 正经项目集成

装依赖：

npm install page-agent

然后在代码里初始化：

import { PageAgent } from 'page-agent'

const agent = new PageAgent({
    model: 'qwen3.5-plus',
    baseURL: 'https://dashscope.aliyuncs.com/compatible-mode/v1',
    apiKey: 'YOUR_API_KEY',
    language: 'zh-CN',
})

await agent.execute('点击登录按钮')

就这么简单。再也不用手写 document.querySelector('#login-btn').click() 了。

🔧 更多场景

除了点按钮填表单，还能做 SaaS AI Copilot（几行代码给你的产品加 AI 副驾）、跨页面 Agent（配合 Chrome 扩展）、甚至通过 MCP Server 让外部的 Agent 客户端控制你的浏览器。

Page Agent 的定位很克制——它不为服务端自动化设计，专注在客户端网页增强。实际开发中，这意味着你的 ERP/CRM/管理后台可以瞬间拥有一个会说中文的 AI 操作员。

总结

Page Agent 用纯前端 JS 实现 GUI Agent，无需后端、插件或无头浏览器

一个 script 标签或一行 npm install 就能接入

文本驱动 DOM 操作，普通 LLM 即可运行，不需要多模态模型

支持自定义 LLM、Chrome 扩展、MCP Server

适合 SaaS Copilot、智能表单、无障碍等场景

Project: alibaba/page-agent | ⭐ 17.7k Stars | 🛠 TypeScript | 🏢 Alibaba

Let's be honest — web automation has always meant either installing Selenium, spinning up a headless browser with Puppeteer, or adding a browser extension. Alibaba's open-source Page Agent flips that completely. It's pure JavaScript — one script tag gives your web page GUI Agent capabilities, no backend changes needed.

🎯 What It Actually Does

Page Agent is a JavaScript in-page GUI agent library. Drop it into any web page, and you can control the interface with natural language — click buttons, fill forms, scrape data, all without writing a single CSS selector.

The slick part? No screenshots, no multi-modal models, no browser extensions required. It manipulates the DOM through text-based commands, so any ordinary LLM can drive it.

⚡ Try It in One Line

Fastest way to get started — add one script tag to your page:

<script src="https://cdn.jsdelivr.net/npm/page-agent@1.8.1/dist/iife/page-agent.demo.js" crossorigin="true"></script>

⚠️ This CDN uses a free testing LLM API — for evaluation only.

🛠 Production Integration

Install via npm:

npm install page-agent

Then initialize in your code:

import { PageAgent } from 'page-agent'

const agent = new PageAgent({
    model: 'qwen3.5-plus',
    baseURL: 'https://dashscope.aliyuncs.com/compatible-mode/v1',
    apiKey: 'YOUR_API_KEY',
    language: 'en-US',
})

await agent.execute('Click the login button')

That's it. No more hand-writing document.querySelector('#login-btn').click().

🔧 More Use Cases

Beyond clicking buttons and filling forms, you can build SaaS AI Copilots (add an AI assistant to your product in lines of code), multi-page agents (via Chrome extension), or expose browser control to external agent clients through MCP Server.

Page Agent is intentionally scoped — it's designed for client-side enhancement, not server-side automation. In practice, that means your ERP, CRM, or admin dashboard can instantly get a natural-language AI operator.

Key Takeaways

Pure frontend JS GUI agent — no backend, no plugins, no headless browsers

One script tag or one npm install to get started

Text-driven DOM manipulation works with any standard LLM

Supports custom LLMs, Chrome extension, MCP Server

Perfect for SaaS copilots, smart forms, and accessibility enhancement

菜单

分享

Page Agent17.7k Stars JS GUI Agent script：

🎯 用一句话说清楚它做了什么

⚡ 一行代码体验

🛠 正经项目集成

🔧 更多场景

总结

🎯 What It Actually Does

⚡ Try It in One Line

🛠 Production Integration

🔧 More Use Cases

Key Takeaways

评论

🧠 Mem0：55k Stars 的开源 AI 记忆层，pip install 让你的 Agent 不再"转头就忘" / Mem0: 55k Stars Open-Source Memory Layer for AI Agents

🐺 OpenFang：17.5k Stars 的开源 Agent 操作系统，装了它你的 Agent 就自己干活了

🤖 AionUi：25k Stars 的开源 AI 协作桌面，一个 App 管理所有 Coding Agent / AionUi: Free Open-Source Multi-Agent Cowork Desktop

🍒 Cherry Studio：45k Stars 的跨平台 AI 桌面客户端，一个 App 装下所有大模型

⚡ Mastra：23.9k Stars 的 TypeScript AI Agent 框架，Gatsby 团队出品，一行命令搭好生产级 Agent

🎨 Taste Skill：17k Stars 的 Anti-Slop 前端框架，一句命令让 AI 不再生成丑界面

⚡ Agno：40k Stars 的一站式 Agent 平台 SDK，20 行代码搭出生产级 AI 应用

🔥 GenericAgent：11.4k Stars 的自我进化 Agent，3K 行代码长出专属技能树

🎯 Page Agent：17.8k Stars，阿里开源的 JavaScript 页面 GUI Agent，一行代码给你的网页装上 AI

🦌 DeerFlow：ByteDance's 67k Stars SuperAgent Harness，三行命令跑起一个 Agent 团队