🎬 video-use：7.6k Stars，用 Claude Code 剪视频，不用学 Premiere

项目地址：github.com/browser-use/video-use | ⭐ 7,664 | 🛠 Python | 👤 browser-use

老实说，剪视频这事儿我一直挺抗拒的。打开 Premiere 等两分钟，调个时间轴再等渲染，改个字幕又得重新导出一遍。更别提那些 filler words——"嗯"、"那个"、"就是"——手动剪到怀疑人生。

browser-use 团队出了个新玩意儿：video-use。把原始素材往文件夹里一丢，跟 Claude Code 说一句"帮我把这些剪成一条成片"，它自己就干完了。100% 开源，不绑任何付费服务（除了可选的 ElevenLabs API）。

它能干什么？

🔥 自动剪掉 filler words — umm、ah、假开头、两段之间的空白，精确到字级别。不是按时间轴砍，是读语音转录文本后精准定位。

⚡ 自动调色 — 暖色电影风、中性通透风，或者你自己写的 ffmpeg chain，每个片段单独处理。

🎯 30ms 音频淡入淡出 — 每个剪辑点自动加，你永远不会听到爆音。

📝 自动烧录字幕 — 默认 2 词一大写块，完全可自定义样式。

🎨 动画叠加层 — 通过 HyperFrames、Remotion、Manim 或 PIL 生成，作为子 Agent 并行跑，一个动画一个子进程。

🔄 自我评估 — 渲染完在每个剪辑边界自动检查，有视觉跳跃或音频爆音自动修，最多重试 3 次。

💾 会话记忆 — 存在 project.md 里，下周回来接着干。

怎么用？

最直接的方式：

# 1. 克隆项目
git clone https://github.com/browser-use/video-use ~/Developer/video-use
ln -sfn ~/Developer/video-use ~/.claude/skills/video-use

# 2. 装依赖
cd ~/Developer/video-use
uv sync
brew install ffmpeg

# 3. 配 API key
cp .env.example .env
# 编辑 .env，填入 ELEVENLABS_API_KEY

# 4. 把原始素材放到一个文件夹
cd /path/to/your/videos
claude

然后在 Claude Code 里说一句：

edit these into a launch video

它会先扫描素材，给你一个剪辑方案，你确认后才开始干活。所有输出在 edit/final.mp4。

原理

LLM 从来不看视频画面——它读视频。

Layer 1 — 音频转录文本。 用 ElevenLabs Scribe 转一遍，拿到字级别的时间戳、说话人标注、音频事件（笑声、掌声）。所有素材打包成一个 ~12KB 的 takes_packed.md，LLM 直接读这个。

Layer 2 — 视觉快照（按需）。 timeline_view 给任意时间范围生成一张胶片条 + 波形 + 字标签的 PNG。只在需要做剪辑决策时调用——比如判断犹豫停顿、对比两条重拍、确认剪辑点是否准确。

粗暴做法：30,000 帧 × 1,500 tokens = 45M tokens 的噪音。
video-use 的做法：12KB 文本 + 几张 PNG。

思路跟 browser-use 给 LLM 一个结构化 DOM 而不是截图完全一致——只不过这次对象从网页换成了视频。

踩坑提醒

默认用 ElevenLabs 做转录，不开源。你也可以换成 Whisper 本地跑，但时间准确度差一截。
第一次跑的时候它会问你 ElevenLabs API key，提前备好。
如果素材很多，建议先手动筛选一轮再丢给 Agent——它目前没有自动"挑最好的 take"的能力。

总结

7.6k Stars，browser-use 团队出品，质量有保障
Claude Code 直接对话式剪辑，不用学任何剪辑软件
自动去 filler words + 调色 + 字幕 + 动画，全链路自动化
开放的 ffmpeg chain，不限制你的创意
自我评估循环保证输出质量，不翻车

🎬 video-use: 7.6k Stars — Edit Videos with Claude Code, No Premiere Required

github.com/browser-use/video-use | ⭐ 7,664 | 🛠 Python | 👤 browser-use

Honestly, I've always dreaded video editing. Open Premiere, wait two minutes, tweak the timeline, wait for render, fix a subtitle typo, export the whole thing again. And those filler words — "umm," "uh," false starts — manually cutting them out is soul-crushing.

The browser-use team just dropped video-use. Drop raw footage in a folder, tell Claude Code "edit these into a launch video," and it handles the rest. 100% open source, no paid service required (except the optional ElevenLabs API).

What It Does

🔥 Auto-cuts filler words — umm, ah, false starts, silence between takes. Word-level precision via transcript analysis, not timeline scrubbing.

⚡ Auto color grading — warm cinematic, neutral punch, or your own custom ffmpeg chain. Applied per segment.

🎯 30ms audio fades at every cut — no pops, guaranteed.

📝 Burns subtitles — 2-word UPPERCASE chunks by default, fully customizable.

🎨 Animation overlays via HyperFrames, Remotion, Manim, or PIL — spawned as parallel sub-agents.

🔄 Self-evaluates rendered output at every cut boundary. Catches visual jumps and audio pops, auto-fixes up to 3 retries.

💾 Session memory in project.md — pick up next week where you left off.

Quick Start

# 1. Clone and symlink
git clone https://github.com/browser-use/video-use ~/Developer/video-use
ln -sfn ~/Developer/video-use ~/.claude/skills/video-use

# 2. Install deps
cd ~/Developer/video-use
uv sync
brew install ffmpeg

# 3. Add API key
cp .env.example .env
# Edit .env with ELEVENLABS_API_KEY

# 4. Point your agent at raw footage
cd /path/to/your/videos
claude

Then just say:

edit these into a launch video

It inventories the sources, proposes a strategy, waits for your OK, then produces edit/final.mp4.

How It Works

The LLM never watches the video — it reads it.

Layer 1 — Audio transcript. One ElevenLabs Scribe call per source gives word-level timestamps, speaker diarization, and audio events. All takes pack into a ~12KB takes_packed.md.

Layer 2 — Visual composite (on demand). timeline_view produces a filmstrip + waveform + word labels PNG for any time range. Called only at decision points.

Naive approach: 30,000 frames × 1,500 tokens = 45M tokens of noise.
Video Use: 12KB text + a handful of PNGs.

Same philosophy as browser-use giving an LLM a structured DOM instead of a screenshot — but for video.

Gotchas

Default transcriber is ElevenLabs (not open-source). You can swap to Whisper locally, but timestamp accuracy drops.
Have your ElevenLabs API key ready before the first run.
Currently no "pick the best take" logic — pre-filter your footage before dumping it in.

Summary

7.6k Stars from the browser-use team — quality pedigree
Chat-based video editing with Claude Code, zero learning curve
Auto filler-word removal + color grading + subtitles + animations
Open ffmpeg chain, no creative limitations
Self-evaluation loop ensures output quality

菜单

分享

🎬 video-use：7.6k Stars，用 Claude Code 剪视频，不用学 Premiere / video-use: Edit Videos with Claude Code, No Premiere Required

🎬 video-use：7.6k Stars，用 Claude Code 剪视频，不用学 Premiere

它能干什么？

怎么用？

原理

踩坑提醒

总结

🎬 video-use: 7.6k Stars — Edit Videos with Claude Code, No Premiere Required

What It Does

Quick Start

How It Works

Gotchas

Summary

评论

🧠 TencentDB-Agent-Memory：腾讯开源的 Agent 记忆层，Token 省 61% 通过率提 51%，pip install 即用 / TencentDB Agent Memory: Tencent's Agent Memory Layer Cuts Tokens 61%, Boosts Pass Rate 51%

🎬 video-use：7.6k Stars，用 Claude Code 剪视频，不用学 Premiere / video-use: Edit Videos with Claude Code, No Premiere Required

🦀 ZeroClaw：31k Stars 的 Rust AI 个人助手，一个二进制文件跑遍所有平台

📑 PageIndex：31.4k Stars 的无向量推理 RAG，不用向量数据库也能 98.7% 准确率

⚡ vLLM：80k Stars 的高吞吐 LLM 推理引擎，一行命令把你的 GPU 跑满

⚡ RTK (Rust Token Killer)：48.6k Stars 的 CLI 代理，让你的 AI Coding Agent Token 消耗直降 80% / RTK: The Rust CLI Proxy That Slashes AI Agent Token Usage by 60-90%

🌊 Ruflo：51.6k Stars 的 Claude Code Agent 编排平台，一个命令拉起 100+ Agent 团队

🐉 Skyvern：21.6k Stars 的 AI 浏览器自动化，用 Computer Vision 看懂网页，再也不怕样式变了 / AI-Powered Browser Automation with LLM + Computer Vision

⚡ gws (Google Workspace CLI)：一个命令管理 Google 全家桶，26k Stars 的官方 CLI 给 AI Agent 装上办公超能力

🎼 Harmonist：186 个 Agent 的机械协议强制框架，零依赖零运行时 / Harmonist: 186-Agent Mechanical Protocol Enforcement Framework