記事一覧に戻る
cmux Official Skills Deep Dive — Terminal Control, Browser Automation & Markdown Viewer

cmux Official Skills Deep Dive — Terminal Control, Browser Automation & Markdown Viewer

ZenChAIne·
AIcmuxClaude CodeTerminal

Introduction

What makes cmux an "AI agent terminal" is its official skills — instruction sets published in the cmux repository that teach AI agents how to operate cmux. With these skills loaded, Claude Code and other agents can autonomously control the terminal environment.

Key Takeaways

  • cmux official skills cover three pillars: terminal control, browser automation, and Markdown viewer
  • The browser skill's snapshot + refs approach eliminates CSS selector guessing entirely
  • The Markdown viewer auto-detects file changes and updates in real time
  • Combining all three skills enables a complete development workflow inside cmux

This article continues the cmux series, building on the Getting Started guide, Vol.1 lazygit, and Vol.2 Yazi + Neovim — adding autonomous AI operation capabilities to the environment we've built.

What Are cmux Skills?

The cmux repository (9k+ stars) includes a skills/ directory with several official skills. The three primary ones for end users are:

SkillPurposePrimary Use
skills/cmux/Topology controlPane splitting, workspace management, surface navigation
skills/cmux-browser/Browser automationURL opening, form interaction, screenshots
skills/cmux-markdown/Markdown viewerTask list display, live document preview

When Claude Code loads these skill files (SKILL.md), the AI can execute these operations autonomously.

Core Skill — AI Controls Your Terminal Layout

The Handle Model — Foundation for All Operations

cmux uses a 4-tier resource model:

Window (macOS window)
  └── Workspace (tab-like groups)
        └── Pane (split containers)
              └── Surface (tabs within panes — terminal or browser)

Each resource gets a handle like window:1, workspace:2, pane:3, surface:7. The AI uses these handles to target operations.

bash
# Check your current position
cmux identify --json
 
# List topology
cmux list-windows
cmux list-workspaces
cmux list-panes
cmux list-pane-surfaces --pane pane:1

Workspace and Pane Operations

bash
# Create a new workspace (for project-level switching)
cmux new-workspace
 
# Split a pane (open new terminal to the right)
cmux new-split right --panel pane:1
 
# Move a surface to another pane
cmux move-surface --surface surface:7 --pane pane:2 --focus true
 
# Reorder workspaces
cmux reorder-workspace --workspace workspace:4 --before workspace:2

Flash and Health Check — Essential for Automation

bash
# Visually flash a specific surface (draw attention)
cmux trigger-flash --surface surface:7
 
# Check surface state (is it visible? detached?)
cmux surface-health --workspace workspace:2

surface-health is invaluable for confirming UI state stability before sending focus in automation loops. Subtle, but critical for reliable automation.

Browser Skill — No More CSS Selector Guessing with snapshot + refs

The browser skill is one of cmux's most powerful features. It gives AI full automated control over the built-in WebKit browser.

The Fundamental Difference from Traditional Browser Automation

Traditional: Fetch entire DOM → Guess CSS selectors → Operate
cmux:        snapshot → refs (e1, e2, ...) → Direct operation

Running snapshot --interactive assigns compact reference numbers (e1, e2, e3...) to every interactive element on the page. The AI operates using these numbers directly. No CSS selector guessing required.

Basic Workflow

bash
# 1. Open browser
cmux --json browser open https://example.com
 
# 2. Wait for page load
cmux browser surface:7 wait --load-state complete --timeout-ms 15000
 
# 3. Get element refs via snapshot
cmux browser surface:7 snapshot --interactive
 
# 4. Operate directly via refs
cmux browser surface:7 fill e1 "hello"
cmux --json browser surface:7 click e2 --snapshot-after
 
# 5. Re-snapshot after DOM changes (refs are invalidated)
cmux browser surface:7 snapshot --interactive

Form Auto-Fill Example

bash
# Auto-fill a signup form
cmux --json browser open https://example.com/signup
cmux browser surface:7 wait --load-state complete --timeout-ms 15000
cmux browser surface:7 snapshot --interactive
cmux browser surface:7 fill e1 "Jane Doe"
cmux browser surface:7 fill e2 "jane@example.com"
cmux --json browser surface:7 click e3 --snapshot-after
cmux browser surface:7 wait --url-contains "/welcome" --timeout-ms 15000

Rich Wait Patterns

bash
# Wait for element by CSS selector
cmux browser surface:7 wait --selector "#ready" --timeout-ms 10000
 
# Wait for text to appear
cmux browser surface:7 wait --text "Success" --timeout-ms 10000
 
# Wait for URL change
cmux browser surface:7 wait --url-contains "/dashboard" --timeout-ms 10000
 
# Wait on JavaScript condition
cmux browser surface:7 wait --function "document.readyState === 'complete'" --timeout-ms 10000

--snapshot-after — Operate and Re-Snapshot in One Command

bash
# Auto re-snapshot after clicking
cmux --json browser surface:7 click e5 --snapshot-after

After any DOM-changing operation (clicks, form submissions), refs are invalidated. --snapshot-after combines the operation and re-snapshot into a single command.

WKWebView limitations: cmux's browser is WebKit-based, so Chrome DevTools Protocol features (viewport emulation, network interception, screencast recording) return not_supported.

Markdown Skill — Live Preview for Task Lists

When you need to check task lists or documentation during development, there's no need to switch to another app. cmux's Markdown viewer renders rich Markdown right next to your terminal.

Basic Usage

bash
# Open a Markdown file in the viewer
cmux markdown open plan.md
 
# Open in a specific workspace
cmux markdown open design.md --workspace workspace:2

Live File Watch — Automatic Updates

When the file changes, the viewer updates automatically. This is the killer feature.

bash
# Create a task list
cat > plan.md << 'EOF'
# Task List
## Steps
1. Analyze codebase
2. Implement feature
3. Write tests
4. Verify build
EOF
 
# Open in viewer
cmux markdown open plan.md
 
# AI updates progress → viewer auto-refreshes
echo "## Step 1: Complete ✅" >> plan.md

Whether you save from an editor, redirect with echo, or do an atomic file replacement (write to temp, then rename) — the viewer auto-updates.

Visualizing AI Agent Task Progress in Real Time

The most powerful application of live updates is making AI agent task management visible.

Normally, when you give Claude Code a task, logs scroll through the terminal and it's hard to tell "how far along are we?" But if the AI writes progress to a Markdown file and you display it in cmux's viewer, you can track progress in real time from an adjacent pane.

For example, when an AI processes 5 tasks sequentially:

markdown
# Implementation Tasks
 
## Phase 1: Setup
- [x] Project initialization
- [x] Install dependencies
 
## Phase 2: Implementation
- [x] Create API endpoint
- [ ] Frontend implementation ← current
- [ ] Write tests

As the AI completes each task and updates the checkboxes, the viewer reflects changes instantly. You can see "3 out of 5 done" at a glance without following code details.

To automate this, add these rules to your project's AGENTS.md:

markdown
## Plan Display
Before starting tasks, write your plan to a .md file and open it in cmux:
    cmux markdown open plan.md
Update checkboxes as you complete each task.
The viewer auto-detects file changes and updates in real time.

Once the AI reads this rule, it will automatically follow the "write plan to file → open in viewer → update progress" flow.

How Do the Three Skills Work Together?

Combining all three skills creates a development-to-debugging workflow that never leaves cmux.

bash
# 1. Core skill: Prepare layout
cmux new-split right --panel pane:1    # Add pane on the right
 
# 2. Markdown skill: Show task list
cmux markdown open tasks.md            # Task list on the left
 
# 3. Browser skill: Dev server preview
cmux browser open http://localhost:3000  # Browser on the right
 
# 4. AI modifies code → checks browser → updates task list
# Everything happens inside cmux

In this flow:

  • Core skill handles automatic layout configuration
  • Browser skill handles UI verification and testing
  • Markdown skill handles progress visualization

The AI freely combines all three skills to auto-configure the optimal development environment.

FAQ

Q. How do I install the official skills?

A. Skill files are not bundled with the cmux app. Download the skills/ directory contents from the cmux GitHub repository and place them in Claude Code's skill directory (~/.claude/skills/ or your project's .claude/skills/). The cmux CLI commands themselves (cmux send, etc.) are automatically available inside any cmux terminal session (the CMUX_SOCKET_PATH environment variable is set automatically).

Q. When do browser snapshot refs become invalid?

A. After page navigation, modal open/close, or significant DOM changes. Always re-snapshot with snapshot --interactive after such operations, or use the --snapshot-after flag.

Q. What Markdown formats does the viewer support?

A. Headings (h1–h6), code blocks, tables, lists, blockquotes, bold/italic/strikethrough, links, and inline images. Both light and dark modes are supported.

Q. Does this work on Linux or Windows?

A. cmux is macOS-only (macOS 14.0+).

Summary

cmux's official skills give AI agents "eyes" and "hands" within the terminal environment.

  • Core skill for freely manipulating window and pane layouts
  • Browser skill with snapshot + refs for CSS-selector-free web automation
  • Markdown skill for live-previewing task lists and documents

Combine all three and coding, previewing, and progress tracking all happen inside cmux. If you're using cmux, these skills are a must-have.

At ZenChAIne, we continue to explore and push the boundaries of cmux-powered development.

References