Browser Side Panel (Chrome + Firefox Extension + Daemon)
Goal: Chrome Side Panel (“real sidebar”) summarizes what you see on the current tab. Panel open → navigation → auto summarize (optional) → streaming Markdown rendered in-panel.
Quickstart:
- Install summarize (choose one):
npm i -g @steipete/summarizebrew install steipete/tap/summarize(macOS arm64)
- Build/load extension:
apps/chrome-extension/README.md - Firefox sidebar build:
pnpm -C apps/chrome-extension build:firefox(load viaabout:debugging→ temporary add-on) - Open side panel → copy token install command → run:
summarize daemon install --token <TOKEN>(macOS: LaunchAgent, Linux: systemd user, Windows: Scheduled Task)
- Verify:
summarize daemon status- Restart (if needed):
summarize daemon restart
Firefox notes:
- Sidebar UX differs from Chrome’s side panel (persistent sidebar instead of slide-in panel).
- Firefox testing is limited in Playwright; see
apps/chrome-extension/tests/README-firefox.md. - Compatibility details:
apps/chrome-extension/docs/firefox.md.
Dev (repo checkout):
- Use:
pnpm summarize daemon install --token <TOKEN> --dev(autostart service runssrc/cli.tsviatsx, nodist/build required). - E2E (Playwright):
pnpm -C apps/chrome-extension test:e2e- First run:
pnpm -C apps/chrome-extension exec playwright install chromium - Headless:
HEADLESS=1 pnpm -C apps/chrome-extension test:e2e(headful is more reliable for extensions)
- First run:
Troubleshooting
- “Daemon not reachable”:
summarize daemon status- Logs:
~/.summarize/logs/daemon.err.log
- “Need extension-side traces”:
- Options → Logs →
extension.log(panel/background events). - Enable “Extended logging” in Advanced settings for full pipeline traces.
- Options → Logs →
- “Stream ended unexpectedly” / empty chat response:
- The daemon likely stopped mid-stream. Restart it, then click “Try again”.
summarize daemon restart
- Tweet video not transcribing / no progress:
- Ensure
yt-dlpis available on your PATH (or setYT_DLP_PATH) and you have a transcription provider (whisper.cppinstalled orOPENAI_API_KEY/FAL_KEY). - Re-run
summarize daemon install --token <TOKEN>to refresh the daemon env snapshot (launchd won’t inherit your shell PATH).
- Ensure
- “Could not establish connection / Receiving end does not exist”:
- The content script wasn’t injected (yet), or Chrome blocked site access.
- Chrome → extension details → “Site access” → “On all sites” (or allow the domain), then reload the tab.
- “
wants to look for and connect to any device on your local network”: - Trigger: content scripts (page context) hitting the daemon on
http://127.0.0.1:8787(hover summaries) can cause Chrome to attribute the request to the current origin and prompt per-site. - Fix: hover summaries must proxy daemon calls via the extension background service worker (reload the extension after updating).
- Verify daemon:
summarize daemon status(orcurl http://127.0.0.1:8787/health). - Repro/dev:
pnpm -C apps/chrome-extension devthen enable “Hover summaries” and hover a link.
- Trigger: content scripts (page context) hitting the daemon on
Architecture
- Extension (MV3, WXT)
- Side Panel UI: length + typography controls (font family + size), auto/manual toggle.
- Background service worker: tab + navigation tracking, content extraction, starts summarize runs.
- Content script: extract readable article text from the rendered DOM via Readability; also detect SPA URL changes.
- Panel page streams SSE directly (MV3 service workers can be flaky for long-lived streams).
- Daemon (local, autostart service)
- HTTP server on
127.0.0.1:8787only. - Token-authenticated API.
- Runs the existing summarize pipeline (env/config-based) and streams tokens to client via SSE.
- HTTP server on
Data Flow
1) User opens side panel (click extension icon).
2) Panel sends a “ready” message to the background (plus periodic “ping” heartbeats while open).
3) On nav/tab change (and auto enabled): background asks the content script to extract { url, title, text } (best-effort).
4) Background POSTs payload to daemon /v1/summarize with Authorization: Bearer <token>.
5) Panel opens /v1/summarize/<id>/events (SSE) and renders streamed Markdown.
Auto Mode (URL + Page Text)
The extension always sends the same request shape:
- Always:
url,title - When available: extracted
text+truncated mode: "auto"
The daemon decides the best pipeline:
- YouTube / video / podcast / direct media URLs → prefer URL pipeline (transcripts, yt-dlp, Whisper, readability, …).
- Normal articles with extracted text → prefer page pipeline (“what you see”).
- Fallback: if the preferred path fails before output starts, try the other input (when available).
Video Selection (Page vs Video)
When the page contains embedded audio/video, the Summarize button gains a dropdown caret. Click the caret to pick Page vs Video/Audio. Selecting Video/Audio forces URL mode with transcript-first extraction (captions → yt-dlp/Whisper fallback). Selection is per-run (not persisted).
See docs/media.md for detection and transcript rules.
Slides (Side Panel)
-
The slides toggle lights up on media-friendly URLs (YouTube/watch shorts, youtu.be, direct media) or when the page reports video/audio. Defaults to Video on those pages. - Turning slides on refreshes the current summary and requests slide extraction (
yt-dlp,ffmpeg). OCR text is opt-in (Advanced setting) and requirestesseract. Missing tools surface a footer notice with install instructions; restart the daemon after installing. - Slides stay off elsewhere and the toggle is disabled on non-media pages.
SPA Navigation
- Background listens to
chrome.webNavigation.onHistoryStateUpdated(SPA route changes) andtabs.onUpdated(page loads). - Only triggers summarize when the side panel is open (and auto is enabled).
Markdown Rendering
- Use
markdown-itin the panel. - Disable raw HTML:
html: false(avoid sanitizing libraries). linkify: true.- Render links with
target=_blank+rel=noopener noreferrer.
Timestamp Links (Chat)
- When timed transcripts are available, chat context includes
[mm:ss]lines. - Assistant is prompted to cite timestamps; clicking them seeks the current media (video/audio) while preserving play/pause state.
Model Selection UX
- Settings:
-
Model preset (Options → Advanced): autofreecustom string (e.g. openai/gpt-5-mini,openrouter/...). - Length:
short|medium|long|xl|xxl(or a character target like20k). Tooltips show target ranges + paragraph guidance (frompackages/core/src/prompts/summary-lengths.ts). - Language:
auto(match source) or a tag likeen,de,pt-BR(or free-form like “German”). - Prompt override (advanced): custom instruction prefix (context + content still appended).
- Auto summarize: on/off.
- Hover summaries: on/off (side panel drawer, default off).
- Typography: font family (dropdown + custom), font size (slider).
-
- Advanced overrides (Options → Advanced tab).
- Leave blank to use daemon config/defaults; set a value to override.
- Chat (advanced): enable/disable the side panel chat input (default on; summary is the first message).
- Summary timestamps (advanced): include
[mm:ss]links in summaries for media when available (default on). - Slides parallel (advanced): show summary first and extract slides in parallel (default on).
- Slides OCR text (advanced): allow OCR text as a slide text source (default off).
- Extended logging: send full input/output to daemon logs (requires daemon logging enabled).
- Hover summary prompt: customize the prompt used for link hover summaries (prefilled; reset to default).
- Pipeline mode:
page|url(default auto). - Firecrawl:
off|auto|always. - Markdown mode:
readability|llm|auto|off.
- Preprocess:
off|auto|always. - YouTube mode:
no-auto|yt-dlp|web|apify(default auto). - Timeout (e.g.
90s,2m), retries, max output tokens (e.g.2k). - Process manager: live list of daemon-spawned tools (ffmpeg, yt-dlp, tesseract, etc.) with logs.
- Extension includes current settings in request; daemon treats them like CLI flags (
--model,--length,--language,--prompt).
Token Pairing / Setup Mode
Problem: daemon must be secured; extension must discover and pair with it.
- Side panel “Setup” state:
- Generates token (random, 32+ bytes).
- Shows:
summarize daemon install --token <TOKEN>(macOS: LaunchAgent, Linux: systemd user, Windows: Scheduled Task)summarize daemon status
- “Copy command” button.
- Daemon stores token in
~/.summarize/daemon.json. - Extension stores token in
chrome.storage.local. - If daemon unreachable or 401: show Setup state + troubleshooting.
Daemon Endpoints
GET /health- 200 JSON:
{ ok: true, pid }
- 200 JSON:
GET /v1/ping- Requires auth; returns
{ ok: true }
- Requires auth; returns
POST /v1/summarize- Headers:
Authorization: Bearer <token> - Body:
url: string(required)title: string | nullmodel?: string(e.g.auto,free,openai/gpt-5-mini, …)length?: string(e.g.short,xl,20k)language?: string(e.g.auto,en,de,pt-BR)prompt?: string(custom instruction prefix)mode?: "auto" | "page" | "url"(default:"auto")maxCharacters?: number | null(caps URL-mode extraction before summarization; ignored for extract-only unless explicitly provided)format?: "text" | "markdown"(default:"text")markdownMode?: "readability" | "auto" | "llm" | "off"(only whenformat: "markdown")preprocess?: "off" | "auto" | "always"(markitdown/HTML preprocess)extractOnly?: boolean(whentrue, returns extracted content without summarizing; requiresmode: "url")text?: string(required formode: "page"; optional forauto)truncated?: boolean(optional; indicates extractedtextwas shortened)
- 200 JSON:
{ ok: true, id }
- Headers:
GET /v1/summarize/<id>/slides/events- Headers:
Authorization: Bearer <token> - SSE stream of slide updates (
slides,status,done,error) independent of summary stream.
- Headers:
POST /v1/agent(SSE by default; JSON viaAccept: application/jsonor?format=json)- Headers:
Authorization: Bearer <token> - Body:
url: string(required)title?: string | nullpageContent: stringcacheContent?: string(used for cache key; defaults topageContent)messages: Array<Message>(pi-ai format)model?: stringlength?: string(e.g.short,xl,20k)language?: string(e.g.auto,en,de)tools?: string[]automationEnabled?: boolean
- SSE events:
event: chunkdata: { text }event: assistantdata: { ...assistant }event: donedata: {}event: errordata: { message }
- Headers:
POST /v1/agent/history- Headers:
Authorization: Bearer <token> - Body:
url: string(required)pageContent: stringcacheContent?: string(used for cache key; defaults topageContent)model?: stringlength?: stringlanguage?: stringautomationEnabled?: boolean
- 200 JSON:
{ ok: true, messages }
- Headers:
GET /v1/summarize/:id/events(SSE)event: chunkdata: { text }event: metadata: { model }event: statusdata: { text }(progress messages before output starts)event: metricsdata: { elapsedMs, summary, details, summaryDetailed, detailsDetailed }event: donedata: {}event: errordata: { message }
Notes:
- SSE keeps the extension simple + streaming-friendly.
- Requests keyed by
id; daemon keeps a small in-memory map while streaming.
Daemon Autostart
- CLI commands:
summarize daemon install --token <token> [--port 8787]- Writes
~/.summarize/daemon.json - Installs platform autostart service; verifies
/health
- Writes
summarize daemon uninstallsummarize daemon statussummarize daemon run(foreground; used by autostart service)
- Ensure “single daemon”:
- Stable service name + predictable unit/task path
installreplaces previous install and validates token match
Platform details:
- macOS: LaunchAgent plist in
~/Library/LaunchAgents/<label>.plist - Linux: systemd user unit in
~/.config/systemd/user/summarize-daemon.service - Windows: Scheduled Task “Summarize Daemon” +
~/.summarize/daemon.cmd
Docs
docs/chrome-extension.md(this file): architecture + setup + troubleshooting.- Main
README.md: link to extension doc and “Quickstart: 2 commands + load unpacked”. apps/chrome-extension/README.md: extension-specific dev/build/load-unpacked instructions.
Status
- Implemented (daemon + CLI + Chrome extension).