LLM / summarization mode
By default summarize will call an LLM using direct provider API keys. When CLI tools are
installed, auto mode can use local CLI models when cli.enabled is set (see docs/cli.md).
Defaults
- Default model:
auto - Override with
SUMMARIZE_MODEL, config file (model), or--model.
Env
.env(optional): when running the CLI,summarizealso reads.envin the current working directory and merges it into the environment (real env vars win).XAI_API_KEY(required forxai/...models)XAI_BASE_URL(optional; override xAI API endpoint)OPENAI_API_KEY(required foropenai/...models)OPENAI_BASE_URL(optional; OpenAI-compatible API endpoint, e.g. OpenRouter)OPENAI_USE_CHAT_COMPLETIONS(optional; force OpenAI chat completions)OPENROUTER_API_KEY(optional; required foropenrouter/...models; also used whenOPENAI_BASE_URLpoints to OpenRouter)Z_AI_API_KEY(required forzai/...models; supportsZAI_API_KEYalias)Z_AI_BASE_URL(optional; override default Z.AI base URL)GEMINI_API_KEY(required forgoogle/...models; also acceptsGOOGLE_GENERATIVE_AI_API_KEY/GOOGLE_API_KEY)GOOGLE_BASE_URL/GEMINI_BASE_URL(optional; override Google API endpoint)ANTHROPIC_API_KEY(required foranthropic/...models)ANTHROPIC_BASE_URL(optional; override Anthropic API endpoint)SUMMARIZE_MODEL(optional; overrides default model selection)CLAUDE_PATH/CODEX_PATH/GEMINI_PATH(optional; override CLI binary paths)
Flags
--model <model>- Examples:
cli/codex/gpt-5.2cli/claude/sonnetcli/gemini/gemini-3-flash-previewgoogle/gemini-3-flash-previewopenai/gpt-5-minizai/glm-4.7xai/grok-4-fast-non-reasoninggoogle/gemini-2.0-flashanthropic/claude-sonnet-4-5openrouter/meta-llama/llama-3.3-70b-instruct:free(force OpenRouter)
- Examples:
--cli [provider]- Examples:
--cli claude,--cli Gemini,--cli codex(equivalent to--model cli/<provider>);--clialone uses auto selection with CLI enabled.
- Examples:
--model auto- See
docs/model-auto.md
- See
--model <preset>- Uses a config-defined preset (see
docs/config.md→ “Presets”).
- Uses a config-defined preset (see
--prompt <text>/--prompt-file <path>- Overrides the built-in summary instructions (prompt becomes the instruction prefix).
- Prompts are wrapped in
<instructions>,<context>,<content>tags. - When
--lengthis numeric, we addOutput is X characters.When--languageis explicitly set, we addOutput should be <language>.
--no-cache- Bypass summary cache reads and writes only (LLM output). Extract/transcript caches still apply.
--cache-stats- Print cache stats and exit.
--clear-cache- Delete the cache database and exit. Must be used alone.
--video-mode auto|transcript|understand- Only relevant for video inputs / video-only pages.
--length short|medium|long|xl|xxl|<chars>- This is soft guidance to the model (no hard truncation).
- Minimum numeric value: 50 chars.
- Default:
long. - Output format is Markdown; use short paragraphs and only add bullets when they improve scanability.
--force-summary- Always run the LLM even when extracted content is shorter than the requested length.
--max-output-tokens <count>- Hard cap for output tokens (optional).
- If omitted, no max token parameter is sent (provider default).
- Minimum numeric value: 16.
- Recommendation: prefer
--lengthunless you need a hard cap (some providers count “reasoning” into the cap).
--retries <count>- LLM retry attempts on timeout (default: 1).
--json(includes prompt + summary in one JSON object)
Prompt rules
- Video and podcast summaries omit sponsor/ads/promotional segments; do not include them in the summary.
- Do not mention or acknowledge sponsors/ads, and do not say you skipped or ignored anything.
- If a standout line is present, include 1-2 short exact excerpts formatted as Markdown italics with single asterisks. Do not use quotation marks of any kind (straight or curly). If a title or excerpt would normally use quotes, remove them and optionally italicize the text instead. Apostrophes in contractions are OK. Never include ad/sponsor/boilerplate excerpts and do not mention them. Avoid sponsor/ad/promo language, brand names like Squarespace, or CTA phrases like discount code.
- Final check: remove sponsor/ad references or mentions of skipping/ignoring content. Remove any quotation marks. Ensure standout excerpts are italicized; otherwise omit them.
- Hard rules: never mention sponsor/ads; never output quotation marks of any kind (straight or curly), even for titles.
Z.AI
Use --model zai/<model> (e.g. zai/glm-4.7). Defaults to Z.AI’s base URL and uses chat completions.
Input limits
- Text prompts are checked against the model’s max input tokens (LiteLLM catalog) using a GPT tokenizer.
- Text files over 10 MB are rejected before tokenization.
PDF attachments
- For PDF inputs,
--preprocess autowill send the PDF directly to Anthropic/OpenAI/Gemini when a fixed model supports documents; otherwise we fall back to markitdown. --preprocess alwaysforces markitdown (no direct attachments).- Streaming is disabled for document attachments.