Extract mode
--extract prints the extracted content and exits.
Deprecated alias: --extract-only.
Notes
- No summarization LLM call happens in this mode.
- No extraction cap is applied. Use
--max-extract-characters <count>to cap output if needed. --format mdmay still convert HTML to Markdown (depending on--markdown-modeand available tools).--lengthis intended for summarization guidance; extraction prints full content.--timestampskeeps the plain transcript text but also exposestranscriptSegmentsandtranscriptTimedText(JSON) and prints a timed transcript block when available.--slidesruns slide detection (YouTube/direct video URLs). Slide metadata is included in JSON output and written toslides.jsonin the slide directory.- When combined with
--extractfor videos that have timed transcripts, the CLI interleaves slide images inline at matching timestamps. - Scene detection auto-tunes using sampled frame hashes.
- When combined with
- For non-YouTube URLs with
--format md, the CLI uses Readability article HTML as the default Markdown input (--markdown-mode readability).- Use
--markdown-mode autoto prefer LLM/markitdown conversion without Readability preprocessing. - Use
--markdown-mode llmto force an LLM conversion. - Use
--firecrawl alwaysto try Firecrawl first.
- Use
- For non-YouTube URLs with
--format md,--markdown-mode autocan convert HTML to Markdown via an LLM when configured.- Force it with
--markdown-mode llm. - If no LLM is configured,
--markdown-mode automay fall back touvx markitdownwhen available.
- Force it with
--markdown-mode readabilityuses Readability to extract article HTML before Markdown conversion.
Daemon note:
/v1/summarizesupportsformat: "markdown"+markdownModefor extract-only output (useextractOnly: true).