A CLI tool to crawl a website and automatically generate a Content Security Policy (CSP) for it. https://cspresso.cafe
Find a file
2026-01-02 10:53:18 +11:00
.forgejo/workflows Initial commit 2026-01-02 09:59:52 +11:00
src/cspresso Fix detection of Python for AppImage if it needs to install browsers via playwright 2026-01-02 10:50:53 +11:00
tests black 2026-01-02 10:03:39 +11:00
.gitignore Initial commit 2026-01-02 09:59:52 +11:00
.pre-commit-config.yaml Initial commit 2026-01-02 09:59:52 +11:00
CHANGELOG.md Fix detection of Python for AppImage if it needs to install browsers via playwright 2026-01-02 10:50:53 +11:00
cspresso.svg Initial commit 2026-01-02 09:59:52 +11:00
LICENSE Initial commit 2026-01-02 09:59:52 +11:00
poetry.lock Initial commit 2026-01-02 09:59:52 +11:00
pyproject.toml Initial commit 2026-01-02 09:59:52 +11:00
README.md update README 2026-01-02 10:53:18 +11:00
release.sh Initial commit 2026-01-02 09:59:52 +11:00
tests.sh with deps? 2026-01-02 10:03:15 +11:00

cspresso

CSPresso logo

Crawl up to N pages of a site using a headless Chromium (via Playwright), observe what assets are loaded, and emit a draft Content Security Policy (CSP).

This is meant as a starting point. Review and tighten the resulting policy before enforcing it.

Why "draft"?

  • A crawl rarely covers all user flows (auth-only pages, A/B tests, conditional loads, etc.).
  • Inline script/style handling is tricky:
    • If your pages use nonces, you must generate a new nonce per HTML response and insert it both in the CSP header and in the HTML tags.
    • Hashes work only if the inline content is stable byte-for-byte.

Requirements

  • Python 3.10+
  • Poetry
  • Playwright's Chromium browser binaries (auto-installed by this tool if missing)

Install

If using my artifacts from the Releases page, you may wish to verify the GPG signatures with the key.

It can be found at https://mig5.net/static/mig5.asc . The fingerprint is 00AE817C24A10C2540461A9C1D7CDE0234DB458D.

Poetry

poetry install

pip/pipx

pip install cspresso

AppImage

Download the CSPresso.AppImage from the releases page, make it executable with chmod +x, and run it.

Run

cspresso https://example.com --max-pages 10

The tool will:

  1. attempt to launch Chromium headless
  2. if Chromium isn't installed, it will run: python -m playwright install chromium
  3. crawl same-origin links up to the page limit
  4. print the visited URLs and a CSP header

Where Playwright installs browsers

By default, this project installs Playwright browsers into a local folder: ./.pw-browsers. This makes installs deterministic and easy to cache in CI.

You can override with --browsers-path or by setting PLAYWRIGHT_BROWSERS_PATH yourself.

Linux notes

If Chromium fails to start due to missing system libraries, try:

poetry run cspresso https://example.com --with-deps

That runs python -m playwright install --with-deps chromium (may require sudo depending on your environment).

Output

Default output is a single CSP header line.

For JSON:

poetry run cspresso https://example.com --json

Full usage info

usage: cspresso [-h] [--max-pages MAX_PAGES] [--timeout-ms TIMEOUT_MS] [--settle-ms SETTLE_MS] [--headed] [--no-install] [--with-deps] [--browsers-path BROWSERS_PATH] [--allow-blob] [--unsafe-eval]
                [--upgrade-insecure-requests] [--include-sourcemaps] [--ignore-non-html] [--json]
                url

Crawl up to N pages (same-origin) with Playwright and generate a draft CSP.

positional arguments:
  url                   Start URL (e.g. https://example.com)

options:
  -h, --help            show this help message and exit
  --max-pages MAX_PAGES
                        Maximum number of pages to visit (default: 10)
  --timeout-ms TIMEOUT_MS
                        Navigation timeout in ms (default: 20000)
  --settle-ms SETTLE_MS
                        Extra time after networkidle to allow hydration/delayed requests (default: 1500)
  --headed              Run with a visible browser window (not headless)
  --no-install          Do not auto-install Chromium if missing
  --with-deps           When installing, include Playwright OS deps (Linux). May require elevated privileges.
  --browsers-path BROWSERS_PATH
                        Directory to install/playwright browsers (default: ./.pw-browsers).
  --allow-blob          Include blob: in common directives (drafty)
  --unsafe-eval         Include 'unsafe-eval' in script-src (not recommended)
  --upgrade-insecure-requests
                        Add upgrade-insecure-requests directive
  --include-sourcemaps  Analyze JS/CSS for sourceMappingURL and add map origins to connect-src
  --ignore-non-html     Ignore non-HTML pages that get crawled (which might trigger Chromium's word-wrap hash: https://stackoverflow.com/a/69838710)
  --json                Output JSON instead of a header line