mig5/cspresso

Fork 0

Miguel Jacq fe58397da7

CI / test (push) Failing after 1m20s

Details

Lint / test (push) Failing after 28s

Details

Trivy / test (push) Successful in 23s

Details

Initial commit

2026-01-02 09:59:52 +11:00

3.6 KiB

Raw Blame History

cspresso

Crawl up to N pages of a site using a headless Chromium (via Playwright), observe what assets are loaded, and emit a draft Content Security Policy (CSP).

This is meant as a starting point. Review and tighten the resulting policy before enforcing it.

Why "draft"?

A crawl rarely covers all user flows (auth-only pages, A/B tests, conditional loads, etc.).
Inline script/style handling is tricky:
- If your pages use nonces, you must generate a new nonce per HTML response and insert it both in the CSP header and in the HTML tags.
- Hashes work only if the inline content is stable byte-for-byte.

Requirements

Python 3.10+
Poetry
Playwright's Chromium browser binaries (auto-installed by this tool if missing)

Install

Poetry

poetry install

pip/pipx

pip install cspresso

AppImage

Download the CSPresso.AppImage from the releases page, make it executable with chmod +x, and run it.

Run

poetry run cspresso https://example.com --max-pages 10

The tool will:

attempt to launch Chromium headless
if Chromium isn't installed, it will run: python -m playwright install chromium
crawl same-origin links up to the page limit
print the visited URLs and a CSP header

Where Playwright installs browsers

By default, this project installs Playwright browsers into a local folder: ./.pw-browsers. This makes installs deterministic and easy to cache in CI.

You can override with --browsers-path or by setting PLAYWRIGHT_BROWSERS_PATH yourself.

Linux notes

If Chromium fails to start due to missing system libraries, try:

poetry run cspresso https://example.com --with-deps

That runs python -m playwright install --with-deps chromium (may require sudo depending on your environment).

Output

Default output is a single CSP header line.

For JSON:

poetry run cspresso https://example.com --json

Full usage info

usage: csp-crawl [-h] [--max-pages MAX_PAGES] [--timeout-ms TIMEOUT_MS] [--settle-ms SETTLE_MS] [--headed] [--no-install] [--with-deps] [--browsers-path BROWSERS_PATH] [--allow-blob] [--unsafe-eval]
                 [--upgrade-insecure-requests] [--include-sourcemaps] [--json]
                 url

Crawl up to N pages (same-origin) with Playwright and generate a draft CSP.

positional arguments:
  url                   Start URL (e.g. https://example.com)

options:
  -h, --help            show this help message and exit
  --max-pages MAX_PAGES
                        Maximum number of pages to visit (default: 10)
  --timeout-ms TIMEOUT_MS
                        Navigation timeout in ms (default: 20000)
  --settle-ms SETTLE_MS
                        Extra time after networkidle to allow hydration/delayed requests (default: 1500)
  --headed              Run with a visible browser window (not headless)
  --no-install          Do not auto-install Chromium if missing
  --with-deps           When installing, include Playwright OS deps (Linux). May require elevated privileges.
  --browsers-path BROWSERS_PATH
                        Directory to install/playwright browsers (default: ./.pw-browsers).
  --allow-blob          Include blob: in common directives (drafty)
  --unsafe-eval         Include 'unsafe-eval' in script-src (not recommended)
  --upgrade-insecure-requests
                        Add upgrade-insecure-requests directive
  --include-sourcemaps  Analyze JS/CSS for sourceMappingURL and add map origins to connect-src
  --json                Output JSON instead of a header line

3.6 KiB Raw Blame History

cspresso

Why "draft"?

Requirements

Install

Poetry

pip/pipx

AppImage

Run

Where Playwright installs browsers

Linux notes

Output

Full usage info

3.6 KiB

Raw Blame History