Initial commit
This commit is contained in:
commit
fe58397da7
17 changed files with 2547 additions and 0 deletions
112
README.md
Normal file
112
README.md
Normal file
|
|
@ -0,0 +1,112 @@
|
|||
# cspresso
|
||||
|
||||
<div align="center">
|
||||
<img src="https://git.mig5.net/mig5/cspresso/raw/branch/main/cspresso.svg" alt="CSPresso logo" width="240" />
|
||||
</div>
|
||||
|
||||
Crawl up to *N* pages of a site using a headless Chromium (via Playwright), observe what assets are loaded, and emit a **draft** Content Security Policy (CSP).
|
||||
|
||||
This is meant as a **starting point**. Review and tighten the resulting policy before enforcing it.
|
||||
|
||||
## Why "draft"?
|
||||
|
||||
- A crawl rarely covers all user flows (auth-only pages, A/B tests, conditional loads, etc.).
|
||||
- Inline script/style handling is tricky:
|
||||
- If your pages use nonces, you must generate a **new nonce per HTML response** and insert it both in the CSP header and in the HTML tags.
|
||||
- Hashes work only if the inline content is stable *byte-for-byte*.
|
||||
|
||||
## Requirements
|
||||
|
||||
- Python 3.10+
|
||||
- Poetry
|
||||
- Playwright's Chromium browser binaries (auto-installed by this tool if missing)
|
||||
|
||||
## Install
|
||||
|
||||
### Poetry
|
||||
|
||||
```bash
|
||||
poetry install
|
||||
```
|
||||
|
||||
### pip/pipx
|
||||
|
||||
```bash
|
||||
pip install cspresso
|
||||
```
|
||||
|
||||
### AppImage
|
||||
|
||||
Download the CSPresso.AppImage from the releases page, make it executable with `chmod +x`, and run it.
|
||||
|
||||
## Run
|
||||
|
||||
```bash
|
||||
poetry run cspresso https://example.com --max-pages 10
|
||||
```
|
||||
|
||||
The tool will:
|
||||
1) attempt to launch Chromium headless
|
||||
2) if Chromium isn't installed, it will run: `python -m playwright install chromium`
|
||||
3) crawl same-origin links up to the page limit
|
||||
4) print the visited URLs and a CSP header
|
||||
|
||||
## Where Playwright installs browsers
|
||||
|
||||
By default, this project installs Playwright browsers into a local folder: `./.pw-browsers`.
|
||||
This makes installs deterministic and easy to cache in CI.
|
||||
|
||||
You can override with `--browsers-path` or by setting `PLAYWRIGHT_BROWSERS_PATH` yourself.
|
||||
|
||||
## Linux notes
|
||||
|
||||
If Chromium fails to start due to missing system libraries, try:
|
||||
|
||||
```bash
|
||||
poetry run cspresso https://example.com --with-deps
|
||||
```
|
||||
|
||||
That runs `python -m playwright install --with-deps chromium` (may require sudo depending on your environment).
|
||||
|
||||
## Output
|
||||
|
||||
Default output is a single CSP header line.
|
||||
|
||||
For JSON:
|
||||
|
||||
```bash
|
||||
poetry run cspresso https://example.com --json
|
||||
```
|
||||
|
||||
## Full usage info
|
||||
|
||||
```
|
||||
usage: csp-crawl [-h] [--max-pages MAX_PAGES] [--timeout-ms TIMEOUT_MS] [--settle-ms SETTLE_MS] [--headed] [--no-install] [--with-deps] [--browsers-path BROWSERS_PATH] [--allow-blob] [--unsafe-eval]
|
||||
[--upgrade-insecure-requests] [--include-sourcemaps] [--json]
|
||||
url
|
||||
|
||||
Crawl up to N pages (same-origin) with Playwright and generate a draft CSP.
|
||||
|
||||
positional arguments:
|
||||
url Start URL (e.g. https://example.com)
|
||||
|
||||
options:
|
||||
-h, --help show this help message and exit
|
||||
--max-pages MAX_PAGES
|
||||
Maximum number of pages to visit (default: 10)
|
||||
--timeout-ms TIMEOUT_MS
|
||||
Navigation timeout in ms (default: 20000)
|
||||
--settle-ms SETTLE_MS
|
||||
Extra time after networkidle to allow hydration/delayed requests (default: 1500)
|
||||
--headed Run with a visible browser window (not headless)
|
||||
--no-install Do not auto-install Chromium if missing
|
||||
--with-deps When installing, include Playwright OS deps (Linux). May require elevated privileges.
|
||||
--browsers-path BROWSERS_PATH
|
||||
Directory to install/playwright browsers (default: ./.pw-browsers).
|
||||
--allow-blob Include blob: in common directives (drafty)
|
||||
--unsafe-eval Include 'unsafe-eval' in script-src (not recommended)
|
||||
--upgrade-insecure-requests
|
||||
Add upgrade-insecure-requests directive
|
||||
--include-sourcemaps Analyze JS/CSS for sourceMappingURL and add map origins to connect-src
|
||||
--json Output JSON instead of a header line
|
||||
```
|
||||
Loading…
Add table
Add a link
Reference in a new issue