README.md update

Include files from /usr/local/bin and /usr/local/etc in harvest (assuming they aren't binaries or symlinks) and store in usr_local_custom role, similar to etc_custom.
Add diff mode
2025-12-18 17:24:45 +11:00 · 2025-12-18 17:11:04 +11:00 · 2025-12-18 14:59:51 +11:00 · 2025-12-18 13:50:00 +11:00 · 2025-12-18 13:45:59 +11:00 · 2025-12-18 13:44:26 +11:00
18 changed files with 1715 additions and 232 deletions
--- a/.pre-commit-config.yaml
+++ b/.pre-commit-config.yaml
@ -0,0 +1,25 @@
+repos:
+  - repo: https://github.com/pycqa/flake8
+    rev: 7.3.0
+    hooks:
+      - id: flake8
+        args: ["--select=F"]
+        types: [python]
+
+  - repo: https://github.com/psf/black-pre-commit-mirror
+    rev: 25.11.0
+    hooks:
+      - id: black
+        language_version: python3
+
+  - repo: https://github.com/pre-commit/pre-commit-hooks
+    rev: v4.4.0
+    hooks:
+      - id: trailing-whitespace
+      - id: end-of-file-fixer
+
+  - repo: https://github.com/PyCQA/bandit
+    rev: 1.9.2
+    hooks:
+    - id: bandit
+      files: ^enroll/
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@ -1,3 +1,13 @@
+# 0.1.2
+
+ * Include files from `/usr/local/bin` and `/usr/local/etc` in harvest (assuming they aren't binaries or
+   symlinks) and store in `usr_local_custom` role, similar to `etc_custom`.
+
+# 0.1.1
+
+ * Add `diff` subcommand which can compare two harvests and send email or webhook notifications in different
+   formats.
+
 # 0.1.0

 * Add remote mode for harvesting a remote machine via a local workstation (no need to install enroll remotely)
@ -39,7 +49,7 @@

 * Merge pkg_ and roles created based on file/service detection
 * Avoid idempotency issue with users (`password_lock`)
- * Rename subcommands/args ('export' is now 'enroll', '--bundle' is now '--harvest') 
+ * Rename subcommands/args ('export' is now 'enroll', '--bundle' is now '--harvest')
 * Don't try and start systemd services that were Inactive at harvest time
 * Capture miscellaneous files in /etc under their own `etc_custom` role, but not backup files
 * Add tests
--- a/README.md
+++ b/README.md
@ -4,217 +4,176 @@
  <img src="https://git.mig5.net/mig5/enroll/raw/branch/main/enroll.svg" alt="Enroll logo" width="240" />
 </div>

-**enroll** inspects a Linux machine (currently Debian-only) and generates Ansible roles for things it finds running on the machine.
+**enroll** inspects a Linux machine (currently Debian-only) and generates Ansible roles/playbooks (and optionally inventory) for what it finds.

 It aims to be **optimistic and noninteractive**:
- Detects packages that have been installed
- Detects Debian package ownership of `/etc` files using dpkg's local database.
+- Detects packages that have been installed.
+- Detects Debian package ownership of `/etc` files using dpkg’s local database.
 - Captures config that has **changed from packaged defaults** (dpkg conffile hashes + package md5sums when available).
 - Also captures **service-relevant custom/unowned files** under `/etc/<service>/...` (e.g. drop-in config includes).
 - Defensively excludes likely secrets (path denylist + content sniff + size caps).
- Captures non-system users that exist on the system, and their SSH public keys
- Captures miscellaneous `/etc` files that it can't attribute to a package, and installs it in an `etc_custom` role
- Avoids trying to start systemd services that were detected as being Inactive during harvest
+- Captures non-system users and their SSH public keys.
+- Captures miscellaneous `/etc` files it can’t attribute to a package and installs them in an `etc_custom` role.
+- Ditto for /usr/local/bin (for non-binary files) and /usr/local/etc
+- Avoids trying to start systemd services that were detected as inactive during harvest.

 ---

-# Two modes: single-site vs multi-site (`--fqdn`)
+## Mental model

-**enroll** has two distinct ways to generate Ansible:
+`enroll` works in two phases:

-## 1) Single-site mode (default: *no* `--fqdn`)
-Use this when you're enrolling **one server** (or you're generating a "golden" role set you intend to reuse).
+1) **Harvest**: collect host facts + relevant files into a harvest bundle (`state.json` + harvested artifacts)
+2) **Manifest**: turn that harvest into Ansible roles/playbooks (and optionally inventory)

-**What you get**
- Config, templates, and defaults are primarily **contained inside each role**.
- Raw config files (when not templated) live in the role's `files/`.
- Template variables (when templated) live in the role's `defaults/main.yml`.
+Additionally:

-**Pros**
- Roles are more **self-contained** and easier to understand.
- Better starting point for **provisioning new servers**, because the role contains most of what it needs.
- Less inventory abstraction/duplication.
+- **Diff**: compare two harvests and report what changed (packages/services/users/files) since the previous snapshot.

-**Cons**
- Less convenient for quickly enrolling multiple hosts with divergent configs (you'll do more manual work to make roles flexible across hosts).
+---

-## 2) Multi-site mode (`--fqdn`)
-Use this when you want to enroll **several existing servers** quickly, especially if they differ.
+## Output modes: single-site vs multi-site (`--fqdn`)

-**What you get**
- Roles are **shared** across hosts, but host-specific data lives in inventory.
- Host inventory drives what's managed:
-  - which files to deploy for that host
-  - which packages are relevant for that host
-  - which services should be enabled/started for that host
- For non-templated config, raw files live in host-specific inventory under `.files/` (per role).
+`enroll manifest` (and `enroll single-shot`) support two distinct output styles.

-**Pros**
- Fastest way to retrofit **multiple servers** into config management.
- Avoids shared-role "host A breaks host B" problems by keeping host-specific state in inventory.
- Better fit when you already have a fleet and want to capture/reflect reality first.
+### Single-site mode (default: *no* `--fqdn`)
+Use when enrolling **one server** (or generating a “golden” role set you intend to reuse).

-**Cons**
- More abstraction: roles become more "data-driven".
- Potential duplication: raw files may exist per-host in inventory (even if identical).
- Harder to use the roles to **provision a brand-new server** without also building an inventory for that new host, because multi-site output assumes the server already exists and is being retrofitted.
+**Characteristics**
+- Roles are more self-contained.
+- Raw config files live in the role’s `files/`.
+- Template variables live in the role’s `defaults/main.yml`.
+
+### Multi-site mode (`--fqdn`)
+Use when enrolling **several existing servers** quickly, especially if they differ.
+
+**Characteristics**
+- Roles are shared, host-specific state lives in inventory.
+- Host inventory drives what gets managed (files/packages/services).
+- Non-templated raw files live per-host under `inventory/host_vars/<fqdn>/<role>/.files/...`.

 **Rule of thumb**
- If your goal is *"make this one server reproducible / provisionable"* → start with **single-site**.
- If your goal is *"get several already-running servers under management quickly"* → use **multi-site**.
+- “Make this one server reproducible/provisionable” → start with **single-site**
+- “Get multiple already-running servers under management quickly” → use **multi-site**

 ---

-# Key concepts
+## Subcommands

-## Harvest
+### `enroll harvest`
+Harvest state about a host and write a harvest bundle.

-**enroll** begins by 'harvesting' known state about your host. This includes detecting what running services exist, what packages have been installed 'manually' (that is, stuff that doesn't come out of the box with the OS), and anything 'custom' in `/etc` that it can't attribute to a specific package.
+**What it captures (high level)**
+- Detected services + service-relevant packages
+- “Manual” packages
+- Changed-from-default config (plus related custom/unowned files under service dirs)
+- Non-system users + SSH public keys
+- Misc `/etc` that can’t be attributed to a package (`etc_custom` role)

-It also detects if any config files have been *changed* from their packaged defaults. If they have, it will attempt to 'harvest' them. If the config file is identical to how it comes with the package, then it doesn't bother harvesting it, because there's little value in config-managing it if it's identical to what you get by simply installing the package!
+**Common flags**
+- Remote harvesting:
+  - `--remote-host`, `--remote-user`, `--remote-port`
+  - `--no-sudo` (if you don’t want/need sudo)
+- Sensitive-data behaviour:
+  - default: tries to avoid likely secrets
+  - `--dangerous`: disables secret-safety checks (see “Sensitive data” below)
+- Encrypt bundles at rest:
+  - `--sops <FINGERPRINT...>`: writes a single encrypted `harvest.tar.gz.sops` instead of a plaintext directory

-The harvest writes a state.json file explaining all the data it harvested and, if it chose not to harvest something, explanations as to why that is the case (see below: sensitive data).
+---

-### Remote harvesting (workstation → remote)
+### `enroll manifest`
+Generate Ansible output from an existing harvest bundle.

-If you'd prefer not to install **enroll** on the target host, you can run the harvest over SSH from your workstation and pull the harvest bundle back locally:
+**Inputs**
+- `--harvest /path/to/harvest` (directory)
+  or `--harvest /path/to/harvest.tar.gz.sops` (if using `--sops`)

-```bash
-enroll harvest --remote-host myhost.example.com --remote-user myuser --out /tmp/enroll-harvest
-```
+**Output**
+- In plaintext mode: an Ansible repo-like directory structure (roles/playbooks, and inventory in multi-site mode).
+- In `--sops` mode: a single encrypted file `manifest.tar.gz.sops` containing the generated output.

- `--remote-port` defaults to `22`
- `--remote-user` defaults to your local `$USER`
+**Common flags**
+- `--fqdn <host>`: enables **multi-site** output style

-This uploads a self-contained `enroll` zipapp to a temporary directory on the remote host, runs `harvest` there, then downloads the resulting harvest bundle to the `--out` directory on your workstation.
+---

-**Privilege note:** A "full" harvest typically needs root access. Remote harvesting assumes the remote user can run `sudo` **without a password prompt** (NOPASSWD) so the harvest can run non-interactively. If you don't want this, pass `--no-sudo` as well.
+### `enroll single-shot`
+Convenience wrapper that runs **harvest → manifest** in one command.
+
+Use this when you want “get me something workable ASAP”.
+
+Supports the same general flags as harvest/manifest, including `--fqdn`, remote harvest flags, and `--sops`.
+
+---
+
+### `enroll diff`
+Compare two harvest bundles and report what changed.
+
+**What it reports**
+- Packages added/removed
+- Services enabled added/removed, plus key state changes
+- Users added/removed, plus field changes (uid/gid/home/shell/groups, etc.)
+- Managed files added/removed/changed (metadata + content hash changes where available)
+
+**Inputs**
+- `--old <harvest>` and `--new <harvest>` (directories or `state.json` paths)
+- `--sops` when comparing SOPS-encrypted harvest bundles
+
+**Output formats**
+- `--format json` (default for webhooks)
+- `--format markdown` / `--format text` (human-oriented)
+
+**Notifications**
+- Webhook:
+  - `--webhook <url>`
+  - `--webhook-format json|markdown|text`
+  - `--webhook-header 'Header-Name: value'` (repeatable)
+- Email (optional):
+  - `--email-to <addr>` (plus optional SMTP/sendmail-related flags, depending on your install)
+
+---

 ## Sensitive data

-**enroll** doesn't make any assumptions about how you might handle sensitive data from your config files, in Ansible. Some people might use SOPS, others might use Vault, others might do something else entirely.
+By default, `enroll` does **not** assume how you handle secrets in Ansible. It will attempt to avoid harvesting likely sensitive data (private keys, passwords, tokens, etc.). This can mean it skips some config files you may ultimately want to manage.

-For this reason, **enroll** will attempt to read config files, and if it detects data that looks like a sensitive SSH/SSL private key, or password, or API key, etc, then it won't harvest it for config management.
+If you opt in to collecting everything:

-This inevitably means that it will deliberately miss some important config files that you probably *want* to manage in Ansible.
+### `--dangerous`
+**WARNING:** disables “likely secret” safety checks. This can copy private keys, TLS key material, API tokens, database passwords, and other credentials into the harvest output **in plaintext**.

-Nonetheless, in the Harvest 'state' file, there should be an explanation of 'excluded files'. You can parse or inspect this file to find what it chose to ignore, and then you know what you might want to augment the results with later, once you 'manifest' the harvest into Ansible configuration.
+If you intend to keep harvests/manifests long-term (especially in git), strongly consider encrypting them at rest.

-Nonetheless, in some cases it may be appropriate to truly grab as much as you can, including secrets. For that, read on for the `--dangerous` flag.
+### Encrypt bundles at rest with `--sops`
+`--sops` encrypts the harvest and/or manifest outputs into a single `.tar.gz.sops` file (GPG). This is for **storage-at-rest**, not for direct “Ansible SOPS inventory” workflows.

-### Opting in to fetching sensitive data: `--dangerous`
-
-**WARNING:** `--dangerous` disables enroll's "likely a secret" safety checks. This can cause private keys, TLS key material, API tokens, database passwords, and other credentials to be copied into your harvest output **in plaintext**.
-
-Only use `--dangerous` if you explicitly want to scoop up sensitive files and you understand where the harvest output is stored, who can read it, and how it will be handled (backups, git commits, etc, as well as risk of using `--out` with a shared `/tmp` location where other users could see the data). We offer no liability if your sensitive data is compromised through the use of this tool!
-
-**Strong recommendation:** If you plan to keep harvested files long-term (especially in git), encrypt secrets at rest. A common approach is to use **SOPS** and then use the **community.sops** Ansible collection to load/decrypt encrypted content during deploy.
-
-Install the collection:
-
-```bash
-ansible-galaxy collection install community.sops
-```
-
-Then you can use the collection's lookup/vars plugins or modules to decrypt or load SOPS-encrypted vars at runtime.
-
-Note the section below **also** talks about SOPS, but this is in the context of simply encrypting the data generated by `enroll` at rest for safe-keeping, **not** for direct integration with Ansible.
-
-
-### Encrypting harvest/manifests at rest with `--sops`
-
-If you want to use `--dangerous` (or you simply want to keep the harvested artifacts private when they're sitting on disk, in git, etc), you can pass `--sops` to `harvest`, `manifest`, or `single-shot`.
-
-To use `--sops`, you will need to have [sops](https://github.com/getsops/sops) installed on your `$PATH`.
-
- `--sops` expects one or more **GPG key fingerprints**. If `sops` is not on the `$PATH`, **enroll** will error.
- `harvest --sops ...` writes a *single* encrypted file (`harvest.tar.gz.sops`) instead of a plaintext directory.
- `manifest --sops ...` (and `single-shot --sops ...`) will:
-  - decrypt the harvest bundle with `sops -d` (if the `--harvest` input is an encrypted file), then generate manifests as normal
-  - bundle the entire generated Ansible output into a *single* encrypted file (`manifest.tar.gz.sops`)
-
-⚠️ **Important:** `manifest --sops` (and `single-shot --sops`) produces **one encrypted file**. It is **not** an Ansible repo you can point `ansible-playbook` at directly. It is **not** the same as using SOPS inventory with the Ansible SOPS collection.
-
-To use the encrypted SOPS manifest, decrypt and extract it first, then run Ansible from inside the extracted `manifest/` directory:
-
-```bash
-sops -d /path/to/manifest.tar.gz.sops | tar -xzvf -
-cd manifest
-ansible-playbook ...
-```
-
-Example:
-
-```bash
-# Harvest (encrypted-at-rest)
-enroll harvest --out /tmp/enroll-harvest --dangerous --sops <FINGERPRINT(s)>
-
-# Manifest (encrypted-at-rest)
-enroll manifest --harvest /tmp/enroll-harvest/harvest.tar.gz.sops --out /tmp/enroll-ansible --sops <FINGERPRINT(s)>
-
-# Decrypt/extract manifest output for inspection / ansible runs
-cd /tmp/enroll-ansible
-sops -d manifest.tar.gz.sops | tar -xzvf -
-cd manifest
-```
-
-(If you want to manually inspect an encrypted harvest bundle, extract it into its own directory, e.g. `mkdir -p harvest && sops -d harvest.tar.gz.sops | tar -xzvf - -C harvest`.)
-
-
-
-## Manifest
-
-The 'manifest' subcommand expects to be given a path to the 'harvest' obtained in the first step. It will then attempt to generate Ansible roles and playbooks (and potentially 'inventory') from that harvest.
-
-Manifesting is the most complex step because a lot of people will have opinions on how Ansible roles and inventory should work. No solution is perfect for everyone. However, **enroll** tries to strike a reasonable balance.
-
-Remember, the purpose of this tool is to save **time** getting your systems into a decently-managed state. It's still up to you to wrangle it into a form that works for you on an ongoing basis.
+⚠️ Important: `manifest --sops` produces one encrypted file. You must decrypt + extract it before running `ansible-playbook`.

 ---

-# Single-shot mode for the impatient sysadmin
+## JinjaTurtle integration (both modes)

-**enroll** has a 'single-shot' subcommand which combines the two other phases (harvest and manifest) into one. Use it to generate both the harvest and then manifest ansible from that harvest all in one go. Perfect if you're in a hurry!
+If [JinjaTurtle](https://git.mig5.net/mig5/jinjaturtle) is installed, `enroll` can generate Jinja2 templates for ini/json/xml/toml-style config.

---
-
-# JinjaTurtle integration (both modes)
-
-If you also have my other tool [JinjaTurtle](https://git.mig5.net/mig5/jinjaturtle) installed, **enroll** will attempt to create Jinja2 templates for any ini/json/xml/toml style configuration that it finds.
-
- Templates live in the **role** (`roles/<role>/templates/...`)
+- Templates live in `roles/<role>/templates/...`
 - Variables live in:
-  - **single-site**: `roles/<role>/defaults/main.yml`
-  - **multi-site** (`--fqdn`): `inventory/host_vars/<fqdn>/<role>.yml`
+  - single-site: `roles/<role>/defaults/main.yml`
+  - multi-site: `inventory/host_vars/<fqdn>/<role>.yml`

-JinjaTurtle will be used automatically if it is detected on the `$PATH`. You can also be explicit and pass `--jinjaturtle`, but this will throw an error if JinjaTurtle is not on the `$PATH`.
-
-If you *do* have JinjaTurtle installed, but *don't* wish to make use of it, you can use `--no-jinjaturtle`, in which case all config files will be kept as 'raw' files.
+You can force it on with `--jinjaturtle` or disable with `--no-jinjaturtle`.

 ---

-# How multi-site avoids "shared role breaks a host"
+## How multi-site avoids “shared role breaks a host”

-In multi-site mode, **roles are data-driven**. The role contains generic tasks like:
-
- "deploy all files listed for this host"
- "install packages listed for this host"
- "apply systemd enable/start state listed for this host"
-
-The host inventory is what decides which files/packages/services apply to that host. This prevents the classic failure mode where host2 adds a config file to a shared role and host1 then fails trying to deploy a file it never had.
-
-Raw non-templated files are stored under:
-
- `inventory/host_vars/<fqdn>/<role>/.files/...`
-
-…and the host's role variables describe which of those files should be deployed.
+In multi-site mode, roles are **data-driven**. The role tasks are generic (“deploy the files listed for this host”, “install the packages listed for this host”, “apply systemd enable/start state listed for this host”). Host inventory decides what applies per-host, avoiding the classic “host2 adds config, host1 breaks” failure mode.

 ---

 # Install

 ## Ubuntu/Debian apt repository
-
 ```bash
 sudo mkdir -p /usr/share/keyrings
 curl -fsSL https://mig5.net/static/mig5.asc | sudo gpg --dearmor -o /usr/share/keyrings/mig5.gpg
@ -224,25 +183,19 @@ sudo apt install enroll
 ```

 ## AppImage
-
-Download the AppImage file from the Releases page (verify with GPG if you wish, my fingerprint is [here](https://mig5.net/static/mig5.asc)),
-then make it executable and run it:
+Download it from my Releases page, then:

 ```bash
 chmod +x Enroll.AppImage
 ./Enroll.AppImage
 ```

-### Pip/PipX
-
+## Pip/PipX
 ```bash
 pip install enroll
 ```

-### Poetry
-
-Clone this repository with git, then:
-
+## Poetry (dev)
 ```bash
 poetry install
 poetry run enroll --help
@ -250,111 +203,110 @@ poetry run enroll --help

 ---

-# Usage
+## Found a bug / have a suggestion?

-## 1. Harvest state/information about the host
+My Forgejo doesn’t currently support federation, so I haven’t opened registration/login for issues.

-On the host (root recommended to harvest as much data as possible):
+Instead, email me (see `pyproject.toml`) or contact me on the Fediverse:

+https://goto.mig5.net/@mig5
+
+---
+
+# Examples
+
+## Harvest
+
+### Local harvest
 ```bash
 enroll harvest --out /tmp/enroll-harvest
 ```
-### Remote harvest over SSH (no enroll install required on the remote host, no need for --out)

+### Remote harvest over SSH
 ```bash
-enroll harvest --remote-host myhost.example.com --remote-user myuser
+enroll harvest --remote-host myhost.example.com --remote-user myuser --out /tmp/enroll-harvest
 ```

-### `--dangerous` (captures potentially sensitive files — read the warning above)
-
+### `--dangerous`
 ```bash
 enroll harvest --out /tmp/enroll-harvest --dangerous
 ```

-Remote + dangerous:
-
+### Remote + dangerous:
 ```bash
 enroll harvest --remote-host myhost.example.com --remote-user myuser --dangerous
 ```

-### `--sops` (encrypt bundles at rest)
-
-`--sops` bundles and encrypts the output as a single SOPS-encrypted `.tar.gz.sops` file (GPG). This is particularly useful if you're using `--dangerous`.
-
+### `--sops` (encrypt at rest)
 ```bash
 # Encrypted harvest bundle (writes /tmp/enroll-harvest/harvest.tar.gz.sops)
 enroll harvest --out /tmp/enroll-harvest --dangerous --sops <FINGERPRINT(s)>
+```

-# Encrypted manifest bundle (writes /tmp/enroll-ansible/manifest.tar.gz.sops)
+---
+
+## Manifest
+
+### Single-site (default: no --fqdn)
+```bash
+enroll manifest --harvest /tmp/enroll-harvest --out /tmp/enroll-ansible
+```
+
+### Multi-site (--fqdn)
+```bash
+enroll manifest --harvest /tmp/enroll-harvest --out /tmp/enroll-ansible --fqdn "$(hostname -f)"
+```
+
+### Manifest with `--sops`
+```bash
+# Generate encrypted manifest bundle (writes /tmp/enroll-ansible/manifest.tar.gz.sops)
 enroll manifest --harvest /tmp/enroll-harvest/harvest.tar.gz.sops --out /tmp/enroll-ansible --sops <FINGERPRINT(s)>

 # Decrypt/extract the manifest bundle, then run Ansible from inside ./manifest/
 cd /tmp/enroll-ansible
 sops -d manifest.tar.gz.sops | tar -xzvf -
 cd manifest
-ansible-playbook ./playbook.yml
 ```

-
-## 2. Generate Ansible manifests (roles/playbook) from that harvest
-
-### Single-site (default: no --fqdn)
-
-Good for one server, or for producing roles you want to reuse to provision new machines:
-
-```bash
-enroll manifest --harvest /tmp/enroll-harvest --out /tmp/enroll-ansible
-```
-
-### Multi-site (--fqdn)
-
-Best when enrolling multiple already-running servers into one repo:
-
-```bash
-enroll manifest --harvest /tmp/enroll-harvest --out /tmp/enroll-ansible --fqdn "$(hostname -f)"
-```
+---

 ## Single-shot

-Alternatively, do both steps in one shot:
-
 ```bash
 enroll single-shot --harvest /tmp/enroll-harvest --out /tmp/enroll-ansible --fqdn "$(hostname -f)"
 ```
-Remote single-shot (run harvest over SSH, then manifest locally):

+Remote single-shot (run harvest over SSH, then manifest locally):
 ```bash
-enroll single-shot --remote-host myhost.example.com --remote-user myuser --harvest /tmp/enroll-harvest --out /tmp/enroll-ansible --fqdn "myhost.example.com"
+enroll single-shot --remote-host myhost.example.com --remote-user myuser   --harvest /tmp/enroll-harvest --out /tmp/enroll-ansible --fqdn "myhost.example.com"
 ```

-In multi-site mode (`--fqdn`), you can run single-shot repeatedly against multiple hosts while reusing the same `--out` directory so each host merges into the existing Ansible repo.
+---

+## Diff

-## 3. Run Ansible
+### Compare two harvest directories
+```bash
+enroll diff --old /path/to/harvestA --new /path/to/harvestB --format json
+```
+
+### Diff + webhook notify
+```bash
+enroll diff   --old /path/to/golden/harvest   --new /path/to/new/harvest   --webhook https://nr.mig5.net/forms/webhooks/xxxx   --webhook-format json   --webhook-header 'X-Enroll-Secret: xxxx'
+```
+
+`diff` mode also supports email sending and text or markdown format, as well as `--exit-code` mode to trigger a return code of 2 (useful for crons or CI)
+
+---
+
+## Run Ansible

 ### Single-site
-
-You can run it however you prefer (local connection or your own inventory). Example:
-
 ```bash
 ansible-playbook -i "localhost," -c local /tmp/enroll-ansible/playbook.yml
 ```

 ### Multi-site (--fqdn)
-
-In multi-site mode, enroll generates an ansible.cfg, `host_vars` inventory, and a host-specific playbook:
-
 ```bash
 ansible-playbook /tmp/enroll-ansible/playbooks/"$(hostname -f)".yml
 ```
-
---
-
-# Found a bug, have a suggestion?
-
-My Forgejo doesn't yet support proper federation, and for that reason I've not opened up registration/login to use the issue queue.
-
-Instead, you can e-mail me (see the pyproject.toml for details) or contact me on the Fediverse:
-
-https://goto.mig5.net/@mig5
-
--- a/debian/changelog
+++ b/debian/changelog
@ -1,3 +1,17 @@
+enroll (0.1.2) unstable; urgency=medium
+
+  * Include files from `/usr/local/bin` and `/usr/local/etc` in harvest (assuming they aren't binaries or
+    symlinks) and store in `usr_local_custom` role, similar to `etc_custom`.
+
+ -- Miguel Jacq <mig@mig5.net>  Thu, 18 Dec 2025 17:07:00 +1100
+
+enroll (0.1.1) unstable; urgency=medium
+
+  * Add `diff` subcommand which can compare two harvests and send email or webhook notifications in different
+    formats.
+
+ -- Miguel Jacq <mig@mig5.net>  Thu, 18 Dec 2025 15:00:00 +1100
+
 enroll (0.1.0) unstable; urgency=medium

  * Add remote mode for harvesting a remote machine via a local workstation (no need to install enroll remotely)
@ -12,7 +26,7 @@ enroll (0.1.0) unstable; urgency=medium
    ship or manage those files.
  * Don't collect files ending in `.log`

- -- Miguel Jacq <mig@mig5.net>  Tue, 17 Dec 2025 18:00:00 +1100
+ -- Miguel Jacq <mig@mig5.net>  Wed, 17 Dec 2025 18:00:00 +1100

 enroll (0.0.5) unstable; urgency=medium

--- a/enroll.svg
+++ b/enroll.svg
@ -109,4 +109,3 @@
    <tspan class="text-dark">en</tspan><tspan class="text-light">roll</tspan>
  </text>
 </svg>
-
--- a/enroll/cli.py
+++ b/enroll/cli.py
@ -8,6 +8,7 @@ from pathlib import Path
 from typing import Optional

 from .cache import new_harvest_cache_dir
+from .diff import compare_harvests, format_report, post_webhook, send_email
 from .harvest import harvest
 from .manifest import manifest
 from .remote import remote_harvest
@ -211,6 +212,90 @@ def main() -> None:
    _add_common_manifest_args(s)
    _add_remote_args(s)

+    d = sub.add_parser("diff", help="Compare two harvests and report differences")
+    d.add_argument(
+        "--old",
+        required=True,
+        help=(
+            "Old/baseline harvest (directory, a path to state.json, a tarball, or a SOPS-encrypted bundle)."
+        ),
+    )
+    d.add_argument(
+        "--new",
+        required=True,
+        help=(
+            "New/current harvest (directory, a path to state.json, a tarball, or a SOPS-encrypted bundle)."
+        ),
+    )
+    d.add_argument(
+        "--sops",
+        action="store_true",
+        help="Allow SOPS-encrypted harvest bundle inputs (requires `sops` on PATH).",
+    )
+    d.add_argument(
+        "--format",
+        choices=["text", "markdown", "json"],
+        default="text",
+        help="Report output format (default: text).",
+    )
+    d.add_argument(
+        "--out",
+        help="Write the report to this file instead of stdout.",
+    )
+    d.add_argument(
+        "--exit-code",
+        action="store_true",
+        help="Exit with status 2 if differences are detected.",
+    )
+    d.add_argument(
+        "--notify-always",
+        action="store_true",
+        help="Send webhook/email even when there are no differences.",
+    )
+    d.add_argument(
+        "--webhook",
+        help="POST the report to this URL (only when differences are detected, unless --notify-always).",
+    )
+    d.add_argument(
+        "--webhook-format",
+        choices=["json", "text", "markdown"],
+        default="json",
+        help="Payload format for --webhook (default: json).",
+    )
+    d.add_argument(
+        "--webhook-header",
+        action="append",
+        default=[],
+        metavar="K:V",
+        help="Extra HTTP header for --webhook (repeatable), e.g. 'Authorization: Bearer ...'.",
+    )
+    d.add_argument(
+        "--email-to",
+        action="append",
+        default=[],
+        help="Email the report to this address (repeatable; only when differences are detected unless --notify-always).",
+    )
+    d.add_argument(
+        "--email-from",
+        help="From address for --email-to (default: enroll@<hostname>).",
+    )
+    d.add_argument(
+        "--email-subject",
+        help="Subject for --email-to (default: 'enroll diff report').",
+    )
+    d.add_argument(
+        "--smtp",
+        help="SMTP server host[:port] for --email-to. If omitted, uses local sendmail.",
+    )
+    d.add_argument(
+        "--smtp-user",
+        help="SMTP username (optional).",
+    )
+    d.add_argument(
+        "--smtp-password-env",
+        help="Environment variable containing SMTP password (optional).",
+    )
+
    args = ap.parse_args()

    remote_host: Optional[str] = getattr(args, "remote_host", None)
@ -287,6 +372,61 @@ def main() -> None:
            )
            if getattr(args, "sops", None) and out_enc:
                print(str(out_enc))
+        elif args.cmd == "diff":
+            report, has_changes = compare_harvests(
+                args.old,
+                args.new,
+                sops_mode=bool(getattr(args, "sops", False)),
+            )
+
+            txt = format_report(report, fmt=str(getattr(args, "format", "text")))
+            out_path = getattr(args, "out", None)
+            if out_path:
+                p = Path(out_path).expanduser()
+                p.parent.mkdir(parents=True, exist_ok=True)
+                p.write_text(txt, encoding="utf-8")
+            else:
+                print(txt, end="" if txt.endswith("\n") else "\n")
+
+            should_notify = has_changes or bool(getattr(args, "notify_always", False))
+
+            webhook = getattr(args, "webhook", None)
+            if webhook and should_notify:
+                wf = str(getattr(args, "webhook_format", "json"))
+                payload = format_report(report, fmt=wf)
+                body = payload.encode("utf-8")
+                headers = {}
+                if wf == "json":
+                    headers["Content-Type"] = "application/json"
+                else:
+                    headers["Content-Type"] = "text/plain; charset=utf-8"
+                for hv in getattr(args, "webhook_header", []) or []:
+                    if ":" in hv:
+                        k, v = hv.split(":", 1)
+                        headers[k.strip()] = v.strip()
+                status, _resp = post_webhook(webhook, body, headers=headers)
+                if status and status >= 400:
+                    raise SystemExit(f"error: webhook returned HTTP {status}")
+
+            to_addrs = getattr(args, "email_to", []) or []
+            if to_addrs and should_notify:
+                subject = getattr(args, "email_subject", None) or "enroll diff report"
+                smtp_pw = None
+                pw_env = getattr(args, "smtp_password_env", None)
+                if pw_env:
+                    smtp_pw = os.environ.get(str(pw_env))
+                send_email(
+                    to_addrs=list(to_addrs),
+                    subject=str(subject),
+                    body=txt,
+                    from_addr=getattr(args, "email_from", None),
+                    smtp=getattr(args, "smtp", None),
+                    smtp_user=getattr(args, "smtp_user", None),
+                    smtp_password=smtp_pw,
+                )
+
+            if getattr(args, "exit_code", False) and has_changes:
+                raise SystemExit(2)
        elif args.cmd == "single-shot":
            sops_fps = getattr(args, "sops", None)
            if remote_host:
@ -379,5 +519,55 @@ def main() -> None:
                        fqdn=args.fqdn,
                        jinjaturtle=_jt_mode(args),
                    )
+        elif args.cmd == "diff":
+            report, has_changes = compare_harvests(
+                args.old, args.new, sops_mode=bool(getattr(args, "sops", False))
+            )
+
+            rendered = format_report(report, fmt=str(args.format))
+            if args.out:
+                Path(args.out).expanduser().write_text(rendered, encoding="utf-8")
+            else:
+                print(rendered, end="")
+
+            do_notify = bool(has_changes or getattr(args, "notify_always", False))
+
+            if do_notify and getattr(args, "webhook", None):
+                wf = str(getattr(args, "webhook_format", "json"))
+                body = format_report(report, fmt=wf).encode("utf-8")
+                headers = {"User-Agent": "enroll"}
+                if wf == "json":
+                    headers["Content-Type"] = "application/json"
+                else:
+                    headers["Content-Type"] = "text/plain; charset=utf-8"
+                for hv in getattr(args, "webhook_header", []) or []:
+                    if ":" not in hv:
+                        raise SystemExit(
+                            "error: --webhook-header must be in the form 'K:V'"
+                        )
+                    k, v = hv.split(":", 1)
+                    headers[k.strip()] = v.strip()
+                status, _ = post_webhook(str(args.webhook), body, headers=headers)
+                if status and status >= 400:
+                    raise SystemExit(f"error: webhook returned HTTP {status}")
+
+            if do_notify and (getattr(args, "email_to", []) or []):
+                subject = getattr(args, "email_subject", None) or "enroll diff report"
+                smtp_password = None
+                pw_env = getattr(args, "smtp_password_env", None)
+                if pw_env:
+                    smtp_password = os.environ.get(str(pw_env))
+                send_email(
+                    to_addrs=list(getattr(args, "email_to", []) or []),
+                    subject=str(subject),
+                    body=rendered,
+                    from_addr=getattr(args, "email_from", None),
+                    smtp=getattr(args, "smtp", None),
+                    smtp_user=getattr(args, "smtp_user", None),
+                    smtp_password=smtp_password,
+                )
+
+            if getattr(args, "exit_code", False) and has_changes:
+                raise SystemExit(2)
    except SopsError as e:
        raise SystemExit(f"error: {e}")
--- a/enroll/diff.py
+++ b/enroll/diff.py
@ -0,0 +1,763 @@
+from __future__ import annotations
+
+import hashlib
+import json
+import os
+import shutil
+import subprocess  # nosec
+import tarfile
+import tempfile
+import urllib.request
+from contextlib import ExitStack
+from dataclasses import dataclass
+from datetime import datetime, timezone
+from email.message import EmailMessage
+from pathlib import Path
+from typing import Any, Dict, Iterable, List, Optional, Tuple
+
+from .remote import _safe_extract_tar
+from .sopsutil import decrypt_file_binary_to, require_sops_cmd
+
+
+def _utc_now_iso() -> str:
+    return datetime.now(tz=timezone.utc).isoformat()
+
+
+def _sha256(path: Path) -> str:
+    h = hashlib.sha256()
+    with open(path, "rb") as f:
+        while True:
+            chunk = f.read(1024 * 1024)
+            if not chunk:
+                break
+            h.update(chunk)
+    return h.hexdigest()
+
+
+@dataclass
+class BundleRef:
+    """A prepared harvest bundle.
+
+    `dir` is a directory containing state.json + artifacts/.
+    `tempdir` is set when the bundle needed extraction into a temp directory.
+    """
+
+    dir: Path
+    tempdir: Optional[tempfile.TemporaryDirectory] = None
+
+    @property
+    def state_path(self) -> Path:
+        return self.dir / "state.json"
+
+
+def _bundle_from_input(path: str, *, sops_mode: bool) -> BundleRef:
+    """Resolve a user-supplied path to a harvest bundle directory.
+
+    Accepts:
+      - a bundle directory
+      - a path to state.json inside a bundle directory
+      - (sops mode or .sops) a SOPS-encrypted tar.gz bundle
+      - a plain tar.gz/tgz bundle
+    """
+
+    p = Path(path).expanduser()
+
+    # Accept the state.json path directly (harvest often prints this).
+    if p.is_file() and p.name == "state.json":
+        p = p.parent
+
+    if p.is_dir():
+        return BundleRef(dir=p)
+
+    if not p.exists():
+        raise RuntimeError(f"Harvest path not found: {p}")
+
+    # Auto-enable sops mode if it looks like an encrypted bundle.
+    is_sops = p.name.endswith(".sops")
+    if sops_mode or is_sops:
+        require_sops_cmd()
+        td = tempfile.TemporaryDirectory(prefix="enroll-harvest-")
+        td_path = Path(td.name)
+        try:
+            os.chmod(td_path, 0o700)
+        except OSError:
+            pass
+
+        tar_path = td_path / "harvest.tar.gz"
+        out_dir = td_path / "bundle"
+        out_dir.mkdir(parents=True, exist_ok=True)
+        try:
+            os.chmod(out_dir, 0o700)
+        except OSError:
+            pass
+
+        decrypt_file_binary_to(p, tar_path, mode=0o600)
+        with tarfile.open(tar_path, mode="r:gz") as tf:
+            _safe_extract_tar(tf, out_dir)
+
+        return BundleRef(dir=out_dir, tempdir=td)
+
+    # Plain tarballs (useful for operators who rsync/zip harvests around).
+    if p.suffixes[-2:] == [".tar", ".gz"] or p.suffix == ".tgz":
+        td = tempfile.TemporaryDirectory(prefix="enroll-harvest-")
+        td_path = Path(td.name)
+        try:
+            os.chmod(td_path, 0o700)
+        except OSError:
+            pass
+        out_dir = td_path / "bundle"
+        out_dir.mkdir(parents=True, exist_ok=True)
+        try:
+            os.chmod(out_dir, 0o700)
+        except OSError:
+            pass
+        with tarfile.open(p, mode="r:gz") as tf:
+            _safe_extract_tar(tf, out_dir)
+        return BundleRef(dir=out_dir, tempdir=td)
+
+    raise RuntimeError(
+        f"Harvest path is not a directory, state.json, encrypted bundle, or tarball: {p}"
+    )
+
+
+def _load_state(bundle_dir: Path) -> Dict[str, Any]:
+    sp = bundle_dir / "state.json"
+    with open(sp, "r", encoding="utf-8") as f:
+        return json.load(f)
+
+
+def _all_packages(state: Dict[str, Any]) -> List[str]:
+    pkgs = set(state.get("manual_packages", []) or [])
+    pkgs |= set(state.get("manual_packages_skipped", []) or [])
+    for s in state.get("services", []) or []:
+        for p in s.get("packages", []) or []:
+            pkgs.add(p)
+    return sorted(pkgs)
+
+
+def _service_units(state: Dict[str, Any]) -> Dict[str, Dict[str, Any]]:
+    out: Dict[str, Dict[str, Any]] = {}
+    for s in state.get("services", []) or []:
+        unit = s.get("unit")
+        if unit:
+            out[str(unit)] = s
+    return out
+
+
+def _users_by_name(state: Dict[str, Any]) -> Dict[str, Dict[str, Any]]:
+    users = (state.get("users") or {}).get("users") or []
+    out: Dict[str, Dict[str, Any]] = {}
+    for u in users:
+        name = u.get("name")
+        if name:
+            out[str(name)] = u
+    return out
+
+
+@dataclass(frozen=True)
+class FileRec:
+    path: str
+    role: str
+    src_rel: str
+    owner: Optional[str]
+    group: Optional[str]
+    mode: Optional[str]
+    reason: Optional[str]
+
+
+def _iter_managed_files(state: Dict[str, Any]) -> Iterable[Tuple[str, Dict[str, Any]]]:
+    # Services
+    for s in state.get("services", []) or []:
+        role = s.get("role_name") or "unknown"
+        for mf in s.get("managed_files", []) or []:
+            yield str(role), mf
+
+    # Package roles
+    for p in state.get("package_roles", []) or []:
+        role = p.get("role_name") or "unknown"
+        for mf in p.get("managed_files", []) or []:
+            yield str(role), mf
+
+    # Users
+    u = state.get("users") or {}
+    u_role = u.get("role_name") or "users"
+    for mf in u.get("managed_files", []) or []:
+        yield str(u_role), mf
+
+    # etc_custom
+    ec = state.get("etc_custom") or {}
+    ec_role = ec.get("role_name") or "etc_custom"
+    for mf in ec.get("managed_files", []) or []:
+        yield str(ec_role), mf
+
+    # usr_local_custom
+    ul = state.get("usr_local_custom") or {}
+    ul_role = ul.get("role_name") or "usr_local_custom"
+    for mf in ul.get("managed_files", []) or []:
+        yield str(ul_role), mf
+
+
+def _file_index(bundle_dir: Path, state: Dict[str, Any]) -> Dict[str, FileRec]:
+    """Return mapping of absolute path -> FileRec.
+
+    If duplicates occur, the first one wins (should be rare by design).
+    """
+
+    out: Dict[str, FileRec] = {}
+    for role, mf in _iter_managed_files(state):
+        p = mf.get("path")
+        src_rel = mf.get("src_rel")
+        if not p or not src_rel:
+            continue
+        p = str(p)
+        if p in out:
+            continue
+        out[p] = FileRec(
+            path=p,
+            role=str(role),
+            src_rel=str(src_rel),
+            owner=mf.get("owner"),
+            group=mf.get("group"),
+            mode=mf.get("mode"),
+            reason=mf.get("reason"),
+        )
+    return out
+
+
+def _artifact_path(bundle_dir: Path, rec: FileRec) -> Path:
+    return bundle_dir / "artifacts" / rec.role / rec.src_rel
+
+
+def compare_harvests(
+    old_path: str,
+    new_path: str,
+    *,
+    sops_mode: bool = False,
+) -> Tuple[Dict[str, Any], bool]:
+    """Compare two harvests.
+
+    Returns (report, has_changes).
+    """
+    with ExitStack() as stack:
+        old_b = _bundle_from_input(old_path, sops_mode=sops_mode)
+        new_b = _bundle_from_input(new_path, sops_mode=sops_mode)
+        if old_b.tempdir:
+            stack.callback(old_b.tempdir.cleanup)
+        if new_b.tempdir:
+            stack.callback(new_b.tempdir.cleanup)
+
+        old_state = _load_state(old_b.dir)
+        new_state = _load_state(new_b.dir)
+
+        old_pkgs = set(_all_packages(old_state))
+        new_pkgs = set(_all_packages(new_state))
+
+        pkgs_added = sorted(new_pkgs - old_pkgs)
+        pkgs_removed = sorted(old_pkgs - new_pkgs)
+
+        old_units = _service_units(old_state)
+        new_units = _service_units(new_state)
+        units_added = sorted(set(new_units) - set(old_units))
+        units_removed = sorted(set(old_units) - set(new_units))
+
+        units_changed: List[Dict[str, Any]] = []
+        for unit in sorted(set(old_units) & set(new_units)):
+            a = old_units[unit]
+            b = new_units[unit]
+            ch: Dict[str, Any] = {}
+            for k in [
+                "active_state",
+                "sub_state",
+                "unit_file_state",
+                "condition_result",
+            ]:
+                if a.get(k) != b.get(k):
+                    ch[k] = {"old": a.get(k), "new": b.get(k)}
+            a_pk = set(a.get("packages", []) or [])
+            b_pk = set(b.get("packages", []) or [])
+            if a_pk != b_pk:
+                ch["packages"] = {
+                    "added": sorted(b_pk - a_pk),
+                    "removed": sorted(a_pk - b_pk),
+                }
+            if ch:
+                units_changed.append({"unit": unit, "changes": ch})
+
+        old_users = _users_by_name(old_state)
+        new_users = _users_by_name(new_state)
+        users_added = sorted(set(new_users) - set(old_users))
+        users_removed = sorted(set(old_users) - set(new_users))
+
+        users_changed: List[Dict[str, Any]] = []
+        for name in sorted(set(old_users) & set(new_users)):
+            a = old_users[name]
+            b = new_users[name]
+            ch: Dict[str, Any] = {}
+            for k in [
+                "uid",
+                "gid",
+                "gecos",
+                "home",
+                "shell",
+                "primary_group",
+            ]:
+                if a.get(k) != b.get(k):
+                    ch[k] = {"old": a.get(k), "new": b.get(k)}
+            a_sg = set(a.get("supplementary_groups", []) or [])
+            b_sg = set(b.get("supplementary_groups", []) or [])
+            if a_sg != b_sg:
+                ch["supplementary_groups"] = {
+                    "added": sorted(b_sg - a_sg),
+                    "removed": sorted(a_sg - b_sg),
+                }
+            if ch:
+                users_changed.append({"name": name, "changes": ch})
+
+        old_files = _file_index(old_b.dir, old_state)
+        new_files = _file_index(new_b.dir, new_state)
+        old_paths_set = set(old_files)
+        new_paths_set = set(new_files)
+
+        files_added = sorted(new_paths_set - old_paths_set)
+        files_removed = sorted(old_paths_set - new_paths_set)
+
+        # Hash cache to avoid reading the same file more than once.
+        hash_cache: Dict[str, str] = {}
+
+        def _hash_for(bundle_dir: Path, rec: FileRec) -> Optional[str]:
+            ap = _artifact_path(bundle_dir, rec)
+            if not ap.exists() or not ap.is_file():
+                return None
+            key = str(ap)
+            if key in hash_cache:
+                return hash_cache[key]
+            hash_cache[key] = _sha256(ap)
+            return hash_cache[key]
+
+        files_changed: List[Dict[str, Any]] = []
+        for p in sorted(old_paths_set & new_paths_set):
+            a = old_files[p]
+            b = new_files[p]
+            ch: Dict[str, Any] = {}
+
+            # Role movement is itself interesting (e.g., file ownership attribution changed).
+            if a.role != b.role:
+                ch["role"] = {"old": a.role, "new": b.role}
+            for k in ["owner", "group", "mode", "reason"]:
+                av = getattr(a, k)
+                bv = getattr(b, k)
+                if av != bv:
+                    ch[k] = {"old": av, "new": bv}
+
+            ha = _hash_for(old_b.dir, a)
+            hb = _hash_for(new_b.dir, b)
+            if ha is None or hb is None:
+                if ha != hb:
+                    ch["content"] = {
+                        "old": "missing" if ha is None else "present",
+                        "new": "missing" if hb is None else "present",
+                    }
+            else:
+                if ha != hb:
+                    ch["content"] = {"old_sha256": ha, "new_sha256": hb}
+
+            if ch:
+                files_changed.append({"path": p, "changes": ch})
+
+        has_changes = any(
+            [
+                pkgs_added,
+                pkgs_removed,
+                units_added,
+                units_removed,
+                units_changed,
+                users_added,
+                users_removed,
+                users_changed,
+                files_added,
+                files_removed,
+                files_changed,
+            ]
+        )
+
+        def _mtime_iso(p: Path) -> Optional[str]:
+            try:
+                ts = p.stat().st_mtime
+            except OSError:
+                return None
+            return datetime.fromtimestamp(ts, tz=timezone.utc).isoformat()
+
+        report: Dict[str, Any] = {
+            "generated_at": _utc_now_iso(),
+            "old": {
+                "input": old_path,
+                "bundle_dir": str(old_b.dir),
+                "state_mtime": _mtime_iso(old_b.state_path),
+                "host": (old_state.get("host") or {}).get("hostname"),
+            },
+            "new": {
+                "input": new_path,
+                "bundle_dir": str(new_b.dir),
+                "state_mtime": _mtime_iso(new_b.state_path),
+                "host": (new_state.get("host") or {}).get("hostname"),
+            },
+            "packages": {"added": pkgs_added, "removed": pkgs_removed},
+            "services": {
+                "enabled_added": units_added,
+                "enabled_removed": units_removed,
+                "changed": units_changed,
+            },
+            "users": {
+                "added": users_added,
+                "removed": users_removed,
+                "changed": users_changed,
+            },
+            "files": {
+                "added": [
+                    {
+                        "path": p,
+                        "role": new_files[p].role,
+                        "reason": new_files[p].reason,
+                    }
+                    for p in files_added
+                ],
+                "removed": [
+                    {
+                        "path": p,
+                        "role": old_files[p].role,
+                        "reason": old_files[p].reason,
+                    }
+                    for p in files_removed
+                ],
+                "changed": files_changed,
+            },
+        }
+
+        return report, has_changes
+
+
+def format_report(report: Dict[str, Any], *, fmt: str = "text") -> str:
+    fmt = (fmt or "text").lower()
+    if fmt == "json":
+        return json.dumps(report, indent=2, sort_keys=True)
+    if fmt == "markdown":
+        return _report_markdown(report)
+    return _report_text(report)
+
+
+def _report_text(report: Dict[str, Any]) -> str:
+    lines: List[str] = []
+    old = report.get("old", {})
+    new = report.get("new", {})
+    lines.append(
+        f"enroll diff report (generated {report.get('generated_at')})\n"
+        f"old: {old.get('input')} (host={old.get('host')}, state_mtime={old.get('state_mtime')})\n"
+        f"new: {new.get('input')} (host={new.get('host')}, state_mtime={new.get('state_mtime')})"
+    )
+
+    pk = report.get("packages", {})
+    lines.append("\nPackages")
+    lines.append(f"  added:   {len(pk.get('added', []) or [])}")
+    lines.append(f"  removed: {len(pk.get('removed', []) or [])}")
+    for p in pk.get("added", []) or []:
+        lines.append(f"    + {p}")
+    for p in pk.get("removed", []) or []:
+        lines.append(f"    - {p}")
+
+    sv = report.get("services", {})
+    lines.append("\nServices (enabled systemd units)")
+    for u in sv.get("enabled_added", []) or []:
+        lines.append(f"  + {u}")
+    for u in sv.get("enabled_removed", []) or []:
+        lines.append(f"  - {u}")
+    for ch in sv.get("changed", []) or []:
+        unit = ch.get("unit")
+        lines.append(f"  * {unit} changed")
+        for k, v in (ch.get("changes") or {}).items():
+            if k == "packages":
+                a = (v or {}).get("added", [])
+                r = (v or {}).get("removed", [])
+                if a:
+                    lines.append(f"      packages +: {', '.join(a)}")
+                if r:
+                    lines.append(f"      packages -: {', '.join(r)}")
+            else:
+                lines.append(f"      {k}: {v.get('old')} -> {v.get('new')}")
+
+    us = report.get("users", {})
+    lines.append("\nUsers")
+    for u in us.get("added", []) or []:
+        lines.append(f"  + {u}")
+    for u in us.get("removed", []) or []:
+        lines.append(f"  - {u}")
+    for ch in us.get("changed", []) or []:
+        name = ch.get("name")
+        lines.append(f"  * {name} changed")
+        for k, v in (ch.get("changes") or {}).items():
+            if k == "supplementary_groups":
+                a = (v or {}).get("added", [])
+                r = (v or {}).get("removed", [])
+                if a:
+                    lines.append(f"      groups +: {', '.join(a)}")
+                if r:
+                    lines.append(f"      groups -: {', '.join(r)}")
+            else:
+                lines.append(f"      {k}: {v.get('old')} -> {v.get('new')}")
+
+    fl = report.get("files", {})
+    lines.append("\nFiles")
+    for e in fl.get("added", []) or []:
+        lines.append(
+            f"  + {e.get('path')}  (role={e.get('role')}, reason={e.get('reason')})"
+        )
+    for e in fl.get("removed", []) or []:
+        lines.append(
+            f"  - {e.get('path')}  (role={e.get('role')}, reason={e.get('reason')})"
+        )
+    for ch in fl.get("changed", []) or []:
+        p = ch.get("path")
+        lines.append(f"  * {p} changed")
+        for k, v in (ch.get("changes") or {}).items():
+            if k == "content":
+                if "old_sha256" in (v or {}):
+                    lines.append("      content: sha256 changed")
+                else:
+                    lines.append(f"      content: {v.get('old')} -> {v.get('new')}")
+            else:
+                lines.append(f"      {k}: {v.get('old')} -> {v.get('new')}")
+
+    if not any(
+        [
+            (pk.get("added") or []),
+            (pk.get("removed") or []),
+            (sv.get("enabled_added") or []),
+            (sv.get("enabled_removed") or []),
+            (sv.get("changed") or []),
+            (us.get("added") or []),
+            (us.get("removed") or []),
+            (us.get("changed") or []),
+            (fl.get("added") or []),
+            (fl.get("removed") or []),
+            (fl.get("changed") or []),
+        ]
+    ):
+        lines.append("\nNo differences detected.")
+
+    return "\n".join(lines) + "\n"
+
+
+def _report_markdown(report: Dict[str, Any]) -> str:
+    old = report.get("old", {})
+    new = report.get("new", {})
+    out: List[str] = []
+    out.append("# enroll diff report\n")
+    out.append(f"Generated: `{report.get('generated_at')}`\n")
+    out.append(
+        f"- **Old**: `{old.get('input')}` (host={old.get('host')}, state_mtime={old.get('state_mtime')})\n"
+        f"- **New**: `{new.get('input')}` (host={new.get('host')}, state_mtime={new.get('state_mtime')})\n"
+    )
+
+    pk = report.get("packages", {})
+    out.append("## Packages\n")
+    out.append(f"- Added: {len(pk.get('added', []) or [])}\n")
+    for p in pk.get("added", []) or []:
+        out.append(f"  - `+ {p}`\n")
+    out.append(f"- Removed: {len(pk.get('removed', []) or [])}\n")
+    for p in pk.get("removed", []) or []:
+        out.append(f"  - `- {p}`\n")
+
+    sv = report.get("services", {})
+    out.append("## Services (enabled systemd units)\n")
+    if sv.get("enabled_added"):
+        out.append("- Enabled added\n")
+        for u in sv.get("enabled_added", []) or []:
+            out.append(f"  - `+ {u}`\n")
+    if sv.get("enabled_removed"):
+        out.append("- Enabled removed\n")
+        for u in sv.get("enabled_removed", []) or []:
+            out.append(f"  - `- {u}`\n")
+    if sv.get("changed"):
+        out.append("- Changed\n")
+        for ch in sv.get("changed", []) or []:
+            unit = ch.get("unit")
+            out.append(f"  - `{unit}`\n")
+            for k, v in (ch.get("changes") or {}).items():
+                if k == "packages":
+                    a = (v or {}).get("added", [])
+                    r = (v or {}).get("removed", [])
+                    if a:
+                        out.append(
+                            f"    - packages added: {', '.join('`'+x+'`' for x in a)}\n"
+                        )
+                    if r:
+                        out.append(
+                            f"    - packages removed: {', '.join('`'+x+'`' for x in r)}\n"
+                        )
+                else:
+                    out.append(f"    - {k}: `{v.get('old')}` → `{v.get('new')}`\n")
+
+    us = report.get("users", {})
+    out.append("## Users\n")
+    if us.get("added"):
+        out.append("- Added\n")
+        for u in us.get("added", []) or []:
+            out.append(f"  - `+ {u}`\n")
+    if us.get("removed"):
+        out.append("- Removed\n")
+        for u in us.get("removed", []) or []:
+            out.append(f"  - `- {u}`\n")
+    if us.get("changed"):
+        out.append("- Changed\n")
+        for ch in us.get("changed", []) or []:
+            name = ch.get("name")
+            out.append(f"  - `{name}`\n")
+            for k, v in (ch.get("changes") or {}).items():
+                if k == "supplementary_groups":
+                    a = (v or {}).get("added", [])
+                    r = (v or {}).get("removed", [])
+                    if a:
+                        out.append(
+                            f"    - groups added: {', '.join('`'+x+'`' for x in a)}\n"
+                        )
+                    if r:
+                        out.append(
+                            f"    - groups removed: {', '.join('`'+x+'`' for x in r)}\n"
+                        )
+                else:
+                    out.append(f"    - {k}: `{v.get('old')}` → `{v.get('new')}`\n")
+
+    fl = report.get("files", {})
+    out.append("## Files\n")
+    if fl.get("added"):
+        out.append("- Added\n")
+        for e in fl.get("added", []) or []:
+            out.append(
+                f"  - `+ {e.get('path')}` (role={e.get('role')}, reason={e.get('reason')})\n"
+            )
+    if fl.get("removed"):
+        out.append("- Removed\n")
+        for e in fl.get("removed", []) or []:
+            out.append(
+                f"  - `- {e.get('path')}` (role={e.get('role')}, reason={e.get('reason')})\n"
+            )
+    if fl.get("changed"):
+        out.append("- Changed\n")
+        for ch in fl.get("changed", []) or []:
+            p = ch.get("path")
+            out.append(f"  - `{p}`\n")
+            for k, v in (ch.get("changes") or {}).items():
+                if k == "content":
+                    if "old_sha256" in (v or {}):
+                        out.append("    - content: sha256 changed\n")
+                    else:
+                        out.append(
+                            f"    - content: `{v.get('old')}` → `{v.get('new')}`\n"
+                        )
+                else:
+                    out.append(f"    - {k}: `{v.get('old')}` → `{v.get('new')}`\n")
+
+    if not any(
+        [
+            (pk.get("added") or []),
+            (pk.get("removed") or []),
+            (sv.get("enabled_added") or []),
+            (sv.get("enabled_removed") or []),
+            (sv.get("changed") or []),
+            (us.get("added") or []),
+            (us.get("removed") or []),
+            (us.get("changed") or []),
+            (fl.get("added") or []),
+            (fl.get("removed") or []),
+            (fl.get("changed") or []),
+        ]
+    ):
+        out.append("\n_No differences detected._\n")
+
+    return "".join(out)
+
+
+def post_webhook(
+    url: str,
+    body: bytes,
+    *,
+    headers: Optional[Dict[str, str]] = None,
+    timeout_s: int = 10,
+) -> Tuple[int, str]:
+    req = urllib.request.Request(url=url, data=body, method="POST")
+    for k, v in (headers or {}).items():
+        req.add_header(k, v)
+    try:
+        with urllib.request.urlopen(req, timeout=timeout_s) as resp:  # nosec
+            status = int(getattr(resp, "status", 0) or 0)
+            text = resp.read().decode("utf-8", errors="replace")
+            return status, text
+    except Exception as e:
+        raise RuntimeError(f"webhook POST failed: {e}") from e
+
+
+def send_email(
+    *,
+    to_addrs: List[str],
+    subject: str,
+    body: str,
+    from_addr: Optional[str] = None,
+    smtp: Optional[str] = None,
+    smtp_user: Optional[str] = None,
+    smtp_password: Optional[str] = None,
+) -> None:
+    if not to_addrs:
+        raise RuntimeError("email: no recipients")
+
+    msg = EmailMessage()
+    msg["To"] = ", ".join(to_addrs)
+    if from_addr:
+        msg["From"] = from_addr
+    else:
+        host = os.uname().nodename
+        msg["From"] = f"enroll@{host}"
+    msg["Subject"] = subject
+    msg.set_content(body)
+
+    # Preferred: use local sendmail if smtp wasn't specified.
+    if not smtp:
+        sendmail = shutil.which("sendmail")
+        if not sendmail:
+            raise RuntimeError(
+                "email: no --smtp provided and sendmail not found on PATH"
+            )
+        p = subprocess.run(
+            [sendmail, "-t", "-i"],
+            input=msg.as_bytes(),
+            capture_output=True,
+            check=False,
+        )  # nosec
+        if p.returncode != 0:
+            raise RuntimeError(
+                "email: sendmail failed:\n"
+                f"  rc: {p.returncode}\n"
+                f"  stderr: {p.stderr.decode('utf-8', errors='replace').strip()}"
+            )
+        return
+
+    import smtplib
+
+    host = smtp
+    port = 25
+    if ":" in smtp:
+        host, port_s = smtp.rsplit(":", 1)
+        try:
+            port = int(port_s)
+        except ValueError:
+            raise RuntimeError(f"email: invalid smtp port in {smtp!r}")
+
+    with smtplib.SMTP(host, port, timeout=10) as s:
+        s.ehlo()
+        try:
+            s.starttls()
+            s.ehlo()
+        except Exception:
+            # STARTTLS is optional; ignore if unsupported.
+            pass  # nosec
+        if smtp_user:
+            s.login(smtp_user, smtp_password or "")
+        s.send_message(msg)
--- a/enroll/harvest.py
+++ b/enroll/harvest.py
@ -78,6 +78,14 @@ class EtcCustomSnapshot:
    notes: List[str]


+@dataclass
+class UsrLocalCustomSnapshot:
+    role_name: str
+    managed_files: List[ManagedFile]
+    excluded: List[ExcludedFile]
+    notes: List[str]
+
+
 ALLOWED_UNOWNED_EXTS = {
    ".conf",
    ".cfg",
@ -701,6 +709,103 @@ def harvest(
        notes=etc_notes,
    )

+    # -------------------------
+    # usr_local_custom role (/usr/local/etc + /usr/local/bin scripts)
+    # -------------------------
+    ul_notes: List[str] = []
+    ul_excluded: List[ExcludedFile] = []
+    ul_managed: List[ManagedFile] = []
+    ul_role_name = "usr_local_custom"
+
+    # Extend the already-captured set with etc_custom.
+    already_all: Set[str] = set(already)
+    for mf in etc_managed:
+        already_all.add(mf.path)
+
+    def _scan_usr_local_tree(
+        root: str, *, require_executable: bool, cap: int, reason: str
+    ) -> None:
+        scanned = 0
+        if not os.path.isdir(root):
+            return
+        for dirpath, _, filenames in os.walk(root):
+            for fn in filenames:
+                path = os.path.join(dirpath, fn)
+                if path in already_all:
+                    continue
+                if not os.path.isfile(path) or os.path.islink(path):
+                    continue
+                if require_executable:
+                    try:
+                        owner, group, mode = stat_triplet(path)
+                    except OSError:
+                        ul_excluded.append(ExcludedFile(path=path, reason="unreadable"))
+                        continue
+                    try:
+                        if (int(mode, 8) & 0o111) == 0:
+                            continue
+                    except ValueError:
+                        # If mode parsing fails, be conservative and skip.
+                        continue
+                else:
+                    try:
+                        owner, group, mode = stat_triplet(path)
+                    except OSError:
+                        ul_excluded.append(ExcludedFile(path=path, reason="unreadable"))
+                        continue
+
+                deny = policy.deny_reason(path)
+                if deny:
+                    ul_excluded.append(ExcludedFile(path=path, reason=deny))
+                    continue
+
+                src_rel = path.lstrip("/")
+                try:
+                    _copy_into_bundle(bundle_dir, ul_role_name, path, src_rel)
+                except OSError:
+                    ul_excluded.append(ExcludedFile(path=path, reason="unreadable"))
+                    continue
+
+                ul_managed.append(
+                    ManagedFile(
+                        path=path,
+                        src_rel=src_rel,
+                        owner=owner,
+                        group=group,
+                        mode=mode,
+                        reason=reason,
+                    )
+                )
+
+                already_all.add(path)
+                scanned += 1
+                if scanned >= cap:
+                    ul_notes.append(f"Reached file cap ({cap}) while scanning {root}.")
+                    return
+
+    # /usr/local/etc: capture all non-binary regular files (filtered by IgnorePolicy)
+    _scan_usr_local_tree(
+        "/usr/local/etc",
+        require_executable=False,
+        cap=2000,
+        reason="usr_local_etc_custom",
+    )
+
+    # /usr/local/bin: capture executable scripts only (skip non-executable text)
+    _scan_usr_local_tree(
+        "/usr/local/bin",
+        require_executable=True,
+        cap=2000,
+        reason="usr_local_bin_script",
+    )
+
+    usr_local_custom_snapshot = UsrLocalCustomSnapshot(
+        role_name=ul_role_name,
+        managed_files=ul_managed,
+        excluded=ul_excluded,
+        notes=ul_notes,
+    )
+
    state = {
        "host": {"hostname": os.uname().nodename, "os": "debian"},
        "users": asdict(users_snapshot),
@ -709,6 +814,7 @@ def harvest(
        "manual_packages_skipped": manual_pkgs_skipped,
        "package_roles": [asdict(p) for p in pkg_snaps],
        "etc_custom": asdict(etc_custom_snapshot),
+        "usr_local_custom": asdict(usr_local_custom_snapshot),
    }

    state_path = os.path.join(bundle_dir, "state.json")
--- a/enroll/ignore.py
+++ b/enroll/ignore.py
@ -23,6 +23,11 @@ DEFAULT_DENY_GLOBS = [
    "/etc/gshadow",
    "/etc/*shadow",
    "/etc/letsencrypt/*",
+    "/usr/local/etc/ssl/private/*",
+    "/usr/local/etc/ssh/ssh_host_*",
+    "/usr/local/etc/*shadow",
+    "/usr/local/etc/*gshadow",
+    "/usr/local/etc/letsencrypt/*",
 ]

 SENSITIVE_CONTENT_PATTERNS = [
--- a/enroll/manifest.py
+++ b/enroll/manifest.py
@ -567,7 +567,7 @@ def _tar_dir_to_with_progress(
            cols = shutil.get_terminal_size((80, 20)).columns
            msg = msg[: cols - 1]
        except Exception:
-            pass # nosec
+            pass  # nosec
        os.write(2, ("\r" + msg).encode("utf-8", errors="replace"))

    with tarfile.open(tar_path, mode="w:gz") as tf:
@ -629,6 +629,7 @@ def _manifest_from_bundle_dir(
    package_roles: List[Dict[str, Any]] = state.get("package_roles", [])
    users_snapshot: Dict[str, Any] = state.get("users", {})
    etc_custom_snapshot: Dict[str, Any] = state.get("etc_custom", {})
+    usr_local_custom_snapshot: Dict[str, Any] = state.get("usr_local_custom", {})

    site_mode = fqdn is not None and fqdn != ""

@ -661,6 +662,7 @@ def _manifest_from_bundle_dir(

    manifested_users_roles: List[str] = []
    manifested_etc_custom_roles: List[str] = []
+    manifested_usr_local_custom_roles: List[str] = []
    manifested_service_roles: List[str] = []
    manifested_pkg_roles: List[str] = []

@ -999,6 +1001,105 @@ Unowned /etc config files not attributed to packages or services.

    # -------------------------

+    # -------------------------
+
+    # -------------------------
+    # usr_local_custom role (/usr/local/etc + /usr/local/bin scripts)
+    # -------------------------
+    if usr_local_custom_snapshot and usr_local_custom_snapshot.get("managed_files"):
+        role = usr_local_custom_snapshot.get("role_name", "usr_local_custom")
+        role_dir = os.path.join(roles_root, role)
+        _write_role_scaffold(role_dir)
+
+        var_prefix = role
+
+        managed_files = usr_local_custom_snapshot.get("managed_files", [])
+        excluded = usr_local_custom_snapshot.get("excluded", [])
+        notes = usr_local_custom_snapshot.get("notes", [])
+
+        templated, jt_vars = _jinjify_managed_files(
+            bundle_dir,
+            role,
+            role_dir,
+            managed_files,
+            jt_exe=jt_exe,
+            jt_enabled=jt_enabled,
+            overwrite_templates=not site_mode,
+        )
+
+        # Copy only the non-templated artifacts (templates live in the role).
+        if site_mode:
+            _copy_artifacts(
+                bundle_dir,
+                role,
+                _host_role_files_dir(out_dir, fqdn or "", role),
+                exclude_rels=templated,
+            )
+        else:
+            _copy_artifacts(
+                bundle_dir,
+                role,
+                os.path.join(role_dir, "files"),
+                exclude_rels=templated,
+            )
+
+        files_var = _build_managed_files_var(
+            managed_files,
+            templated,
+            notify_other=None,
+            notify_systemd=None,
+        )
+
+        jt_map = _yaml_load_mapping(jt_vars) if jt_vars.strip() else {}
+        vars_map: Dict[str, Any] = {f"{var_prefix}_managed_files": files_var}
+        vars_map = _merge_mappings_overwrite(vars_map, jt_map)
+
+        if site_mode:
+            _write_role_defaults(role_dir, {f"{var_prefix}_managed_files": []})
+            _write_hostvars(out_dir, fqdn or "", role, vars_map)
+        else:
+            _write_role_defaults(role_dir, vars_map)
+
+        tasks = "---\n" + _render_generic_files_tasks(
+            var_prefix, include_restart_notify=False
+        )
+        with open(
+            os.path.join(role_dir, "tasks", "main.yml"), "w", encoding="utf-8"
+        ) as f:
+            f.write(tasks.rstrip() + "\n")
+
+        # No handlers needed for this role, but keep a valid YAML document.
+        with open(
+            os.path.join(role_dir, "handlers", "main.yml"), "w", encoding="utf-8"
+        ) as f:
+            f.write("---\n")
+
+        with open(
+            os.path.join(role_dir, "meta", "main.yml"), "w", encoding="utf-8"
+        ) as f:
+            f.write("---\ndependencies: []\n")
+
+        readme = (
+            """# usr_local_custom\n\n"""
+            "Unowned /usr/local files (scripts in /usr/local/bin and config under /usr/local/etc).\n\n"
+            "## Managed files\n"
+            + ("\n".join([f"- {mf.get('path')}" for mf in managed_files]) or "- (none)")
+            + "\n\n## Excluded\n"
+            + (
+                "\n".join([f"- {e.get('path')} ({e.get('reason')})" for e in excluded])
+                or "- (none)"
+            )
+            + "\n\n## Notes\n"
+            + ("\n".join([f"- {n}" for n in notes]) or "- (none)")
+            + "\n"
+        )
+        with open(os.path.join(role_dir, "README.md"), "w", encoding="utf-8") as f:
+            f.write(readme)
+
+        manifested_usr_local_custom_roles.append(role)
+
+    # -------------------------
+
    # -------------------------
    # Service roles
    # -------------------------
@ -1310,6 +1411,7 @@ Generated for package `{pkg}`.
        manifested_pkg_roles
        + manifested_service_roles
        + manifested_etc_custom_roles
+        + manifested_usr_local_custom_roles
        + manifested_users_roles
    )

--- a/enroll/remote.py
+++ b/enroll/remote.py
@ -200,7 +200,7 @@ def remote_harvest(

            # Stream a tarball back to the local machine (avoid creating a tar file on the remote).
            cmd = f"tar -cz -C {rbundle} ."
-            _stdin, stdout, stderr = ssh.exec_command(cmd) # nosec
+            _stdin, stdout, stderr = ssh.exec_command(cmd)  # nosec
            with open(local_tgz, "wb") as f:
                while True:
                    chunk = stdout.read(1024 * 128)
--- a/enroll/sopsutil.py
+++ b/enroll/sopsutil.py
@ -2,7 +2,7 @@ from __future__ import annotations

 import os
 import shutil
-import subprocess # nosec
+import subprocess  # nosec
 import tempfile
 from pathlib import Path
 from typing import Iterable, List, Optional
@ -62,7 +62,7 @@ def encrypt_file_binary(
        ],
        capture_output=True,
        check=False,
-    ) # nosec
+    )  # nosec
    if res.returncode != 0:
        raise SopsError(
            "sops encryption failed:\n"
@ -112,7 +112,7 @@ def decrypt_file_binary_to(
        ],
        capture_output=True,
        check=False,
-    ) # nosec
+    )  # nosec
    if res.returncode != 0:
        raise SopsError(
            "sops decryption failed:\n"
--- a/pyproject.toml
+++ b/pyproject.toml
@ -1,6 +1,6 @@
 [tool.poetry]
 name = "enroll"
-version = "0.1.0"
+version = "0.1.2"
 description = "Enroll a server's running state retrospectively into Ansible"
 authors = ["Miguel Jacq <mig@mig5.net>"]
 license = "GPL-3.0-or-later"
--- a/tests.sh
+++ b/tests.sh
@ -15,10 +15,10 @@ poetry run \
 	  --harvest "${BUNDLE_DIR}" \
 	  --out "${ANSIBLE_DIR}"

-builtin cd "${ANSIBLE_DIR}" 
+builtin cd "${ANSIBLE_DIR}"

 # Lint
-ansible-lint "${ANSIBLE_DIR}" 
+ansible-lint "${ANSIBLE_DIR}"

 # Run
 ansible-playbook playbook.yml -i "localhost," -c local --check --diff
--- a/tests/test_diff_usr_local_custom.py
+++ b/tests/test_diff_usr_local_custom.py
@ -0,0 +1,111 @@
+import json
+from pathlib import Path
+
+from enroll.diff import compare_harvests
+
+
+def _write_bundle(root: Path, state: dict, artifacts: dict[str, bytes]) -> None:
+    root.mkdir(parents=True, exist_ok=True)
+    (root / "state.json").write_text(json.dumps(state, indent=2), encoding="utf-8")
+    for rel, data in artifacts.items():
+        p = root / rel
+        p.parent.mkdir(parents=True, exist_ok=True)
+        p.write_bytes(data)
+
+
+def test_diff_includes_usr_local_custom_files(tmp_path: Path):
+    old = tmp_path / "old"
+    new = tmp_path / "new"
+
+    old_state = {
+        "host": {"hostname": "h1", "os": "debian"},
+        "users": {
+            "role_name": "users",
+            "users": [],
+            "managed_files": [],
+            "excluded": [],
+            "notes": [],
+        },
+        "services": [],
+        "package_roles": [],
+        "manual_packages": ["curl"],
+        "manual_packages_skipped": [],
+        "etc_custom": {
+            "role_name": "etc_custom",
+            "managed_files": [],
+            "excluded": [],
+            "notes": [],
+        },
+        "usr_local_custom": {
+            "role_name": "usr_local_custom",
+            "managed_files": [
+                {
+                    "path": "/usr/local/etc/myapp.conf",
+                    "src_rel": "usr/local/etc/myapp.conf",
+                    "owner": "root",
+                    "group": "root",
+                    "mode": "0644",
+                    "reason": "usr_local_etc_custom",
+                }
+            ],
+            "excluded": [],
+            "notes": [],
+        },
+    }
+    new_state = {
+        **old_state,
+        "manual_packages": ["curl", "htop"],
+        "usr_local_custom": {
+            "role_name": "usr_local_custom",
+            "managed_files": [
+                {
+                    "path": "/usr/local/etc/myapp.conf",
+                    "src_rel": "usr/local/etc/myapp.conf",
+                    "owner": "root",
+                    "group": "root",
+                    "mode": "0644",
+                    "reason": "usr_local_etc_custom",
+                },
+                {
+                    "path": "/usr/local/bin/myscript",
+                    "src_rel": "usr/local/bin/myscript",
+                    "owner": "root",
+                    "group": "root",
+                    "mode": "0755",
+                    "reason": "usr_local_bin_script",
+                },
+            ],
+            "excluded": [],
+            "notes": [],
+        },
+    }
+
+    _write_bundle(
+        old,
+        old_state,
+        {
+            "artifacts/usr_local_custom/usr/local/etc/myapp.conf": b"myapp=1\n",
+        },
+    )
+    _write_bundle(
+        new,
+        new_state,
+        {
+            "artifacts/usr_local_custom/usr/local/etc/myapp.conf": b"myapp=2\n",
+            "artifacts/usr_local_custom/usr/local/bin/myscript": b"#!/bin/sh\necho hi\n",
+        },
+    )
+
+    report, has_changes = compare_harvests(str(old), str(new))
+    assert has_changes is True
+
+    # Packages: htop was added.
+    assert report["packages"]["added"] == ["htop"]
+
+    # Files: /usr/local/etc/myapp.conf should be detected as changed (content sha differs).
+    changed_paths = {c["path"] for c in report["files"]["changed"]}
+    assert "/usr/local/etc/myapp.conf" in changed_paths
+
+    # Files: new script was added.
+    added_paths = {a["path"] for a in report["files"]["added"]}
+    assert "/usr/local/bin/myscript" in added_paths
--- a/tests/test_harvest.py
+++ b/tests/test_harvest.py
@ -23,30 +23,51 @@ def test_harvest_dedup_manual_packages_and_builds_etc_custom(
    real_islink = os.path.islink

    # Fake filesystem: two /etc files exist, only one is dpkg-owned.
+    # Also include some /usr/local files to populate usr_local_custom.
    files = {
        "/etc/openvpn/server.conf": b"server",
        "/etc/default/keyboard": b"kbd",
+        "/usr/local/etc/myapp.conf": b"myapp=1\n",
+        "/usr/local/bin/myscript": b"#!/bin/sh\necho hi\n",
+        # non-executable text under /usr/local/bin should be skipped
+        "/usr/local/bin/readme.txt": b"hello\n",
+    }
+    dirs = {
+        "/etc",
+        "/etc/openvpn",
+        "/etc/default",
+        "/usr",
+        "/usr/local",
+        "/usr/local/etc",
+        "/usr/local/bin",
    }
-    dirs = {"/etc", "/etc/openvpn", "/etc/default"}

    def fake_isfile(p: str) -> bool:
        if p.startswith("/etc/") or p == "/etc":
            return p in files
+        if p.startswith("/usr/local/"):
+            return p in files
        return real_isfile(p)

    def fake_isdir(p: str) -> bool:
        if p.startswith("/etc"):
            return p in dirs
+        if p.startswith("/usr/local") or p in ("/usr", "/usr/local"):
+            return p in dirs
        return real_isdir(p)

    def fake_islink(p: str) -> bool:
        if p.startswith("/etc"):
            return False
+        if p.startswith("/usr/local"):
+            return False
        return real_islink(p)

    def fake_exists(p: str) -> bool:
        if p.startswith("/etc"):
            return p in files or p in dirs
+        if p.startswith("/usr/local") or p in ("/usr", "/usr/local"):
+            return p in files or p in dirs
        return real_exists(p)

    def fake_walk(root: str):
@ -57,6 +78,10 @@ def test_harvest_dedup_manual_packages_and_builds_etc_custom(
            yield ("/etc/openvpn", [], ["server.conf"])
        elif root == "/etc/default":
            yield ("/etc/default", [], ["keyboard"])
+        elif root == "/usr/local/etc":
+            yield ("/usr/local/etc", [], ["myapp.conf"])
+        elif root == "/usr/local/bin":
+            yield ("/usr/local/bin", [], ["myscript", "readme.txt"])
        else:
            yield (root, [], [])

@ -109,7 +134,13 @@ def test_harvest_dedup_manual_packages_and_builds_etc_custom(
    monkeypatch.setattr(h, "list_manual_packages", lambda: ["openvpn", "curl"])
    monkeypatch.setattr(h, "collect_non_system_users", lambda: [])

-    monkeypatch.setattr(h, "stat_triplet", lambda p: ("root", "root", "0644"))
+    def fake_stat_triplet(p: str):
+        if p == "/usr/local/bin/myscript":
+            return ("root", "root", "0755")
+        # /usr/local/bin/readme.txt remains non-executable
+        return ("root", "root", "0644")
+
+    monkeypatch.setattr(h, "stat_triplet", fake_stat_triplet)

    # Avoid needing source files on disk by implementing our own bundle copier
    def fake_copy(bundle_dir: str, role_name: str, abs_path: str, src_rel: str):
@ -139,3 +170,9 @@ def test_harvest_dedup_manual_packages_and_builds_etc_custom(
    assert any(
        mf["path"] == "/etc/default/keyboard" for mf in etc_custom["managed_files"]
    )
+
+    # /usr/local content is attributed to usr_local_custom
+    ul = st["usr_local_custom"]
+    assert any(mf["path"] == "/usr/local/etc/myapp.conf" for mf in ul["managed_files"])
+    assert any(mf["path"] == "/usr/local/bin/myscript" for mf in ul["managed_files"])
+    assert all(mf["path"] != "/usr/local/bin/readme.txt" for mf in ul["managed_files"])
--- a/tests/test_manifest.py
+++ b/tests/test_manifest.py
@ -47,6 +47,29 @@ def test_manifest_writes_roles_and_playbook_with_clean_when(tmp_path: Path):
            "excluded": [],
            "notes": [],
        },
+        "usr_local_custom": {
+            "role_name": "usr_local_custom",
+            "managed_files": [
+                {
+                    "path": "/usr/local/etc/myapp.conf",
+                    "src_rel": "usr/local/etc/myapp.conf",
+                    "owner": "root",
+                    "group": "root",
+                    "mode": "0644",
+                    "reason": "usr_local_etc_custom",
+                },
+                {
+                    "path": "/usr/local/bin/myscript",
+                    "src_rel": "usr/local/bin/myscript",
+                    "owner": "root",
+                    "group": "root",
+                    "mode": "0755",
+                    "reason": "usr_local_bin_script",
+                },
+            ],
+            "excluded": [],
+            "notes": [],
+        },
        "services": [
            {
                "unit": "foo.service",
@ -92,6 +115,26 @@ def test_manifest_writes_roles_and_playbook_with_clean_when(tmp_path: Path):
        "kbd", encoding="utf-8"
    )

+    # Create artifacts for usr_local_custom files so copy works
+    (bundle / "artifacts" / "usr_local_custom" / "usr" / "local" / "etc").mkdir(
+        parents=True, exist_ok=True
+    )
+    (
+        bundle
+        / "artifacts"
+        / "usr_local_custom"
+        / "usr"
+        / "local"
+        / "etc"
+        / "myapp.conf"
+    ).write_text("myapp=1\n", encoding="utf-8")
+    (bundle / "artifacts" / "usr_local_custom" / "usr" / "local" / "bin").mkdir(
+        parents=True, exist_ok=True
+    )
+    (
+        bundle / "artifacts" / "usr_local_custom" / "usr" / "local" / "bin" / "myscript"
+    ).write_text("#!/bin/sh\necho hi\n", encoding="utf-8")
+
    manifest(str(bundle), str(out))

    # Service role: systemd management should be gated on foo_manage_unit and a probe.
@ -119,6 +162,7 @@ def test_manifest_writes_roles_and_playbook_with_clean_when(tmp_path: Path):
    pb = (out / "playbook.yml").read_text(encoding="utf-8")
    assert "- users" in pb
    assert "- etc_custom" in pb
+    assert "- usr_local_custom" in pb
    assert "- curl" in pb
    assert "- foo" in pb

@ -168,6 +212,21 @@ def test_manifest_site_mode_creates_host_inventory_and_raw_files(tmp_path: Path)
            "excluded": [],
            "notes": [],
        },
+        "usr_local_custom": {
+            "role_name": "usr_local_custom",
+            "managed_files": [
+                {
+                    "path": "/usr/local/etc/myapp.conf",
+                    "src_rel": "usr/local/etc/myapp.conf",
+                    "owner": "root",
+                    "group": "root",
+                    "mode": "0644",
+                    "reason": "usr_local_etc_custom",
+                }
+            ],
+            "excluded": [],
+            "notes": [],
+        },
        "services": [
            {
                "unit": "foo.service",
@ -197,6 +256,20 @@ def test_manifest_site_mode_creates_host_inventory_and_raw_files(tmp_path: Path)
    bundle.mkdir(parents=True, exist_ok=True)
    (bundle / "state.json").write_text(json.dumps(state, indent=2), encoding="utf-8")

+    # Artifacts for usr_local_custom file so copy works.
+    (bundle / "artifacts" / "usr_local_custom" / "usr" / "local" / "etc").mkdir(
+        parents=True, exist_ok=True
+    )
+    (
+        bundle
+        / "artifacts"
+        / "usr_local_custom"
+        / "usr"
+        / "local"
+        / "etc"
+        / "myapp.conf"
+    ).write_text("myapp=1\n", encoding="utf-8")
+
    manifest(str(bundle), str(out), fqdn=fqdn)

    # Host playbook exists.
--- a/tests/test_misc_coverage.py
+++ b/tests/test_misc_coverage.py
@ -0,0 +1,96 @@
+import stat
+from pathlib import Path
+
+import pytest
+
+from enroll.cache import _safe_component, new_harvest_cache_dir
+from enroll.ignore import IgnorePolicy
+from enroll.sopsutil import (
+    SopsError,
+    _pgp_arg,
+    decrypt_file_binary_to,
+    encrypt_file_binary,
+)
+
+
+def test_safe_component_sanitizes_and_bounds_length():
+    assert _safe_component("  ") == "unknown"
+    assert _safe_component("a/b c") == "a_b_c"
+    assert _safe_component("x" * 200) == "x" * 64
+
+
+def test_new_harvest_cache_dir_uses_xdg_cache_home(tmp_path: Path, monkeypatch):
+    monkeypatch.setenv("XDG_CACHE_HOME", str(tmp_path / "xdg"))
+    hc = new_harvest_cache_dir(hint="my host/01")
+    assert hc.dir.exists()
+    assert "my_host_01" in hc.dir.name
+    assert str(hc.dir).startswith(str(tmp_path / "xdg"))
+    # best-effort: ensure directory is not world-readable on typical FS
+    try:
+        mode = stat.S_IMODE(hc.dir.stat().st_mode)
+        assert mode & 0o077 == 0
+    except OSError:
+        pass
+
+
+def test_ignore_policy_denies_binary_and_sensitive_content(tmp_path: Path):
+    p_bin = tmp_path / "binfile"
+    p_bin.write_bytes(b"abc\x00def")
+    assert IgnorePolicy().deny_reason(str(p_bin)) == "binary_like"
+
+    p_secret = tmp_path / "secret.conf"
+    p_secret.write_text("password=foo\n", encoding="utf-8")
+    assert IgnorePolicy().deny_reason(str(p_secret)) == "sensitive_content"
+
+    # dangerous mode disables heuristic scanning (but still checks file-ness/size)
+    assert IgnorePolicy(dangerous=True).deny_reason(str(p_secret)) is None
+
+
+def test_ignore_policy_denies_usr_local_shadow_by_glob():
+    # This should short-circuit before stat() (path doesn't need to exist).
+    assert IgnorePolicy().deny_reason("/usr/local/etc/shadow") == "denied_path"
+
+
+def test_sops_pgp_arg_and_encrypt_decrypt_roundtrip(tmp_path: Path, monkeypatch):
+    assert _pgp_arg(["  ABC ", "DEF"]) == "ABC,DEF"
+    with pytest.raises(SopsError):
+        _pgp_arg([])
+
+    # Stub out sops and subprocess.
+    import enroll.sopsutil as s
+
+    monkeypatch.setattr(s, "require_sops_cmd", lambda: "sops")
+
+    class R:
+        def __init__(self, rc: int, out: bytes, err: bytes = b""):
+            self.returncode = rc
+            self.stdout = out
+            self.stderr = err
+
+    calls = []
+
+    def fake_run(cmd, capture_output, check):
+        calls.append(cmd)
+        # Return a deterministic payload so we can assert file writes.
+        if "--encrypt" in cmd:
+            return R(0, b"ENCRYPTED")
+        if "--decrypt" in cmd:
+            return R(0, b"PLAINTEXT")
+        return R(1, b"", b"bad")
+
+    monkeypatch.setattr(s.subprocess, "run", fake_run)
+
+    src = tmp_path / "src.bin"
+    src.write_bytes(b"x")
+    enc = tmp_path / "out.sops"
+    dec = tmp_path / "out.bin"
+
+    encrypt_file_binary(src, enc, pgp_fingerprints=["ABC"], mode=0o600)
+    assert enc.read_bytes() == b"ENCRYPTED"
+
+    decrypt_file_binary_to(enc, dec, mode=0o644)
+    assert dec.read_bytes() == b"PLAINTEXT"
+
+    # Sanity: we invoked encrypt and decrypt.
+    assert any("--encrypt" in c for c in calls)
+    assert any("--decrypt" in c for c in calls)
Author	SHA1	Message	Date
Miguel Jacq	25add369dc	README.md update All checks were successful CI / test (push) Successful in 5m3s Details Lint / test (push) Successful in 27s Details Trivy / test (push) Successful in 18s Details	2025-12-18 17:24:45 +11:00
Miguel Jacq	4660a0703e	Include files from `/usr/local/bin` and `/usr/local/etc` in harvest (assuming they aren't binaries or symlinks) and store in `usr_local_custom` role, similar to `etc_custom`. All checks were successful CI / test (push) Successful in 5m43s Details Lint / test (push) Successful in 30s Details Trivy / test (push) Successful in 19s Details	2025-12-18 17:11:04 +11:00
Miguel Jacq	b5d2b99174	Add diff mode All checks were successful Lint / test (push) Successful in 30s Details Trivy / test (push) Successful in 23s Details CI / test (push) Successful in 5m14s Details	2025-12-18 14:59:51 +11:00
Miguel Jacq	55e50ebf59	Fix end of file/whitespace per pre-commit All checks were successful CI / test (push) Successful in 5m11s Details Lint / test (push) Successful in 27s Details Trivy / test (push) Successful in 17s Details	2025-12-18 13:50:00 +11:00
Miguel Jacq	e94bd86c75	Add files param to bandit pre-commit Some checks failed Lint / test (push) Waiting to run Details Trivy / test (push) Waiting to run Details CI / test (push) Has been cancelled Details	2025-12-18 13:45:59 +11:00
Miguel Jacq	bfa2f4a724	Add bandit to pre-commit	2025-12-18 13:44:26 +11:00
Miguel Jacq	591ecaa235	Add pre-commit config Some checks failed Lint / test (push) Waiting to run Details Trivy / test (push) Waiting to run Details CI / test (push) Has been cancelled Details	2025-12-18 13:41:22 +11:00
Miguel Jacq	a235028f3b	black All checks were successful CI / test (push) Successful in 5m38s Details Lint / test (push) Successful in 27s Details Trivy / test (push) Successful in 21s Details	2025-12-18 13:34:37 +11:00