Compare commits

...

85 commits
0.1.0 ... main

Author SHA1 Message Date
3fcfefe644
0.5.0
All checks were successful
CI / test (push) Successful in 8m28s
Lint / test (push) Successful in 1m5s
2026-05-12 12:24:00 +10:00
618dd20e7c
Update deps 2026-05-12 12:23:52 +10:00
5695f4258e
Add support for ssh configs as templates, via JinjaTurtle 2026-05-12 12:23:41 +10:00
5c686d27cc
Remove trivy..
All checks were successful
CI / test (push) Successful in 8m16s
Lint / test (push) Successful in 33s
2026-03-23 11:20:56 +11:00
4ea7267b92
Update my GPG key
All checks were successful
CI / test (push) Successful in 8m26s
Lint / test (push) Successful in 33s
Trivy / test (push) Successful in 25s
2026-03-11 12:02:39 +11:00
d403dcb918
0.4.4
All checks were successful
CI / test (push) Successful in 8m14s
Lint / test (push) Successful in 32s
Trivy / test (push) Successful in 24s
2026-02-17 10:58:38 +11:00
778237740a
Add ability to gracefully handle an encrypted private key for SSH (can be forced or automated with an env var too)
All checks were successful
CI / test (push) Successful in 8m22s
Lint / test (push) Successful in 32s
Trivy / test (push) Successful in 24s
2026-02-17 10:35:51 +11:00
87ddf52e81
Update cryptography dependency
All checks were successful
CI / test (push) Successful in 8m22s
Lint / test (push) Successful in 33s
Trivy / test (push) Successful in 26s
2026-02-17 10:00:39 +11:00
5f6b0f49d9
Update dependencies
All checks were successful
CI / test (push) Successful in 8m22s
Lint / test (push) Successful in 32s
Trivy / test (push) Successful in 26s
2026-01-16 10:59:22 +11:00
1856e3a79d
Add support for AddressFamily and ConnectTimeout in the .ssh/config when using --remote-ssh-config. 2026-01-16 10:58:39 +11:00
478b0e1b9d
Add README example for --remote-ssh-config
All checks were successful
CI / test (push) Successful in 8m19s
Lint / test (push) Successful in 33s
Trivy / test (push) Successful in 25s
2026-01-13 22:03:58 +11:00
f5eaac9f75
Support --remote-ssh-config [path-to-ssh-config] as an argument in case extra params are required beyond --remote-port or --remote-user.
All checks were successful
CI / test (push) Successful in 8m18s
Lint / test (push) Successful in 33s
Trivy / test (push) Successful in 25s
Note: `--remote-host` must still be set, but it can be an 'alias' represented by the 'Host' value in the ssh config.
2026-01-13 21:56:28 +11:00
5754ef1aad
Add interactive output when 'enroll diff --enforce' is invoking Ansible.
All checks were successful
CI / test (push) Successful in 8m18s
Lint / test (push) Successful in 32s
Trivy / test (push) Successful in 24s
2026-01-11 10:01:16 +11:00
d172d848c4
Relax python3-jsonschema version for Fedora support
All checks were successful
CI / test (push) Successful in 8m16s
Lint / test (push) Successful in 34s
Trivy / test (push) Successful in 24s
2026-01-10 11:44:51 +11:00
f84d795c49
Rename test file
All checks were successful
CI / test (push) Successful in 8m15s
Lint / test (push) Successful in 32s
Trivy / test (push) Successful in 24s
2026-01-10 11:24:01 +11:00
95b784c1a0
Fix and add tests
Some checks failed
Lint / test (push) Waiting to run
Trivy / test (push) Waiting to run
CI / test (push) Has been cancelled
2026-01-10 11:16:28 +11:00
ebd30247d1
Add --enforce mode to enroll diff and add --ignore-package-versions
Some checks failed
CI / test (push) Failing after 1m48s
Lint / test (push) Successful in 32s
Trivy / test (push) Successful in 22s
If there is diff detected between the two harvests, and it can
enforce restoring the state from the older harvest, it will
manifest the state and apply it with ansible. Only the specific
roles that had diffed will be applied (via the new tags capability).

`--ignore-package-versions` will skip reporting when packages are
upgraded/downgraded in the diff.
2026-01-10 10:51:41 +11:00
9a249cc973
Initial pass at an --enforce mode for enroll diff, to manifest and restore state of old harvest if ansible is on the PATH
All checks were successful
CI / test (push) Successful in 8m13s
Lint / test (push) Successful in 33s
Trivy / test (push) Successful in 23s
2026-01-10 09:50:28 +11:00
9749190cd8
Fix test
All checks were successful
CI / test (push) Successful in 8m14s
Lint / test (push) Successful in 32s
Trivy / test (push) Successful in 21s
2026-01-10 09:15:29 +11:00
ca3d958a96
Add --exclude-path to enroll diff command
Some checks failed
CI / test (push) Failing after 1m45s
Lint / test (push) Successful in 31s
Trivy / test (push) Successful in 23s
So that you can ignore certain churn from the diff

(stuff you still wanted to harvest as a baseline but don't care if it changes day to day)
2026-01-10 08:56:35 +11:00
8be821c494
Update pynacl dependency to resolve CVE-2025-69277
All checks were successful
CI / test (push) Successful in 8m1s
Lint / test (push) Successful in 31s
Trivy / test (push) Successful in 23s
2026-01-08 17:16:58 +11:00
8daed96b7c
Attempt to generate Jinja2 templates of systemd unit files and Postfix main.cf (now that JinjaTurtle supports it)
All checks were successful
CI / test (push) Successful in 8m13s
Lint / test (push) Successful in 31s
Trivy / test (push) Successful in 23s
2026-01-06 12:47:12 +11:00
e0ef5ede98
Run validate in CLI tests
All checks were successful
CI / test (push) Successful in 8m28s
Lint / test (push) Successful in 31s
Trivy / test (push) Successful in 26s
2026-01-05 21:30:14 +11:00
025f00f924
Fix tests
Some checks failed
Lint / test (push) Waiting to run
Trivy / test (push) Waiting to run
CI / test (push) Has been cancelled
2026-01-05 21:25:46 +11:00
66d032d981
Introduce 'enroll validate' to check a harvest meets the schema spec and isn't lacking artifacts or contains orphaned ones
Some checks failed
CI / test (push) Failing after 1m47s
Lint / test (push) Successful in 31s
Trivy / test (push) Successful in 23s
2026-01-05 21:17:50 +11:00
45e0d9bb16
0.3.0
All checks were successful
CI / test (push) Successful in 8m25s
Lint / test (push) Successful in 31s
Trivy / test (push) Successful in 23s
2026-01-05 17:13:43 +11:00
9f30c56e8a
Don't remove apache2 (it breaks the manifest run)
Some checks failed
CI / test (push) Successful in 8m22s
Lint / test (push) Successful in 31s
Trivy / test (push) Has been cancelled
2026-01-05 17:04:06 +11:00
7a9a0abcd1
Add tests for symlinks management
Some checks failed
CI / test (push) Failing after 7m32s
Lint / test (push) Successful in 30s
Trivy / test (push) Successful in 23s
2026-01-05 16:54:39 +11:00
aea58c8684
Install Apache2 to test symlinks management in the ansible manifests
Some checks failed
Lint / test (push) Waiting to run
Trivy / test (push) Waiting to run
CI / test (push) Has been cancelled
2026-01-05 16:48:08 +11:00
ca4cf00e84
Changelog entry for symlinks
All checks were successful
CI / test (push) Successful in 7m36s
Lint / test (push) Successful in 31s
Trivy / test (push) Successful in 23s
2026-01-05 16:30:14 +11:00
d3fdfc9ef7
Manage certain symlinks e.g for apache2/nginx sites-enabled and so on
Some checks failed
Lint / test (push) Waiting to run
Trivy / test (push) Waiting to run
CI / test (push) Has been cancelled
2026-01-05 16:29:21 +11:00
bcf3dd7422
Fix tests
All checks were successful
CI / test (push) Successful in 7m18s
Lint / test (push) Successful in 31s
Trivy / test (push) Successful in 22s
2026-01-05 15:52:25 +11:00
91ec1b8791
Ignore files ending in - in the /etc/ dir e.g /etc/shadow-
Some checks failed
CI / test (push) Failing after 1m43s
Lint / test (push) Successful in 32s
Trivy / test (push) Successful in 23s
2026-01-05 15:48:17 +11:00
b5e32770a3
Ignore files that end with a tilde (probably backup files generated by editors) 2026-01-05 15:23:45 +11:00
e04b158c39
Fix non-interactive test
All checks were successful
CI / test (push) Successful in 7m19s
Lint / test (push) Successful in 30s
Trivy / test (push) Successful in 23s
2026-01-05 15:06:20 +11:00
a1433d645f
Capture other files in the user's home directory
Some checks failed
CI / test (push) Failing after 1m57s
Lint / test (push) Successful in 32s
Trivy / test (push) Successful in 27s
Such as `.bashrc`, `.bash_aliases`, `.profile`, if these files differ from the `/etc/skel` defaults
2026-01-05 15:02:22 +11:00
e68ec0bffc
More test coverage 2026-01-05 14:27:56 +11:00
24cedc8c8d
Centralise the cron and logrotate stuff into their respective roles.
All checks were successful
CI / test (push) Successful in 7m52s
Lint / test (push) Successful in 30s
Trivy / test (push) Successful in 23s
We had a bit of duplication between roles based on harvest discovery.

Arguably some crons/logrotate scripts are specific to other packages,
but it helps to go to one place to find them all. We'll apply these
roles last in the playbook, to give an opportunity for all other
packages / non-system users to have been installed already.
2026-01-05 12:01:25 +11:00
c9003d589d
Fix test. Update README
All checks were successful
CI / test (push) Successful in 8m1s
Lint / test (push) Successful in 30s
Trivy / test (push) Successful in 23s
2026-01-05 10:23:15 +11:00
59674d4660
Introduce enroll explain
Some checks failed
CI / test (push) Failing after 1m45s
Lint / test (push) Successful in 31s
Trivy / test (push) Successful in 23s
A tool to analyze and explain what's in (or not in) a harvest and why.
2026-01-05 10:16:44 +11:00
56d0148614
Update README
All checks were successful
CI / test (push) Successful in 6m53s
Lint / test (push) Successful in 30s
Trivy / test (push) Successful in 22s
2026-01-04 21:27:23 +11:00
04234e296f
0.2.3
All checks were successful
CI / test (push) Successful in 6m55s
Lint / test (push) Successful in 30s
Trivy / test (push) Successful in 23s
2026-01-04 21:05:49 +11:00
a2be708a31
Support for remote hosts that require password for sudo.
Some checks failed
Lint / test (push) Waiting to run
Trivy / test (push) Waiting to run
CI / test (push) Has been cancelled
Introduce --ask-become-pass or -K to support password-required sudo on remote hosts, just like Ansible.

It will also fall back to this prompt if a password is required but the arg wasn't passed in.

With thanks to slhck from HN for the initial patch, advice and feedback.
2026-01-04 20:49:10 +11:00
9df4dc862d
Add CONTRIBUTORS.md 2026-01-04 15:53:33 +11:00
fd55bcde9b
fix fedora release
All checks were successful
CI / test (push) Successful in 7m2s
Lint / test (push) Successful in 31s
Trivy / test (push) Successful in 23s
2026-01-03 12:56:59 +11:00
1d3ce6191e
remove 'fc' from release root
Some checks failed
Lint / test (push) Waiting to run
Trivy / test (push) Waiting to run
CI / test (push) Has been cancelled
2026-01-03 12:49:14 +11:00
626d76c755
Update README for RPM repo URL
Some checks failed
Lint / test (push) Waiting to run
Trivy / test (push) Waiting to run
CI / test (push) Has been cancelled
2026-01-03 12:46:32 +11:00
f82fd894ca
More test coverage (71%)
Some checks failed
Lint / test (push) Waiting to run
Trivy / test (push) Waiting to run
CI / test (push) Has been cancelled
2026-01-03 12:34:39 +11:00
9a2516d858
Fix release date
All checks were successful
CI / test (push) Successful in 7m0s
Lint / test (push) Successful in 31s
Trivy / test (push) Successful in 23s
2026-01-03 12:17:43 +11:00
6c3275b44a
Fix tests
All checks were successful
CI / test (push) Successful in 7m4s
Lint / test (push) Successful in 30s
Trivy / test (push) Successful in 22s
2026-01-03 11:46:40 +11:00
824010b2ab
Several bug fixes and prep for 0.2.2
Some checks failed
CI / test (push) Failing after 1m40s
Lint / test (push) Successful in 31s
Trivy / test (push) Successful in 24s
- Fix stat() of parent directory so that we set directory perms correct on --include paths.
 - Set pty for remote calls when sudo is required, to help systems with limits on sudo without pty
2026-01-03 11:39:57 +11:00
29b52d451d
0.2.1
Some checks failed
CI / test (push) Failing after 2m37s
Lint / test (push) Successful in 30s
Trivy / test (push) Successful in 23s
2026-01-02 21:29:16 +11:00
c88405ef01
Ensure directories in the tree of anything included with --include are defined in the state and manifest so we make dirs before we try to create files 2026-01-02 21:10:32 +11:00
781efef467
Don't accidentally add extra_paths role to usr_local_custom list, resulting in extra_paths appearing twice in manifested playbook 2026-01-02 20:19:47 +11:00
09438246ae
Build for Fedora 43
All checks were successful
CI / test (push) Successful in 6m42s
Lint / test (push) Successful in 31s
Trivy / test (push) Successful in 23s
2026-01-01 15:24:21 +11:00
e4887b7add
Update README.md
All checks were successful
CI / test (push) Successful in 6m39s
Lint / test (push) Successful in 31s
Trivy / test (push) Successful in 23s
2026-01-01 11:02:30 +11:00
e44e4aaf3a
0.2.0
All checks were successful
CI / test (push) Successful in 4m52s
Lint / test (push) Successful in 27s
Trivy / test (push) Successful in 17s
2025-12-29 17:39:39 +11:00
f01603dac4
Better attribution of config files to parent service/role (not systemd helpers)
All checks were successful
CI / test (push) Successful in 4m51s
Lint / test (push) Successful in 27s
Trivy / test (push) Successful in 15s
2025-12-29 17:19:59 +11:00
081739fd19
Fix tests
All checks were successful
CI / test (push) Successful in 5m7s
Lint / test (push) Successful in 29s
Trivy / test (push) Successful in 18s
2025-12-29 16:35:21 +11:00
043802e800
Refactor state structure and capture versions of packages 2025-12-29 16:10:27 +11:00
984b0fa81b
Add ability to enroll RH-style systems (DNF5/DNF/RPM)
All checks were successful
CI / test (push) Successful in 5m9s
Lint / test (push) Successful in 27s
Trivy / test (push) Successful in 17s
2025-12-29 14:59:34 +11:00
ad2abed612
Add version CLI arg 2025-12-29 14:29:11 +11:00
8c19473e18
Fix an attribution bug for certain files ending up in the wrong package/role.
All checks were successful
CI / test (push) Successful in 5m2s
Lint / test (push) Successful in 29s
Trivy / test (push) Successful in 21s
2025-12-28 18:37:14 +11:00
921801caa6
0.1.6
All checks were successful
CI / test (push) Successful in 5m24s
Lint / test (push) Successful in 30s
Trivy / test (push) Successful in 16s
2025-12-28 15:32:40 +11:00
3fc5aec5fc
0.1.5
All checks were successful
CI / test (push) Successful in 5m4s
Lint / test (push) Successful in 28s
Trivy / test (push) Successful in 17s
2025-12-28 09:56:52 +11:00
8c6b51be3e
Manage apt stuff in its own role, not in etc_custom
Some checks failed
Lint / test (push) Waiting to run
Trivy / test (push) Waiting to run
CI / test (push) Has been cancelled
2025-12-28 09:39:14 +11:00
303c1b0dd8
Consolidate logrotate and cron files into their main service/package roles if they exist. Standardise on MAX_FILES_CAP in one place 2025-12-28 09:30:21 +11:00
cae6246177
Add Fedora install steps to README
All checks were successful
CI / test (push) Successful in 5m1s
Lint / test (push) Successful in 29s
Trivy / test (push) Successful in 18s
2025-12-27 19:14:01 +11:00
40aad9e798
0.1.4
All checks were successful
CI / test (push) Successful in 5m0s
Lint / test (push) Successful in 28s
Trivy / test (push) Successful in 18s
2025-12-27 19:04:00 +11:00
054a6192d1
Capture more singletons in /etc and avoid apt duplication
Some checks failed
Lint / test (push) Waiting to run
Trivy / test (push) Waiting to run
CI / test (push) Has been cancelled
2025-12-27 19:02:22 +11:00
4d2250f974
Add fedora rpm building
All checks were successful
CI / test (push) Successful in 4m45s
Lint / test (push) Successful in 28s
Trivy / test (push) Successful in 17s
2025-12-27 16:56:30 +11:00
8c478249d9
Add build-deb action workflow
All checks were successful
CI / test (push) Successful in 4m48s
Lint / test (push) Successful in 27s
Trivy / test (push) Successful in 17s
2025-12-23 17:22:50 +11:00
51196a0a2b
Fix trivy exit code
All checks were successful
CI / test (push) Successful in 4m48s
Lint / test (push) Successful in 28s
Trivy / test (push) Successful in 17s
2025-12-22 17:28:10 +11:00
59239eb2d2
Fix formatting in README
All checks were successful
CI / test (push) Successful in 5m33s
Lint / test (push) Successful in 30s
Trivy / test (push) Successful in 17s
2025-12-20 18:38:05 +11:00
cf819f755a
0.1.3
All checks were successful
CI / test (push) Successful in 5m35s
Lint / test (push) Successful in 28s
Trivy / test (push) Successful in 20s
2025-12-20 18:26:04 +11:00
9641637d4d
Add support for an enroll.ini config file to store arguments per subcommand, to avoid having to remember them all for repetitive executions.
Some checks failed
Lint / test (push) Waiting to run
Trivy / test (push) Waiting to run
CI / test (push) Has been cancelled
2025-12-20 18:24:46 +11:00
240e79706f
Allow the user to add extra paths to harvest, or
All checks were successful
CI / test (push) Successful in 5m31s
Lint / test (push) Successful in 34s
Trivy / test (push) Successful in 19s
paths to ignore, using `--exclude-path` and
`--include-path` arguments.
2025-12-20 17:47:00 +11:00
25add369dc
README.md update
All checks were successful
CI / test (push) Successful in 5m3s
Lint / test (push) Successful in 27s
Trivy / test (push) Successful in 18s
2025-12-18 17:24:45 +11:00
4660a0703e
Include files from /usr/local/bin and /usr/local/etc in harvest (assuming they aren't binaries or symlinks) and store in usr_local_custom role, similar to etc_custom.
All checks were successful
CI / test (push) Successful in 5m43s
Lint / test (push) Successful in 30s
Trivy / test (push) Successful in 19s
2025-12-18 17:11:04 +11:00
b5d2b99174
Add diff mode
All checks were successful
CI / test (push) Successful in 5m14s
Lint / test (push) Successful in 30s
Trivy / test (push) Successful in 23s
2025-12-18 14:59:51 +11:00
55e50ebf59
Fix end of file/whitespace per pre-commit
All checks were successful
CI / test (push) Successful in 5m11s
Lint / test (push) Successful in 27s
Trivy / test (push) Successful in 17s
2025-12-18 13:50:00 +11:00
e94bd86c75
Add files param to bandit pre-commit
Some checks failed
Lint / test (push) Waiting to run
Trivy / test (push) Waiting to run
CI / test (push) Has been cancelled
2025-12-18 13:45:59 +11:00
bfa2f4a724
Add bandit to pre-commit 2025-12-18 13:44:26 +11:00
591ecaa235
Add pre-commit config
Some checks failed
Lint / test (push) Waiting to run
Trivy / test (push) Waiting to run
CI / test (push) Has been cancelled
2025-12-18 13:41:22 +11:00
a235028f3b
black
All checks were successful
CI / test (push) Successful in 5m38s
Lint / test (push) Successful in 27s
Trivy / test (push) Successful in 21s
2025-12-18 13:34:37 +11:00
69 changed files with 15288 additions and 1197 deletions

View file

@ -0,0 +1,66 @@
name: CI
on:
push:
jobs:
test:
runs-on: docker
steps:
- name: Install system dependencies
run: |
apt-get update
DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends \
build-essential \
devscripts \
debhelper \
dh-python \
pybuild-plugin-pyproject \
python3-all \
python3-poetry-core \
python3-yaml \
python3-paramiko \
python3-jsonschema \
rsync \
ca-certificates
- name: Checkout
uses: actions/checkout@v4
with:
submodules: recursive
- name: Build deb
run: |
mkdir /out
rsync -a --delete \
--exclude '.git' \
--exclude '.venv' \
--exclude 'dist' \
--exclude 'build' \
--exclude '__pycache__' \
--exclude '.pytest_cache' \
--exclude '.mypy_cache' \
./ /out/
cd /out/
export DEBEMAIL="mig@mig5.net"
export DEBFULLNAME="Miguel Jacq"
dch --distribution "trixie" --local "~trixie" "CI build for trixie"
dpkg-buildpackage -us -uc -b
# Notify if any previous step in this job failed
- name: Notify on failure
if: ${{ failure() }}
env:
WEBHOOK_URL: ${{ secrets.NODERED_WEBHOOK_URL }}
REPOSITORY: ${{ forgejo.repository }}
RUN_NUMBER: ${{ forgejo.run_number }}
SERVER_URL: ${{ forgejo.server_url }}
run: |
curl -X POST \
-H "Content-Type: application/json" \
-d "{\"repository\":\"$REPOSITORY\",\"run_number\":\"$RUN_NUMBER\",\"status\":\"failure\",\"url\":\"$SERVER_URL/$REPOSITORY/actions/runs/$RUN_NUMBER\"}" \
"$WEBHOOK_URL"

View file

@ -15,7 +15,7 @@ jobs:
run: |
apt-get update
DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends \
ansible ansible-lint python3-venv pipx systemctl python3-apt
ansible ansible-lint python3-venv pipx systemctl python3-apt jq python3-jsonschema
- name: Install Poetry
run: |

View file

@ -1,40 +0,0 @@
name: Trivy
on:
schedule:
- cron: '0 1 * * *'
push:
jobs:
test:
runs-on: docker
steps:
- name: Checkout
uses: actions/checkout@v4
- name: Install system dependencies
run: |
DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends wget gnupg
wget -qO - https://aquasecurity.github.io/trivy-repo/deb/public.key | gpg --dearmor | tee /usr/share/keyrings/trivy.gpg > /dev/null
echo "deb [signed-by=/usr/share/keyrings/trivy.gpg] https://aquasecurity.github.io/trivy-repo/deb generic main" | tee -a /etc/apt/sources.list.d/trivy.list
apt-get update
DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends trivy
- name: Run trivy
run: |
trivy fs --no-progress --ignore-unfixed --format table --disable-telemetry .
# Notify if any previous step in this job failed
- name: Notify on failure
if: ${{ failure() }}
env:
WEBHOOK_URL: ${{ secrets.NODERED_WEBHOOK_URL }}
REPOSITORY: ${{ forgejo.repository }}
RUN_NUMBER: ${{ forgejo.run_number }}
SERVER_URL: ${{ forgejo.server_url }}
run: |
curl -X POST \
-H "Content-Type: application/json" \
-d "{\"repository\":\"$REPOSITORY\",\"run_number\":\"$RUN_NUMBER\",\"status\":\"failure\",\"url\":\"$SERVER_URL/$REPOSITORY/actions/runs/$RUN_NUMBER\"}" \
"$WEBHOOK_URL"

25
.pre-commit-config.yaml Normal file
View file

@ -0,0 +1,25 @@
repos:
- repo: https://github.com/pycqa/flake8
rev: 7.3.0
hooks:
- id: flake8
args: ["--select=F"]
types: [python]
- repo: https://github.com/psf/black-pre-commit-mirror
rev: 25.11.0
hooks:
- id: black
language_version: python3
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v4.4.0
hooks:
- id: trailing-whitespace
- id: end-of-file-fixer
- repo: https://github.com/PyCQA/bandit
rev: 1.9.2
hooks:
- id: bandit
files: ^enroll/

View file

@ -1,3 +1,96 @@
# 0.4.4
* Update cryptography dependency
* Add capability to handle passphrases on encrypted SSH private keys. Prompting can be forced with `--ask-key-passphrase` or automated (e.g for CI) with `--ssh-key-passphrase env SOMEVAR`
# 0.4.3
* Add support for AddressFamily and ConnectTimeout in the .ssh/config when using `--remote-ssh-config`.
* Update dependencies
# 0.4.2
* Support `--remote-ssh-config [path-to-ssh-config]` as an argument in case extra params are required beyond `--remote-port` or `--remote-user`. Note: `--remote-host` must still be set, but it can be an 'alias' represented by the 'Host' value in the ssh config.
# 0.4.1
* Add interactive output when 'enroll diff --enforce' is invoking Ansible.
# 0.4.0
* Introduce `enroll validate` - a tool to validate a harvest against the state schema, or check for missing or orphaned obsolete artifacts in a harvest.
* Attempt to generate Jinja2 templates of systemd unit files and Postfix main.cf (now that JinjaTurtle supports it)
* Update pynacl dependency to resolve CVE-2025-69277
* Add `--exclude-path` to `enroll diff` command, so that you can ignore certain churn from the diff (stuff you still wanted to harvest as a baseline but don't care if it changes day to day)
* Add `--ignore-package-versions` to `enroll diff` command, to optionally ignore package upgrades (e.g due to patching) from the diff.
* Add tags to the playbook for each role, to allow easier targeting of specific roles during play later.
* Add `--enforce` mode to `enroll diff`. If there is diff detected between the two harvests, and it can enforce restoring the state from the older harvest, it will manifest the state and apply it with ansible. Only the specific roles that had diffed will be applied (via the new tags capability)
# 0.3.0
* Introduce `enroll explain` - a tool to analyze and explain what's in (or not in) a harvest and why.
* Centralise the cron and logrotate stuff into their respective roles, we had a bit of duplication between roles based on harvest discovery.
* Capture other files in the user's home directory such as `.bashrc`, `.bash_aliases`, `.profile`, if these files differ from the `/etc/skel` defaults
* Ignore files that end with a tilde or - (probably backup files generated by editors or shadow file changes)
* Manage certain symlinks e.g for apache2/nginx sites-enabled and so on
# 0.2.3
* Introduce --ask-become-pass or -K to support password-required sudo on remote hosts, just like Ansible. It will also fall back to this prompt if a password is required but the arg wasn't passed in.
# 0.2.2
* Fix stat() of parent directory so that we set directory perms correct on --include paths.
* Set pty for remote calls when sudo is required, to help systems with limits on sudo without pty
# 0.2.1
* Don't accidentally add `extra_paths` role to `usr_local_custom` list, resulting in `extra_paths` appearing twice in manifested playbook
* Ensure directories in the tree of anything included with --include are defined in the state and manifest so we make dirs before we try to create files
# 0.2.0
* Add version CLI arg
* Add ability to enroll RH-style systems (DNF5/DNF/RPM)
* Refactor harvest state to track package versions
# 0.1.7
* Fix an attribution bug for certain files ending up in the wrong package/role.
# 0.1.6
* DRY up some code logic
* More test coverage
# 0.1.5
* Consolidate logrotate and cron files into their main service/package roles if they exist.
* Standardise on `MAX_FILES_CAP` in one place
* Manage apt stuff in its own role, not in `etc_custom`
# 0.1.4
* Attempt to capture more stuff from /etc that might not be attributable to a specific package. This includes common singletons and systemd timers
* Avoid duplicate apt data in package-specific roles.
# 0.1.3
* Allow the user to add extra paths to harvest, or paths to ignore, using `--exclude-path` and `--include-path`
arguments.
* Add support for an enroll.ini config file to store arguments per subcommand, to avoid having to remember
them all for repetitive executions.
# 0.1.2
* Include files from `/usr/local/bin` and `/usr/local/etc` in harvest (assuming they aren't binaries or
symlinks) and store in `usr_local_custom` role, similar to `etc_custom`.
# 0.1.1
* Add `diff` subcommand which can compare two harvests and send email or webhook notifications in different
formats.
# 0.1.0
* Add remote mode for harvesting a remote machine via a local workstation (no need to install enroll remotely)
@ -39,7 +132,7 @@
* Merge pkg_ and roles created based on file/service detection
* Avoid idempotency issue with users (`password_lock`)
* Rename subcommands/args ('export' is now 'enroll', '--bundle' is now '--harvest')
* Rename subcommands/args ('export' is now 'enroll', '--bundle' is now '--harvest')
* Don't try and start systemd services that were Inactive at harvest time
* Capture miscellaneous files in /etc under their own `etc_custom` role, but not backup files
* Add tests

5
CONTRIBUTORS.md Normal file
View file

@ -0,0 +1,5 @@
## Contributors
mig5 would like to thank the following people for their contributions to Enroll.
* [slhck](https://slhck.info/)

View file

@ -26,6 +26,7 @@ RUN set -eux; \
python3-poetry-core \
python3-yaml \
python3-paramiko \
python3-jsonschema \
rsync \
ca-certificates \
; \

88
Dockerfile.rpmbuild Normal file
View file

@ -0,0 +1,88 @@
# syntax=docker/dockerfile:1
ARG BASE_IMAGE=fedora:42
FROM ${BASE_IMAGE}
RUN set -eux; \
dnf -y update; \
dnf -y install \
rpm-build \
rpmdevtools \
redhat-rpm-config \
gcc \
make \
findutils \
tar \
gzip \
rsync \
python3 \
python3-devel \
python3-setuptools \
python3-wheel \
pyproject-rpm-macros \
python3-rpm-macros \
python3-yaml \
python3-paramiko \
python3-jsonschema \
openssl-devel \
python3-poetry-core ; \
dnf -y clean all
# Build runner script (copies repo, tars, runs rpmbuild)
RUN set -eux; cat > /usr/local/bin/build-rpm <<'EOF'
#!/usr/bin/env bash
set -euo pipefail
SRC="${SRC:-/src}"
WORKROOT="${WORKROOT:-/work}"
OUT="${OUT:-/out}"
VERSION_ID="$(grep VERSION_ID /etc/os-release | cut -d= -f2)"
echo "Version ID is ${VERSION_ID}"
mkdir -p "${WORKROOT}" "${OUT}"
WORK="${WORKROOT}/src"
rm -rf "${WORK}"
mkdir -p "${WORK}"
rsync -a --delete \
--exclude '.git' \
--exclude '.venv' \
--exclude 'dist' \
--exclude 'build' \
--exclude '__pycache__' \
--exclude '.pytest_cache' \
--exclude '.mypy_cache' \
"${SRC}/" "${WORK}/"
cd "${WORK}"
# Determine version from pyproject.toml unless provided
if [ -n "${VERSION:-}" ]; then
ver="${VERSION}"
else
ver="$(grep -m1 '^version = ' pyproject.toml | sed -E 's/version = "([^"]+)".*/\1/')"
fi
TOPDIR="${WORKROOT}/rpmbuild"
mkdir -p "${TOPDIR}"/{BUILD,BUILDROOT,RPMS,SOURCES,SPECS,SRPMS}
tarball="${TOPDIR}/SOURCES/enroll-${ver}.tar.gz"
tar -czf "${tarball}" --transform "s#^#enroll/#" .
spec_src="rpm/enroll.spec"
cp -v "${spec_src}" "${TOPDIR}/SPECS/enroll.spec"
rpmbuild -ba "${TOPDIR}/SPECS/enroll.spec" \
--define "_topdir ${TOPDIR}" \
--define "upstream_version ${ver}"
shopt -s nullglob
cp -v "${TOPDIR}"/RPMS/*/*.rpm "${OUT}/" || true
cp -v "${TOPDIR}"/SRPMS/*.src.rpm "${OUT}/" || true
echo "Artifacts copied to ${OUT}"
EOF
RUN chmod +x /usr/local/bin/build-rpm
WORKDIR /work
ENTRYPOINT ["/usr/local/bin/build-rpm"]

696
README.md
View file

@ -4,217 +4,304 @@
<img src="https://git.mig5.net/mig5/enroll/raw/branch/main/enroll.svg" alt="Enroll logo" width="240" />
</div>
**enroll** inspects a Linux machine (currently Debian-only) and generates Ansible roles for things it finds running on the machine.
**enroll** inspects a Linux machine (Debian-like or RedHat-like) and generates Ansible roles/playbooks (and optionally inventory) for what it finds.
It aims to be **optimistic and noninteractive**:
- Detects packages that have been installed
- Detects Debian package ownership of `/etc` files using dpkg's local database.
- Captures config that has **changed from packaged defaults** (dpkg conffile hashes + package md5sums when available).
- Detects packages that have been installed.
- Detects package ownership of `/etc` files where possible
- Captures config that has **changed from packaged defaults** where possible (e.g dpkg conffile hashes + package md5sums when available).
- Also captures **service-relevant custom/unowned files** under `/etc/<service>/...` (e.g. drop-in config includes).
- Defensively excludes likely secrets (path denylist + content sniff + size caps).
- Captures non-system users that exist on the system, and their SSH public keys
- Captures miscellaneous `/etc` files that it can't attribute to a package, and installs it in an `etc_custom` role
- Avoids trying to start systemd services that were detected as being Inactive during harvest
- Captures non-system users and their SSH public keys and any .bashrc or .bash_aliases or .profile files that deviate from the skel defaults.
- Captures miscellaneous `/etc` files it can't attribute to a package and installs them in an `etc_custom` role.
- Captures symlinks in common applications that rely on them, e.g apache2/nginx 'sites-enabled'
- Ditto for /usr/local/bin (for non-binary files) and /usr/local/etc
- Avoids trying to start systemd services that were detected as inactive during harvest.
---
# Two modes: single-site vs multi-site (`--fqdn`)
## Mental model
**enroll** has two distinct ways to generate Ansible:
`enroll` works in two phases:
## 1) Single-site mode (default: *no* `--fqdn`)
Use this when you're enrolling **one server** (or you're generating a "golden" role set you intend to reuse).
1) **Harvest**: collect host facts + relevant files into a harvest bundle (`state.json` + harvested artifacts)
2) **Manifest**: turn that harvest into Ansible roles/playbooks (and optionally inventory)
**What you get**
- Config, templates, and defaults are primarily **contained inside each role**.
- Raw config files (when not templated) live in the role's `files/`.
- Template variables (when templated) live in the role's `defaults/main.yml`.
Additionally, some other functionalities exist:
**Pros**
- Roles are more **self-contained** and easier to understand.
- Better starting point for **provisioning new servers**, because the role contains most of what it needs.
- Less inventory abstraction/duplication.
- **Diff**: compare two harvests and report what changed (packages/services/users/files) since the previous snapshot.
- **Single-shot mode**: run both harvest and manifest at once.
**Cons**
- Less convenient for quickly enrolling multiple hosts with divergent configs (you'll do more manual work to make roles flexible across hosts).
---
## 2) Multi-site mode (`--fqdn`)
Use this when you want to enroll **several existing servers** quickly, especially if they differ.
## Output modes: single-site vs multi-site (`--fqdn`)
**What you get**
- Roles are **shared** across hosts, but host-specific data lives in inventory.
- Host inventory drives what's managed:
- which files to deploy for that host
- which packages are relevant for that host
- which services should be enabled/started for that host
- For non-templated config, raw files live in host-specific inventory under `.files/` (per role).
`enroll manifest` (and `enroll single-shot`) support two distinct output styles.
**Pros**
- Fastest way to retrofit **multiple servers** into config management.
- Avoids shared-role "host A breaks host B" problems by keeping host-specific state in inventory.
- Better fit when you already have a fleet and want to capture/reflect reality first.
### Single-site mode (default: *no* `--fqdn`)
Use when enrolling **one server** (or generating a “golden” role set you intend to reuse).
**Cons**
- More abstraction: roles become more "data-driven".
- Potential duplication: raw files may exist per-host in inventory (even if identical).
- Harder to use the roles to **provision a brand-new server** without also building an inventory for that new host, because multi-site output assumes the server already exists and is being retrofitted.
**Characteristics**
- Roles are more self-contained.
- Raw config files live in the role's `files/`.
- Template variables live in the role's `defaults/main.yml`.
### Multi-site mode (`--fqdn`)
Use when enrolling **several existing servers** quickly, especially if they differ.
**Characteristics**
- Roles are shared, host-specific state lives in inventory.
- Host inventory drives what gets managed (files/packages/services).
- Non-templated raw files live per-host under `inventory/host_vars/<fqdn>/<role>/.files/...`.
**Rule of thumb**
- If your goal is *"make this one server reproducible / provisionable"* → start with **single-site**.
- If your goal is *"get several already-running servers under management quickly"* → use **multi-site**.
- “Make this one server reproducible/provisionable” → start with **single-site**
- “Get multiple already-running servers under management quickly” → use **multi-site**
---
# Key concepts
## Subcommands
## Harvest
### `enroll harvest`
Harvest state about a host and write a harvest bundle.
**enroll** begins by 'harvesting' known state about your host. This includes detecting what running services exist, what packages have been installed 'manually' (that is, stuff that doesn't come out of the box with the OS), and anything 'custom' in `/etc` that it can't attribute to a specific package.
**What it captures (high level)**
- Detected services + service-relevant packages
- “Manual” packages
- Changed-from-default config (plus related custom/unowned files under service dirs)
- Non-system users + SSH public keys
- Misc `/etc` that can't be attributed to a package (`etc_custom` role)
- Optional user-specified extra files/dirs via `--include-path` (emitted as an `extra_paths` role at manifest time)
It also detects if any config files have been *changed* from their packaged defaults. If they have, it will attempt to 'harvest' them. If the config file is identical to how it comes with the package, then it doesn't bother harvesting it, because there's little value in config-managing it if it's identical to what you get by simply installing the package!
**Common flags**
- Remote harvesting:
- `--remote-host`, `--remote-user`, `--remote-port`, `--remote-ssh-config`
- `--no-sudo` (if you don't want/need sudo)
- Sensitive-data behaviour:
- default: tries to avoid likely secrets
- `--dangerous`: disables secret-safety checks (see “Sensitive data” below)
- Encrypt bundles at rest:
- `--sops <FINGERPRINT...>`: writes a single encrypted `harvest.tar.gz.sops` instead of a plaintext directory
- Path selection (include/exclude):
- `--include-path <PATTERN>` (repeatable): add extra files/dirs to harvest (even from locations normally ignored, like `/home`). Still subject to secret-safety checks unless `--dangerous`.
- `--exclude-path <PATTERN>` (repeatable): skip files/dirs even if they would normally be harvested.
- Pattern syntax:
- plain path: matches that file; directories match the directory + everything under it
- glob (default): supports `*` and `**` (prefix with `glob:` to force)
- regex: prefix with `re:` or `regex:`
- Precedence: excludes win over includes.
* Using remote mode and auth requires secrets?
* sudo password:
* `--ask-become-pass` (or `-K`) prompts for the sudo password.
* If you forget, and remote sudo requires a password, Enroll will still fall back to prompting in interactive mode (slightly slower due to retry).
* SSH private-key passphrase:
* `--ask-key-passphrase` prompts for the SSH key passphrase.
* `--ssh-key-passphrase-env ENV_VAR` reads the SSH key passphrase from an environment variable (useful for CI/non-interactive runs).
* If neither is provided, and Enroll detects an encrypted key in an interactive session, it will still fall back to prompting on-demand.
* In non-interactive sessions, pass `--ask-key-passphrase` or `--ssh-key-passphrase-env ENV_VAR` when using encrypted private keys.
* Note: `--ask-key-passphrase` and `--ssh-key-passphrase-env` are mutually exclusive.
The harvest writes a state.json file explaining all the data it harvested and, if it chose not to harvest something, explanations as to why that is the case (see below: sensitive data).
### Remote harvesting (workstation → remote)
If you'd prefer not to install **enroll** on the target host, you can run the harvest over SSH from your workstation and pull the harvest bundle back locally:
Examples (encrypted SSH key)
```bash
enroll harvest --remote-host myhost.example.com --remote-user myuser --out /tmp/enroll-harvest
# Interactive
enroll harvest --remote-host myhost.example.com --remote-user myuser --ask-key-passphrase --out /tmp/enroll-harvest
# Non-interactive / CI
export ENROLL_SSH_KEY_PASSPHRASE='correct horse battery staple'
enroll single-shot --remote-host myhost.example.com --remote-user myuser --ssh-key-passphrase-env ENROLL_SSH_KEY_PASSPHRASE --harvest /tmp/enroll-harvest --out /tmp/enroll-ansible --fqdn myhost.example.com
```
- `--remote-port` defaults to `22`
- `--remote-user` defaults to your local `$USER`
---
This uploads a self-contained `enroll` zipapp to a temporary directory on the remote host, runs `harvest` there, then downloads the resulting harvest bundle to the `--out` directory on your workstation.
### `enroll manifest`
Generate Ansible output from an existing harvest bundle.
**Privilege note:** A "full" harvest typically needs root access. Remote harvesting assumes the remote user can run `sudo` **without a password prompt** (NOPASSWD) so the harvest can run non-interactively. If you don't want this, pass `--no-sudo` as well.
**Inputs**
- `--harvest /path/to/harvest` (directory)
or `--harvest /path/to/harvest.tar.gz.sops` (if using `--sops`)
**Output**
- In plaintext mode: an Ansible repo-like directory structure (roles/playbooks, and inventory in multi-site mode).
- In `--sops` mode: a single encrypted file `manifest.tar.gz.sops` containing the generated output.
**Common flags**
- `--fqdn <host>`: enables **multi-site** output style
**Role tags**
Generated playbooks tag each role so you can target just the parts you need:
- Tag format: `role_<role_name>` (e.g. `role_services`, `role_users`)
- Fallback/safe tag: `role_other`
Example:
```bash
ansible-playbook -i "localhost," -c local /tmp/enroll-ansible/playbook.yml --tags role_services,role_users
```
---
### `enroll single-shot`
Convenience wrapper that runs **harvest → manifest** in one command.
Use this when you want “get me something workable ASAP”.
Supports the same general flags as harvest/manifest, including `--fqdn`, remote harvest flags, and `--sops`.
---
### `enroll diff`
Compare two harvest bundles and report what changed.
**What it reports**
- Packages added/removed
- Services enabled added/removed, plus key state changes
- Users added/removed, plus field changes (uid/gid/home/shell/groups, etc.)
- Managed files added/removed/changed (metadata + content hash changes where available)
**Inputs**
- `--old <harvest>` and `--new <harvest>` (directories or `state.json` paths)
- `--sops` when comparing SOPS-encrypted harvest bundles
- `--exclude-path <PATTERN>` (repeatable) to ignore file/dir drift under matching paths (same pattern syntax as harvest)
- `--ignore-package-versions` to ignore package version-only drift (upgrades/downgrades)
- `--enforce` to apply the **old** harvest state locally (requires `ansible-playbook` on `PATH`)
**Noise suppression**
- `--exclude-path` is useful for things that change often but you still want in the harvest baseline (e.g. `/var/anacron`).
- `--ignore-package-versions` keeps routine upgrades from alerting; package add/remove drift is still reported.
**Enforcement (`--enforce`)**
If a diff exists and `ansible-playbook` is available, Enroll will:
1) generate a manifest from the **old** harvest into a temporary directory
2) run `ansible-playbook -i localhost, -c local <tmp>/playbook.yml` (often with `--tags role_<...>` to limit runtime)
3) record in the diff report that the old harvest was enforced
Enforcement is intentionally “safe”:
- reinstalls packages that were removed (`state: present`), but does **not** attempt downgrades/pinning
- restores users, files (contents + permissions/ownership), and service enable/start state
If `ansible-playbook` is not on `PATH`, Enroll returns an error and does not enforce.
**Output formats**
- `--format json` (default for webhooks)
- `--format markdown` / `--format text` (human-oriented)
**Notifications**
- Webhook:
- `--webhook <url>`
- `--webhook-format json|markdown|text`
- `--webhook-header 'Header-Name: value'` (repeatable)
- Email (optional):
- `--email-to <addr>` (plus optional SMTP/sendmail-related flags, depending on your install)
---
### `enroll explain`
Analyze a harvest and provide user-friendly explanations for what's in it and why.
This may also explain why something *wasn't* included (e.g a binary file, a file that was too large, unreadable due to permissions, or looked like a log file/secret.
Provide either the path to the harvest or the path to its state.json. It can also handle SOPS-encrypted harvests.
Output can be provided in plaintext or json.
---
### `enroll validate`
Validates a harvest by checking:
* state.json exists and is valid JSON
* state.json validates against a JSON Schema (by default the vendored one)
* Every `managed_file` entry has a corresponding artifact at: `artifacts/<role_name>/<src_rel>`
* That there are no **unreferenced files** sitting in `artifacts/` that aren't in the state.
#### Schema location + overrides
The master schema lives at: `enroll/schema/state.schema.json`.
You can override with a local file or URL:
```
enroll validate /path/to/harvest --schema ./state.schema.json
enroll validate /path/to/harvest --schema https://enroll.sh/schema/state.schema.json
```
Or skip schema checks (still does artifact consistency checks):
```
enroll validate /path/to/harvest --no-schema
```
#### CLI usage examples
Validate a local harvest:
```
enroll validate ./harvest
```
Validate a harvest tarball or a sops bundle:
```
enroll validate ./harvest.tar.gz
enroll validate ./harvest.sops --sops
```
JSON output + write to file:
```
enroll validate ./harvest --format json --out validate.json
```
Return exit code 1 for any warnings, not just errors (useful for CI):
```
enroll validate ./harvest --fail-on-warnings
```
---
## Sensitive data
**enroll** doesn't make any assumptions about how you might handle sensitive data from your config files, in Ansible. Some people might use SOPS, others might use Vault, others might do something else entirely.
By default, `enroll` does **not** assume how you handle secrets in Ansible. It will attempt to avoid harvesting likely sensitive data (private keys, passwords, tokens, etc.). This can mean it skips some config files you may ultimately want to manage.
For this reason, **enroll** will attempt to read config files, and if it detects data that looks like a sensitive SSH/SSL private key, or password, or API key, etc, then it won't harvest it for config management.
If you opt in to collecting everything:
This inevitably means that it will deliberately miss some important config files that you probably *want* to manage in Ansible.
### `--dangerous`
**WARNING:** disables “likely secret” safety checks. This can copy private keys, TLS key material, API tokens, database passwords, and other credentials into the harvest output **in plaintext**.
Nonetheless, in the Harvest 'state' file, there should be an explanation of 'excluded files'. You can parse or inspect this file to find what it chose to ignore, and then you know what you might want to augment the results with later, once you 'manifest' the harvest into Ansible configuration.
If you intend to keep harvests/manifests long-term (especially in git), strongly consider encrypting them at rest.
Nonetheless, in some cases it may be appropriate to truly grab as much as you can, including secrets. For that, read on for the `--dangerous` flag.
### Encrypt bundles at rest with `--sops`
`--sops` encrypts the harvest and/or manifest outputs into a single `.tar.gz.sops` file (GPG). This is for **storage-at-rest**, not for direct “Ansible SOPS inventory” workflows.
### Opting in to fetching sensitive data: `--dangerous`
**WARNING:** `--dangerous` disables enroll's "likely a secret" safety checks. This can cause private keys, TLS key material, API tokens, database passwords, and other credentials to be copied into your harvest output **in plaintext**.
Only use `--dangerous` if you explicitly want to scoop up sensitive files and you understand where the harvest output is stored, who can read it, and how it will be handled (backups, git commits, etc, as well as risk of using `--out` with a shared `/tmp` location where other users could see the data). We offer no liability if your sensitive data is compromised through the use of this tool!
**Strong recommendation:** If you plan to keep harvested files long-term (especially in git), encrypt secrets at rest. A common approach is to use **SOPS** and then use the **community.sops** Ansible collection to load/decrypt encrypted content during deploy.
Install the collection:
```bash
ansible-galaxy collection install community.sops
```
Then you can use the collection's lookup/vars plugins or modules to decrypt or load SOPS-encrypted vars at runtime.
Note the section below **also** talks about SOPS, but this is in the context of simply encrypting the data generated by `enroll` at rest for safe-keeping, **not** for direct integration with Ansible.
### Encrypting harvest/manifests at rest with `--sops`
If you want to use `--dangerous` (or you simply want to keep the harvested artifacts private when they're sitting on disk, in git, etc), you can pass `--sops` to `harvest`, `manifest`, or `single-shot`.
To use `--sops`, you will need to have [sops](https://github.com/getsops/sops) installed on your `$PATH`.
- `--sops` expects one or more **GPG key fingerprints**. If `sops` is not on the `$PATH`, **enroll** will error.
- `harvest --sops ...` writes a *single* encrypted file (`harvest.tar.gz.sops`) instead of a plaintext directory.
- `manifest --sops ...` (and `single-shot --sops ...`) will:
- decrypt the harvest bundle with `sops -d` (if the `--harvest` input is an encrypted file), then generate manifests as normal
- bundle the entire generated Ansible output into a *single* encrypted file (`manifest.tar.gz.sops`)
⚠️ **Important:** `manifest --sops` (and `single-shot --sops`) produces **one encrypted file**. It is **not** an Ansible repo you can point `ansible-playbook` at directly. It is **not** the same as using SOPS inventory with the Ansible SOPS collection.
To use the encrypted SOPS manifest, decrypt and extract it first, then run Ansible from inside the extracted `manifest/` directory:
```bash
sops -d /path/to/manifest.tar.gz.sops | tar -xzvf -
cd manifest
ansible-playbook ...
```
Example:
```bash
# Harvest (encrypted-at-rest)
enroll harvest --out /tmp/enroll-harvest --dangerous --sops <FINGERPRINT(s)>
# Manifest (encrypted-at-rest)
enroll manifest --harvest /tmp/enroll-harvest/harvest.tar.gz.sops --out /tmp/enroll-ansible --sops <FINGERPRINT(s)>
# Decrypt/extract manifest output for inspection / ansible runs
cd /tmp/enroll-ansible
sops -d manifest.tar.gz.sops | tar -xzvf -
cd manifest
```
(If you want to manually inspect an encrypted harvest bundle, extract it into its own directory, e.g. `mkdir -p harvest && sops -d harvest.tar.gz.sops | tar -xzvf - -C harvest`.)
## Manifest
The 'manifest' subcommand expects to be given a path to the 'harvest' obtained in the first step. It will then attempt to generate Ansible roles and playbooks (and potentially 'inventory') from that harvest.
Manifesting is the most complex step because a lot of people will have opinions on how Ansible roles and inventory should work. No solution is perfect for everyone. However, **enroll** tries to strike a reasonable balance.
Remember, the purpose of this tool is to save **time** getting your systems into a decently-managed state. It's still up to you to wrangle it into a form that works for you on an ongoing basis.
⚠️ Important: `manifest --sops` produces one encrypted file. You must decrypt + extract it before running `ansible-playbook`.
---
# Single-shot mode for the impatient sysadmin
## JinjaTurtle integration (both modes)
**enroll** has a 'single-shot' subcommand which combines the two other phases (harvest and manifest) into one. Use it to generate both the harvest and then manifest ansible from that harvest all in one go. Perfect if you're in a hurry!
If [JinjaTurtle](https://git.mig5.net/mig5/jinjaturtle) is installed, `enroll` can generate Jinja2 templates for ini/json/xml/toml-style config.
---
# JinjaTurtle integration (both modes)
If you also have my other tool [JinjaTurtle](https://git.mig5.net/mig5/jinjaturtle) installed, **enroll** will attempt to create Jinja2 templates for any ini/json/xml/toml style configuration that it finds.
- Templates live in the **role** (`roles/<role>/templates/...`)
- Templates live in `roles/<role>/templates/...`
- Variables live in:
- **single-site**: `roles/<role>/defaults/main.yml`
- **multi-site** (`--fqdn`): `inventory/host_vars/<fqdn>/<role>.yml`
- single-site: `roles/<role>/defaults/main.yml`
- multi-site: `inventory/host_vars/<fqdn>/<role>.yml`
JinjaTurtle will be used automatically if it is detected on the `$PATH`. You can also be explicit and pass `--jinjaturtle`, but this will throw an error if JinjaTurtle is not on the `$PATH`.
If you *do* have JinjaTurtle installed, but *don't* wish to make use of it, you can use `--no-jinjaturtle`, in which case all config files will be kept as 'raw' files.
You can force it on with `--jinjaturtle` or disable with `--no-jinjaturtle`.
---
# How multi-site avoids "shared role breaks a host"
## How multi-site avoids “shared role breaks a host”
In multi-site mode, **roles are data-driven**. The role contains generic tasks like:
- "deploy all files listed for this host"
- "install packages listed for this host"
- "apply systemd enable/start state listed for this host"
The host inventory is what decides which files/packages/services apply to that host. This prevents the classic failure mode where host2 adds a config file to a shared role and host1 then fails trying to deploy a file it never had.
Raw non-templated files are stored under:
- `inventory/host_vars/<fqdn>/<role>/.files/...`
…and the host's role variables describe which of those files should be deployed.
In multi-site mode, roles are **data-driven**. The role tasks are generic (“deploy the files listed for this host”, “install the packages listed for this host”, “apply systemd enable/start state listed for this host”). Host inventory decides what applies per-host, avoiding the classic “host2 adds config, host1 breaks” failure mode.
---
# Install
## Ubuntu/Debian apt repository
```bash
sudo mkdir -p /usr/share/keyrings
curl -fsSL https://mig5.net/static/mig5.asc | sudo gpg --dearmor -o /usr/share/keyrings/mig5.gpg
@ -223,26 +310,39 @@ sudo apt update
sudo apt install enroll
```
## AppImage
## Fedora
Download the AppImage file from the Releases page (verify with GPG if you wish, my fingerprint is [here](https://mig5.net/static/mig5.asc)),
then make it executable and run it:
```bash
sudo rpm --import https://mig5.net/static/mig5.asc
sudo tee /etc/yum.repos.d/mig5.repo > /dev/null << 'EOF'
[mig5]
name=mig5 Repository
baseurl=https://rpm.mig5.net/$releasever/rpm/$basearch
enabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://mig5.net/static/mig5.asc
EOF
sudo dnf upgrade --refresh
sudo dnf install enroll
```
## AppImage
Download it from my Releases page, then:
```bash
chmod +x Enroll.AppImage
./Enroll.AppImage
```
### Pip/PipX
## Pip/PipX
```bash
pip install enroll
```
### Poetry
Clone this repository with git, then:
## Poetry (dev)
```bash
poetry install
poetry run enroll --help
@ -250,111 +350,279 @@ poetry run enroll --help
---
# Usage
## Found a bug / have a suggestion?
## 1. Harvest state/information about the host
My Forgejo doesn't currently support federation, so I haven't opened registration/login for issues.
On the host (root recommended to harvest as much data as possible):
Instead, email me (see `pyproject.toml`) or contact me on the Fediverse:
https://goto.mig5.net/@mig5
---
# Examples
## Harvest
### Local harvest
```bash
enroll harvest --out /tmp/enroll-harvest
```
### Remote harvest over SSH (no enroll install required on the remote host, no need for --out)
### Remote harvest over SSH
```bash
enroll harvest --remote-host myhost.example.com --remote-user myuser
enroll harvest --remote-host myhost.example.com --remote-user myuser --out /tmp/enroll-harvest
```
### `--dangerous` (captures potentially sensitive files — read the warning above)
### Remote harvest over SSH, where the SSH configuration is in ~/.ssh/config (e.g a different SSH key)
Note: you must still pass `--remote-host`, but in this case, its value can be the 'Host' alias of an entry in your `~/.ssh/config`.
```bash
enroll harvest --remote-host myhostalias --remote-ssh-config ~/.ssh/config --out /tmp/enroll-harvest
```
### Include paths (`--include-path`)
```bash
# Add a few dotfiles from /home (still secret-safe unless --dangerous)
enroll harvest --out /tmp/enroll-harvest --include-path '/home/*/.bashrc' --include-path '/home/*/.profile'
```
### Exclude paths (`--exclude-path`)
```bash
# Skip specific /usr/local/bin entries (or patterns)
enroll harvest --out /tmp/enroll-harvest --exclude-path '/usr/local/bin/docker-*' --exclude-path '/usr/local/bin/some-tool'
```
### Regex include
```bash
enroll harvest --out /tmp/enroll-harvest --include-path 're:^/home/[^/]+/\.config/myapp/.*$'
```
### `--dangerous`
```bash
enroll harvest --out /tmp/enroll-harvest --dangerous
```
Remote + dangerous:
### Remote + dangerous:
```bash
enroll harvest --remote-host myhost.example.com --remote-user myuser --dangerous
```
### `--sops` (encrypt bundles at rest)
`--sops` bundles and encrypts the output as a single SOPS-encrypted `.tar.gz.sops` file (GPG). This is particularly useful if you're using `--dangerous`.
### `--sops` (encrypt at rest)
```bash
# Encrypted harvest bundle (writes /tmp/enroll-harvest/harvest.tar.gz.sops)
enroll harvest --out /tmp/enroll-harvest --dangerous --sops <FINGERPRINT(s)>
```
# Encrypted manifest bundle (writes /tmp/enroll-ansible/manifest.tar.gz.sops)
---
## Manifest
### Single-site (default: no --fqdn)
```bash
enroll manifest --harvest /tmp/enroll-harvest --out /tmp/enroll-ansible
```
### Multi-site (--fqdn)
```bash
enroll manifest --harvest /tmp/enroll-harvest --out /tmp/enroll-ansible --fqdn "$(hostname -f)"
```
### Manifest with `--sops`
```bash
# Generate encrypted manifest bundle (writes /tmp/enroll-ansible/manifest.tar.gz.sops)
enroll manifest --harvest /tmp/enroll-harvest/harvest.tar.gz.sops --out /tmp/enroll-ansible --sops <FINGERPRINT(s)>
# Decrypt/extract the manifest bundle, then run Ansible from inside ./manifest/
cd /tmp/enroll-ansible
sops -d manifest.tar.gz.sops | tar -xzvf -
cd manifest
ansible-playbook ./playbook.yml
```
## 2. Generate Ansible manifests (roles/playbook) from that harvest
### Single-site (default: no --fqdn)
Good for one server, or for producing roles you want to reuse to provision new machines:
```bash
enroll manifest --harvest /tmp/enroll-harvest --out /tmp/enroll-ansible
```
### Multi-site (--fqdn)
Best when enrolling multiple already-running servers into one repo:
```bash
enroll manifest --harvest /tmp/enroll-harvest --out /tmp/enroll-ansible --fqdn "$(hostname -f)"
```
---
## Single-shot
Alternatively, do both steps in one shot:
```bash
enroll single-shot --harvest /tmp/enroll-harvest --out /tmp/enroll-ansible --fqdn "$(hostname -f)"
```
Remote single-shot (run harvest over SSH, then manifest locally):
Remote single-shot (run harvest over SSH, then manifest locally):
```bash
enroll single-shot --remote-host myhost.example.com --remote-user myuser --harvest /tmp/enroll-harvest --out /tmp/enroll-ansible --fqdn "myhost.example.com"
enroll single-shot --remote-host myhost.example.com --remote-user myuser --harvest /tmp/enroll-harvest --out /tmp/enroll-ansible --fqdn "myhost.example.com"
```
In multi-site mode (`--fqdn`), you can run single-shot repeatedly against multiple hosts while reusing the same `--out` directory so each host merges into the existing Ansible repo.
---
## Diff
## 3. Run Ansible
### Compare two harvest directories, output in json
```bash
enroll diff --old /path/to/harvestA --new /path/to/harvestB --format json
```
### Diff + webhook notify
```bash
enroll diff --old /path/to/golden/harvest --new /path/to/new/harvest --webhook https://nr.mig5.net/forms/webhooks/xxxx --webhook-format json --webhook-header 'X-Enroll-Secret: xxxx'
```
`diff` mode also supports email sending and text or markdown format, as well as `--exit-code` mode to trigger a return code of 2 (useful for crons or CI)
### Ignore a specific directory or file from the diff
```bash
enroll diff --old /path/to/harvestA --new /path/to/harvestB --exclude-path /var/anacron
```
### Ignore package version drift (routine upgrades) but still alert on add/remove
```bash
enroll diff --old /path/to/harvestA --new /path/to/harvestB --ignore-package-versions
```
### Enforce the old harvest state when drift is detected (requires Ansible)
```bash
enroll diff --old /path/to/harvestA --new /path/to/harvestB --enforce --ignore-package-versions --exclude-path /var/anacron
```
---
## Explain
### Explain a harvest
All of these do the same thing:
```bash
enroll explain /path/to/state.json
enroll explain /path/to/bundle_dir
enroll explain /path/to/harvest.tar.gz
```
### Explain a SOPS-encrypted harvest
```bash
enroll explain /path/to/harvest.tar.gz.sops --sops
```
### Explain with JSON output and more examples
```bash
enroll explain /path/to/state.json --format json --max-examples 25
```
### Example output
```
enroll explain /tmp/syrah.harvest
Enroll explain: /tmp/syrah.harvest
Host: syrah.mig5.net (os: debian, pkg: dpkg)
Enroll: 0.2.3
Inventory
- Packages: 254
- Why packages were included (observed_via):
- user_installed: 248 Package appears explicitly installed (as opposed to only pulled in as a dependency).
- package_role: 232 Package was referenced by an enroll packages snapshot/role. (e.g. acl, acpid, adduser)
- systemd_unit: 22 Package is associated with a systemd unit that was harvested. (e.g. postfix.service, tor.service, apparmor.service)
Roles collected
- users: 1 user(s), 1 file(s), 0 excluded
- services: 19 unit(s), 111 file(s), 6 excluded
- packages: 232 package snapshot(s), 41 file(s), 0 excluded
- apt_config: 26 file(s), 7 dir(s), 10 excluded
- dnf_config: 0 file(s), 0 dir(s), 0 excluded
- etc_custom: 70 file(s), 20 dir(s), 0 excluded
- usr_local_custom: 35 file(s), 1 dir(s), 0 excluded
- extra_paths: 0 file(s), 0 dir(s), 0 excluded
Why files were included (managed_files.reason)
- custom_unowned (179): A file not owned by any package (often custom/operator-managed).. Examples: /etc/apparmor.d/local/lsb_release, /etc/apparmor.d/local/nvidia_modprobe, /etc/apparmor.d/local/sbin.dhclient
- usr_local_bin_script (35): Executable scripts under /usr/local/bin (often operator-installed).. Examples: /usr/local/bin/check_firewall, /usr/local/bin/awslogs
- apt_keyring (13): Repository signing key material used by APT.. Examples: /etc/apt/keyrings/openvpn-repo-public.asc, /etc/apt/trusted.gpg, /etc/apt/trusted.gpg.d/deb.torproject.org-keyring.gpg
- modified_conffile (10): A package-managed conffile differs from the packaged/default version.. Examples: /etc/dnsmasq.conf, /etc/ssh/moduli, /etc/tor/torrc
- logrotate_snippet (9): logrotate snippets/configs referenced in system configuration.. Examples: /etc/logrotate.d/rsyslog, /etc/logrotate.d/tor, /etc/logrotate.d/apt
- apt_config (7): APT configuration affecting package installation and repository behavior.. Examples: /etc/apt/apt.conf.d/01autoremove, /etc/apt/apt.conf.d/20listchanges, /etc/apt/apt.conf.d/70debconf
[...]
```
---
## Run Ansible
### Single-site
You can run it however you prefer (local connection or your own inventory). Example:
```bash
ansible-playbook -i "localhost," -c local /tmp/enroll-ansible/playbook.yml
```
### Multi-site (--fqdn)
In multi-site mode, enroll generates an ansible.cfg, `host_vars` inventory, and a host-specific playbook:
```bash
ansible-playbook /tmp/enroll-ansible/playbooks/"$(hostname -f)".yml
```
---
### Run only specific roles (tags)
Generated playbooks tag each role as `role_<name>` (e.g. `role_users`, `role_services`), so you can speed up targeted runs:
```bash
ansible-playbook -i "localhost," -c local /tmp/enroll-ansible/playbook.yml --tags role_users
```
# Found a bug, have a suggestion?
## Configuration file
My Forgejo doesn't yet support proper federation, and for that reason I've not opened up registration/login to use the issue queue.
As can be seen above, there are a lot of powerful 'permutations' available to all four subcommands.
Instead, you can e-mail me (see the pyproject.toml for details) or contact me on the Fediverse:
Sometimes, it can be easier to store them in a config file so you don't have to remember them!
https://goto.mig5.net/@mig5
Enroll supports reading an ini-style file of all the arguments for each subcommand.
### Location of the config file
The path the config file can be specified with `-c` or `--config` on the command-line. Otherwise,
Enroll will look for `./enroll.ini`, `./.enroll.ini` (in the current working directory),
`~/.config/enroll/enroll.ini` (or `$XDG_CONFIG_HOME/enroll/enroll.ini`).
You may also pass `--no-config` if you deliberately want to ignore the config file even if it existed.
### Precedence
Highest wins:
* Explicit CLI flags
* INI config ([cmd], [enroll])
* argparse defaults
### Example config file
Here is an example.
Whenever an argument on the command-line has a 'hyphen' in it, just be sure to change it to an underscore in the ini file.
```ini
[enroll]
# (future global flags may live here)
[harvest]
dangerous = false
include_path =
/home/*/.bashrc
/home/*/.profile
exclude_path = /usr/local/bin/docker-*, /usr/local/bin/some-tool
# remote_host = yourserver.example.com
# remote_user = you
# remote_port = 2222
[manifest]
# you can set defaults here too, e.g.
no_jinjaturtle = true
sops = 54A91143AE0AB4F7743B01FE888ED1B423A3BC99
[diff]
# ignore noisy drift
exclude_path = /var/anacron
ignore_package_versions = true
# enforce = true # requires ansible-playbook on PATH
[single-shot]
# if you use single-shot, put its defaults here.
# It does not inherit those of the subsections above, so you
# may wish to repeat them here.
include_path = re:^/home/[^/]+/\.config/myapp/.*$
```

131
debian/changelog vendored
View file

@ -1,3 +1,132 @@
enroll (0.5.0) unstable; urgency=medium
* Add ssh config support where JinjaTurtle is used
-- Miguel Jacq <mig@mig5.net> Tue, 12 May 2026 12:00 +1000
enroll (0.4.4) unstable; urgency=medium
* Add capability to handle passphrases on encrypted SSH private keys. Prompting can be forced with `--ask-key-passphrase` or automated (e.g for CI) with `--ssh-key-passphrase env SOMEVAR`
-- Miguel Jacq <mig@mig5.net> Tue, 17 Feb 2026 11:00 +1100
enroll (0.4.3) unstable; urgency=medium
* Add support for AddressFamily and ConnectTimeout in the .ssh/config when using `--remote-ssh-config`.
-- Miguel Jacq <mig@mig5.net> Fri, 16 Jan 2026 11:00 +1100
enroll (0.4.2) unstable; urgency=medium
* Support `--remote-ssh-config [path-to-ssh-config]` as an argument in case extra params are required beyond `--remote-port` or `--remote-user`. Note: `--remote-host` must still be set, but it can be an 'alias' represented by the 'Host' value in the ssh config.
-- Miguel Jacq <mig@mig5.net> Tue, 13 Jan 2026 21:55:00 +1100
enroll (0.4.1) unstable; urgency=medium
* Add interactive output when 'enroll diff --enforce' is invoking Ansible.
-- Miguel Jacq <mig@mig5.net> Sun, 11 Jan 2026 10:00:00 +1100
enroll (0.4.0) unstable; urgency=medium
* Introduce `enroll validate` - a tool to validate a harvest against the state schema, or check for missing or orphaned obsolete artifacts in a harvest.
* Attempt to generate Jinja2 templates of systemd unit files and Postfix main.cf (now that JinjaTurtle supports it)
* Update pynacl dependency to resolve CVE-2025-69277
* Add `--exclude-path` to `enroll diff` command, so that you can ignore certain churn from the diff (stuff you still wanted to harvest as a baseline but don't care if it changes day to day)
* Add `--ignore-package-versions` to `enroll diff` command, to optionally ignore package upgrades (e.g due to patching) from the diff.
* Add tags to the playbook for each role, to allow easier targeting of specific roles during play later.
* Add `--enforce` mode to `enroll diff`. If there is diff detected between the two harvests, and it can enforce restoring the state from the older harvest, it will manifest the state and apply it with ansible.
Only the specific roles that had diffed will be applied (via the new tags capability)
-- Miguel Jacq <mig@mig5.net> Sat, 10 Jan 2026 10:30:00 +1100
enroll (0.3.0) unstable; urgency=medium
* Introduce `enroll explain` - a tool to analyze and explain what's in (or not in) a harvest and why.
* Centralise the cron and logrotate stuff into their respective roles, we had a bit of duplication between roles based on harvest discovery.
* Capture other files in the user's home directory such as `.bashrc`, `.bash_aliases`, `.profile`, if these files differ from the `/etc/skel` defaults
* Ignore files that end with a tilde or - (probably backup files generated by editors or shadow file changes)
* Manage certain symlinks e.g for apache2/nginx sites-enabled and so on
-- Miguel Jacq <mig@mig5.net> Mon, 05 Jan 2026 17:00:00 +1100
enroll (0.2.3) unstable; urgency=medium
* Introduce --ask-become-pass or -K to support password-required sudo on remote hosts, just like Ansible. It will also fall back to this prompt if a password is required but the arg wasn't passed in.
-- Miguel Jacq <mig@mig5.net> Sun, 04 Jan 2026 20:38:00 +1100
enroll (0.2.2) unstable; urgency=medium
* Fix stat() of parent directory so that we set directory perms correct on --include paths.
* Set pty for remote calls when sudo is required, to help systems with limits on sudo without pty
-- Miguel Jacq <mig@mig5.net> Sat, 03 Jan 2026 09:56:00 +1100
enroll (0.2.1) unstable; urgency=medium
* Don't accidentally add extra_paths role to usr_local_custom list, resulting in extra_paths appearing twice in manifested playbook
* Ensure directories in the tree of anything included with --include are defined in the state and manifest so we make dirs before we try to create files
-- Miguel Jacq <mig@mig5.net> Fri, 02 Jan 2026 21:30:00 +1100
enroll (0.2.0) unstable; urgency=medium
* Add version CLI arg
* Add ability to enroll RH-style systems (DNF5/DNF/RPM)
* Refactor harvest state to track package versions
-- Miguel Jacq <mig@mig5.net> Mon, 29 Dec 2025 17:30:00 +1100
enroll (0.1.7) unstable; urgency=medium
* Fix an attribution bug for certain files ending up in the wrong package/role.
-- Miguel Jacq <mig@mig5.net> Sun, 28 Dec 2025 18:30:00 +1100
enroll (0.1.6) unstable; urgency=medium
* DRY up some code logic
* More test coverage
-- Miguel Jacq <mig@mig5.net> Sun, 28 Dec 2025 15:30:00 +1100
enroll (0.1.5) unstable; urgency=medium
* Consolidate logrotate and cron files into their main service/package roles if they exist.
* Standardise on MAX_FILES_CAP in one place
* Manage apt stuff in its own role, not in etc_custom
-- Miguel Jacq <mig@mig5.net> Sun, 28 Dec 2025 10:00:00 +1100
enroll (0.1.4) unstable; urgency=medium
* Attempt to capture more stuff from /etc that might not be attributable to a specific package. This includes common singletons and systemd timers
* Avoid duplicate apt data in package-specific roles.
-- Miguel Jacq <mig@mig5.net> Sat, 27 Dec 2025 19:00:00 +1100
enroll (0.1.3) unstable; urgency=medium
* Allow the user to add extra paths to harvest, or paths to ignore, using `--exclude-path` and `--include-path`
arguments.
* Add support for an enroll.ini config file to store arguments per subcommand, to avoid having to remember
them all for repetitive executions.
-- Miguel Jacq <mig@mig5.net> Sat, 20 Dec 2025 18:24:00 +1100
enroll (0.1.2) unstable; urgency=medium
* Include files from `/usr/local/bin` and `/usr/local/etc` in harvest (assuming they aren't binaries or
symlinks) and store in `usr_local_custom` role, similar to `etc_custom`.
-- Miguel Jacq <mig@mig5.net> Thu, 18 Dec 2025 17:07:00 +1100
enroll (0.1.1) unstable; urgency=medium
* Add `diff` subcommand which can compare two harvests and send email or webhook notifications in different
formats.
-- Miguel Jacq <mig@mig5.net> Thu, 18 Dec 2025 15:00:00 +1100
enroll (0.1.0) unstable; urgency=medium
* Add remote mode for harvesting a remote machine via a local workstation (no need to install enroll remotely)
@ -12,7 +141,7 @@ enroll (0.1.0) unstable; urgency=medium
ship or manage those files.
* Don't collect files ending in `.log`
-- Miguel Jacq <mig@mig5.net> Tue, 17 Dec 2025 18:00:00 +1100
-- Miguel Jacq <mig@mig5.net> Wed, 17 Dec 2025 18:00:00 +1100
enroll (0.0.5) unstable; urgency=medium

5
debian/control vendored
View file

@ -10,12 +10,13 @@ Build-Depends:
python3-all,
python3-yaml,
python3-poetry-core,
python3-paramiko
python3-paramiko,
python3-jsonschema
Standards-Version: 4.6.2
Homepage: https://git.mig5.net/mig5/enroll
Package: enroll
Architecture: all
Depends: ${misc:Depends}, ${python3:Depends}, python3-yaml, python3-paramiko
Depends: ${misc:Depends}, ${python3:Depends}, python3-yaml, python3-paramiko, python3-jsonschema
Description: Harvest a host into Ansible roles
A tool that inspects a system and emits Ansible roles/playbooks to reproduce it.

View file

@ -109,4 +109,3 @@
<tspan class="text-dark">en</tspan><tspan class="text-light">roll</tspan>
</text>
</svg>

Before

Width:  |  Height:  |  Size: 4.4 KiB

After

Width:  |  Height:  |  Size: 4.4 KiB

Before After
Before After

View file

@ -1,17 +1,261 @@
from __future__ import annotations
import argparse
import configparser
import json
import os
import sys
import tarfile
import tempfile
from pathlib import Path
from typing import Optional
from .cache import new_harvest_cache_dir
from .diff import (
compare_harvests,
enforce_old_harvest,
format_report,
has_enforceable_drift,
post_webhook,
send_email,
)
from .explain import explain_state
from .harvest import harvest
from .manifest import manifest
from .remote import remote_harvest
from .remote import (
remote_harvest,
RemoteSudoPasswordRequired,
RemoteSSHKeyPassphraseRequired,
)
from .sopsutil import SopsError, encrypt_file_binary
from .validate import validate_harvest
from .version import get_enroll_version
def _discover_config_path(argv: list[str]) -> Optional[Path]:
"""Return the config path to use, if any.
Precedence:
1) --no-config disables loading.
2) --config PATH (or -c PATH)
3) $ENROLL_CONFIG
4) ./enroll.ini, ./.enroll.ini
5) $XDG_CONFIG_HOME/enroll/enroll.ini (or ~/.config/enroll/enroll.ini)
The config file is optional; if no file is found, returns None.
"""
# Quick scan for explicit flags without needing to build the full parser.
if "--no-config" in argv:
return None
def _value_after(flag: str) -> Optional[str]:
try:
i = argv.index(flag)
except ValueError:
return None
if i + 1 >= len(argv):
return None
return argv[i + 1]
p = _value_after("--config") or _value_after("-c")
if p:
return Path(p).expanduser()
envp = os.environ.get("ENROLL_CONFIG")
if envp:
return Path(envp).expanduser()
cwd = Path.cwd()
for name in ("enroll.ini", ".enroll.ini"):
cp = cwd / name
if cp.exists() and cp.is_file():
return cp
xdg = os.environ.get("XDG_CONFIG_HOME")
if xdg:
base = Path(xdg).expanduser()
else:
base = Path.home() / ".config"
cp = base / "enroll" / "enroll.ini"
if cp.exists() and cp.is_file():
return cp
return None
def _parse_bool(s: str) -> Optional[bool]:
v = str(s).strip().lower()
if v in {"1", "true", "yes", "y", "on"}:
return True
if v in {"0", "false", "no", "n", "off"}:
return False
return None
def _action_lookup(p: argparse.ArgumentParser) -> dict[str, argparse.Action]:
"""Map config keys -> argparse actions for a parser.
Accepts both dest names and long option names without leading dashes,
normalized with '-' -> '_'.
"""
m: dict[str, argparse.Action] = {}
for a in p._actions: # noqa: SLF001 (argparse internal)
if not getattr(a, "dest", None):
continue
dest = str(a.dest).strip().lower()
if dest:
m[dest] = a
for opt in getattr(a, "option_strings", []) or []:
k = opt.lstrip("-").strip().lower()
if k:
m[k.replace("-", "_")] = a
m[k] = a
return m
def _choose_flag(a: argparse.Action) -> Optional[str]:
# Prefer a long flag if available (e.g. --dangerous over -d)
for s in getattr(a, "option_strings", []) or []:
if s.startswith("--"):
return s
for s in getattr(a, "option_strings", []) or []:
return s
return None
def _split_list_value(v: str) -> list[str]:
# Support comma-separated and/or multi-line lists.
raw = str(v)
if "\n" in raw:
parts = [p.strip() for p in raw.splitlines()]
return [p for p in parts if p]
if "," in raw:
parts = [p.strip() for p in raw.split(",")]
return [p for p in parts if p]
raw = raw.strip()
return [raw] if raw else []
def _section_to_argv(
p: argparse.ArgumentParser, cfg: configparser.ConfigParser, section: str
) -> list[str]:
"""Translate an INI section into argv tokens for this parser."""
if not cfg.has_section(section):
return []
lookup = _action_lookup(p)
out: list[str] = []
for k, v in cfg.items(section):
key = str(k).strip().lower().replace("-", "_")
# Avoid recursion / confusing self-configuration.
if key in {"config", "no_config"}:
continue
a = lookup.get(key)
if not a:
# Unknown keys are ignored (but we try to be helpful).
print(
f"warning: config [{section}] contains unknown option '{k}' (ignored)",
file=sys.stderr,
)
continue
flag = _choose_flag(a)
if not flag:
continue
# Boolean flags
if isinstance(a, argparse._StoreTrueAction): # noqa: SLF001
b = _parse_bool(v)
if b is True:
out.append(flag)
continue
if isinstance(a, argparse._StoreFalseAction): # noqa: SLF001
b = _parse_bool(v)
if b is False:
out.append(flag)
continue
# Repeated options
if isinstance(a, argparse._AppendAction): # noqa: SLF001
for item in _split_list_value(v):
out.extend([flag, item])
continue
# Count flags (rare, but easy to support)
if isinstance(a, argparse._CountAction): # noqa: SLF001
b = _parse_bool(v)
if b is True:
out.append(flag)
else:
try:
n = int(str(v).strip())
except ValueError:
n = 0
out.extend([flag] * max(0, n))
continue
# Standard scalar options
sval = str(v).strip()
if sval:
out.extend([flag, sval])
return out
def _inject_config_argv(
argv: list[str],
*,
cfg_path: Optional[Path],
root_parser: argparse.ArgumentParser,
subparsers: dict[str, argparse.ArgumentParser],
) -> list[str]:
"""Return argv with config-derived tokens inserted.
We insert:
- [enroll] options before the subcommand
- [<subcommand>] options immediately after the subcommand token
CLI flags always win because they come later in argv.
"""
if not cfg_path:
return argv
cfg_path = Path(cfg_path).expanduser()
if not (cfg_path.exists() and cfg_path.is_file()):
return argv
cfg = configparser.ConfigParser()
try:
cfg.read(cfg_path, encoding="utf-8")
except (OSError, configparser.Error) as e:
raise SystemExit(f"error: failed to read config file {cfg_path}: {e}")
global_tokens = _section_to_argv(root_parser, cfg, "enroll")
# Find the subcommand token position.
cmd_pos: Optional[int] = None
cmd_name: Optional[str] = None
for i, tok in enumerate(argv):
if tok in subparsers:
cmd_pos = i
cmd_name = tok
break
if cmd_pos is None or cmd_name is None:
# No subcommand found (argparse will handle the error); only apply global.
return global_tokens + argv
cmd_tokens = _section_to_argv(subparsers[cmd_name], cfg, cmd_name)
# Also accept section names with '_' in place of '-' (e.g. [single_shot])
if "-" in cmd_name:
alt = cmd_name.replace("-", "_")
if alt != cmd_name:
cmd_tokens += _section_to_argv(subparsers[cmd_name], cfg, alt)
return global_tokens + argv[: cmd_pos + 1] + cmd_tokens + argv[cmd_pos + 1 :]
def _resolve_sops_out_file(out: Optional[str], *, hint: str) -> Path:
@ -89,29 +333,100 @@ def _jt_mode(args: argparse.Namespace) -> str:
return "auto"
def _add_config_args(p: argparse.ArgumentParser) -> None:
p.add_argument(
"-c",
"--config",
help=(
"Path to an INI config file for default options. If omitted, enroll will look for "
"./enroll.ini, ./.enroll.ini, or ~/.config/enroll/enroll.ini (or $XDG_CONFIG_HOME/enroll/enroll.ini)."
),
)
p.add_argument(
"--no-config",
action="store_true",
help="Do not load any INI config file (even if one would be auto-discovered).",
)
def _add_remote_args(p: argparse.ArgumentParser) -> None:
p.add_argument(
"--remote-host",
help="SSH host to run harvesting on (if set, harvest runs remotely and is pulled locally).",
)
p.add_argument(
"--remote-ssh-config",
nargs="?",
const=str(Path.home() / ".ssh" / "config"),
default=None,
help=(
"Use OpenSSH-style ssh_config settings for --remote-host. "
"If provided without a value, defaults to ~/.ssh/config. "
"(Applies HostName/User/Port/IdentityFile/ProxyCommand/HostKeyAlias when supported.)"
),
)
p.add_argument(
"--remote-port",
type=int,
default=22,
help="SSH port for --remote-host (default: 22).",
default=None,
help=(
"SSH port for --remote-host. If omitted, defaults to 22, or a value from ssh_config when "
"--remote-ssh-config is set."
),
)
p.add_argument(
"--remote-user",
default=os.environ.get("USER") or None,
help="SSH username for --remote-host (default: local $USER).",
default=None,
help=(
"SSH username for --remote-host. If omitted, defaults to local $USER, or a value from ssh_config when "
"--remote-ssh-config is set."
),
)
# Align terminology with Ansible: "become" == sudo.
p.add_argument(
"--ask-become-pass",
"-K",
action="store_true",
help=(
"Prompt for the remote sudo (become) password when using --remote-host "
"(similar to ansible --ask-become-pass)."
),
)
keyp = p.add_mutually_exclusive_group()
keyp.add_argument(
"--ask-key-passphrase",
action="store_true",
help=(
"Prompt for the SSH private key passphrase when using --remote-host. "
"If not set, enroll will still prompt on-demand if it detects an encrypted key in an interactive session."
),
)
keyp.add_argument(
"--ssh-key-passphrase-env",
metavar="ENV_VAR",
help=(
"Read the SSH private key passphrase from environment variable ENV_VAR "
"(useful for non-interactive runs/CI)."
),
)
def main() -> None:
ap = argparse.ArgumentParser(prog="enroll")
ap.add_argument(
"-v",
"--version",
action="version",
version=f"{get_enroll_version()}",
)
_add_config_args(ap)
sub = ap.add_subparsers(dest="cmd", required=True)
h = sub.add_parser("harvest", help="Harvest service/package/config state")
_add_config_args(h)
_add_remote_args(h)
h.add_argument(
"--out",
help=(
@ -124,6 +439,27 @@ def main() -> None:
action="store_true",
help="Collect files more aggressively (may include secrets). Disables secret-avoidance checks.",
)
h.add_argument(
"--include-path",
action="append",
default=[],
metavar="PATTERN",
help=(
"Include extra file paths to harvest (repeatable). Supports globs (including '**') and regex via 're:<regex>'. "
"Included files are still filtered by IgnorePolicy unless --dangerous is used."
),
)
h.add_argument(
"--exclude-path",
action="append",
default=[],
metavar="PATTERN",
help=(
"Exclude file paths from harvesting (repeatable). Supports globs (including '**') and regex via 're:<regex>'. "
"Excludes apply to all harvesting, including defaults."
),
)
h.add_argument(
"--sops",
nargs="+",
@ -138,9 +474,9 @@ def main() -> None:
action="store_true",
help="Don't use sudo on the remote host (when using --remote options). This may result in a limited harvest due to permission restrictions.",
)
_add_remote_args(h)
m = sub.add_parser("manifest", help="Render Ansible roles from a harvest")
_add_config_args(m)
m.add_argument(
"--harvest",
required=True,
@ -173,6 +509,8 @@ def main() -> None:
s = sub.add_parser(
"single-shot", help="Harvest state, then manifest Ansible code, in one shot"
)
_add_config_args(s)
_add_remote_args(s)
s.add_argument(
"--harvest",
help=(
@ -185,13 +523,34 @@ def main() -> None:
action="store_true",
help="Collect files more aggressively (may include secrets). Disables secret-avoidance checks.",
)
s.add_argument(
"--include-path",
action="append",
default=[],
metavar="PATTERN",
help=(
"Include extra file paths to harvest (repeatable). Supports globs (including '**') and regex via 're:<regex>'. "
"Included files are still filtered by IgnorePolicy unless --dangerous is used."
),
)
s.add_argument(
"--exclude-path",
action="append",
default=[],
metavar="PATTERN",
help=(
"Exclude file paths from harvesting (repeatable). Supports globs (including '**') and regex via 're:<regex>'. "
"Excludes apply to all harvesting, including defaults."
),
)
s.add_argument(
"--sops",
nargs="+",
metavar="GPG_FINGERPRINT",
help=(
"Encrypt the harvest as a SOPS-encrypted tarball, and bundle+encrypt the manifest output in --out "
"(same behavior as `harvest --sops` and `manifest --sops`)."
"(same behaviour as `harvest --sops` and `manifest --sops`)."
),
)
s.add_argument(
@ -209,18 +568,222 @@ def main() -> None:
),
)
_add_common_manifest_args(s)
_add_remote_args(s)
args = ap.parse_args()
d = sub.add_parser("diff", help="Compare two harvests and report differences")
_add_config_args(d)
d.add_argument(
"--old",
required=True,
help=(
"Old/baseline harvest (directory, a path to state.json, a tarball, or a SOPS-encrypted bundle)."
),
)
d.add_argument(
"--new",
required=True,
help=(
"New/current harvest (directory, a path to state.json, a tarball, or a SOPS-encrypted bundle)."
),
)
d.add_argument(
"--sops",
action="store_true",
help="Allow SOPS-encrypted harvest bundle inputs (requires `sops` on PATH).",
)
d.add_argument(
"--format",
choices=["text", "markdown", "json"],
default="text",
help="Report output format (default: text).",
)
d.add_argument(
"--exclude-path",
action="append",
default=[],
metavar="PATTERN",
help=(
"Exclude file paths from the diff report (repeatable). Supports globs (including '**') and regex via 're:<regex>'. "
"This affects file drift reporting only (added/removed/changed files), not package/service/user diffs."
),
)
d.add_argument(
"--ignore-package-versions",
action="store_true",
help=(
"Ignore package version changes in the diff report and exit status. "
"Package additions/removals are still reported. Useful when routine upgrades would otherwise create noisy drift."
),
)
d.add_argument(
"--enforce",
action="store_true",
help=(
"If differences are detected, attempt to enforce the old harvest state locally by generating a manifest and "
"running ansible-playbook. Requires ansible-playbook on PATH. "
"Enroll does not attempt to downgrade packages; if the only drift is package version upgrades (or newly installed packages), enforcement is skipped."
),
)
d.add_argument(
"--out",
help="Write the report to this file instead of stdout.",
)
d.add_argument(
"--exit-code",
action="store_true",
help="Exit with status 2 if differences are detected.",
)
d.add_argument(
"--notify-always",
action="store_true",
help="Send webhook/email even when there are no differences.",
)
d.add_argument(
"--webhook",
help="POST the report to this URL (only when differences are detected, unless --notify-always).",
)
d.add_argument(
"--webhook-format",
choices=["json", "text", "markdown"],
default="json",
help="Payload format for --webhook (default: json).",
)
d.add_argument(
"--webhook-header",
action="append",
default=[],
metavar="K:V",
help="Extra HTTP header for --webhook (repeatable), e.g. 'Authorization: Bearer ...'.",
)
d.add_argument(
"--email-to",
action="append",
default=[],
help="Email the report to this address (repeatable; only when differences are detected unless --notify-always).",
)
d.add_argument(
"--email-from",
help="From address for --email-to (default: enroll@<hostname>).",
)
d.add_argument(
"--email-subject",
help="Subject for --email-to (default: 'enroll diff report').",
)
d.add_argument(
"--smtp",
help="SMTP server host[:port] for --email-to. If omitted, uses local sendmail.",
)
d.add_argument(
"--smtp-user",
help="SMTP username (optional).",
)
d.add_argument(
"--smtp-password-env",
help="Environment variable containing SMTP password (optional).",
)
remote_host: Optional[str] = getattr(args, "remote_host", None)
e = sub.add_parser("explain", help="Explain a harvest state.json")
_add_config_args(e)
e.add_argument(
"harvest",
help=(
"Harvest input (directory, a path to state.json, a tarball, or a SOPS-encrypted bundle)."
),
)
e.add_argument(
"--sops",
action="store_true",
help="Treat the input as a SOPS-encrypted bundle (auto-detected if the filename ends with .sops).",
)
e.add_argument(
"--format",
choices=["text", "json"],
default="text",
help="Output format.",
)
e.add_argument(
"--max-examples",
type=int,
default=3,
help="How many example paths/refs to show per reason.",
)
v = sub.add_parser(
"validate", help="Validate a harvest bundle (state.json + artifacts)"
)
_add_config_args(v)
v.add_argument(
"harvest",
help=(
"Harvest input (directory, a path to state.json, a tarball, or a SOPS-encrypted bundle)."
),
)
v.add_argument(
"--sops",
action="store_true",
help="Treat the input as a SOPS-encrypted bundle (auto-detected if the filename ends with .sops).",
)
v.add_argument(
"--schema",
help=(
"Optional JSON schema source (file path or https:// URL). "
"If omitted, uses the schema vendored in the enroll codebase."
),
)
v.add_argument(
"--no-schema",
action="store_true",
help="Skip JSON schema validation and only perform bundle consistency checks.",
)
v.add_argument(
"--fail-on-warnings",
action="store_true",
help="Exit non-zero if validation produces warnings.",
)
v.add_argument(
"--format",
choices=["text", "json"],
default="text",
help="Output format.",
)
v.add_argument(
"--out",
help="Write the report to this file instead of stdout.",
)
argv = sys.argv[1:]
cfg_path = _discover_config_path(argv)
argv = _inject_config_argv(
argv,
cfg_path=cfg_path,
root_parser=ap,
subparsers={
"harvest": h,
"manifest": m,
"single-shot": s,
"diff": d,
"explain": e,
"validate": v,
},
)
args = ap.parse_args(argv)
# Preserve historical defaults for remote harvesting unless ssh_config lookup is enabled.
# This lets ssh_config values take effect when the user did not explicitly set
# --remote-user / --remote-port.
if hasattr(args, "remote_host"):
rsc = getattr(args, "remote_ssh_config", None)
if not rsc:
if getattr(args, "remote_port", None) is None:
setattr(args, "remote_port", 22)
if getattr(args, "remote_user", None) is None:
setattr(args, "remote_user", os.environ.get("USER") or None)
try:
if args.cmd == "harvest":
sops_fps = getattr(args, "sops", None)
if remote_host:
if args.remote_host:
if sops_fps:
out_file = _resolve_sops_out_file(args.out, hint=remote_host)
out_file = _resolve_sops_out_file(args.out, hint=args.remote_host)
with tempfile.TemporaryDirectory(prefix="enroll-harvest-") as td:
tmp_bundle = Path(td) / "bundle"
tmp_bundle.mkdir(parents=True, exist_ok=True)
@ -229,12 +792,20 @@ def main() -> None:
except OSError:
pass
remote_harvest(
ask_become_pass=args.ask_become_pass,
ask_key_passphrase=bool(args.ask_key_passphrase),
ssh_key_passphrase_env=getattr(
args, "ssh_key_passphrase_env", None
),
local_out_dir=tmp_bundle,
remote_host=remote_host,
remote_port=int(args.remote_port),
remote_host=args.remote_host,
remote_port=args.remote_port,
remote_user=args.remote_user,
remote_ssh_config=args.remote_ssh_config,
dangerous=bool(args.dangerous),
no_sudo=bool(args.no_sudo),
include_paths=list(getattr(args, "include_path", []) or []),
exclude_paths=list(getattr(args, "exclude_path", []) or []),
)
_encrypt_harvest_dir_to_sops(
tmp_bundle, out_file, list(sops_fps)
@ -244,15 +815,23 @@ def main() -> None:
out_dir = (
Path(args.out)
if args.out
else new_harvest_cache_dir(hint=remote_host).dir
else new_harvest_cache_dir(hint=args.remote_host).dir
)
state = remote_harvest(
ask_become_pass=args.ask_become_pass,
ask_key_passphrase=bool(args.ask_key_passphrase),
ssh_key_passphrase_env=getattr(
args, "ssh_key_passphrase_env", None
),
local_out_dir=out_dir,
remote_host=remote_host,
remote_port=int(args.remote_port),
remote_host=args.remote_host,
remote_port=args.remote_port,
remote_user=args.remote_user,
remote_ssh_config=args.remote_ssh_config,
dangerous=bool(args.dangerous),
no_sudo=bool(args.no_sudo),
include_paths=list(getattr(args, "include_path", []) or []),
exclude_paths=list(getattr(args, "exclude_path", []) or []),
)
print(str(state))
else:
@ -265,18 +844,68 @@ def main() -> None:
os.chmod(tmp_bundle, 0o700)
except OSError:
pass
harvest(str(tmp_bundle), dangerous=bool(args.dangerous))
harvest(
str(tmp_bundle),
dangerous=bool(args.dangerous),
include_paths=list(getattr(args, "include_path", []) or []),
exclude_paths=list(getattr(args, "exclude_path", []) or []),
)
_encrypt_harvest_dir_to_sops(
tmp_bundle, out_file, list(sops_fps)
)
print(str(out_file))
else:
if not args.out:
raise SystemExit(
"error: --out is required unless --remote-host is set"
if args.out:
out_dir = args.out
else:
out_dir = (
Path(args.out)
if args.out
else new_harvest_cache_dir(hint=args.remote_host).dir
)
path = harvest(args.out, dangerous=bool(args.dangerous))
path = harvest(
out_dir,
dangerous=bool(args.dangerous),
include_paths=list(getattr(args, "include_path", []) or []),
exclude_paths=list(getattr(args, "exclude_path", []) or []),
)
print(path)
elif args.cmd == "explain":
out = explain_state(
args.harvest,
sops_mode=bool(getattr(args, "sops", False)),
fmt=str(getattr(args, "format", "text")),
max_examples=int(getattr(args, "max_examples", 3)),
)
sys.stdout.write(out)
elif args.cmd == "validate":
res = validate_harvest(
args.harvest,
sops_mode=bool(getattr(args, "sops", False)),
schema=getattr(args, "schema", None),
no_schema=bool(getattr(args, "no_schema", False)),
)
fmt = str(getattr(args, "format", "text"))
if fmt == "json":
txt = json.dumps(res.to_dict(), indent=2, sort_keys=True) + "\n"
else:
txt = res.to_text()
out_path = getattr(args, "out", None)
if out_path:
p = Path(out_path).expanduser()
p.parent.mkdir(parents=True, exist_ok=True)
p.write_text(txt, encoding="utf-8")
else:
sys.stdout.write(txt)
if res.errors:
raise SystemExit(1)
if res.warnings and bool(getattr(args, "fail_on_warnings", False)):
raise SystemExit(1)
elif args.cmd == "manifest":
out_enc = manifest(
args.harvest,
@ -287,11 +916,107 @@ def main() -> None:
)
if getattr(args, "sops", None) and out_enc:
print(str(out_enc))
elif args.cmd == "diff":
report, has_changes = compare_harvests(
args.old,
args.new,
sops_mode=bool(getattr(args, "sops", False)),
exclude_paths=list(getattr(args, "exclude_path", []) or []),
ignore_package_versions=bool(
getattr(args, "ignore_package_versions", False)
),
)
# Optional enforcement: if drift is detected, attempt to restore the
# system to the *old* (baseline) state using ansible-playbook.
if bool(getattr(args, "enforce", False)):
if has_changes:
if not has_enforceable_drift(report):
report["enforcement"] = {
"requested": True,
"status": "skipped",
"reason": (
"no enforceable drift detected (only additions and/or package version changes); "
"enroll does not attempt to downgrade packages"
),
}
else:
try:
info = enforce_old_harvest(
args.old,
sops_mode=bool(getattr(args, "sops", False)),
report=report,
)
except Exception as e:
raise SystemExit(
f"error: could not enforce old harvest state: {e}"
) from e
report["enforcement"] = {
"requested": True,
**(info or {}),
}
else:
report["enforcement"] = {
"requested": True,
"status": "skipped",
"reason": "no differences detected",
}
txt = format_report(report, fmt=str(getattr(args, "format", "text")))
out_path = getattr(args, "out", None)
if out_path:
p = Path(out_path).expanduser()
p.parent.mkdir(parents=True, exist_ok=True)
p.write_text(txt, encoding="utf-8")
else:
print(txt, end="" if txt.endswith("\n") else "\n")
should_notify = has_changes or bool(getattr(args, "notify_always", False))
webhook = getattr(args, "webhook", None)
if webhook and should_notify:
wf = str(getattr(args, "webhook_format", "json"))
payload = format_report(report, fmt=wf)
body = payload.encode("utf-8")
headers = {}
if wf == "json":
headers["Content-Type"] = "application/json"
else:
headers["Content-Type"] = "text/plain; charset=utf-8"
for hv in getattr(args, "webhook_header", []) or []:
if ":" in hv:
k, v = hv.split(":", 1)
headers[k.strip()] = v.strip()
status, _resp = post_webhook(webhook, body, headers=headers)
if status and status >= 400:
raise SystemExit(f"error: webhook returned HTTP {status}")
to_addrs = getattr(args, "email_to", []) or []
if to_addrs and should_notify:
subject = getattr(args, "email_subject", None) or "enroll diff report"
smtp_pw = None
pw_env = getattr(args, "smtp_password_env", None)
if pw_env:
smtp_pw = os.environ.get(str(pw_env))
send_email(
to_addrs=list(to_addrs),
subject=str(subject),
body=txt,
from_addr=getattr(args, "email_from", None),
smtp=getattr(args, "smtp", None),
smtp_user=getattr(args, "smtp_user", None),
smtp_password=smtp_pw,
)
if getattr(args, "exit_code", False) and has_changes:
raise SystemExit(2)
elif args.cmd == "single-shot":
sops_fps = getattr(args, "sops", None)
if remote_host:
if args.remote_host:
if sops_fps:
out_file = _resolve_sops_out_file(args.harvest, hint=remote_host)
out_file = _resolve_sops_out_file(
args.harvest, hint=args.remote_host
)
with tempfile.TemporaryDirectory(prefix="enroll-harvest-") as td:
tmp_bundle = Path(td) / "bundle"
tmp_bundle.mkdir(parents=True, exist_ok=True)
@ -300,12 +1025,20 @@ def main() -> None:
except OSError:
pass
remote_harvest(
ask_become_pass=args.ask_become_pass,
ask_key_passphrase=bool(args.ask_key_passphrase),
ssh_key_passphrase_env=getattr(
args, "ssh_key_passphrase_env", None
),
local_out_dir=tmp_bundle,
remote_host=remote_host,
remote_port=int(args.remote_port),
remote_host=args.remote_host,
remote_port=args.remote_port,
remote_user=args.remote_user,
remote_ssh_config=args.remote_ssh_config,
dangerous=bool(args.dangerous),
no_sudo=bool(args.no_sudo),
include_paths=list(getattr(args, "include_path", []) or []),
exclude_paths=list(getattr(args, "exclude_path", []) or []),
)
_encrypt_harvest_dir_to_sops(
tmp_bundle, out_file, list(sops_fps)
@ -324,15 +1057,23 @@ def main() -> None:
harvest_dir = (
Path(args.harvest)
if args.harvest
else new_harvest_cache_dir(hint=remote_host).dir
else new_harvest_cache_dir(hint=args.remote_host).dir
)
remote_harvest(
ask_become_pass=args.ask_become_pass,
ask_key_passphrase=bool(args.ask_key_passphrase),
ssh_key_passphrase_env=getattr(
args, "ssh_key_passphrase_env", None
),
local_out_dir=harvest_dir,
remote_host=remote_host,
remote_port=int(args.remote_port),
remote_host=args.remote_host,
remote_port=args.remote_port,
remote_user=args.remote_user,
remote_ssh_config=args.remote_ssh_config,
dangerous=bool(args.dangerous),
no_sudo=bool(args.no_sudo),
include_paths=list(getattr(args, "include_path", []) or []),
exclude_paths=list(getattr(args, "exclude_path", []) or []),
)
manifest(
str(harvest_dir),
@ -353,7 +1094,12 @@ def main() -> None:
os.chmod(tmp_bundle, 0o700)
except OSError:
pass
harvest(str(tmp_bundle), dangerous=bool(args.dangerous))
harvest(
str(tmp_bundle),
dangerous=bool(args.dangerous),
include_paths=list(getattr(args, "include_path", []) or []),
exclude_paths=list(getattr(args, "exclude_path", []) or []),
)
_encrypt_harvest_dir_to_sops(
tmp_bundle, out_file, list(sops_fps)
)
@ -372,12 +1118,29 @@ def main() -> None:
raise SystemExit(
"error: --harvest is required unless --remote-host is set"
)
harvest(args.harvest, dangerous=bool(args.dangerous))
harvest(
args.harvest,
dangerous=bool(args.dangerous),
include_paths=list(getattr(args, "include_path", []) or []),
exclude_paths=list(getattr(args, "exclude_path", []) or []),
)
manifest(
args.harvest,
args.out,
fqdn=args.fqdn,
jinjaturtle=_jt_mode(args),
)
except RemoteSudoPasswordRequired:
raise SystemExit(
"error: remote sudo requires a password. Re-run with --ask-become-pass."
) from None
except RemoteSSHKeyPassphraseRequired as e:
msg = str(e).strip() or (
"SSH private key passphrase is required. "
"Re-run with --ask-key-passphrase or --ssh-key-passphrase-env VAR."
)
raise SystemExit(f"error: {msg}") from None
except RuntimeError as e:
raise SystemExit(f"error: {e}") from None
except SopsError as e:
raise SystemExit(f"error: {e}")
raise SystemExit(f"error: {e}") from None

View file

@ -1,7 +1,6 @@
from __future__ import annotations
import glob
import hashlib
import os
import subprocess # nosec
from typing import Dict, List, Optional, Set, Tuple
@ -64,6 +63,50 @@ def list_manual_packages() -> List[str]:
return sorted(set(pkgs))
def list_installed_packages() -> Dict[str, List[Dict[str, str]]]:
"""Return mapping of installed package name -> installed instances.
Uses dpkg-query and is expected to work on Debian/Ubuntu-like systems.
Output format:
{"pkg": [{"version": "...", "arch": "..."}, ...], ...}
"""
try:
p = subprocess.run(
[
"dpkg-query",
"-W",
"-f=${Package}\t${Version}\t${Architecture}\n",
],
text=True,
capture_output=True,
check=False,
) # nosec
except Exception:
return {}
out: Dict[str, List[Dict[str, str]]] = {}
for raw in (p.stdout or "").splitlines():
line = raw.strip("\n")
if not line:
continue
parts = line.split("\t")
if len(parts) < 3:
continue
name, ver, arch = parts[0].strip(), parts[1].strip(), parts[2].strip()
if not name:
continue
out.setdefault(name, []).append({"version": ver, "arch": arch})
# Stable ordering for deterministic JSON dumps.
for k in list(out.keys()):
out[k] = sorted(
out[k], key=lambda x: (x.get("arch") or "", x.get("version") or "")
)
return out
def build_dpkg_etc_index(
info_dir: str = "/var/lib/dpkg/info",
) -> Tuple[Set[str], Dict[str, str], Dict[str, Set[str]], Dict[str, List[str]]]:
@ -154,7 +197,9 @@ def parse_status_conffiles(
if ":" in line:
k, v = line.split(":", 1)
key = k
cur[key] = v.lstrip()
# Preserve leading spaces in continuation lines, but strip
# the trailing newline from the initial key line value.
cur[key] = v.lstrip().rstrip("\n")
if cur:
flush()
@ -178,28 +223,3 @@ def read_pkg_md5sums(pkg: str) -> Dict[str, str]:
md5, rel = line.split(None, 1)
m[rel.strip()] = md5.strip()
return m
def file_md5(path: str) -> str:
h = hashlib.md5() # nosec
with open(path, "rb") as f:
for chunk in iter(lambda: f.read(1024 * 1024), b""):
h.update(chunk)
return h.hexdigest()
def stat_triplet(path: str) -> Tuple[str, str, str]:
st = os.stat(path, follow_symlinks=True)
mode = oct(st.st_mode & 0o777)[2:].zfill(4)
import pwd, grp
try:
owner = pwd.getpwuid(st.st_uid).pw_name
except KeyError:
owner = str(st.st_uid)
try:
group = grp.getgrgid(st.st_gid).gr_name
except KeyError:
group = str(st.st_gid)
return owner, group, mode

1353
enroll/diff.py Normal file

File diff suppressed because it is too large Load diff

578
enroll/explain.py Normal file
View file

@ -0,0 +1,578 @@
from __future__ import annotations
import json
from collections import Counter, defaultdict
from dataclasses import dataclass
from typing import Any, Dict, Iterable, List, Tuple
from .diff import _bundle_from_input, _load_state # reuse existing bundle handling
@dataclass(frozen=True)
class ReasonInfo:
title: str
why: str
_MANAGED_FILE_REASONS: Dict[str, ReasonInfo] = {
# Package manager / repo config
"apt_config": ReasonInfo(
"APT configuration",
"APT configuration affecting package installation and repository behavior.",
),
"apt_source": ReasonInfo(
"APT repository source",
"APT source list entries (e.g. sources.list or sources.list.d).",
),
"apt_keyring": ReasonInfo(
"APT keyring",
"Repository signing key material used by APT.",
),
"apt_signed_by_keyring": ReasonInfo(
"APT Signed-By keyring",
"Keyring referenced via a Signed-By directive in an APT source.",
),
"yum_conf": ReasonInfo(
"YUM/DNF main config",
"Primary YUM configuration (often /etc/yum.conf).",
),
"yum_config": ReasonInfo(
"YUM/DNF config",
"YUM/DNF configuration files (including conf.d).",
),
"yum_repo": ReasonInfo(
"YUM/DNF repository",
"YUM/DNF repository definitions (e.g. yum.repos.d).",
),
"dnf_config": ReasonInfo(
"DNF configuration",
"DNF configuration affecting package installation and repositories.",
),
"rpm_gpg_key": ReasonInfo(
"RPM GPG key",
"Repository signing keys used by RPM/YUM/DNF.",
),
# SSH
"authorized_keys": ReasonInfo(
"SSH authorized keys",
"User authorized_keys files (controls who can log in with SSH keys).",
),
"ssh_public_key": ReasonInfo(
"SSH public key",
"SSH host/user public keys relevant to authentication.",
),
# System config / security
"system_security": ReasonInfo(
"Security configuration",
"Security-sensitive configuration (SSH, sudoers, PAM, auth, etc.).",
),
"system_network": ReasonInfo(
"Network configuration",
"Network configuration (interfaces, resolv.conf, network managers, etc.).",
),
"system_firewall": ReasonInfo(
"Firewall configuration",
"Firewall rules/configuration (ufw, nftables, iptables, etc.).",
),
"system_sysctl": ReasonInfo(
"sysctl configuration",
"Kernel sysctl tuning (sysctl.conf / sysctl.d).",
),
"system_modprobe": ReasonInfo(
"modprobe configuration",
"Kernel module configuration (modprobe.d).",
),
"system_mounts": ReasonInfo(
"Mount configuration",
"Mount configuration (e.g. /etc/fstab and related).",
),
"system_rc": ReasonInfo(
"Startup/rc configuration",
"Startup scripts / rc configuration that can affect boot behavior.",
),
# systemd + timers
"systemd_dropin": ReasonInfo(
"systemd drop-in",
"systemd override/drop-in files that modify a unit's behavior.",
),
"systemd_envfile": ReasonInfo(
"systemd EnvironmentFile",
"Files referenced by systemd units via EnvironmentFile.",
),
"related_timer": ReasonInfo(
"Related systemd timer",
"A systemd timer captured because it is related to a unit/service.",
),
# cron / logrotate
"system_cron": ReasonInfo(
"System cron",
"System cron configuration (crontab, cron.d, etc.).",
),
"cron_snippet": ReasonInfo(
"Cron snippet",
"Cron snippets referenced/used by harvested services or configs.",
),
"system_logrotate": ReasonInfo(
"System logrotate",
"System logrotate configuration.",
),
"logrotate_snippet": ReasonInfo(
"logrotate snippet",
"logrotate snippets/configs referenced in system configuration.",
),
# Custom paths / drift signals
"modified_conffile": ReasonInfo(
"Modified package conffile",
"A package-managed conffile differs from the packaged/default version.",
),
"modified_packaged_file": ReasonInfo(
"Modified packaged file",
"A file owned by a package differs from the packaged version.",
),
"custom_unowned": ReasonInfo(
"Unowned custom file",
"A file not owned by any package (often custom/operator-managed).",
),
"custom_specific_path": ReasonInfo(
"Custom specific path",
"A specific path included by a custom rule or snapshot.",
),
"usr_local_bin_script": ReasonInfo(
"/usr/local/bin script",
"Executable scripts under /usr/local/bin (often operator-installed).",
),
"usr_local_etc_custom": ReasonInfo(
"/usr/local/etc custom",
"Custom configuration under /usr/local/etc.",
),
# User includes
"user_include": ReasonInfo(
"User-included path",
"Included because you specified it via --include-path / include patterns.",
),
}
_MANAGED_DIR_REASONS: Dict[str, ReasonInfo] = {
"parent_of_managed_file": ReasonInfo(
"Parent directory",
"Included so permissions/ownership can be recreated for managed files.",
),
"user_include_dir": ReasonInfo(
"User-included directory",
"Included because you specified it via --include-path / include patterns.",
),
}
_EXCLUDED_REASONS: Dict[str, ReasonInfo] = {
"user_excluded": ReasonInfo(
"User excluded",
"Excluded because you explicitly excluded it (e.g. --exclude-path / patterns).",
),
"unreadable": ReasonInfo(
"Unreadable",
"Enroll could not read this path with the permissions it had.",
),
"log_file": ReasonInfo(
"Log file",
"Excluded because it appears to be a log file (usually noisy/large).",
),
"denied_path": ReasonInfo(
"Denied path",
"Excluded because the path is in a denylist for safety.",
),
"too_large": ReasonInfo(
"Too large",
"Excluded because it exceeded the size limit for harvested files.",
),
"not_regular_file": ReasonInfo(
"Not a regular file",
"Excluded because it was not a regular file (device, socket, etc.).",
),
"binary_like": ReasonInfo(
"Binary-like",
"Excluded because it looked like binary content (not useful for config management).",
),
"sensitive_content": ReasonInfo(
"Sensitive content",
"Excluded because it likely contains secrets (e.g. shadow, private keys).",
),
}
_OBSERVED_VIA: Dict[str, ReasonInfo] = {
"user_installed": ReasonInfo(
"User-installed",
"Package appears explicitly installed (as opposed to only pulled in as a dependency).",
),
"systemd_unit": ReasonInfo(
"Referenced by systemd unit",
"Package is associated with a systemd unit that was harvested.",
),
"package_role": ReasonInfo(
"Referenced by package role",
"Package was referenced by an enroll packages snapshot/role.",
),
}
def _ri(mapping: Dict[str, ReasonInfo], key: str) -> ReasonInfo:
return mapping.get(key) or ReasonInfo(key, f"Captured with reason '{key}'")
def _role_common_counts(role_obj: Dict[str, Any]) -> Tuple[int, int, int, int]:
"""Return (managed_files, managed_dirs, excluded, notes) counts for a RoleCommon object."""
mf = len(role_obj.get("managed_files") or [])
md = len(role_obj.get("managed_dirs") or [])
ex = len(role_obj.get("excluded") or [])
nt = len(role_obj.get("notes") or [])
return mf, md, ex, nt
def _summarize_reasons(
items: Iterable[Dict[str, Any]],
reason_key: str,
*,
mapping: Dict[str, ReasonInfo],
max_examples: int,
) -> List[Dict[str, Any]]:
by_reason: Dict[str, List[str]] = defaultdict(list)
counts: Counter[str] = Counter()
for it in items:
if not isinstance(it, dict):
continue
r = it.get(reason_key)
if not r:
continue
r = str(r)
counts[r] += 1
p = it.get("path")
if (
max_examples > 0
and isinstance(p, str)
and p
and len(by_reason[r]) < max_examples
):
by_reason[r].append(p)
out: List[Dict[str, Any]] = []
for reason, count in counts.most_common():
info = _ri(mapping, reason)
out.append(
{
"reason": reason,
"count": count,
"title": info.title,
"why": info.why,
"examples": by_reason.get(reason, []),
}
)
return out
def explain_state(
harvest: str,
*,
sops_mode: bool = False,
fmt: str = "text",
max_examples: int = 3,
) -> str:
"""Explain a harvest bundle's state.json.
`harvest` may be:
- a bundle directory
- a path to state.json
- a tarball (.tar.gz/.tgz)
- a SOPS-encrypted bundle (.sops)
"""
bundle = _bundle_from_input(harvest, sops_mode=sops_mode)
state = _load_state(bundle.dir)
host = state.get("host") or {}
enroll = state.get("enroll") or {}
roles = state.get("roles") or {}
inv = state.get("inventory") or {}
inv_pkgs = (inv.get("packages") or {}) if isinstance(inv, dict) else {}
role_summaries: List[Dict[str, Any]] = []
# Users
users_obj = roles.get("users") or {}
user_entries = users_obj.get("users") or []
mf, md, ex, _nt = (
_role_common_counts(users_obj) if isinstance(users_obj, dict) else (0, 0, 0, 0)
)
role_summaries.append(
{
"role": "users",
"summary": f"{len(user_entries)} user(s), {mf} file(s), {ex} excluded",
"notes": users_obj.get("notes") or [],
}
)
# Services
services_list = roles.get("services") or []
if isinstance(services_list, list):
total_mf = sum(
len((s.get("managed_files") or []))
for s in services_list
if isinstance(s, dict)
)
total_ex = sum(
len((s.get("excluded") or [])) for s in services_list if isinstance(s, dict)
)
role_summaries.append(
{
"role": "services",
"summary": f"{len(services_list)} unit(s), {total_mf} file(s), {total_ex} excluded",
"units": [
{
"unit": s.get("unit"),
"active_state": s.get("active_state"),
"sub_state": s.get("sub_state"),
"unit_file_state": s.get("unit_file_state"),
"condition_result": s.get("condition_result"),
}
for s in services_list
if isinstance(s, dict)
],
}
)
# Package snapshots
pkgs_list = roles.get("packages") or []
if isinstance(pkgs_list, list):
total_mf = sum(
len((p.get("managed_files") or []))
for p in pkgs_list
if isinstance(p, dict)
)
total_ex = sum(
len((p.get("excluded") or [])) for p in pkgs_list if isinstance(p, dict)
)
role_summaries.append(
{
"role": "packages",
"summary": f"{len(pkgs_list)} package snapshot(s), {total_mf} file(s), {total_ex} excluded",
"packages": [
p.get("package") for p in pkgs_list if isinstance(p, dict)
],
}
)
# Single snapshots
for rname in [
"apt_config",
"dnf_config",
"etc_custom",
"usr_local_custom",
"extra_paths",
]:
robj = roles.get(rname) or {}
if not isinstance(robj, dict):
continue
mf, md, ex, _nt = _role_common_counts(robj)
extra: Dict[str, Any] = {}
if rname == "extra_paths":
extra = {
"include_patterns": robj.get("include_patterns") or [],
"exclude_patterns": robj.get("exclude_patterns") or [],
}
role_summaries.append(
{
"role": rname,
"summary": f"{mf} file(s), {md} dir(s), {ex} excluded",
"notes": robj.get("notes") or [],
**extra,
}
)
# Flatten managed/excluded across roles
all_managed_files: List[Dict[str, Any]] = []
all_managed_dirs: List[Dict[str, Any]] = []
all_excluded: List[Dict[str, Any]] = []
def _consume_role(role_obj: Dict[str, Any]) -> None:
for f in role_obj.get("managed_files") or []:
if isinstance(f, dict):
all_managed_files.append(f)
for d in role_obj.get("managed_dirs") or []:
if isinstance(d, dict):
all_managed_dirs.append(d)
for e in role_obj.get("excluded") or []:
if isinstance(e, dict):
all_excluded.append(e)
if isinstance(users_obj, dict):
_consume_role(users_obj)
if isinstance(services_list, list):
for s in services_list:
if isinstance(s, dict):
_consume_role(s)
if isinstance(pkgs_list, list):
for p in pkgs_list:
if isinstance(p, dict):
_consume_role(p)
for rname in [
"apt_config",
"dnf_config",
"etc_custom",
"usr_local_custom",
"extra_paths",
]:
robj = roles.get(rname)
if isinstance(robj, dict):
_consume_role(robj)
managed_file_reasons = _summarize_reasons(
all_managed_files,
"reason",
mapping=_MANAGED_FILE_REASONS,
max_examples=max_examples,
)
managed_dir_reasons = _summarize_reasons(
all_managed_dirs,
"reason",
mapping=_MANAGED_DIR_REASONS,
max_examples=max_examples,
)
excluded_reasons = _summarize_reasons(
all_excluded,
"reason",
mapping=_EXCLUDED_REASONS,
max_examples=max_examples,
)
# Inventory observed_via breakdown (count packages that contain at least one entry for that kind)
observed_kinds: Counter[str] = Counter()
observed_refs: Dict[str, Counter[str]] = defaultdict(Counter)
for _pkg, entry in inv_pkgs.items():
if not isinstance(entry, dict):
continue
seen_kinds = set()
for ov in entry.get("observed_via") or []:
if not isinstance(ov, dict):
continue
kind = ov.get("kind")
if not kind:
continue
kind = str(kind)
seen_kinds.add(kind)
ref = ov.get("ref")
if isinstance(ref, str) and ref:
observed_refs[kind][ref] += 1
for k in seen_kinds:
observed_kinds[k] += 1
observed_via_summary: List[Dict[str, Any]] = []
for kind, cnt in observed_kinds.most_common():
info = _ri(_OBSERVED_VIA, kind)
top_refs = [
r for r, _ in observed_refs.get(kind, Counter()).most_common(max_examples)
]
observed_via_summary.append(
{
"kind": kind,
"count": cnt,
"title": info.title,
"why": info.why,
"top_refs": top_refs,
}
)
report: Dict[str, Any] = {
"bundle_dir": str(bundle.dir),
"host": host,
"enroll": enroll,
"inventory": {
"package_count": len(inv_pkgs),
"observed_via": observed_via_summary,
},
"roles": role_summaries,
"reasons": {
"managed_files": managed_file_reasons,
"managed_dirs": managed_dir_reasons,
"excluded": excluded_reasons,
},
}
if fmt == "json":
return json.dumps(report, indent=2, sort_keys=True)
# Text rendering
out: List[str] = []
out.append(f"Enroll explained: {harvest}")
hn = host.get("hostname") or "(unknown host)"
os_family = host.get("os") or "unknown"
pkg_backend = host.get("pkg_backend") or "?"
ver = enroll.get("version") or "?"
out.append(f"Host: {hn} (os: {os_family}, pkg: {pkg_backend})")
out.append(f"Enroll: {ver}")
out.append("")
out.append("Inventory")
out.append(f"- Packages: {len(inv_pkgs)}")
if observed_via_summary:
out.append("- Why packages were included (observed_via):")
for ov in observed_via_summary:
extra = ""
if ov.get("top_refs"):
extra = f" (e.g. {', '.join(ov['top_refs'])})"
out.append(f" - {ov['kind']}: {ov['count']} {ov['why']}{extra}")
out.append("")
out.append("Roles collected")
for rs in role_summaries:
out.append(f"- {rs['role']}: {rs['summary']}")
if rs["role"] == "extra_paths":
inc = rs.get("include_patterns") or []
exc = rs.get("exclude_patterns") or []
if inc:
suffix = "" if len(inc) > max_examples else ""
out.append(
f" include_patterns: {', '.join(map(str, inc[:max_examples]))}{suffix}"
)
if exc:
suffix = "" if len(exc) > max_examples else ""
out.append(
f" exclude_patterns: {', '.join(map(str, exc[:max_examples]))}{suffix}"
)
notes = rs.get("notes") or []
if notes:
for n in notes[:max_examples]:
out.append(f" note: {n}")
if len(notes) > max_examples:
out.append(
f" note: (+{len(notes) - max_examples} more. Use --format json to see them all)"
)
out.append("")
out.append("Why files were included (managed_files.reason)")
if managed_file_reasons:
for r in managed_file_reasons[:15]:
exs = r.get("examples") or []
ex_txt = f" Examples: {', '.join(exs)}" if exs else ""
out.append(f"- {r['reason']} ({r['count']}): {r['why']}.{ex_txt}")
if len(managed_file_reasons) > 15:
out.append(
f"- (+{len(managed_file_reasons) - 15} more reasons. Use --format json to see them all)"
)
else:
out.append("- (no managed files)")
if managed_dir_reasons:
out.append("")
out.append("Why directories were included (managed_dirs.reason)")
for r in managed_dir_reasons:
out.append(f"- {r['reason']} ({r['count']}): {r['why']}")
out.append("")
out.append("Why paths were excluded")
if excluded_reasons:
for r in excluded_reasons:
exs = r.get("examples") or []
ex_txt = f" Examples: {', '.join(exs)}" if exs else ""
out.append(f"- {r['reason']} ({r['count']}): {r['why']}.{ex_txt}")
else:
out.append("- (no excluded paths)")
return "\n".join(out) + "\n"

40
enroll/fsutil.py Normal file
View file

@ -0,0 +1,40 @@
from __future__ import annotations
import hashlib
import os
from typing import Tuple
def file_md5(path: str) -> str:
"""Return hex MD5 of a file.
Used for Debian dpkg baseline comparisons.
"""
h = hashlib.md5() # nosec
with open(path, "rb") as f:
for chunk in iter(lambda: f.read(1024 * 1024), b""):
h.update(chunk)
return h.hexdigest()
def stat_triplet(path: str) -> Tuple[str, str, str]:
"""Return (owner, group, mode) for a path.
owner/group are usernames/group names when resolvable, otherwise numeric ids.
mode is a zero-padded octal string (e.g. "0644").
"""
st = os.stat(path, follow_symlinks=True)
mode = oct(st.st_mode & 0o7777)[2:].zfill(4)
import grp
import pwd
try:
owner = pwd.getpwuid(st.st_uid).pw_name
except KeyError:
owner = str(st.st_uid)
try:
group = grp.getgrgid(st.st_gid).gr_name
except KeyError:
group = str(st.st_gid)
return owner, group, mode

File diff suppressed because it is too large Load diff

View file

@ -23,6 +23,27 @@ DEFAULT_DENY_GLOBS = [
"/etc/gshadow",
"/etc/*shadow",
"/etc/letsencrypt/*",
"/usr/local/etc/ssl/private/*",
"/usr/local/etc/ssh/ssh_host_*",
"/usr/local/etc/*shadow",
"/usr/local/etc/*gshadow",
"/usr/local/etc/letsencrypt/*",
]
# Allow a small set of binary config artifacts that are commonly required to
# reproduce system configuration (notably APT keyrings). These are still subject
# to size and readability limits, but are exempt from the "binary_like" denial.
DEFAULT_ALLOW_BINARY_GLOBS = [
"/etc/apt/trusted.gpg",
"/etc/apt/trusted.gpg.d/*.gpg",
"/etc/apt/keyrings/*.gpg",
"/etc/apt/keyrings/*.pgp",
"/etc/apt/keyrings/*.asc",
"/usr/share/keyrings/*.gpg",
"/usr/share/keyrings/*.pgp",
"/usr/share/keyrings/*.asc",
"/etc/pki/rpm-gpg/*",
]
SENSITIVE_CONTENT_PATTERNS = [
@ -39,6 +60,7 @@ BLOCK_END = b"*/"
@dataclass
class IgnorePolicy:
deny_globs: Optional[list[str]] = None
allow_binary_globs: Optional[list[str]] = None
max_file_bytes: int = 256_000
sample_bytes: int = 64_000
# If True, be much less conservative about collecting potentially
@ -49,6 +71,8 @@ class IgnorePolicy:
def __post_init__(self) -> None:
if self.deny_globs is None:
self.deny_globs = list(DEFAULT_DENY_GLOBS)
if self.allow_binary_globs is None:
self.allow_binary_globs = list(DEFAULT_ALLOW_BINARY_GLOBS)
def iter_effective_lines(self, content: bytes):
in_block = False
@ -76,6 +100,12 @@ class IgnorePolicy:
# Always ignore plain *.log files (rarely useful as config, often noisy).
if path.endswith(".log"):
return "log_file"
# Ignore editor/backup files that end with a trailing tilde.
if path.endswith("~"):
return "backup_file"
# Ignore backup shadow files
if path.startswith("/etc/") and path.endswith("-"):
return "backup_file"
if not self.dangerous:
for g in self.deny_globs or []:
@ -100,6 +130,10 @@ class IgnorePolicy:
return "unreadable"
if b"\x00" in data:
for g in self.allow_binary_globs or []:
if fnmatch.fnmatch(path, g):
# Binary is acceptable for explicitly-allowed paths.
return None
return "binary_like"
if not self.dangerous:
@ -109,3 +143,75 @@ class IgnorePolicy:
return "sensitive_content"
return None
def deny_reason_dir(self, path: str) -> Optional[str]:
"""Directory-specific deny logic.
deny_reason() is file-oriented (it rejects directories as "not_regular_file").
For directory metadata capture (so roles can recreate directory trees), we need
a lighter-weight check:
- apply deny_globs (unless dangerous)
- require the path to be a real directory (no symlink)
- ensure it's stat'able/readable
No size checks or content scanning are performed for directories.
"""
if not self.dangerous:
for g in self.deny_globs or []:
if fnmatch.fnmatch(path, g):
return "denied_path"
try:
os.stat(path, follow_symlinks=True)
except OSError:
return "unreadable"
if os.path.islink(path):
return "symlink"
if not os.path.isdir(path):
return "not_directory"
return None
def deny_reason_link(self, path: str) -> Optional[str]:
"""Symlink-specific deny logic.
Symlinks are meaningful configuration state (e.g. Debian-style
*-enabled directories). deny_reason() is file-oriented and rejects
symlinks as "not_regular_file".
For symlinks we:
- apply the usual deny_globs (unless dangerous)
- ensure the path is a symlink and we can readlink() it
No size checks or content scanning are performed for symlinks.
"""
# Keep the same fast-path filename ignores as deny_reason().
if path.endswith(".log"):
return "log_file"
if path.endswith("~"):
return "backup_file"
if path.startswith("/etc/") and path.endswith("-"):
return "backup_file"
if not self.dangerous:
for g in self.deny_globs or []:
if fnmatch.fnmatch(path, g):
return "denied_path"
try:
os.lstat(path)
except OSError:
return "unreadable"
if not os.path.islink(path):
return "not_symlink"
try:
os.readlink(path)
except OSError:
return "unreadable"
return None

View file

@ -8,7 +8,51 @@ from pathlib import Path
from typing import Optional
SUPPORTED_EXTS = {".ini", ".json", ".toml", ".yaml", ".yml", ".xml"}
SYSTEMD_SUFFIXES = {
".service",
".socket",
".target",
".timer",
".path",
".mount",
".automount",
".slice",
".swap",
".scope",
".link",
".netdev",
".network",
}
SUPPORTED_SUFFIXES = {
".ini",
".cfg",
".json",
".toml",
".yaml",
".yml",
".xml",
".repo",
} | SYSTEMD_SUFFIXES
def infer_other_formats(dest_path: str) -> Optional[str]:
p = Path(dest_path)
name = p.name.lower()
suffix = p.suffix.lower()
# postfix
if name == "main.cf":
return "postfix"
# systemd units
if suffix in SYSTEMD_SUFFIXES:
return "systemd"
# OpenSSH system config files and snippets
parts = {part.lower() for part in p.parts}
if name in {"sshd_config", "ssh_config"}:
return "ssh"
if suffix == ".conf" and {"sshd_config.d", "ssh_config.d"} & parts:
return "ssh"
return None
@dataclass(frozen=True)
@ -22,9 +66,15 @@ def find_jinjaturtle_cmd() -> Optional[str]:
return shutil.which("jinjaturtle")
def can_jinjify_path(path: str) -> bool:
p = Path(path)
return p.suffix.lower() in SUPPORTED_EXTS
def can_jinjify_path(dest_path: str) -> bool:
p = Path(dest_path)
suffix = p.suffix.lower()
if infer_other_formats(dest_path):
return True
# allow unambiguous structured formats
if suffix in SUPPORTED_SUFFIXES:
return True
return False
def run_jinjaturtle(

File diff suppressed because it is too large Load diff

293
enroll/pathfilter.py Normal file
View file

@ -0,0 +1,293 @@
from __future__ import annotations
import glob
import os
import re
from dataclasses import dataclass
from pathlib import PurePosixPath
from typing import List, Optional, Sequence, Set, Tuple
_REGEX_PREFIXES = ("re:", "regex:")
def _has_glob_chars(s: str) -> bool:
return any(ch in s for ch in "*?[")
def _norm_abs(p: str) -> str:
"""Normalise a path-ish string to an absolute POSIX path.
We treat inputs that don't start with '/' as being relative to '/'.
"""
p = p.strip()
if not p:
return "/"
if not p.startswith("/"):
p = "/" + p
# `normpath` keeps a leading '/' for absolute paths.
return os.path.normpath(p)
def _posix_match(path: str, pattern: str) -> bool:
"""Path matching with glob semantics.
Uses PurePosixPath.match which:
- treats '/' as a segment separator
- supports '**' for recursive matching
Both `path` and `pattern` are treated as absolute paths.
"""
# PurePosixPath.match is anchored and works best on relative strings.
p = path.lstrip("/")
pat = pattern.lstrip("/")
try:
return PurePosixPath(p).match(pat)
except Exception:
# If the pattern is somehow invalid, fail closed.
return False
def _regex_literal_prefix(regex: str) -> str:
"""Best-effort literal prefix extraction for a regex.
This lets us pick a starting directory to walk when expanding regex-based
include patterns.
"""
s = regex
if s.startswith("^"):
s = s[1:]
out: List[str] = []
escaped = False
meta = set(".^$*+?{}[]\\|()")
for ch in s:
if escaped:
out.append(ch)
escaped = False
continue
if ch == "\\":
escaped = True
continue
if ch in meta:
break
out.append(ch)
return "".join(out)
@dataclass(frozen=True)
class CompiledPathPattern:
raw: str
kind: str # 'prefix' | 'glob' | 'regex'
value: str
regex: Optional[re.Pattern[str]] = None
def matches(self, path: str) -> bool:
p = _norm_abs(path)
if self.kind == "regex":
if not self.regex:
return False
# Search (not match) so users can write unanchored patterns.
return self.regex.search(p) is not None
if self.kind == "glob":
return _posix_match(p, self.value)
# prefix
pref = self.value.rstrip("/")
return p == pref or p.startswith(pref + "/")
def compile_path_pattern(raw: str) -> CompiledPathPattern:
s = raw.strip()
for pre in _REGEX_PREFIXES:
if s.startswith(pre):
rex = s[len(pre) :].strip()
try:
return CompiledPathPattern(
raw=raw, kind="regex", value=rex, regex=re.compile(rex)
)
except re.error:
# Treat invalid regexes as non-matching.
return CompiledPathPattern(raw=raw, kind="regex", value=rex, regex=None)
# If the user explicitly says glob:, honour it.
if s.startswith("glob:"):
pat = s[len("glob:") :].strip()
return CompiledPathPattern(raw=raw, kind="glob", value=_norm_abs(pat))
# Heuristic: if it contains glob metacharacters, treat as a glob.
if _has_glob_chars(s) or "**" in s:
return CompiledPathPattern(raw=raw, kind="glob", value=_norm_abs(s))
# Otherwise treat as an exact path-or-prefix (dir subtree).
return CompiledPathPattern(raw=raw, kind="prefix", value=_norm_abs(s))
@dataclass
class PathFilter:
"""User-provided path filters.
Semantics:
- exclude patterns always win
- include patterns are used only to expand *additional* files to harvest
(they do not restrict the default harvest set)
Patterns:
- By default: glob-like (supports '**')
- Regex: prefix with 're:' or 'regex:'
- Force glob: prefix with 'glob:'
- A plain path without wildcards matches that path and everything under it
(directory-prefix behaviour).
Examples:
--exclude-path /usr/local/bin/docker-*
--include-path /home/*/.bashrc
--include-path 're:^/home/[^/]+/.config/myapp/.*$'
"""
include: Sequence[str] = ()
exclude: Sequence[str] = ()
def __post_init__(self) -> None:
self._include = [
compile_path_pattern(p) for p in self.include if str(p).strip()
]
self._exclude = [
compile_path_pattern(p) for p in self.exclude if str(p).strip()
]
def is_excluded(self, path: str) -> bool:
for pat in self._exclude:
if pat.matches(path):
return True
return False
def iter_include_patterns(self) -> List[CompiledPathPattern]:
return list(self._include)
def expand_includes(
patterns: Sequence[CompiledPathPattern],
*,
exclude: Optional[PathFilter] = None,
max_files: int,
) -> Tuple[List[str], List[str]]:
"""Expand include patterns into concrete file paths.
Returns (paths, notes). The returned paths are absolute paths.
This function is intentionally conservative:
- symlinks are ignored (both dirs and files)
- the number of collected files is capped
Regex patterns are expanded by walking a best-effort inferred root.
"""
out: List[str] = []
notes: List[str] = []
seen: Set[str] = set()
def _maybe_add_file(p: str) -> None:
if len(out) >= max_files:
return
p = _norm_abs(p)
if exclude and exclude.is_excluded(p):
return
if p in seen:
return
if not os.path.isfile(p) or os.path.islink(p):
return
seen.add(p)
out.append(p)
def _walk_dir(root: str, match: Optional[CompiledPathPattern] = None) -> None:
root = _norm_abs(root)
if not os.path.isdir(root) or os.path.islink(root):
return
for dirpath, dirnames, filenames in os.walk(root, followlinks=False):
# Prune excluded directories early.
if exclude:
dirnames[:] = [
d
for d in dirnames
if not exclude.is_excluded(os.path.join(dirpath, d))
and not os.path.islink(os.path.join(dirpath, d))
]
for fn in filenames:
if len(out) >= max_files:
return
p = os.path.join(dirpath, fn)
if os.path.islink(p) or not os.path.isfile(p):
continue
if exclude and exclude.is_excluded(p):
continue
if match is not None and not match.matches(p):
continue
if p in seen:
continue
seen.add(p)
out.append(_norm_abs(p))
for pat in patterns:
if len(out) >= max_files:
notes.append(
f"Include cap reached ({max_files}); some includes were not expanded."
)
break
matched_any = False
if pat.kind == "prefix":
p = pat.value
if os.path.isfile(p) and not os.path.islink(p):
_maybe_add_file(p)
matched_any = True
elif os.path.isdir(p) and not os.path.islink(p):
before = len(out)
_walk_dir(p)
matched_any = len(out) > before
else:
# Still allow prefix patterns that don't exist now (e.g. remote different)
# by matching nothing rather than erroring.
matched_any = False
elif pat.kind == "glob":
# Use glob for expansion; also walk directories that match.
gpat = pat.value
hits = glob.glob(gpat, recursive=True)
for h in hits:
if len(out) >= max_files:
break
h = _norm_abs(h)
if exclude and exclude.is_excluded(h):
continue
if os.path.isdir(h) and not os.path.islink(h):
before = len(out)
_walk_dir(h)
if len(out) > before:
matched_any = True
elif os.path.isfile(h) and not os.path.islink(h):
_maybe_add_file(h)
matched_any = True
else: # regex
rex = pat.value
prefix = _regex_literal_prefix(rex)
# Determine a walk root. If we can infer an absolute prefix, use its
# directory; otherwise fall back to '/'.
if prefix.startswith("/"):
root = os.path.dirname(prefix) or "/"
else:
root = "/"
before = len(out)
_walk_dir(root, match=pat)
matched_any = len(out) > before
if not matched_any:
notes.append(f"Include pattern matched no files: {pat.raw!r}")
return out, notes

282
enroll/platform.py Normal file
View file

@ -0,0 +1,282 @@
from __future__ import annotations
import shutil
from dataclasses import dataclass
from typing import Dict, List, Optional, Set, Tuple
from .fsutil import file_md5
def _read_os_release(path: str = "/etc/os-release") -> Dict[str, str]:
out: Dict[str, str] = {}
try:
with open(path, "r", encoding="utf-8", errors="replace") as f:
for raw in f:
line = raw.strip()
if not line or line.startswith("#") or "=" not in line:
continue
k, v = line.split("=", 1)
k = k.strip()
v = v.strip().strip('"')
out[k] = v
except OSError:
return {}
return out
@dataclass
class PlatformInfo:
os_family: str # debian|redhat|unknown
pkg_backend: str # dpkg|rpm|unknown
os_release: Dict[str, str]
def detect_platform() -> PlatformInfo:
"""Detect platform family and package backend.
Uses /etc/os-release when available, with a conservative fallback to
checking for dpkg/rpm binaries.
"""
osr = _read_os_release()
os_id = (osr.get("ID") or "").strip().lower()
likes = (osr.get("ID_LIKE") or "").strip().lower().split()
deb_ids = {"debian", "ubuntu", "linuxmint", "raspbian", "kali"}
rhel_ids = {
"fedora",
"rhel",
"centos",
"rocky",
"almalinux",
"ol",
"oracle",
"scientific",
}
if os_id in deb_ids or "debian" in likes:
return PlatformInfo(os_family="debian", pkg_backend="dpkg", os_release=osr)
if os_id in rhel_ids or any(
x in likes for x in ("rhel", "fedora", "centos", "redhat")
):
return PlatformInfo(os_family="redhat", pkg_backend="rpm", os_release=osr)
# Fallback heuristics.
if shutil.which("dpkg"):
return PlatformInfo(os_family="debian", pkg_backend="dpkg", os_release=osr)
if shutil.which("rpm"):
return PlatformInfo(os_family="redhat", pkg_backend="rpm", os_release=osr)
return PlatformInfo(os_family="unknown", pkg_backend="unknown", os_release=osr)
class PackageBackend:
"""Backend abstraction for package ownership, config detection, and manual package lists."""
name: str
pkg_config_prefixes: Tuple[str, ...]
def owner_of_path(self, path: str) -> Optional[str]: # pragma: no cover
raise NotImplementedError
def list_manual_packages(self) -> List[str]: # pragma: no cover
raise NotImplementedError
def installed_packages(self) -> Dict[str, List[Dict[str, str]]]: # pragma: no cover
"""Return mapping of package name -> installed instances.
Each instance is a dict with at least:
- version: package version string
- arch: architecture string
Backends should be best-effort and return an empty mapping on failure.
"""
raise NotImplementedError
def build_etc_index(
self,
) -> Tuple[
Set[str], Dict[str, str], Dict[str, Set[str]], Dict[str, List[str]]
]: # pragma: no cover
raise NotImplementedError
def specific_paths_for_hints(self, hints: Set[str]) -> List[str]:
return []
def is_pkg_config_path(self, path: str) -> bool:
for pfx in self.pkg_config_prefixes:
if path == pfx or path.startswith(pfx):
return True
return False
def modified_paths(self, pkg: str, etc_paths: List[str]) -> Dict[str, str]:
"""Return a mapping of modified file paths -> reason label."""
return {}
class DpkgBackend(PackageBackend):
name = "dpkg"
pkg_config_prefixes = ("/etc/apt/",)
def __init__(self) -> None:
from .debian import parse_status_conffiles
self._conffiles_by_pkg = parse_status_conffiles()
def owner_of_path(self, path: str) -> Optional[str]:
from .debian import dpkg_owner
return dpkg_owner(path)
def list_manual_packages(self) -> List[str]:
from .debian import list_manual_packages
return list_manual_packages()
def installed_packages(self) -> Dict[str, List[Dict[str, str]]]:
from .debian import list_installed_packages
return list_installed_packages()
def build_etc_index(self):
from .debian import build_dpkg_etc_index
return build_dpkg_etc_index()
def specific_paths_for_hints(self, hints: Set[str]) -> List[str]:
paths: List[str] = []
for h in hints:
paths.extend(
[
f"/etc/default/{h}",
f"/etc/init.d/{h}",
f"/etc/sysctl.d/{h}.conf",
]
)
return paths
def modified_paths(self, pkg: str, etc_paths: List[str]) -> Dict[str, str]:
from .debian import read_pkg_md5sums
out: Dict[str, str] = {}
conff = self._conffiles_by_pkg.get(pkg, {})
md5sums = read_pkg_md5sums(pkg)
for path in etc_paths:
if not path.startswith("/etc/"):
continue
if self.is_pkg_config_path(path):
continue
if path in conff:
try:
current = file_md5(path)
except OSError:
continue
if current != conff[path]:
out[path] = "modified_conffile"
continue
rel = path.lstrip("/")
baseline = md5sums.get(rel)
if baseline:
try:
current = file_md5(path)
except OSError:
continue
if current != baseline:
out[path] = "modified_packaged_file"
return out
class RpmBackend(PackageBackend):
name = "rpm"
pkg_config_prefixes = (
"/etc/dnf/",
"/etc/yum/",
"/etc/yum.repos.d/",
"/etc/yum.conf",
)
def __init__(self) -> None:
self._modified_cache: Dict[str, Set[str]] = {}
self._config_cache: Dict[str, Set[str]] = {}
def owner_of_path(self, path: str) -> Optional[str]:
from .rpm import rpm_owner
return rpm_owner(path)
def list_manual_packages(self) -> List[str]:
from .rpm import list_manual_packages
return list_manual_packages()
def installed_packages(self) -> Dict[str, List[Dict[str, str]]]:
from .rpm import list_installed_packages
return list_installed_packages()
def build_etc_index(self):
from .rpm import build_rpm_etc_index
return build_rpm_etc_index()
def specific_paths_for_hints(self, hints: Set[str]) -> List[str]:
paths: List[str] = []
for h in hints:
paths.extend(
[
f"/etc/sysconfig/{h}",
f"/etc/sysconfig/{h}.conf",
f"/etc/sysctl.d/{h}.conf",
]
)
return paths
def _config_files(self, pkg: str) -> Set[str]:
if pkg in self._config_cache:
return self._config_cache[pkg]
from .rpm import rpm_config_files
s = rpm_config_files(pkg)
self._config_cache[pkg] = s
return s
def _modified_files(self, pkg: str) -> Set[str]:
if pkg in self._modified_cache:
return self._modified_cache[pkg]
from .rpm import rpm_modified_files
s = rpm_modified_files(pkg)
self._modified_cache[pkg] = s
return s
def modified_paths(self, pkg: str, etc_paths: List[str]) -> Dict[str, str]:
out: Dict[str, str] = {}
modified = self._modified_files(pkg)
if not modified:
return out
config = self._config_files(pkg)
for path in etc_paths:
if not path.startswith("/etc/"):
continue
if self.is_pkg_config_path(path):
continue
if path not in modified:
continue
out[path] = (
"modified_conffile" if path in config else "modified_packaged_file"
)
return out
def get_backend(info: Optional[PlatformInfo] = None) -> PackageBackend:
info = info or detect_platform()
if info.pkg_backend == "dpkg":
return DpkgBackend()
if info.pkg_backend == "rpm":
return RpmBackend()
# Unknown: be conservative and use an rpm backend if rpm exists, otherwise dpkg.
if shutil.which("rpm"):
return RpmBackend()
return DpkgBackend()

View file

@ -1,13 +1,183 @@
from __future__ import annotations
import getpass
import os
import shlex
import shutil
import sys
import time
import tarfile
import tempfile
import zipapp
from pathlib import Path
from pathlib import PurePosixPath
from typing import Optional
from typing import Optional, Callable, TextIO
class RemoteSudoPasswordRequired(RuntimeError):
"""Raised when sudo requires a password but none was provided."""
class RemoteSSHKeyPassphraseRequired(RuntimeError):
"""Raised when SSH private key decryption needs a passphrase."""
def _sudo_password_required(out: str, err: str) -> bool:
"""Return True if sudo output indicates it needs a password/TTY."""
blob = (out + "\n" + err).lower()
patterns = (
"a password is required",
"password is required",
"a terminal is required to read the password",
"no tty present and no askpass program specified",
"must have a tty to run sudo",
"sudo: sorry, you must have a tty",
"askpass",
)
return any(p in blob for p in patterns)
def _sudo_not_permitted(out: str, err: str) -> bool:
"""Return True if sudo output indicates the user cannot sudo at all."""
blob = (out + "\n" + err).lower()
patterns = (
"is not in the sudoers file",
"not allowed to execute",
"may not run sudo",
"sorry, user",
)
return any(p in blob for p in patterns)
def _sudo_tty_required(out: str, err: str) -> bool:
"""Return True if sudo output indicates it requires a TTY (sudoers requiretty)."""
blob = (out + "\n" + err).lower()
patterns = (
"must have a tty",
"sorry, you must have a tty",
"sudo: sorry, you must have a tty",
"must have a tty to run sudo",
)
return any(p in blob for p in patterns)
def _resolve_become_password(
ask_become_pass: bool,
*,
prompt: str = "sudo password: ",
getpass_fn: Callable[[str], str] = getpass.getpass,
) -> Optional[str]:
if ask_become_pass:
return getpass_fn(prompt)
return None
def _resolve_ssh_key_passphrase(
ask_key_passphrase: bool,
*,
env_var: Optional[str] = None,
prompt: str = "SSH key passphrase: ",
getpass_fn: Callable[[str], str] = getpass.getpass,
) -> Optional[str]:
"""Resolve SSH private-key passphrase from env and/or prompt.
Precedence:
1) --ssh-key-passphrase-env style input (env_var)
2) --ask-key-passphrase style interactive prompt
3) None
"""
if env_var:
val = os.environ.get(str(env_var))
if val is None:
raise RuntimeError(
"SSH key passphrase environment variable is not set: " f"{env_var}"
)
return val
if ask_key_passphrase:
return getpass_fn(prompt)
return None
def remote_harvest(
*,
ask_become_pass: bool = False,
ask_key_passphrase: bool = False,
ssh_key_passphrase_env: Optional[str] = None,
no_sudo: bool = False,
prompt: str = "sudo password: ",
key_prompt: str = "SSH key passphrase: ",
getpass_fn: Optional[Callable[[str], str]] = None,
stdin: Optional[TextIO] = None,
**kwargs,
):
"""Call _remote_harvest, with a safe sudo password fallback.
Behavior:
- Run without a password unless --ask-become-pass is set.
- If the remote sudo policy requires a password and none was provided,
prompt and retry when running interactively.
"""
# Resolve defaults at call time (easier to test/monkeypatch, and avoids capturing
# sys.stdin / getpass.getpass at import time).
if getpass_fn is None:
getpass_fn = getpass.getpass
if stdin is None:
stdin = sys.stdin
sudo_password = _resolve_become_password(
ask_become_pass and not no_sudo,
prompt=prompt,
getpass_fn=getpass_fn,
)
ssh_key_passphrase = _resolve_ssh_key_passphrase(
ask_key_passphrase,
env_var=ssh_key_passphrase_env,
prompt=key_prompt,
getpass_fn=getpass_fn,
)
while True:
try:
return _remote_harvest(
sudo_password=sudo_password,
no_sudo=no_sudo,
ssh_key_passphrase=ssh_key_passphrase,
**kwargs,
)
except RemoteSSHKeyPassphraseRequired:
# Already tried a passphrase and still failed.
if ssh_key_passphrase is not None:
raise RemoteSSHKeyPassphraseRequired(
"SSH private key could not be decrypted with the supplied "
"passphrase."
) from None
# Fallback prompt if interactive.
if stdin is not None and getattr(stdin, "isatty", lambda: False)():
ssh_key_passphrase = getpass_fn(key_prompt)
continue
raise RemoteSSHKeyPassphraseRequired(
"SSH private key is encrypted and needs a passphrase. "
"Re-run with --ask-key-passphrase or "
"--ssh-key-passphrase-env VAR."
)
except RemoteSudoPasswordRequired:
if sudo_password is not None:
raise
# Fallback prompt if interactive.
if stdin is not None and getattr(stdin, "isatty", lambda: False)():
sudo_password = getpass_fn(prompt)
continue
raise RemoteSudoPasswordRequired(
"Remote sudo requires a password. Re-run with --ask-become-pass."
)
def _safe_extract_tar(tar: tarfile.TarFile, dest: Path) -> None:
@ -15,7 +185,6 @@ def _safe_extract_tar(tar: tarfile.TarFile, dest: Path) -> None:
Protects against path traversal (e.g. entries containing ../).
"""
# Note: tar member names use POSIX separators regardless of platform.
dest = dest.resolve()
@ -79,30 +248,169 @@ def _build_enroll_pyz(tmpdir: Path) -> Path:
return pyz_path
def _ssh_run(ssh, cmd: str) -> tuple[int, str, str]:
"""Run a command over a Paramiko SSHClient."""
_stdin, stdout, stderr = ssh.exec_command(cmd)
out = stdout.read().decode("utf-8", errors="replace")
err = stderr.read().decode("utf-8", errors="replace")
rc = stdout.channel.recv_exit_status()
def _ssh_run(
ssh,
cmd: str,
*,
get_pty: bool = False,
stdin_text: Optional[str] = None,
close_stdin: bool = False,
) -> tuple[int, str, str]:
"""Run a command over a Paramiko SSHClient.
Paramiko's exec_command runs commands without a TTY by default.
Some hosts have sudoers "requiretty" enabled, which causes sudo to
fail even when passwordless sudo is configured. For those commands,
request a PTY.
We do not request a PTY for commands that stream binary data
(e.g. tar/gzip output), as a PTY can corrupt the byte stream.
"""
stdin, stdout, stderr = ssh.exec_command(cmd, get_pty=get_pty)
# All three file-like objects share the same underlying Channel.
chan = stdout.channel
if stdin_text is not None and stdin is not None:
try:
stdin.write(stdin_text)
stdin.flush()
except Exception:
# If the remote side closed stdin early, ignore.
pass # nosec
finally:
if close_stdin:
# For sudo -S, a wrong password causes sudo to re-prompt and wait
# forever for more input. We try hard to deliver EOF so sudo can
# fail fast.
try:
chan.shutdown_write() # sends EOF to the remote process
except Exception:
pass # nosec
try:
stdin.close()
except Exception:
pass # nosec
# Read incrementally to avoid blocking forever on stdout.read()/stderr.read()
# if the remote process is waiting for more input (e.g. sudo password retry).
out_chunks: list[bytes] = []
err_chunks: list[bytes] = []
# Keep a small tail of stderr to detect sudo retry messages without
# repeatedly joining potentially large buffers.
err_tail = b""
while True:
progressed = False
if chan.recv_ready():
out_chunks.append(chan.recv(1024 * 64))
progressed = True
if chan.recv_stderr_ready():
chunk = chan.recv_stderr(1024 * 64)
err_chunks.append(chunk)
err_tail = (err_tail + chunk)[-4096:]
progressed = True
# If we just attempted sudo -S with a single password line and sudo is
# asking again, detect it and stop waiting.
if close_stdin and stdin_text is not None:
blob = err_tail.lower()
if b"sorry, try again" in blob or b"incorrect password" in blob:
try:
chan.close()
except Exception:
pass # nosec
break
# Exit once the process has exited and we have drained the buffers.
if (
chan.exit_status_ready()
and not chan.recv_ready()
and not chan.recv_stderr_ready()
):
break
if not progressed:
time.sleep(0.05)
out = b"".join(out_chunks).decode("utf-8", errors="replace")
err = b"".join(err_chunks).decode("utf-8", errors="replace")
rc = chan.recv_exit_status() if chan.exit_status_ready() else 1
return rc, out, err
def remote_harvest(
def _ssh_run_sudo(
ssh,
cmd: str,
*,
sudo_password: Optional[str] = None,
get_pty: bool = True,
) -> tuple[int, str, str]:
"""Run cmd via sudo with a safe non-interactive-first strategy.
Strategy:
1) Try `sudo -n`.
2) If sudo reports a password is required and we have one, retry with
`sudo -S` and feed it via stdin.
3) If sudo reports a password is required and we *don't* have one, raise
RemoteSudoPasswordRequired.
We avoid requesting a PTY unless the remote sudo policy requires it.
This makes sudo -S behavior more reliable (wrong passwords fail fast
instead of blocking on a PTY).
"""
cmd_n = f"sudo -n -p '' -- {cmd}"
# First try: never prompt, and prefer no PTY.
rc, out, err = _ssh_run(ssh, cmd_n, get_pty=False)
need_pty = False
# Some sudoers configurations require a TTY even for passwordless sudo.
if get_pty and rc != 0 and _sudo_tty_required(out, err):
need_pty = True
rc, out, err = _ssh_run(ssh, cmd_n, get_pty=True)
if rc == 0:
return rc, out, err
if _sudo_not_permitted(out, err):
return rc, out, err
if _sudo_password_required(out, err):
if sudo_password is None:
raise RemoteSudoPasswordRequired(
"Remote sudo requires a password, but none was provided."
)
cmd_s = f"sudo -S -p '' -- {cmd}"
return _ssh_run(
ssh,
cmd_s,
get_pty=need_pty,
stdin_text=str(sudo_password) + "\n",
close_stdin=True,
)
return rc, out, err
def _remote_harvest(
*,
local_out_dir: Path,
remote_host: str,
remote_port: int = 22,
remote_port: Optional[int] = None,
remote_user: Optional[str] = None,
remote_ssh_config: Optional[str] = None,
remote_python: str = "python3",
dangerous: bool = False,
no_sudo: bool = False,
sudo_password: Optional[str] = None,
ssh_key_passphrase: Optional[str] = None,
include_paths: Optional[list[str]] = None,
exclude_paths: Optional[list[str]] = None,
) -> Path:
"""Run enroll harvest on a remote host via SSH and pull the bundle locally.
Returns the local path to state.json inside local_out_dir.
"""
try:
import paramiko # type: ignore
except Exception as e:
@ -130,13 +438,120 @@ def remote_harvest(
# Users should add the key to known_hosts.
ssh.set_missing_host_key_policy(paramiko.RejectPolicy())
ssh.connect(
hostname=remote_host,
port=int(remote_port),
username=remote_user,
allow_agent=True,
look_for_keys=True,
)
# Resolve SSH connection parameters.
connect_host = remote_host
connect_port = int(remote_port) if remote_port is not None else 22
connect_user = remote_user
key_filename = None
sock = None
hostkey_name = connect_host
# Timeouts derived from ssh_config if set (ConnectTimeout).
# Used both for socket connect (when we create one) and Paramiko handshake/auth.
connect_timeout: Optional[float] = None
if remote_ssh_config:
from paramiko.config import SSHConfig # type: ignore
from paramiko.proxy import ProxyCommand # type: ignore
import socket as _socket
cfg_path = Path(str(remote_ssh_config)).expanduser()
if not cfg_path.exists():
raise RuntimeError(f"SSH config file not found: {cfg_path}")
cfg = SSHConfig()
with cfg_path.open("r", encoding="utf-8") as _fp:
cfg.parse(_fp)
hcfg = cfg.lookup(remote_host)
connect_host = str(hcfg.get("hostname") or remote_host)
hostkey_name = str(hcfg.get("hostkeyalias") or connect_host)
if remote_port is None and hcfg.get("port"):
try:
connect_port = int(str(hcfg.get("port")))
except ValueError:
pass
if connect_user is None and hcfg.get("user"):
connect_user = str(hcfg.get("user"))
ident = hcfg.get("identityfile")
if ident:
if isinstance(ident, (list, tuple)):
key_filename = [str(Path(p).expanduser()) for p in ident]
else:
key_filename = str(Path(str(ident)).expanduser())
# Honour OpenSSH ConnectTimeout (seconds) if present.
if hcfg.get("connecttimeout"):
try:
connect_timeout = float(str(hcfg.get("connecttimeout")))
except (TypeError, ValueError):
connect_timeout = None
proxycmd = hcfg.get("proxycommand")
# AddressFamily support: inet (IPv4 only), inet6 (IPv6 only), any (default).
addrfam = str(hcfg.get("addressfamily") or "any").strip().lower()
family: Optional[int] = None
if addrfam == "inet":
family = _socket.AF_INET
elif addrfam == "inet6":
family = _socket.AF_INET6
if proxycmd:
# ProxyCommand provides the transport; AddressFamily doesn't apply here.
sock = ProxyCommand(str(proxycmd))
elif family is not None:
# Enforce the requested address family by pre-connecting the socket and
# passing it into Paramiko via sock=.
last_err: Optional[OSError] = None
infos = _socket.getaddrinfo(
connect_host, connect_port, family, _socket.SOCK_STREAM
)
for af, socktype, proto, _, sa in infos:
s = _socket.socket(af, socktype, proto)
if connect_timeout is not None:
s.settimeout(connect_timeout)
try:
s.connect(sa)
sock = s
break
except OSError as e:
last_err = e
try:
s.close()
except Exception:
pass # nosec
if sock is None and last_err is not None:
raise last_err
elif hostkey_name != connect_host:
# If HostKeyAlias is used, connect to HostName via a socket but
# use HostKeyAlias for known_hosts lookups.
sock = _socket.create_connection(
(connect_host, connect_port), timeout=connect_timeout
)
# If we created a socket (sock!=None), pass hostkey_name as hostname so
# known_hosts lookup uses HostKeyAlias (or whatever hostkey_name resolved to).
try:
ssh.connect(
hostname=hostkey_name if sock is not None else connect_host,
port=connect_port,
username=connect_user,
key_filename=key_filename,
sock=sock,
allow_agent=True,
look_for_keys=True,
timeout=connect_timeout,
banner_timeout=connect_timeout,
auth_timeout=connect_timeout,
passphrase=ssh_key_passphrase,
)
except paramiko.PasswordRequiredException as e: # type: ignore[attr-defined]
raise RemoteSSHKeyPassphraseRequired(
"SSH private key is encrypted and no passphrase was provided."
) from e
# If no username was explicitly provided, SSH may have selected a default.
# We need a concrete username for the (sudo) chown step below.
@ -165,42 +580,65 @@ def remote_harvest(
sftp.put(str(pyz), rapp)
# Run remote harvest.
_cmd = f"{remote_python} {rapp} harvest --out {rbundle}"
argv: list[str] = [
remote_python,
rapp,
"harvest",
"--out",
rbundle,
]
if dangerous:
argv.append("--dangerous")
for p in include_paths or []:
argv.extend(["--include-path", str(p)])
for p in exclude_paths or []:
argv.extend(["--exclude-path", str(p)])
_cmd = " ".join(map(shlex.quote, argv))
if not no_sudo:
# Prefer non-interactive sudo first; retry with -S only when needed.
rc, out, err = _ssh_run_sudo(
ssh, _cmd, sudo_password=sudo_password, get_pty=True
)
cmd = f"sudo {_cmd}"
else:
cmd = _cmd
if dangerous:
cmd += " --dangerous"
rc, out, err = _ssh_run(ssh, cmd)
rc, out, err = _ssh_run(ssh, cmd, get_pty=False)
if rc != 0:
raise RuntimeError(
"Remote harvest failed.\n"
f"Command: {cmd}\n"
f"Exit code: {rc}\n"
f"Stdout: {out.strip()}\n"
f"Stderr: {err.strip()}"
)
if not no_sudo:
# Ensure user can read the files, before we tar it
# Ensure user can read the files, before we tar it.
if not resolved_user:
raise RuntimeError(
"Unable to determine remote username for chown. "
"Pass --remote-user explicitly or use --no-sudo."
)
cmd = f"sudo chown -R {resolved_user} {rbundle}"
rc, out, err = _ssh_run(ssh, cmd)
chown_cmd = f"chown -R {resolved_user} {rbundle}"
rc, out, err = _ssh_run_sudo(
ssh,
chown_cmd,
sudo_password=sudo_password,
get_pty=True,
)
if rc != 0:
raise RuntimeError(
"chown of harvest failed.\n"
f"Command: {cmd}\n"
f"Command: sudo {chown_cmd}\n"
f"Exit code: {rc}\n"
f"Stdout: {out.strip()}\n"
f"Stderr: {err.strip()}"
)
# Stream a tarball back to the local machine (avoid creating a tar file on the remote).
cmd = f"tar -cz -C {rbundle} ."
_stdin, stdout, stderr = ssh.exec_command(cmd) # nosec
_stdin, stdout, stderr = ssh.exec_command(cmd) # nosec
with open(local_tgz, "wb") as f:
while True:
chunk = stdout.read(1024 * 128)

323
enroll/rpm.py Normal file
View file

@ -0,0 +1,323 @@
from __future__ import annotations
import os
import re
import shutil
import subprocess # nosec
from typing import Dict, List, Optional, Set, Tuple
def _run(
cmd: list[str], *, allow_fail: bool = False, merge_err: bool = False
) -> tuple[int, str]:
"""Run a command and return (rc, stdout).
If merge_err is True, stderr is merged into stdout to preserve ordering.
"""
p = subprocess.run(
cmd,
check=False,
text=True,
stdout=subprocess.PIPE,
stderr=(subprocess.STDOUT if merge_err else subprocess.PIPE),
) # nosec
out = p.stdout or ""
if (not allow_fail) and p.returncode != 0:
err = "" if merge_err else (p.stderr or "")
raise RuntimeError(f"Command failed: {cmd}\n{err}{out}")
return p.returncode, out
def rpm_owner(path: str) -> Optional[str]:
"""Return owning package name for a path, or None if unowned."""
if not path:
return None
rc, out = _run(
["rpm", "-qf", "--qf", "%{NAME}\n", path], allow_fail=True, merge_err=True
)
if rc != 0:
return None
for line in out.splitlines():
line = line.strip()
if not line:
continue
if "is not owned" in line:
return None
# With --qf we expect just the package name.
if re.match(r"^[A-Za-z0-9_.+:-]+$", line):
# Strip any accidental epoch/name-version-release output.
return line.split(":", 1)[-1].strip() if line else None
return None
_ARCH_SUFFIXES = {
"noarch",
"x86_64",
"i686",
"aarch64",
"armv7hl",
"ppc64le",
"s390x",
"riscv64",
}
def _strip_arch(token: str) -> str:
"""Strip a trailing .ARCH from a yum/dnf package token."""
t = token.strip()
if "." not in t:
return t
head, tail = t.rsplit(".", 1)
if tail in _ARCH_SUFFIXES:
return head
return t
def list_manual_packages() -> List[str]:
"""Return packages considered "user-installed" on RPM-based systems.
Best-effort:
1) dnf repoquery --userinstalled
2) dnf history userinstalled
3) yum history userinstalled
If none are available, returns an empty list.
"""
def _dedupe(pkgs: List[str]) -> List[str]:
return sorted({p for p in (pkgs or []) if p})
if shutil.which("dnf"):
# Prefer a machine-friendly output.
for cmd in (
["dnf", "-q", "repoquery", "--userinstalled", "--qf", "%{name}\n"],
["dnf", "-q", "repoquery", "--userinstalled"],
):
rc, out = _run(cmd, allow_fail=True, merge_err=True)
if rc == 0 and out.strip():
pkgs = []
for line in out.splitlines():
line = line.strip()
if not line or line.startswith("Loaded plugins"):
continue
pkgs.append(_strip_arch(line.split()[0]))
if pkgs:
return _dedupe(pkgs)
# Fallback
rc, out = _run(
["dnf", "-q", "history", "userinstalled"], allow_fail=True, merge_err=True
)
if rc == 0 and out.strip():
pkgs = []
for line in out.splitlines():
line = line.strip()
if not line or line.startswith("Installed") or line.startswith("Last"):
continue
# Often: "vim-enhanced.x86_64"
tok = line.split()[0]
pkgs.append(_strip_arch(tok))
if pkgs:
return _dedupe(pkgs)
if shutil.which("yum"):
rc, out = _run(
["yum", "-q", "history", "userinstalled"], allow_fail=True, merge_err=True
)
if rc == 0 and out.strip():
pkgs = []
for line in out.splitlines():
line = line.strip()
if (
not line
or line.startswith("Installed")
or line.startswith("Loaded")
):
continue
tok = line.split()[0]
pkgs.append(_strip_arch(tok))
if pkgs:
return _dedupe(pkgs)
return []
def list_installed_packages() -> Dict[str, List[Dict[str, str]]]:
"""Return mapping of installed package name -> installed instances.
Uses `rpm -qa` and is expected to work on RHEL/Fedora-like systems.
Output format:
{"pkg": [{"version": "...", "arch": "..."}, ...], ...}
The version string is formatted as:
- "<version>-<release>" for typical packages
- "<epoch>:<version>-<release>" if a non-zero epoch is present
"""
try:
_, out = _run(
[
"rpm",
"-qa",
"--qf",
"%{NAME}\t%{EPOCHNUM}\t%{VERSION}\t%{RELEASE}\t%{ARCH}\n",
],
allow_fail=False,
merge_err=True,
)
except Exception:
return {}
pkgs: Dict[str, List[Dict[str, str]]] = {}
for raw in (out or "").splitlines():
line = raw.strip("\n")
if not line:
continue
parts = line.split("\t")
if len(parts) < 5:
continue
name, epoch, ver, rel, arch = [p.strip() for p in parts[:5]]
if not name or not ver:
continue
# Normalise epoch.
epoch = epoch.strip()
if epoch.lower() in ("(none)", "none", ""):
epoch = "0"
v = f"{ver}-{rel}" if rel else ver
if epoch and epoch.isdigit() and epoch != "0":
v = f"{epoch}:{v}"
pkgs.setdefault(name, []).append({"version": v, "arch": arch})
for k in list(pkgs.keys()):
pkgs[k] = sorted(
pkgs[k], key=lambda x: (x.get("arch") or "", x.get("version") or "")
)
return pkgs
def _walk_etc_files() -> List[str]:
out: List[str] = []
for dirpath, _, filenames in os.walk("/etc"):
for fn in filenames:
p = os.path.join(dirpath, fn)
if os.path.islink(p) or not os.path.isfile(p):
continue
out.append(p)
return out
def build_rpm_etc_index() -> (
Tuple[Set[str], Dict[str, str], Dict[str, Set[str]], Dict[str, List[str]]]
):
"""Best-effort equivalent of build_dpkg_etc_index for RPM systems.
This builds indexes by walking the live /etc tree and querying RPM ownership
for each file.
Returns:
owned_etc_paths: set of /etc paths owned by rpm
etc_owner_map: /etc/path -> pkg
topdir_to_pkgs: "nginx" -> {"nginx", ...} based on /etc/<topdir>/...
pkg_to_etc_paths: pkg -> list of owned /etc paths
"""
owned: Set[str] = set()
owner: Dict[str, str] = {}
topdir_to_pkgs: Dict[str, Set[str]] = {}
pkg_to_etc: Dict[str, List[str]] = {}
paths = _walk_etc_files()
# Query in chunks to avoid excessive process spawns.
chunk_size = 250
not_owned_re = re.compile(
r"^file\s+(?P<path>.+?)\s+is\s+not\s+owned\s+by\s+any\s+package", re.IGNORECASE
)
for i in range(0, len(paths), chunk_size):
chunk = paths[i : i + chunk_size]
rc, out = _run(
["rpm", "-qf", "--qf", "%{NAME}\n", *chunk],
allow_fail=True,
merge_err=True,
)
lines = [ln.strip() for ln in out.splitlines() if ln.strip()]
# Heuristic: rpm prints one output line per input path. If that isn't
# true (warnings/errors), fall back to per-file queries for this chunk.
if len(lines) != len(chunk):
for p in chunk:
pkg = rpm_owner(p)
if not pkg:
continue
owned.add(p)
owner.setdefault(p, pkg)
pkg_to_etc.setdefault(pkg, []).append(p)
parts = p.split("/", 3)
if len(parts) >= 3 and parts[2]:
topdir_to_pkgs.setdefault(parts[2], set()).add(pkg)
continue
for pth, line in zip(chunk, lines):
if not line:
continue
if not_owned_re.match(line) or "is not owned" in line:
continue
pkg = line.split()[0].strip()
if not pkg:
continue
owned.add(pth)
owner.setdefault(pth, pkg)
pkg_to_etc.setdefault(pkg, []).append(pth)
parts = pth.split("/", 3)
if len(parts) >= 3 and parts[2]:
topdir_to_pkgs.setdefault(parts[2], set()).add(pkg)
for k, v in list(pkg_to_etc.items()):
pkg_to_etc[k] = sorted(set(v))
return owned, owner, topdir_to_pkgs, pkg_to_etc
def rpm_config_files(pkg: str) -> Set[str]:
"""Return config files for a package (rpm -qc)."""
rc, out = _run(["rpm", "-qc", pkg], allow_fail=True, merge_err=True)
if rc != 0:
return set()
files: Set[str] = set()
for line in out.splitlines():
line = line.strip()
if line.startswith("/"):
files.add(line)
return files
def rpm_modified_files(pkg: str) -> Set[str]:
"""Return files reported as modified by rpm verification (rpm -V).
rpm -V only prints lines for differences/missing files.
"""
rc, out = _run(["rpm", "-V", pkg], allow_fail=True, merge_err=True)
# rc is non-zero when there are differences; we still want the output.
files: Set[str] = set()
for raw in out.splitlines():
line = raw.strip()
if not line:
continue
# Typical forms:
# S.5....T. c /etc/foo.conf
# missing /etc/bar
m = re.search(r"\s(/\S+)$", line)
if m:
files.add(m.group(1))
continue
if line.startswith("missing"):
parts = line.split()
if parts and parts[-1].startswith("/"):
files.add(parts[-1])
return files

View file

@ -0,0 +1,4 @@
"""Vendored JSON schemas.
These are used by `enroll validate` so validation can run offline.
"""

View file

@ -0,0 +1,712 @@
{
"$defs": {
"AptConfigSnapshot": {
"allOf": [
{
"$ref": "#/$defs/RoleCommon"
},
{
"properties": {
"role_name": {
"const": "apt_config"
}
},
"type": "object"
}
],
"unevaluatedProperties": false
},
"DnfConfigSnapshot": {
"allOf": [
{
"$ref": "#/$defs/RoleCommon"
},
{
"properties": {
"role_name": {
"const": "dnf_config"
}
},
"type": "object"
}
],
"unevaluatedProperties": false
},
"EtcCustomSnapshot": {
"allOf": [
{
"$ref": "#/$defs/RoleCommon"
},
{
"properties": {
"role_name": {
"const": "etc_custom"
}
},
"type": "object"
}
],
"unevaluatedProperties": false
},
"ExcludedFile": {
"additionalProperties": false,
"properties": {
"path": {
"minLength": 1,
"pattern": "^/.*",
"type": "string"
},
"reason": {
"enum": [
"user_excluded",
"unreadable",
"backup_file",
"log_file",
"denied_path",
"too_large",
"not_regular_file",
"not_symlink",
"binary_like",
"sensitive_content"
],
"type": "string"
}
},
"required": [
"path",
"reason"
],
"type": "object"
},
"ExtraPathsSnapshot": {
"allOf": [
{
"$ref": "#/$defs/RoleCommon"
},
{
"properties": {
"exclude_patterns": {
"items": {
"type": "string"
},
"type": "array"
},
"include_patterns": {
"items": {
"type": "string"
},
"type": "array"
},
"role_name": {
"const": "extra_paths"
}
},
"required": [
"include_patterns",
"exclude_patterns"
],
"type": "object"
}
],
"unevaluatedProperties": false
},
"InstalledPackageInstance": {
"additionalProperties": false,
"properties": {
"arch": {
"minLength": 1,
"type": "string"
},
"version": {
"minLength": 1,
"type": "string"
}
},
"required": [
"version",
"arch"
],
"type": "object"
},
"ManagedDir": {
"additionalProperties": false,
"properties": {
"group": {
"minLength": 1,
"type": "string"
},
"mode": {
"pattern": "^[0-7]{4}$",
"type": "string"
},
"owner": {
"minLength": 1,
"type": "string"
},
"path": {
"minLength": 1,
"pattern": "^/.*",
"type": "string"
},
"reason": {
"enum": [
"parent_of_managed_file",
"user_include_dir"
],
"type": "string"
}
},
"required": [
"path",
"owner",
"group",
"mode",
"reason"
],
"type": "object"
},
"ManagedFile": {
"additionalProperties": false,
"properties": {
"group": {
"minLength": 1,
"type": "string"
},
"mode": {
"pattern": "^[0-7]{4}$",
"type": "string"
},
"owner": {
"minLength": 1,
"type": "string"
},
"path": {
"minLength": 1,
"pattern": "^/.*",
"type": "string"
},
"reason": {
"enum": [
"apt_config",
"apt_keyring",
"apt_signed_by_keyring",
"apt_source",
"authorized_keys",
"cron_snippet",
"custom_specific_path",
"custom_unowned",
"dnf_config",
"logrotate_snippet",
"modified_conffile",
"modified_packaged_file",
"related_timer",
"rpm_gpg_key",
"ssh_public_key",
"system_cron",
"system_firewall",
"system_logrotate",
"system_modprobe",
"system_mounts",
"system_network",
"system_rc",
"system_security",
"system_sysctl",
"systemd_dropin",
"systemd_envfile",
"user_include",
"user_profile",
"user_shell_aliases",
"user_shell_logout",
"user_shell_rc",
"usr_local_bin_script",
"usr_local_etc_custom",
"yum_conf",
"yum_config",
"yum_repo"
],
"type": "string"
},
"src_rel": {
"minLength": 1,
"pattern": "^[^/].*",
"type": "string"
}
},
"required": [
"path",
"src_rel",
"owner",
"group",
"mode",
"reason"
],
"type": "object"
},
"ManagedLink": {
"additionalProperties": false,
"type": "object",
"properties": {
"path": {
"type": "string",
"minLength": 1,
"pattern": "^/.*"
},
"target": {
"type": "string",
"minLength": 1
},
"reason": {
"type": "string",
"enum": [
"enabled_symlink"
]
}
},
"required": [
"path",
"target",
"reason"
]
},
"ObservedVia": {
"oneOf": [
{
"additionalProperties": false,
"properties": {
"kind": {
"const": "user_installed"
}
},
"required": [
"kind"
],
"type": "object"
},
{
"additionalProperties": false,
"properties": {
"kind": {
"const": "systemd_unit"
},
"ref": {
"minLength": 1,
"type": "string"
}
},
"required": [
"kind",
"ref"
],
"type": "object"
},
{
"additionalProperties": false,
"properties": {
"kind": {
"const": "package_role"
},
"ref": {
"minLength": 1,
"type": "string"
}
},
"required": [
"kind",
"ref"
],
"type": "object"
}
]
},
"PackageInventoryEntry": {
"additionalProperties": false,
"properties": {
"arches": {
"items": {
"minLength": 1,
"type": "string"
},
"type": "array"
},
"installations": {
"items": {
"$ref": "#/$defs/InstalledPackageInstance"
},
"type": "array"
},
"observed_via": {
"items": {
"$ref": "#/$defs/ObservedVia"
},
"type": "array"
},
"roles": {
"items": {
"minLength": 1,
"type": "string"
},
"type": "array"
},
"version": {
"type": [
"string",
"null"
]
}
},
"required": [
"version",
"arches",
"installations",
"observed_via",
"roles"
],
"type": "object"
},
"PackageSnapshot": {
"allOf": [
{
"$ref": "#/$defs/RoleCommon"
},
{
"properties": {
"package": {
"minLength": 1,
"type": "string"
}
},
"required": [
"package"
],
"type": "object"
}
],
"unevaluatedProperties": false
},
"RoleCommon": {
"properties": {
"excluded": {
"items": {
"$ref": "#/$defs/ExcludedFile"
},
"type": "array"
},
"managed_dirs": {
"items": {
"$ref": "#/$defs/ManagedDir"
},
"type": "array"
},
"managed_files": {
"items": {
"$ref": "#/$defs/ManagedFile"
},
"type": "array"
},
"managed_links": {
"items": {
"$ref": "#/$defs/ManagedLink"
},
"type": "array"
},
"notes": {
"items": {
"type": "string"
},
"type": "array"
},
"role_name": {
"minLength": 1,
"pattern": "^[A-Za-z0-9_]+$",
"type": "string"
}
},
"required": [
"role_name",
"managed_dirs",
"managed_files",
"excluded",
"notes"
],
"type": "object"
},
"ServiceSnapshot": {
"allOf": [
{
"$ref": "#/$defs/RoleCommon"
},
{
"properties": {
"active_state": {
"type": [
"string",
"null"
]
},
"condition_result": {
"type": [
"string",
"null"
]
},
"packages": {
"items": {
"minLength": 1,
"type": "string"
},
"type": "array"
},
"role_name": {
"minLength": 1,
"pattern": "^[a-z_][a-z0-9_]*$",
"type": "string"
},
"sub_state": {
"type": [
"string",
"null"
]
},
"unit": {
"minLength": 1,
"type": "string"
},
"unit_file_state": {
"type": [
"string",
"null"
]
}
},
"required": [
"unit",
"packages",
"active_state",
"sub_state",
"unit_file_state",
"condition_result"
],
"type": "object"
}
],
"unevaluatedProperties": false
},
"UserEntry": {
"additionalProperties": false,
"properties": {
"gecos": {
"type": "string"
},
"gid": {
"minimum": 0,
"type": "integer"
},
"home": {
"type": "string"
},
"name": {
"minLength": 1,
"type": "string"
},
"primary_group": {
"minLength": 1,
"type": "string"
},
"shell": {
"type": "string"
},
"supplementary_groups": {
"items": {
"minLength": 1,
"type": "string"
},
"type": "array"
},
"uid": {
"minimum": 0,
"type": "integer"
}
},
"required": [
"name",
"uid",
"gid",
"gecos",
"home",
"shell",
"primary_group",
"supplementary_groups"
],
"type": "object"
},
"UsersSnapshot": {
"allOf": [
{
"$ref": "#/$defs/RoleCommon"
},
{
"properties": {
"role_name": {
"const": "users"
},
"users": {
"items": {
"$ref": "#/$defs/UserEntry"
},
"type": "array"
}
},
"required": [
"users"
],
"type": "object"
}
],
"unevaluatedProperties": false
},
"UsrLocalCustomSnapshot": {
"allOf": [
{
"$ref": "#/$defs/RoleCommon"
},
{
"properties": {
"role_name": {
"const": "usr_local_custom"
}
},
"type": "object"
}
],
"unevaluatedProperties": false
}
},
"$id": "https://enroll.sh/schema/state.schema.json",
"$schema": "https://json-schema.org/draft/2020-12/schema",
"additionalProperties": false,
"properties": {
"enroll": {
"additionalProperties": false,
"properties": {
"harvest_time": {
"minimum": 0,
"type": "integer"
},
"version": {
"type": "string"
}
},
"required": [
"version",
"harvest_time"
],
"type": "object"
},
"host": {
"additionalProperties": false,
"properties": {
"hostname": {
"minLength": 1,
"type": "string"
},
"os": {
"enum": [
"debian",
"redhat",
"unknown"
],
"type": "string"
},
"os_release": {
"additionalProperties": {
"type": "string"
},
"type": "object"
},
"pkg_backend": {
"enum": [
"dpkg",
"rpm"
],
"type": "string"
}
},
"required": [
"hostname",
"os",
"pkg_backend",
"os_release"
],
"type": "object"
},
"inventory": {
"additionalProperties": false,
"properties": {
"packages": {
"additionalProperties": {
"$ref": "#/$defs/PackageInventoryEntry"
},
"type": "object"
}
},
"required": [
"packages"
],
"type": "object"
},
"roles": {
"additionalProperties": false,
"properties": {
"apt_config": {
"$ref": "#/$defs/AptConfigSnapshot"
},
"dnf_config": {
"$ref": "#/$defs/DnfConfigSnapshot"
},
"etc_custom": {
"$ref": "#/$defs/EtcCustomSnapshot"
},
"extra_paths": {
"$ref": "#/$defs/ExtraPathsSnapshot"
},
"packages": {
"items": {
"$ref": "#/$defs/PackageSnapshot"
},
"type": "array"
},
"services": {
"items": {
"$ref": "#/$defs/ServiceSnapshot"
},
"type": "array"
},
"users": {
"$ref": "#/$defs/UsersSnapshot"
},
"usr_local_custom": {
"$ref": "#/$defs/UsrLocalCustomSnapshot"
}
},
"required": [
"users",
"services",
"packages",
"apt_config",
"dnf_config",
"etc_custom",
"usr_local_custom",
"extra_paths"
],
"type": "object"
}
},
"required": [
"enroll",
"host",
"inventory",
"roles"
],
"title": "Enroll harvest state.json schema (latest)",
"type": "object"
}

View file

@ -2,7 +2,7 @@ from __future__ import annotations
import os
import shutil
import subprocess # nosec
import subprocess # nosec
import tempfile
from pathlib import Path
from typing import Iterable, List, Optional
@ -62,7 +62,7 @@ def encrypt_file_binary(
],
capture_output=True,
check=False,
) # nosec
) # nosec
if res.returncode != 0:
raise SopsError(
"sops encryption failed:\n"
@ -112,7 +112,7 @@ def decrypt_file_binary_to(
],
capture_output=True,
check=False,
) # nosec
) # nosec
if res.returncode != 0:
raise SopsError(
"sops decryption failed:\n"

View file

@ -33,6 +33,19 @@ def _run(cmd: list[str]) -> str:
return p.stdout
@dataclass
class TimerInfo:
name: str
fragment_path: Optional[str]
dropin_paths: List[str]
env_files: List[str]
trigger_unit: Optional[str]
active_state: Optional[str]
sub_state: Optional[str]
unit_file_state: Optional[str]
condition_result: Optional[str]
def list_enabled_services() -> List[str]:
out = _run(
[
@ -58,6 +71,31 @@ def list_enabled_services() -> List[str]:
return sorted(set(units))
def list_enabled_timers() -> List[str]:
out = _run(
[
"systemctl",
"list-unit-files",
"--type=timer",
"--state=enabled",
"--no-legend",
]
)
units: List[str] = []
for line in out.splitlines():
parts = line.split()
if not parts:
continue
unit = parts[0].strip()
if not unit.endswith(".timer"):
continue
# Skip template units like "foo@.timer"
if unit.endswith("@.timer"):
continue
units.append(unit)
return sorted(set(units))
def get_unit_info(unit: str) -> UnitInfo:
p = subprocess.run(
[
@ -117,3 +155,62 @@ def get_unit_info(unit: str) -> UnitInfo:
unit_file_state=kv.get("UnitFileState") or None,
condition_result=kv.get("ConditionResult") or None,
)
def get_timer_info(unit: str) -> TimerInfo:
p = subprocess.run(
[
"systemctl",
"show",
unit,
"-p",
"FragmentPath",
"-p",
"DropInPaths",
"-p",
"EnvironmentFiles",
"-p",
"Unit",
"-p",
"ActiveState",
"-p",
"SubState",
"-p",
"UnitFileState",
"-p",
"ConditionResult",
],
text=True,
capture_output=True,
) # nosec
if p.returncode != 0:
raise RuntimeError(f"systemctl show failed for {unit}: {p.stderr}")
kv: dict[str, str] = {}
for line in (p.stdout or "").splitlines():
if "=" in line:
k, v = line.split("=", 1)
kv[k] = v.strip()
fragment = kv.get("FragmentPath") or None
dropins = [pp for pp in (kv.get("DropInPaths", "") or "").split() if pp]
env_files: List[str] = []
for token in (kv.get("EnvironmentFiles", "") or "").split():
token = token.lstrip("-")
if token:
env_files.append(token)
trigger = kv.get("Unit") or None
return TimerInfo(
name=unit,
fragment_path=fragment,
dropin_paths=dropins,
env_files=env_files,
trigger_unit=trigger,
active_state=kv.get("ActiveState") or None,
sub_state=kv.get("SubState") or None,
unit_file_state=kv.get("UnitFileState") or None,
condition_result=kv.get("ConditionResult") or None,
)

223
enroll/validate.py Normal file
View file

@ -0,0 +1,223 @@
from __future__ import annotations
import json
import urllib.request
from dataclasses import dataclass
from pathlib import Path
from typing import Any, Dict, List, Optional, Set, Tuple
import jsonschema
from .diff import BundleRef, _bundle_from_input
@dataclass
class ValidationResult:
errors: List[str]
warnings: List[str]
@property
def ok(self) -> bool:
return not self.errors
def to_dict(self) -> Dict[str, Any]:
return {
"ok": self.ok,
"errors": list(self.errors),
"warnings": list(self.warnings),
}
def to_text(self) -> str:
lines: List[str] = []
if not self.errors and not self.warnings:
lines.append("OK: harvest bundle validated")
elif not self.errors and self.warnings:
lines.append(f"WARN: {len(self.warnings)} warning(s)")
else:
lines.append(f"ERROR: {len(self.errors)} validation error(s)")
if self.errors:
lines.append("")
lines.append("Errors:")
for e in self.errors:
lines.append(f"- {e}")
if self.warnings:
lines.append("")
lines.append("Warnings:")
for w in self.warnings:
lines.append(f"- {w}")
return "\n".join(lines) + "\n"
def _default_schema_path() -> Path:
# Keep the schema vendored with the codebase so enroll can validate offline.
return Path(__file__).resolve().parent / "schema" / "state.schema.json"
def _load_schema(schema: Optional[str]) -> Dict[str, Any]:
"""Load a JSON schema.
If schema is None, load the vendored schema.
If schema begins with http(s)://, fetch it.
Otherwise, treat it as a local file path.
"""
if not schema:
p = _default_schema_path()
with open(p, "r", encoding="utf-8") as f:
return json.load(f)
if schema.startswith("http://") or schema.startswith("https://"):
with urllib.request.urlopen(schema, timeout=10) as resp: # nosec
data = resp.read()
return json.loads(data.decode("utf-8"))
p = Path(schema).expanduser()
with open(p, "r", encoding="utf-8") as f:
return json.load(f)
def _json_pointer(err: jsonschema.ValidationError) -> str:
# Build a JSON pointer-ish path that is easy to read.
if err.absolute_path:
parts = [str(p) for p in err.absolute_path]
return "/" + "/".join(parts)
return "/"
def _iter_managed_files(state: Dict[str, Any]) -> List[Tuple[str, Dict[str, Any]]]:
"""Return (role_name, managed_file_dict) tuples across all roles."""
roles = state.get("roles") or {}
out: List[Tuple[str, Dict[str, Any]]] = []
# Singleton roles
for rn in [
"users",
"apt_config",
"dnf_config",
"etc_custom",
"usr_local_custom",
"extra_paths",
]:
snap = roles.get(rn) or {}
for mf in snap.get("managed_files") or []:
if isinstance(mf, dict):
out.append((rn, mf))
# Array roles
for s in roles.get("services") or []:
if not isinstance(s, dict):
continue
role_name = str(s.get("role_name") or "unknown")
for mf in s.get("managed_files") or []:
if isinstance(mf, dict):
out.append((role_name, mf))
for p in roles.get("packages") or []:
if not isinstance(p, dict):
continue
role_name = str(p.get("role_name") or "unknown")
for mf in p.get("managed_files") or []:
if isinstance(mf, dict):
out.append((role_name, mf))
return out
def validate_harvest(
harvest_input: str,
*,
sops_mode: bool = False,
schema: Optional[str] = None,
no_schema: bool = False,
) -> ValidationResult:
"""Validate an enroll harvest bundle.
Checks:
- state.json parses
- state.json validates against the schema (unless no_schema)
- every managed_file src_rel exists in artifacts/<role>/<src_rel>
"""
errors: List[str] = []
warnings: List[str] = []
bundle: BundleRef = _bundle_from_input(harvest_input, sops_mode=sops_mode)
try:
state_path = bundle.state_path
if not state_path.exists():
return ValidationResult(
errors=[f"missing state.json at {state_path}"], warnings=[]
)
try:
state = json.loads(state_path.read_text(encoding="utf-8"))
except Exception as e: # noqa: BLE001
return ValidationResult(
errors=[f"failed to parse state.json: {e!r}"], warnings=[]
)
if not no_schema:
try:
sch = _load_schema(schema)
validator = jsonschema.Draft202012Validator(sch)
for err in sorted(validator.iter_errors(state), key=str):
ptr = _json_pointer(err)
msg = err.message
errors.append(f"schema {ptr}: {msg}")
except Exception as e: # noqa: BLE001
errors.append(f"failed to load/validate schema: {e!r}")
# Artifact existence checks
artifacts_dir = bundle.dir / "artifacts"
referenced: Set[Tuple[str, str]] = set()
for role_name, mf in _iter_managed_files(state):
src_rel = str(mf.get("src_rel") or "")
if not src_rel:
errors.append(
f"managed_file missing src_rel for role {role_name} (path={mf.get('path')!r})"
)
continue
if src_rel.startswith("/") or ".." in src_rel.split("/"):
errors.append(
f"managed_file has suspicious src_rel for role {role_name}: {src_rel!r}"
)
continue
referenced.add((role_name, src_rel))
p = artifacts_dir / role_name / src_rel
if not p.exists():
errors.append(
f"missing artifact for role {role_name}: artifacts/{role_name}/{src_rel}"
)
continue
if not p.is_file():
errors.append(
f"artifact is not a file for role {role_name}: artifacts/{role_name}/{src_rel}"
)
# Warn if there are extra files in artifacts not referenced.
if artifacts_dir.exists() and artifacts_dir.is_dir():
for fp in artifacts_dir.rglob("*"):
if not fp.is_file():
continue
try:
rel = fp.relative_to(artifacts_dir)
except ValueError:
continue
parts = rel.parts
if len(parts) < 2:
continue
role_name = parts[0]
src_rel = "/".join(parts[1:])
if (role_name, src_rel) not in referenced:
warnings.append(
f"unreferenced artifact present: artifacts/{role_name}/{src_rel}"
)
return ValidationResult(errors=errors, warnings=warnings)
finally:
# Ensure any temp extraction dirs are cleaned up.
if bundle.tempdir is not None:
bundle.tempdir.cleanup()

32
enroll/version.py Normal file
View file

@ -0,0 +1,32 @@
from __future__ import annotations
def get_enroll_version() -> str:
"""
Best-effort version lookup that works when installed via:
- poetry/pip/wheel
- deb/rpm system packages
Falls back to "0+unknown" when running from an unpacked source tree.
"""
try:
from importlib.metadata import (
packages_distributions,
version,
)
except Exception:
# Very old Python or unusual environment
return "unknown"
# Map import package -> dist(s)
dist_names = []
try:
dist_names = (packages_distributions() or {}).get("enroll", []) or []
except Exception:
dist_names = []
# Try mapped dists first, then a reasonable default
for dist in [*dist_names, "enroll"]:
try:
return version(dist)
except Exception:
return "unknown"

972
poetry.lock generated

File diff suppressed because it is too large Load diff

View file

@ -1,17 +1,21 @@
[tool.poetry]
name = "enroll"
version = "0.1.0"
version = "0.5.0"
description = "Enroll a server's running state retrospectively into Ansible"
authors = ["Miguel Jacq <mig@mig5.net>"]
license = "GPL-3.0-or-later"
readme = "README.md"
packages = [{ include = "enroll" }]
repository = "https://git.mig5.net/mig5/enroll"
include = [
{ path = "enroll/schema/state.schema.json", format = ["sdist", "wheel"] }
]
[tool.poetry.dependencies]
python = "^3.10"
pyyaml = "^6.0.3"
paramiko = "^4.0.0"
pyyaml = "^6"
paramiko = ">=3.5"
jsonschema = "^4.23.0"
[tool.poetry.scripts]
enroll = "enroll.cli:main"

View file

@ -42,3 +42,52 @@ for dist in ${DISTS[@]}; do
debfile=$(ls -1 dist/${release}/*.deb)
reprepro -b /home/user/git/repo includedeb "${release}" "${debfile}"
done
# RPM
sudo apt-get -y install createrepo-c rpm
BUILD_OUTPUT="${HOME}/git/enroll/dist"
KEYID="54A91143AE0AB4F7743B01FE888ED1B423A3BC99"
REPO_ROOT="${HOME}/git/repo_rpm"
REMOTE="letessier.mig5.net:/opt/repo_rpm"
DISTS=(
fedora:43
fedora:42
)
for dist in ${DISTS[@]}; do
release=$(echo ${dist} | cut -d: -f2)
REPO_RELEASE_ROOT="${REPO_ROOT}/${release}"
RPM_REPO="${REPO_RELEASE_ROOT}/rpm/x86_64"
mkdir -p "$RPM_REPO"
docker build \
--no-cache \
-f Dockerfile.rpmbuild \
-t enroll-rpm:${release} \
--progress=plain \
--build-arg BASE_IMAGE=${dist} \
.
rm -rf "$PWD/dist/rpm"/*
mkdir -p "$PWD/dist/rpm"
docker run --rm -v "$PWD":/src -v "$PWD/dist/rpm":/out enroll-rpm:${release}
sudo chown -R "${USER}" "$PWD/dist"
for file in `ls -1 "${BUILD_OUTPUT}/rpm"`; do
rpmsign --addsign "${BUILD_OUTPUT}/rpm/$file"
done
cp "${BUILD_OUTPUT}/rpm/"*.rpm "$RPM_REPO/"
createrepo_c "$RPM_REPO"
echo "==> Signing repomd.xml..."
qubes-gpg-client --local-user "$KEYID" --detach-sign --armor "$RPM_REPO/repodata/repomd.xml" > "$RPM_REPO/repodata/repomd.xml.asc"
done
echo "==> Syncing repo to server..."
rsync -aHPvz --exclude=.git --delete "$REPO_ROOT/" "$REMOTE/"
echo "Done!"

97
rpm/enroll.spec Normal file
View file

@ -0,0 +1,97 @@
%global upstream_version 0.5.0
Name: enroll
Version: %{upstream_version}
Release: 1%{?dist}.enroll1
Summary: Enroll a server's running state retrospectively into Ansible.
License: GPL-3.0-or-later
URL: https://git.mig5.net/mig5/enroll
Source0: %{name}-%{version}.tar.gz
BuildArch: noarch
BuildRequires: pyproject-rpm-macros
BuildRequires: python3-devel
BuildRequires: python3-poetry-core
Requires: python3-yaml
Requires: python3-paramiko
Requires: python3-jsonschema
Recommends: jinjaturtle
%description
Enroll a server's running state retrospectively into Ansible.
%prep
%autosetup -n enroll
%generate_buildrequires
%pyproject_buildrequires
%build
%pyproject_wheel
%install
%pyproject_install
%pyproject_save_files enroll
%files -f %{pyproject_files}
%license LICENSE
%doc README.md CHANGELOG.md
%{_bindir}/enroll
%changelog
* Tue May 12 2026 Miguel Jacq <mig@mig5.net> - %{version}-%{release}
- Add ssh config support where JinjaTurtle is used
* Tue Feb 16 2026 Miguel Jacq <mig@mig5.net> - %{version}-%{release}
- Add capability to handle passphrases on encrypted SSH private keys. Prompting can be forced with `--ask-key-passphrase` or automated (e.g for CI) with `--ssh-key-passphrase env SOMEVAR`
* Fri Jan 16 2026 Miguel Jacq <mig@mig5.net> - %{version}-%{release}
- Add support for AddressFamily and ConnectTimeout in the .ssh/config when using `--remote-ssh-config`.
* Tue Jan 13 2026 Miguel Jacq <mig@mig5.net> - %{version}-%{release}
- Support `--remote-ssh-config [path-to-ssh-config]` as an argument in case extra params are required beyond `--remote-port` or `--remote-user`. Note: `--remote-host` must still be s
et, but it can be an 'alias' represented by the 'Host' value in the ssh config.
* Sun Jan 11 2026 Miguel Jacq <mig@mig5.net> - %{version}-%{release}
- Add interactive output when 'enroll diff --enforce' is invoking Ansible.
* Sat Jan 10 2026 Miguel Jacq <mig@mig5.net> - %{version}-%{release}
- Introduce `enroll validate` - a tool to validate a harvest against the state schema, or check for missing or orphaned obsolete artifacts in a harvest.
- Attempt to generate Jinja2 templates of systemd unit files and Postfix main.cf (now that JinjaTurtle supports it)
- Update pynacl dependency to resolve CVE-2025-69277
- Add `--exclude-path` to `enroll diff` command, so that you can ignore certain churn from the diff (stuff you still wanted to harvest as a baseline but don't care if it changes day to day)
- Add `--ignore-package-versions` to `enroll diff` command, to optionally ignore package upgrades (e.g due to patching) from the diff.
- Add tags to the playbook for each role, to allow easier targeting of specific roles during play later.
- Add `--enforce` mode to `enroll diff`. If there is diff detected between the two harvests, and it can enforce restoring the state from the older harvest, it will manifest the state and apply it with ansible.
Only the specific roles that had diffed will be applied (via the new tags capability)
* Mon Jan 05 2026 Miguel Jacq <mig@mig5.net> - %{version}-%{release}
- Introduce `enroll explain` - a tool to analyze and explain what's in (or not in) a harvest and why.
- Centralise the cron and logrotate stuff into their respective roles, we had a bit of duplication between roles based on harvest discovery.
- Capture other files in the user's home directory such as `.bashrc`, `.bash_aliases`, `.profile`, if these files differ from the `/etc/skel` defaults
- Ignore files that end with a tilde or - (probably backup files generated by editors or shadow file changes)
- Manage certain symlinks e.g for apache2/nginx sites-enabled and so on
* Sun Jan 04 2026 Miguel Jacq <mig@mig5.net> - %{version}-%{release}
- Introduce --ask-become-pass or -K to support password-required sudo on remote hosts, just like Ansible. It will also fall back to this prompt if a password is required but the arg wasn't passed in.
* Sat Jan 03 2026 Miguel Jacq <mig@mig5.net> - %{version}-%{release}
- Fix stat() of parent directory so that we set directory perms correct on --include paths.
- Set pty for remote calls when sudo is required, to help systems with limits on sudo without pty
* Fri Jan 02 2026 Miguel Jacq <mig@mig5.net> - %{version}-%{release}
- Don't accidentally add extra_paths role to usr_local_custom list, resulting in extra_paths appearing twice in manifested playbook
- Ensure directories in the tree of anything included with --include are defined in the state and manifest so we make dirs before we try to create files
* Mon Dec 29 2025 Miguel Jacq <mig@mig5.net> - %{version}-%{release}
- Add version CLI arg
- Add ability to enroll RH-style systems (DNF5/DNF/RPM)
- Refactor harvest state to track package versions
* Sun Dec 28 2025 Miguel Jacq <mig@mig5.net> - %{version}-%{release}
- Fix an attribution bug for certain files ending up in the wrong package/role.
* Sun Dec 28 2025 Miguel Jacq <mig@mig5.net> - %{version}-%{release}
- DRY up some code logic
- More test coverage
* Sun Dec 28 2025 Miguel Jacq <mig@mig5.net> - %{version}-%{release}
- Consolidate logrotate and cron files into their main service/package roles if they exist.
- Standardise on MAX_FILES_CAP in one place
- Manage apt stuff in its own role, not in etc_custom
* Sat Dec 27 2025 Miguel Jacq <mig@mig5.net> - %{version}-%{release}
- Attempt to capture more stuff from /etc that might not be attributable to a specific package. This includes common singletons and systemd timers
- Avoid duplicate apt data in package-specific roles.
* Sat Dec 27 2025 Miguel Jacq <mig@mig5.net> - %{version}-%{release}
- Initial RPM packaging for Fedora 42

View file

@ -9,16 +9,45 @@ BUNDLE_DIR="/tmp/bundle"
ANSIBLE_DIR="/tmp/ansible"
rm -rf "${BUNDLE_DIR}" "${ANSIBLE_DIR}"
# Install something that has symlinks like apache2,
# to extend the manifests that will be linted later
DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends apache2
# Generate data
poetry run \
enroll single-shot \
--harvest "${BUNDLE_DIR}" \
--out "${ANSIBLE_DIR}"
builtin cd "${ANSIBLE_DIR}"
# Analyse
poetry run \
enroll explain "${BUNDLE_DIR}"
poetry run \
enroll explain "${BUNDLE_DIR}" --format json | jq
# Validate
poetry run \
enroll validate --fail-on-warnings "${BUNDLE_DIR}"
# Install/remove something, harvest again and diff the harvests
DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends cowsay
poetry run \
enroll harvest --out "${BUNDLE_DIR}2"
# Validate
poetry run \
enroll validate --fail-on-warnings "${BUNDLE_DIR}2"
# Diff
poetry run \
enroll diff \
--old "${BUNDLE_DIR}" \
--new "${BUNDLE_DIR}2" \
--format json | jq
DEBIAN_FRONTEND=noninteractive apt-get remove -y --purge cowsay
# Ansible test
builtin cd "${ANSIBLE_DIR}"
# Lint
ansible-lint "${ANSIBLE_DIR}"
ansible-lint "${ANSIBLE_DIR}"
# Run
ansible-playbook playbook.yml -i "localhost," -c local --check --diff

18
tests/test___main__.py Normal file
View file

@ -0,0 +1,18 @@
from __future__ import annotations
import runpy
def test_module_main_invokes_cli_main(monkeypatch):
import enroll.cli
called = {"ok": False}
def fake_main() -> None:
called["ok"] = True
monkeypatch.setattr(enroll.cli, "main", fake_main)
# Execute enroll.__main__ as if `python -m enroll`.
runpy.run_module("enroll.__main__", run_name="__main__")
assert called["ok"] is True

143
tests/test_accounts.py Normal file
View file

@ -0,0 +1,143 @@
from __future__ import annotations
import os
from pathlib import Path
def test_parse_login_defs_parses_known_keys(tmp_path: Path):
from enroll.accounts import parse_login_defs
p = tmp_path / "login.defs"
p.write_text(
"""
# comment
UID_MIN 1000
UID_MAX 60000
SYS_UID_MIN 100
SYS_UID_MAX 999
UID_MIN not_an_int
OTHER 123
""",
encoding="utf-8",
)
vals = parse_login_defs(str(p))
assert vals["UID_MIN"] == 1000
assert vals["UID_MAX"] == 60000
assert vals["SYS_UID_MIN"] == 100
assert vals["SYS_UID_MAX"] == 999
assert "OTHER" not in vals
def test_parse_passwd_and_group_and_ssh_files(tmp_path: Path):
from enroll.accounts import find_user_ssh_files, parse_group, parse_passwd
passwd = tmp_path / "passwd"
passwd.write_text(
"\n".join(
[
"root:x:0:0:root:/root:/bin/bash",
"# comment",
"alice:x:1000:1000:Alice:/home/alice:/bin/bash",
"bob:x:1001:1000:Bob:/home/bob:/usr/sbin/nologin",
"badline",
"cathy:x:notint:1000:Cathy:/home/cathy:/bin/bash",
"",
]
),
encoding="utf-8",
)
group = tmp_path / "group"
group.write_text(
"\n".join(
[
"root:x:0:",
"users:x:1000:alice,bob",
"admins:x:1002:alice",
"badgroup:x:notint:alice",
"",
]
),
encoding="utf-8",
)
rows = parse_passwd(str(passwd))
assert ("alice", 1000, 1000, "Alice", "/home/alice", "/bin/bash") in rows
assert all(r[0] != "cathy" for r in rows) # skipped invalid UID
gid_to_name, name_to_gid, members = parse_group(str(group))
assert gid_to_name[1000] == "users"
assert name_to_gid["admins"] == 1002
assert "alice" in members["admins"]
# ssh discovery: only authorized_keys, no symlinks
home = tmp_path / "home" / "alice"
sshdir = home / ".ssh"
sshdir.mkdir(parents=True)
ak = sshdir / "authorized_keys"
ak.write_text("ssh-ed25519 AAA...", encoding="utf-8")
# a symlink should be ignored
(sshdir / "authorized_keys2").write_text("x", encoding="utf-8")
os.symlink(str(sshdir / "authorized_keys2"), str(sshdir / "authorized_keys_link"))
assert find_user_ssh_files(str(home)) == [str(ak)]
def test_collect_non_system_users(monkeypatch, tmp_path: Path):
import enroll.accounts as a
orig_parse_login_defs = a.parse_login_defs
orig_parse_passwd = a.parse_passwd
orig_parse_group = a.parse_group
# Provide controlled passwd/group/login.defs inputs via monkeypatch.
passwd = tmp_path / "passwd"
passwd.write_text(
"\n".join(
[
"root:x:0:0:root:/root:/bin/bash",
"nobody:x:65534:65534:nobody:/nonexistent:/usr/sbin/nologin",
"alice:x:1000:1000:Alice:/home/alice:/bin/bash",
"sysuser:x:200:200:Sys:/home/sys:/bin/bash",
"bob:x:1001:1000:Bob:/home/bob:/bin/false",
"",
]
),
encoding="utf-8",
)
group = tmp_path / "group"
group.write_text(
"\n".join(
[
"users:x:1000:alice,bob",
"admins:x:1002:alice",
"",
]
),
encoding="utf-8",
)
defs = tmp_path / "login.defs"
defs.write_text("UID_MIN 1000\n", encoding="utf-8")
monkeypatch.setattr(
a, "parse_login_defs", lambda path=str(defs): orig_parse_login_defs(path)
)
monkeypatch.setattr(
a, "parse_passwd", lambda path=str(passwd): orig_parse_passwd(path)
)
monkeypatch.setattr(
a, "parse_group", lambda path=str(group): orig_parse_group(path)
)
# Use a stable fake ssh discovery.
monkeypatch.setattr(
a, "find_user_ssh_files", lambda home: [f"{home}/.ssh/authorized_keys"]
)
users = a.collect_non_system_users()
assert [u.name for u in users] == ["alice"]
u = users[0]
assert u.primary_group == "users"
assert u.supplementary_groups == ["admins"]
assert u.ssh_files == ["/home/alice/.ssh/authorized_keys"]

View file

@ -0,0 +1,33 @@
from __future__ import annotations
import os
from pathlib import Path
import pytest
def test_ensure_dir_secure_refuses_symlink(tmp_path: Path):
from enroll.cache import _ensure_dir_secure
target = tmp_path / "target"
target.mkdir()
link = tmp_path / "link"
link.symlink_to(target, target_is_directory=True)
with pytest.raises(RuntimeError):
_ensure_dir_secure(link)
def test_ensure_dir_secure_ignores_chmod_failures(tmp_path: Path, monkeypatch):
from enroll.cache import _ensure_dir_secure
d = tmp_path / "d"
def boom(_path: str, _mode: int):
raise OSError("no")
monkeypatch.setattr(os, "chmod", boom)
# Should not raise.
_ensure_dir_secure(d)
assert d.exists() and d.is_dir()

View file

@ -1,14 +1,29 @@
from __future__ import annotations
import sys
import pytest
import enroll.cli as cli
from pathlib import Path
from enroll.remote import RemoteSudoPasswordRequired
from enroll.sopsutil import SopsError
def test_cli_harvest_subcommand_calls_harvest(monkeypatch, capsys, tmp_path):
called = {}
def fake_harvest(out: str, dangerous: bool = False):
def fake_harvest(
out: str,
dangerous: bool = False,
include_paths=None,
exclude_paths=None,
**_kwargs,
):
called["out"] = out
called["dangerous"] = dangerous
called["include_paths"] = include_paths or []
called["exclude_paths"] = exclude_paths or []
return str(tmp_path / "state.json")
monkeypatch.setattr(cli, "harvest", fake_harvest)
@ -17,6 +32,8 @@ def test_cli_harvest_subcommand_calls_harvest(monkeypatch, capsys, tmp_path):
cli.main()
assert called["out"] == str(tmp_path)
assert called["dangerous"] is False
assert called["include_paths"] == []
assert called["exclude_paths"] == []
captured = capsys.readouterr()
assert str(tmp_path / "state.json") in captured.out
@ -55,8 +72,16 @@ def test_cli_manifest_subcommand_calls_manifest(monkeypatch, tmp_path):
def test_cli_enroll_subcommand_runs_harvest_then_manifest(monkeypatch, tmp_path):
calls = []
def fake_harvest(bundle_dir: str, dangerous: bool = False):
calls.append(("harvest", bundle_dir, dangerous))
def fake_harvest(
bundle_dir: str,
dangerous: bool = False,
include_paths=None,
exclude_paths=None,
**_kwargs,
):
calls.append(
("harvest", bundle_dir, dangerous, include_paths or [], exclude_paths or [])
)
return str(tmp_path / "bundle" / "state.json")
def fake_manifest(bundle_dir: str, out_dir: str, **kwargs):
@ -87,7 +112,7 @@ def test_cli_enroll_subcommand_runs_harvest_then_manifest(monkeypatch, tmp_path)
cli.main()
assert calls == [
("harvest", str(tmp_path / "bundle"), False),
("harvest", str(tmp_path / "bundle"), False, [], []),
("manifest", str(tmp_path / "bundle"), str(tmp_path / "ansible"), None, "auto"),
]
@ -95,9 +120,17 @@ def test_cli_enroll_subcommand_runs_harvest_then_manifest(monkeypatch, tmp_path)
def test_cli_harvest_dangerous_flag_is_forwarded(monkeypatch, tmp_path):
called = {}
def fake_harvest(out: str, dangerous: bool = False):
def fake_harvest(
out: str,
dangerous: bool = False,
include_paths=None,
exclude_paths=None,
**_kwargs,
):
called["out"] = out
called["dangerous"] = dangerous
called["include_paths"] = include_paths or []
called["exclude_paths"] = exclude_paths or []
return str(tmp_path / "state.json")
monkeypatch.setattr(cli, "harvest", fake_harvest)
@ -107,6 +140,8 @@ def test_cli_harvest_dangerous_flag_is_forwarded(monkeypatch, tmp_path):
cli.main()
assert called["dangerous"] is True
assert called["include_paths"] == []
assert called["exclude_paths"] == []
def test_cli_harvest_remote_calls_remote_harvest_and_uses_cache_dir(
@ -131,6 +166,9 @@ def test_cli_harvest_remote_calls_remote_harvest_and_uses_cache_dir(
remote_user,
dangerous,
no_sudo,
include_paths=None,
exclude_paths=None,
**_kwargs,
):
called.update(
{
@ -140,6 +178,8 @@ def test_cli_harvest_remote_calls_remote_harvest_and_uses_cache_dir(
"remote_user": remote_user,
"dangerous": dangerous,
"no_sudo": no_sudo,
"include_paths": include_paths or [],
"exclude_paths": exclude_paths or [],
}
)
return cache_dir / "state.json"
@ -169,6 +209,8 @@ def test_cli_harvest_remote_calls_remote_harvest_and_uses_cache_dir(
assert called["remote_user"] == "alice"
assert called["dangerous"] is False
assert called["no_sudo"] is False
assert called["include_paths"] == []
assert called["exclude_paths"] == []
def test_cli_single_shot_remote_without_harvest_prints_state_path(
@ -223,6 +265,113 @@ def test_cli_single_shot_remote_without_harvest_prints_state_path(
assert ("manifest", str(cache_dir), str(ansible_dir), "example.test") in calls
def test_cli_harvest_remote_ask_become_pass_prompts_and_passes_password(
monkeypatch, tmp_path
):
from enroll.cache import HarvestCache
import enroll.remote as r
cache_dir = tmp_path / "cache"
cache_dir.mkdir()
called = {}
def fake_cache_dir(*, hint=None):
return HarvestCache(dir=cache_dir)
def fake__remote_harvest(*, sudo_password=None, **kwargs):
called["sudo_password"] = sudo_password
return cache_dir / "state.json"
monkeypatch.setattr(cli, "new_harvest_cache_dir", fake_cache_dir)
monkeypatch.setattr(r, "_remote_harvest", fake__remote_harvest)
monkeypatch.setattr(r.getpass, "getpass", lambda _prompt="": "pw123")
monkeypatch.setattr(
sys,
"argv",
[
"enroll",
"harvest",
"--remote-host",
"example.test",
"--ask-become-pass",
],
)
cli.main()
assert called["sudo_password"] == "pw123"
def test_cli_harvest_remote_password_required_fallback_prompts_and_retries(
monkeypatch, tmp_path
):
from enroll.cache import HarvestCache
import enroll.remote as r
cache_dir = tmp_path / "cache"
cache_dir.mkdir()
def fake_cache_dir(*, hint=None):
return HarvestCache(dir=cache_dir)
calls = []
def fake__remote_harvest(*, sudo_password=None, **kwargs):
calls.append(sudo_password)
if sudo_password is None:
raise r.RemoteSudoPasswordRequired("pw required")
return cache_dir / "state.json"
class _TTYStdin:
def isatty(self):
return True
monkeypatch.setattr(cli, "new_harvest_cache_dir", fake_cache_dir)
monkeypatch.setattr(r, "_remote_harvest", fake__remote_harvest)
monkeypatch.setattr(r.getpass, "getpass", lambda _prompt="": "pw456")
monkeypatch.setattr(sys, "stdin", _TTYStdin())
monkeypatch.setattr(
sys, "argv", ["enroll", "harvest", "--remote-host", "example.test"]
)
cli.main()
assert calls == [None, "pw456"]
def test_cli_harvest_remote_password_required_noninteractive_errors(
monkeypatch, tmp_path
):
from enroll.cache import HarvestCache
import enroll.remote as r
cache_dir = tmp_path / "cache"
cache_dir.mkdir()
def fake_cache_dir(*, hint=None):
return HarvestCache(dir=cache_dir)
def fake__remote_harvest(*, sudo_password=None, **kwargs):
raise r.RemoteSudoPasswordRequired("pw required")
class _NoTTYStdin:
def isatty(self):
return False
monkeypatch.setattr(cli, "new_harvest_cache_dir", fake_cache_dir)
monkeypatch.setattr(r, "_remote_harvest", fake__remote_harvest)
monkeypatch.setattr(sys, "stdin", _NoTTYStdin())
monkeypatch.setattr(
sys, "argv", ["enroll", "harvest", "--remote-host", "example.test"]
)
with pytest.raises(SystemExit) as e:
cli.main()
assert "--ask-become-pass" in str(e.value)
def test_cli_manifest_common_args(monkeypatch, tmp_path):
"""Ensure --fqdn and jinjaturtle mode flags are forwarded correctly."""
@ -254,3 +403,286 @@ def test_cli_manifest_common_args(monkeypatch, tmp_path):
cli.main()
assert called["fqdn"] == "example.test"
assert called["jinjaturtle"] == "off"
def test_cli_explain_passes_args_and_writes_stdout(monkeypatch, capsys, tmp_path):
called = {}
def fake_explain_state(
harvest: str,
*,
sops_mode: bool = False,
fmt: str = "text",
max_examples: int = 3,
):
called["harvest"] = harvest
called["sops_mode"] = sops_mode
called["fmt"] = fmt
called["max_examples"] = max_examples
return "EXPLAINED\n"
monkeypatch.setattr(cli, "explain_state", fake_explain_state)
monkeypatch.setattr(
sys,
"argv",
[
"enroll",
"explain",
"--sops",
"--format",
"json",
"--max-examples",
"7",
str(tmp_path / "bundle" / "state.json"),
],
)
cli.main()
out = capsys.readouterr().out
assert out == "EXPLAINED\n"
assert called["sops_mode"] is True
assert called["fmt"] == "json"
assert called["max_examples"] == 7
def test_discover_config_path_missing_config_value_returns_none(monkeypatch):
# Covers the "--config" flag present with no value.
monkeypatch.delenv("ENROLL_CONFIG", raising=False)
monkeypatch.delenv("XDG_CONFIG_HOME", raising=False)
assert cli._discover_config_path(["--config"]) is None
def test_discover_config_path_defaults_to_home_config(monkeypatch, tmp_path: Path):
# Covers the Path.home() / ".config" fallback.
monkeypatch.delenv("ENROLL_CONFIG", raising=False)
monkeypatch.delenv("XDG_CONFIG_HOME", raising=False)
monkeypatch.setattr(cli.Path, "home", lambda: tmp_path)
monkeypatch.setattr(cli.Path, "cwd", lambda: tmp_path)
cp = tmp_path / ".config" / "enroll" / "enroll.ini"
cp.parent.mkdir(parents=True)
cp.write_text("[enroll]\n", encoding="utf-8")
assert cli._discover_config_path(["harvest"]) == cp
def test_cli_harvest_local_sops_encrypts_and_prints_path(
monkeypatch, tmp_path: Path, capsys
):
out_dir = tmp_path / "out"
out_dir.mkdir()
calls: dict[str, object] = {}
def fake_harvest(bundle_dir: str, **kwargs):
calls["bundle"] = bundle_dir
# Create a minimal state.json so tooling that expects it won't break.
Path(bundle_dir).mkdir(parents=True, exist_ok=True)
(Path(bundle_dir) / "state.json").write_text("{}", encoding="utf-8")
return str(Path(bundle_dir) / "state.json")
def fake_encrypt(bundle_dir: Path, out_file: Path, fps: list[str]):
calls["encrypt"] = (bundle_dir, out_file, fps)
out_file.write_text("encrypted", encoding="utf-8")
return out_file
monkeypatch.setattr(cli, "harvest", fake_harvest)
monkeypatch.setattr(cli, "_encrypt_harvest_dir_to_sops", fake_encrypt)
monkeypatch.setattr(
sys,
"argv",
[
"enroll",
"harvest",
"--sops",
"ABCDEF",
"--out",
str(out_dir),
],
)
cli.main()
printed = capsys.readouterr().out.strip()
assert printed.endswith("harvest.tar.gz.sops")
assert Path(printed).exists()
assert calls.get("encrypt")
def test_cli_harvest_remote_sops_encrypts_and_prints_path(
monkeypatch, tmp_path: Path, capsys
):
out_dir = tmp_path / "out"
out_dir.mkdir()
calls: dict[str, object] = {}
def fake_remote_harvest(**kwargs):
calls["remote"] = kwargs
# Create a minimal state.json in the temp bundle.
out = Path(kwargs["local_out_dir"]) / "state.json"
out.write_text("{}", encoding="utf-8")
return out
def fake_encrypt(bundle_dir: Path, out_file: Path, fps: list[str]):
calls["encrypt"] = (bundle_dir, out_file, fps)
out_file.write_text("encrypted", encoding="utf-8")
return out_file
monkeypatch.setattr(cli, "remote_harvest", fake_remote_harvest)
monkeypatch.setattr(cli, "_encrypt_harvest_dir_to_sops", fake_encrypt)
monkeypatch.setattr(
sys,
"argv",
[
"enroll",
"harvest",
"--remote-host",
"example.com",
"--remote-user",
"root",
"--sops",
"ABCDEF",
"--out",
str(out_dir),
],
)
cli.main()
printed = capsys.readouterr().out.strip()
assert printed.endswith("harvest.tar.gz.sops")
assert Path(printed).exists()
assert calls.get("remote")
assert calls.get("encrypt")
def test_cli_harvest_remote_password_required_exits_cleanly(monkeypatch):
def boom(**kwargs):
raise RemoteSudoPasswordRequired("pw required")
monkeypatch.setattr(cli, "remote_harvest", boom)
monkeypatch.setattr(
sys,
"argv",
[
"enroll",
"harvest",
"--remote-host",
"example.com",
"--remote-user",
"root",
],
)
with pytest.raises(SystemExit) as e:
cli.main()
assert "--ask-become-pass" in str(e.value)
def test_cli_runtime_error_is_wrapped_as_user_friendly_system_exit(monkeypatch):
def boom(*args, **kwargs):
raise RuntimeError("nope")
monkeypatch.setattr(cli, "harvest", boom)
monkeypatch.setattr(sys, "argv", ["enroll", "harvest", "--out", "/tmp/x"])
with pytest.raises(SystemExit) as e:
cli.main()
assert str(e.value) == "error: nope"
def test_cli_sops_error_is_wrapped_as_user_friendly_system_exit(monkeypatch):
def boom(*args, **kwargs):
raise SopsError("sops broke")
monkeypatch.setattr(cli, "manifest", boom)
monkeypatch.setattr(
sys, "argv", ["enroll", "manifest", "--harvest", "/tmp/x", "--out", "/tmp/y"]
)
with pytest.raises(SystemExit) as e:
cli.main()
assert str(e.value) == "error: sops broke"
def test_cli_diff_notifies_webhook_and_email_and_respects_exit_code(
monkeypatch, capsys
):
calls: dict[str, object] = {}
def fake_compare(old, new, sops_mode=False, **kwargs):
calls["compare"] = (old, new, sops_mode)
return {"dummy": True}, True
def fake_format(report, fmt="text"):
calls.setdefault("format", []).append((report, fmt))
return "REPORT\n"
def fake_post(url, body, headers=None):
calls["webhook"] = (url, body, headers)
return 200, b"ok"
def fake_email(**kwargs):
calls["email"] = kwargs
monkeypatch.setattr(cli, "compare_harvests", fake_compare)
monkeypatch.setattr(cli, "format_report", fake_format)
monkeypatch.setattr(cli, "post_webhook", fake_post)
monkeypatch.setattr(cli, "send_email", fake_email)
monkeypatch.setenv("SMTPPW", "secret")
monkeypatch.setattr(
sys,
"argv",
[
"enroll",
"diff",
"--old",
"/tmp/old",
"--new",
"/tmp/new",
"--webhook",
"https://example.invalid/h",
"--webhook-header",
"X-Test: ok",
"--email-to",
"a@example.com",
"--smtp-password-env",
"SMTPPW",
"--exit-code",
],
)
with pytest.raises(SystemExit) as e:
cli.main()
assert e.value.code == 2
assert calls.get("compare")
assert calls.get("webhook")
assert calls.get("email")
# No report printed when exiting via --exit-code? (we still render and print).
_ = capsys.readouterr()
def test_cli_diff_webhook_http_error_raises_system_exit(monkeypatch):
def fake_compare(old, new, sops_mode=False, **kwargs):
return {"dummy": True}, True
monkeypatch.setattr(cli, "compare_harvests", fake_compare)
monkeypatch.setattr(cli, "format_report", lambda report, fmt="text": "R\n")
monkeypatch.setattr(cli, "post_webhook", lambda url, body, headers=None: (500, b""))
monkeypatch.setattr(
sys,
"argv",
[
"enroll",
"diff",
"--old",
"/tmp/old",
"--new",
"/tmp/new",
"--webhook",
"https://example.invalid/h",