5 enroll harvest
Miguel Jacq edited this page 2025-12-17 22:38:14 -06:00

enroll harvest

Harvest system/service/package/config/user state from a Debian host into a "harvest bundle" (state.json plus harvested file artifacts).


Synopsis

enroll harvest [--out <DIR|FILE>] [--dangerous] [--sops <GPG_FPR...>]   [--remote-host <HOST> [--remote-user <USER>] [--remote-port <PORT>] [--no-sudo]]

What it produces

A harvest bundle always contains:

  • state.json — structured snapshot of discovered services/packages/users/files
  • artifacts/ — file copies referenced from state.json (per-role subtrees)

Output formats

Plain (directory)

  • Output is a directory containing state.json and artifacts/.

SOPS (single encrypted file)

  • Output is a single SOPS-encrypted tarball: harvest.tar.gz.sops.
  • Internally, it contains the same layout (state.json, artifacts/), just bundled and encrypted.

Options

--out <path>

Where to write the harvest output.

Behavior depends on whether you're in plain or SOPS mode:

  • Plain mode (no --sops)

    • --out is a directory.
    • Required for local harvests.
    • Optional for remote harvests (see "Cache defaults" below).
  • SOPS mode (--sops ...)

    • --out may be:
      • a directory - the file harvest.tar.gz.sops is created inside it
      • a file path - that exact file is written
    • If omitted, enroll writes into a secure per-user cache dir (see below).

--dangerous

Harvest files more aggressively.

This disables the built-in "likely secret" safety checks, including:

  • denylisted paths (e.g. /etc/shadow, /etc/ssl/private/*, /etc/ssh/ssh_host_*, /etc/letsencrypt/*)
  • heuristic content scanning for common secret patterns (PRIVATE KEY, "password=", "token", "secret", etc.)
  • some other conservative skipping logic

Use with care, especially in plaintext mode.

--sops <GPG_FINGERPRINT...>

Encrypt the harvest output as a SOPS-encrypted tarball.

  • Provide one or more GPG fingerprints.
  • Requires sops available on PATH.
  • Output becomes a single file: harvest.tar.gz.sops (unless you choose another filename via --out).

Remote harvesting

--remote-host <host>

Run the harvest on a remote host over SSH and pull the results locally.

When --remote-host is set:

  • the harvest is executed on the remote machine
  • the bundle is written locally (in --out or the cache directory)

--remote-user <user>

SSH username. Default is the local $USER.

--remote-port <port>

SSH port. Default is 22.

--no-sudo

Don't use sudo on the remote host.

This may cause a partial harvest (missing files/metadata) if the SSH user can't read everything.


Cache defaults (when --out is omitted)

enroll has a "secure cache" feature for harvest output, but it only applies in specific cases:

  • Remote harvest (plain mode): if --out is omitted, a cache dir is created and used.
  • Any harvest with --sops (local or remote): if --out is omitted, a cache dir is created and used.

The cache base is:

  • $XDG_CACHE_HOME/enroll/harvest/ if XDG_CACHE_HOME is set
  • otherwise ~/.local/cache/enroll/harvest/

Each run gets a timestamped directory with an unpredictable suffix, e.g. ~/.local/cache/enroll/harvest/20251218-...-<random>/.

Note: Local plaintext harvests require --out.


Runtime notes / expectations

  • Root recommended: If not running as root (or remote sudo is disabled), enroll may miss files or correct ownership/mode metadata.
  • Symlinks/binaries/large files: Harvesting skips files that are symlinks, "binary-like", or above a size cap (unless you use --dangerous, which relaxes some checks).
  • Output is deterministic-enough to diff: The bundle is designed so comparing two harvests is meaningful (via enroll diff).

Permutations (valid combinations)

Below are the common "flag permutations" you'll typically use.

Local harvest, plaintext (safe)

enroll harvest --out /path/to/harvest-dir

Local harvest, plaintext (--dangerous)

enroll harvest --out /path/to/harvest-dir --dangerous

Local harvest, SOPS-encrypted (safe)

--out may be a dir:

enroll harvest --sops <FPR1> --out /path/to/output-dir
# writes /path/to/output-dir/harvest.tar.gz.sops

…or a file:

enroll harvest --sops <FPR1> --out /path/to/harvest.tar.gz.sops

If you omit --out, it writes into the per-user cache:

enroll harvest --sops <FPR1>

Local harvest, SOPS-encrypted (--dangerous)

enroll harvest --dangerous --sops <FPR1> --out /path/to/output-dir

Remote harvest, plaintext (safe)

With explicit output dir:

enroll harvest --remote-host host.example.com --out /path/to/harvest-dir

Using the cache (omit --out):

enroll harvest --remote-host host.example.com

Remote harvest, plaintext (--dangerous)

enroll harvest --remote-host host.example.com --out /path/to/harvest-dir --dangerous

Remote harvest, plaintext without sudo

enroll harvest --remote-host host.example.com --out /path/to/harvest-dir --no-sudo

Remote harvest, SOPS-encrypted (safe)

enroll harvest --remote-host host.example.com --sops <FPR1> --out /path/to/output-dir
# writes /path/to/output-dir/harvest.tar.gz.sops

Remote harvest, SOPS-encrypted (--dangerous)

enroll harvest --remote-host host.example.com --dangerous --sops <FPR1> --out /path/to/output-dir

Remote harvest, SOPS-encrypted without sudo

enroll harvest --remote-host host.example.com --no-sudo --sops <FPR1> --out /path/to/output-dir