1 Technical_Decomp_PathFilter
Miguel Jacq edited this page 2025-12-27 20:51:49 -06:00

enroll/pathfilter.py

CompiledPathPattern (frozen dataclass)

Purpose: a compiled representation of one include/exclude pattern.

Fields:

  • raw: original input string
  • kind: 'prefix' | 'glob' | 'regex'
  • value: normalized pattern or regex string
  • regex: compiled re.Pattern when kind == "regex", else None

Method:

matches(path: str) -> bool
  • Normalizes path into an absolute POSIX-ish path (/-anchored, normpath).
  • For regex: uses regex.search() (not match) so patterns can be unanchored.
  • For glob: uses PurePosixPath.match() to support segment-aware patterns and **.
  • For prefix: matches exact path or subtree under prefix directory.
Important edge behavior:
  • If regex compilation failed, regex is None and it never matches (fail closed).

PathFilter (dataclass)

Purpose: holds user-provided --include-path and --exclude-path patterns and applies precedence rules.

Fields:

  • include: Sequence[str]
  • exclude: Sequence[str]

Internal state created in post_init:

  • _include: list of CompiledPathPattern built from include
  • _exclude: list of CompiledPathPattern built from exclude

Methods:

is_excluded(path: str) -> bool

Returns True if any exclude pattern matches. Excludes always win.

iter_include_patterns() -> List[CompiledPathPattern]

Returns the compiled include patterns (used by expand_includes()).

Key semantic note:
  • Include patterns do not restrict the default harvest set.
  • They only add extra files to harvest under the extra_paths role.