Add Technical_Decomp_PathFilter

Miguel Jacq 2025-12-27 20:51:49 -06:00
parent f605f4a979
commit 7a1e120365

@ -0,0 +1,53 @@
## enroll/pathfilter.py
### CompiledPathPattern (frozen dataclass)
#### Purpose: a compiled representation of one include/exclude pattern.
#### Fields:
- raw: original input string
- kind: 'prefix' | 'glob' | 'regex'
- value: normalized pattern or regex string
- regex: compiled re.Pattern when kind == "regex", else None
#### Method:
##### matches(path: str) -> bool
- Normalizes path into an absolute POSIX-ish path (/-anchored, normpath).
- For regex: uses regex.search() (not match) so patterns can be unanchored.
- For glob: uses PurePosixPath.match() to support segment-aware patterns and **.
- For prefix: matches exact path or subtree under prefix directory.
##### Important edge behavior:
- If regex compilation failed, regex is None and it never matches (fail closed).
### PathFilter (dataclass)
#### Purpose: holds user-provided --include-path and --exclude-path patterns and applies precedence rules.
#### Fields:
- include: Sequence[str]
- exclude: Sequence[str]
#### Internal state created in __post_init__:
- _include: list of CompiledPathPattern built from include
- _exclude: list of CompiledPathPattern built from exclude
#### Methods:
##### is_excluded(path: str) -> bool
Returns True if any exclude pattern matches. Excludes always win.
##### iter_include_patterns() -> List[CompiledPathPattern]
Returns the compiled include patterns (used by expand_includes()).
##### Key semantic note:
- Include patterns do not restrict the default harvest set.
- They only add extra files to harvest under the extra_paths role.