Compare commits

...

18 commits
0.1.4 ... main

Author SHA1 Message Date
8dd8c0a2be
Uncomment deb release steps
All checks were successful
CI / test (push) Successful in 41s
Lint / test (push) Successful in 30s
Trivy / test (push) Successful in 22s
2025-12-15 15:02:48 +11:00
9f9301e17e
Add Debian packages, support Ubuntu 22 via tomli
Some checks failed
CI / test (push) Successful in 46s
Trivy / test (push) Waiting to run
Lint / test (push) Has been cancelled
2025-12-15 15:01:28 +11:00
b71f41212a
README update
All checks were successful
CI / test (push) Successful in 50s
Lint / test (push) Successful in 31s
Trivy / test (push) Successful in 19s
2025-12-14 16:41:30 +11:00
57842774e3
remove venv path to filedust 2025-12-13 15:10:55 +11:00
72deb1dc1f
CI tweaks
All checks were successful
CI / test (push) Successful in 43s
Lint / test (push) Successful in 25s
Trivy / test (push) Successful in 24s
2025-12-03 18:06:32 +11:00
36682c4020
Add jinja2 dep
All checks were successful
CI / test (push) Successful in 43s
Lint / test (push) Successful in 23s
Trivy / test (push) Successful in 21s
2025-11-30 18:33:16 +11:00
40690055f3
tweak path to filedust
Some checks failed
Trivy / test (push) Waiting to run
CI / test (push) Failing after 38s
Lint / test (push) Has been cancelled
2025-11-30 18:32:32 +11:00
66eda6dae8
Bump version
Some checks failed
CI / test (push) Failing after 39s
Lint / test (push) Successful in 23s
Trivy / test (push) Has been cancelled
2025-11-30 18:31:12 +11:00
414e88b4cd
Remove note
Some checks failed
CI / test (push) Failing after 38s
Lint / test (push) Successful in 25s
Trivy / test (push) Successful in 21s
2025-11-30 18:29:08 +11:00
5bdc8b03eb
use filedust in release.sh 2025-11-30 18:28:00 +11:00
d7c71f6349
Refactor and add much more robust tests (both automated and manual) to ensure loops and things work ok
Some checks failed
CI / test (push) Failing after 45s
Lint / test (push) Successful in 26s
Trivy / test (push) Successful in 24s
2025-11-30 18:27:01 +11:00
3af628e22e
Meh
Some checks failed
Lint / test (push) Failing after 23s
CI / test (push) Successful in 45s
Trivy / test (push) Successful in 23s
2025-11-28 12:51:10 +11:00
78aed97302
Fix CLI return code
Some checks failed
CI / test (push) Failing after 38s
Lint / test (push) Failing after 23s
Trivy / test (push) Successful in 22s
2025-11-28 12:45:43 +11:00
edd1acdabd
Add notes to the README about looping config
Some checks failed
CI / test (push) Failing after 39s
Lint / test (push) Failing after 25s
Trivy / test (push) Successful in 21s
2025-11-28 12:30:26 +11:00
f66f58a7bb
Rename some methods, merge the loopable classes and just always try it 2025-11-28 12:28:46 +11:00
2db80cc6e1
Add ability to generate 'loops' in Jinja if the XML or YAML config supports it 2025-11-28 12:14:17 +11:00
4f9d1a0442
comment cleanup 2025-11-27 21:37:29 +11:00
bd3f9bf8d2
logo update 2025-11-27 21:36:56 +11:00
36 changed files with 3762 additions and 979 deletions

View file

@ -30,3 +30,16 @@ jobs:
run: | run: |
./tests.sh ./tests.sh
# Notify if any previous step in this job failed
- name: Notify on failure
if: ${{ failure() }}
env:
WEBHOOK_URL: ${{ secrets.NODERED_WEBHOOK_URL }}
REPOSITORY: ${{ forgejo.repository }}
RUN_NUMBER: ${{ forgejo.run_number }}
SERVER_URL: ${{ forgejo.server_url }}
run: |
curl -X POST \
-H "Content-Type: application/json" \
-d "{\"repository\":\"$REPOSITORY\",\"run_number\":\"$RUN_NUMBER\",\"status\":\"failure\",\"url\":\"$SERVER_URL/$REPOSITORY/actions/runs/$RUN_NUMBER\"}" \
"$WEBHOOK_URL"

View file

@ -15,7 +15,7 @@ jobs:
run: | run: |
apt-get update apt-get update
DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends \ DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends \
black pyflakes3 python3-bandit black pyflakes3 python3-bandit vulture
- name: Run linters - name: Run linters
run: | run: |
@ -24,3 +24,18 @@ jobs:
pyflakes3 src/* pyflakes3 src/*
pyflakes3 tests/* pyflakes3 tests/*
bandit -s B110 -r src/ bandit -s B110 -r src/
vulture .
# Notify if any previous step in this job failed
- name: Notify on failure
if: ${{ failure() }}
env:
WEBHOOK_URL: ${{ secrets.NODERED_WEBHOOK_URL }}
REPOSITORY: ${{ forgejo.repository }}
RUN_NUMBER: ${{ forgejo.run_number }}
SERVER_URL: ${{ forgejo.server_url }}
run: |
curl -X POST \
-H "Content-Type: application/json" \
-d "{\"repository\":\"$REPOSITORY\",\"run_number\":\"$RUN_NUMBER\",\"status\":\"failure\",\"url\":\"$SERVER_URL/$REPOSITORY/actions/runs/$RUN_NUMBER\"}" \
"$WEBHOOK_URL"

View file

@ -24,3 +24,17 @@ jobs:
- name: Run trivy - name: Run trivy
run: | run: |
trivy fs --no-progress --ignore-unfixed --format table --disable-telemetry . trivy fs --no-progress --ignore-unfixed --format table --disable-telemetry .
# Notify if any previous step in this job failed
- name: Notify on failure
if: ${{ failure() }}
env:
WEBHOOK_URL: ${{ secrets.NODERED_WEBHOOK_URL }}
REPOSITORY: ${{ forgejo.repository }}
RUN_NUMBER: ${{ forgejo.run_number }}
SERVER_URL: ${{ forgejo.server_url }}
run: |
curl -X POST \
-H "Content-Type: application/json" \
-d "{\"repository\":\"$REPOSITORY\",\"run_number\":\"$RUN_NUMBER\",\"status\":\"failure\",\"url\":\"$SERVER_URL/$REPOSITORY/actions/runs/$RUN_NUMBER\"}" \
"$WEBHOOK_URL"

1
.gitignore vendored
View file

@ -7,3 +7,4 @@ dist
*.yml *.yml
*.j2 *.j2
*.toml *.toml
regenerated_*

85
Dockerfile.debbuild Normal file
View file

@ -0,0 +1,85 @@
# syntax=docker/dockerfile:1
ARG BASE_IMAGE=debian:bookworm
FROM ${BASE_IMAGE}
ENV DEBIAN_FRONTEND=noninteractive
# If Ubuntu, ensure Universe is enabled.
RUN set -eux; \
. /etc/os-release; \
if [ "${ID:-}" = "ubuntu" ]; then \
apt-get update; \
apt-get install -y --no-install-recommends software-properties-common ca-certificates; \
add-apt-repository -y universe; \
fi; \
if [ "${VERSION_CODENAME:-}" = "jammy" ]; then \
apt-get update; \
apt-get install -y --no-install-recommends python3-tomli; \
fi
# Build deps
RUN set -eux; \
apt-get update; \
apt-get install -y --no-install-recommends \
build-essential \
devscripts \
debhelper \
dh-python \
pybuild-plugin-pyproject \
python3-all \
python3-poetry-core \
python3-yaml \
python3-defusedxml \
python3-jinja2 \
python3-toml \
rsync \
ca-certificates \
; \
rm -rf /var/lib/apt/lists/*
# Build runner script
RUN set -eux; \
cat > /usr/local/bin/build-deb <<'EOF'
#!/usr/bin/env bash
set -euo pipefail
SRC="${SRC:-/src}"
WORKROOT="${WORKROOT:-/work}"
WORK="${WORKROOT}/src"
OUT="${OUT:-/out}"
mkdir -p "$WORK" "$OUT"
rsync -a --delete \
--exclude '.git' \
--exclude '.venv' \
--exclude 'dist' \
--exclude 'build' \
--exclude '__pycache__' \
--exclude '.pytest_cache' \
--exclude '.mypy_cache' \
"${SRC}/" "${WORK}/"
cd "${WORK}"
if [ -n "${SUITE:-}" ]; then
export DEBEMAIL="mig@mig5.net"
export DEBFULLNAME="Miguel Jacq"
dch --distribution "$SUITE" --local "~${SUITE}" "CI build for $SUITE"
fi
dpkg-buildpackage -us -uc -b
shopt -s nullglob
cp -v "${WORKROOT}"/*.deb \
"${WORKROOT}"/*.changes \
"${WORKROOT}"/*.buildinfo \
"${WORKROOT}"/*.dsc \
"${WORKROOT}"/*.tar.* \
"${OUT}/" || true
echo "Artifacts copied to ${OUT}"
EOF
RUN chmod +x /usr/local/bin/build-deb
WORKDIR /work
ENTRYPOINT ["/usr/local/bin/build-deb"]

View file

@ -5,8 +5,7 @@
</div> </div>
JinjaTurtle is a command-line tool to help you generate Jinja2 templates and JinjaTurtle is a command-line tool to help you generate Jinja2 templates and
Ansible `defaults/main.yml` files from a native configuration file of a piece Ansible inventory from a native configuration file of a piece of software.
of software.
## How it works ## How it works
@ -17,10 +16,10 @@ of software.
role. role.
* A Jinja2 file is generated from the file with those parameter key names * A Jinja2 file is generated from the file with those parameter key names
injected as the `{{ variable }}` names. injected as the `{{ variable }}` names.
* A `defaults/main.yml` is generated with those key names and the *values* * An Ansible inventory YAML file is generated with those key names and the
taken from the original config file as the defaults. *values* taken from the original config file as the defaults.
By default, the Jinja2 template and the `defaults/main.yml` are printed to By default, the Jinja2 template and the Ansible inventory are printed to
stdout. However, it is possible to output the results to new files. stdout. However, it is possible to output the results to new files.
## What sort of config files can it handle? ## What sort of config files can it handle?
@ -29,15 +28,28 @@ TOML, YAML, INI, JSON and XML-style config files should be okay. There are alway
going to be some edge cases in very complex files that are difficult to work going to be some edge cases in very complex files that are difficult to work
with, though, so you may still find that you need to tweak the results. with, though, so you may still find that you need to tweak the results.
The tool does not do anything intelligent like detect common sections that For XML and YAML files, JinjaTurtle will attempt to generate 'for' loops
could practically be turned into 'for' loops in Jinja. You'd have to do those and lists in the Ansible yaml if the config file looks homogenous enough to
sorts of optimisations yourself. support it. However, if it lacks the confidence in this, it will fall back to
using scalar-style flattened attributes.
You may need or wish to tidy up the config to suit your needs.
The goal here is really to *speed up* converting files into Ansible/Jinja2, The goal here is really to *speed up* converting files into Ansible/Jinja2,
but not necessarily to make it perfect. but not necessarily to make it perfect.
## How to install it ## How to install it
### Ubuntu/Debian apt repository
```bash
sudo mkdir -p /usr/share/keyrings
curl -fsSL https://mig5.net/static/mig5.asc | sudo gpg --dearmor -o /usr/share/keyrings/mig5.gpg
echo "deb [arch=amd64 signed-by=/usr/share/keyrings/mig5.gpg] https://apt.mig5.net $(lsb_release -cs) main" | sudo tee /etc/apt/sources.list.d/mig5.list
sudo apt update
sudo apt install jinjaturtle
```
### From PyPi ### From PyPi
``` ```
@ -74,7 +86,7 @@ jinjaturtle php.ini \
``` ```
usage: jinjaturtle [-h] -r ROLE_NAME [-f {json,ini,toml,yaml,xml}] [-d DEFAULTS_OUTPUT] [-t TEMPLATE_OUTPUT] config usage: jinjaturtle [-h] -r ROLE_NAME [-f {json,ini,toml,yaml,xml}] [-d DEFAULTS_OUTPUT] [-t TEMPLATE_OUTPUT] config
Convert a config file into an Ansible defaults file and Jinja2 template. Convert a config file into Ansible inventory and a Jinja2 template.
positional arguments: positional arguments:
config Path to the source configuration file (TOML or INI-style). config Path to the source configuration file (TOML or INI-style).
@ -83,7 +95,7 @@ options:
-h, --help show this help message and exit -h, --help show this help message and exit
-r, --role-name ROLE_NAME -r, --role-name ROLE_NAME
Ansible role name, used as variable prefix (e.g. cometbft). Ansible role name, used as variable prefix (e.g. cometbft).
-f, --format {ini,toml} -f, --format {ini,json,toml,xml}
Force config format instead of auto-detecting from filename. Force config format instead of auto-detecting from filename.
-d, --defaults-output DEFAULTS_OUTPUT -d, --defaults-output DEFAULTS_OUTPUT
Path to write defaults/main.yml. If omitted, defaults YAML is printed to stdout. Path to write defaults/main.yml. If omitted, defaults YAML is printed to stdout.

11
debian/changelog vendored Normal file
View file

@ -0,0 +1,11 @@
jinjaturtle (0.3.3) unstable; urgency=medium
* Fixes for tomli on Ubuntu 22
-- Miguel Jacq <mig@mig5.net> Mon, 15 Dec 2025 14:00:00 +0000
jinjaturtle (0.3.2) unstable; urgency=medium
* Initial package
-- Miguel Jacq <mig@mig5.net> Mon, 15 Dec 2025 12:00:00 +0000

28
debian/control vendored Normal file
View file

@ -0,0 +1,28 @@
Source: jinjaturtle
Section: admin
Priority: optional
Maintainer: Miguel Jacq <mig@mig5.net>
Rules-Requires-Root: no
Build-Depends:
debhelper-compat (= 13),
dh-python,
pybuild-plugin-pyproject,
python3-all,
python3-poetry-core,
python3-yaml,
python3-toml,
python3-defusedxml,
python3-jinja2
Standards-Version: 4.6.2
Homepage: https://git.mig5.net/mig5/jinjaturtle
Package: jinjaturtle
Architecture: all
Depends:
${misc:Depends},
${python3:Depends},
python3-yaml,
python3-toml,
python3-defusedxml,
python3-jinja2
Description: Convert config files into Ansible defaults and Jinja2 templates.

6
debian/rules vendored Executable file
View file

@ -0,0 +1,6 @@
#!/usr/bin/make -f
export PYBUILD_NAME=jinjaturtle
export PYBUILD_SYSTEM=pyproject
%:
dh $@ --with python3 --buildsystem=pybuild

1
debian/source/format vendored Normal file
View file

@ -0,0 +1 @@
3.0 (quilt)

6
debian/source/options vendored Normal file
View file

@ -0,0 +1,6 @@
tar-ignore = ".git"
tar-ignore = ".venv"
tar-ignore = "__pycache__"
tar-ignore = ".pytest_cache"
tar-ignore = "dist"
tar-ignore = "build"

View file

@ -9,8 +9,6 @@
stroke-width="4"/> stroke-width="4"/>
<!-- Center braces block (in the middle of the shell) --> <!-- Center braces block (in the middle of the shell) -->
<rect x="42" y="62" width="28" height="12" rx="3"
fill="#111827" opacity="0.1"/>
<text x="56" y="71" <text x="56" y="71"
text-anchor="middle" text-anchor="middle"
font-family="monospace" font-family="monospace"

Before

Width:  |  Height:  |  Size: 1.6 KiB

After

Width:  |  Height:  |  Size: 1.5 KiB

Before After
Before After

908
poetry.lock generated

File diff suppressed because it is too large Load diff

View file

@ -1,6 +1,6 @@
[tool.poetry] [tool.poetry]
name = "jinjaturtle" name = "jinjaturtle"
version = "0.1.4" version = "0.3.3"
description = "Convert config files into Ansible defaults and Jinja2 templates." description = "Convert config files into Ansible defaults and Jinja2 templates."
authors = ["Miguel Jacq <mig@mig5.net>"] authors = ["Miguel Jacq <mig@mig5.net>"]
license = "GPL-3.0-or-later" license = "GPL-3.0-or-later"
@ -17,13 +17,7 @@ python = "^3.10"
PyYAML = "^6.0" PyYAML = "^6.0"
tomli = { version = "^2.0.0", python = "<3.11" } tomli = { version = "^2.0.0", python = "<3.11" }
defusedxml = "^0.7.1" defusedxml = "^0.7.1"
jinja2 = "^3.1.6"
[tool.poetry.group.dev.dependencies]
pytest = "^7.0"
pytest-cov = "^4.0"
build = "^1.0"
twine = "^5.0"
pyproject-appimage = "^4.2"
[tool.poetry.scripts] [tool.poetry.scripts]
jinjaturtle = "jinjaturtle.cli:main" jinjaturtle = "jinjaturtle.cli:main"
@ -35,3 +29,8 @@ build-backend = "poetry.core.masonry.api"
[tool.pyproject-appimage] [tool.pyproject-appimage]
script = "jinjaturtle" script = "jinjaturtle"
output = "JinjaTurtle.AppImage" output = "JinjaTurtle.AppImage"
[tool.poetry.dev-dependencies]
pytest = "^8"
pytest-cov = "^5"
pyproject-appimage = "^4.2"

View file

@ -2,7 +2,8 @@
set -eo pipefail set -eo pipefail
rm -rf dist # Clean caches etc
filedust -y .
# Publish to Pypi # Publish to Pypi
poetry build poetry build
@ -14,3 +15,30 @@ mv JinjaTurtle.AppImage dist/
# Sign packages # Sign packages
for file in `ls -1 dist/`; do qubes-gpg-client --batch --armor --detach-sign dist/$file > dist/$file.asc; done for file in `ls -1 dist/`; do qubes-gpg-client --batch --armor --detach-sign dist/$file > dist/$file.asc; done
# Deb stuff
DISTS=(
debian:bookworm
debian:trixie
ubuntu:jammy
ubuntu:noble
)
for dist in ${DISTS[@]}; do
release=$(echo ${dist} | cut -d: -f2)
mkdir -p dist/${release}
docker build -f Dockerfile.debbuild -t jinjaturtle-deb:${release} \
--no-cache \
--progress=plain \
--build-arg BASE_IMAGE=${dist} .
docker run --rm \
-e SUITE="${release}" \
-v "$PWD":/src \
-v "$PWD/dist/${release}":/out \
jinjaturtle-deb:${release}
debfile=$(ls -1 dist/${release}/*.deb)
reprepro -b /home/user/git/repo includedeb "${release}" "${debfile}"
done

View file

@ -7,9 +7,10 @@ from pathlib import Path
from .core import ( from .core import (
parse_config, parse_config,
analyze_loops,
flatten_config, flatten_config,
generate_defaults_yaml, generate_ansible_yaml,
generate_template, generate_jinja2_template,
) )
@ -53,19 +54,34 @@ def _main(argv: list[str] | None = None) -> int:
args = parser.parse_args(argv) args = parser.parse_args(argv)
config_path = Path(args.config) config_path = Path(args.config)
fmt, parsed = parse_config(config_path, args.format)
flat_items = flatten_config(fmt, parsed)
defaults_yaml = generate_defaults_yaml(args.role_name, flat_items)
config_text = config_path.read_text(encoding="utf-8") config_text = config_path.read_text(encoding="utf-8")
template_str = generate_template(
fmt, parsed, args.role_name, original_text=config_text # Parse the config
fmt, parsed = parse_config(config_path, args.format)
# Analyze for loops
loop_candidates = analyze_loops(fmt, parsed)
# Flatten config (excluding loop paths if loops are detected)
flat_items = flatten_config(fmt, parsed, loop_candidates)
# Generate defaults YAML (with loop collections if detected)
ansible_yaml = generate_ansible_yaml(args.role_name, flat_items, loop_candidates)
# Generate template (with loops if detected)
template_str = generate_jinja2_template(
fmt,
parsed,
args.role_name,
original_text=config_text,
loop_candidates=loop_candidates,
) )
if args.defaults_output: if args.defaults_output:
Path(args.defaults_output).write_text(defaults_yaml, encoding="utf-8") Path(args.defaults_output).write_text(ansible_yaml, encoding="utf-8")
else: else:
print("# defaults/main.yml") print("# defaults/main.yml")
print(defaults_yaml, end="") print(ansible_yaml, end="")
if args.template_output: if args.template_output:
Path(args.template_output).write_text(template_str, encoding="utf-8") Path(args.template_output).write_text(template_str, encoding="utf-8")
@ -80,4 +96,4 @@ def main() -> None:
""" """
Console-script entry point. Console-script entry point.
""" """
raise SystemExit(_main(sys.argv[1:])) _main(sys.argv[1:])

View file

@ -3,8 +3,10 @@ from __future__ import annotations
from pathlib import Path from pathlib import Path
from typing import Any, Iterable from typing import Any, Iterable
import datetime
import yaml import yaml
from .loop_analyzer import LoopAnalyzer, LoopCandidate
from .handlers import ( from .handlers import (
BaseHandler, BaseHandler,
IniHandler, IniHandler,
@ -16,21 +18,24 @@ from .handlers import (
class QuotedString(str): class QuotedString(str):
"""Marker type for strings that must be double-quoted in YAML output.""" """
Marker type for strings that must be double-quoted in YAML output.
"""
pass pass
def _fallback_str_representer(dumper: yaml.SafeDumper, data: Any): def _fallback_str_representer(dumper: yaml.SafeDumper, data: Any):
""" """
Fallback for objects the dumper doesn't know about. Represent them as Fallback for objects the dumper doesn't know about.
plain strings.
""" """
return dumper.represent_scalar("tag:yaml.org,2002:str", str(data)) return dumper.represent_scalar("tag:yaml.org,2002:str", str(data))
class _TurtleDumper(yaml.SafeDumper): class _TurtleDumper(yaml.SafeDumper):
"""Custom YAML dumper that always double-quotes QuotedString values.""" """
Custom YAML dumper that always double-quotes QuotedString values.
"""
pass pass
@ -42,6 +47,7 @@ def _quoted_str_representer(dumper: yaml.SafeDumper, data: QuotedString):
_TurtleDumper.add_representer(QuotedString, _quoted_str_representer) _TurtleDumper.add_representer(QuotedString, _quoted_str_representer)
# Use our fallback for any unknown object types # Use our fallback for any unknown object types
_TurtleDumper.add_representer(None, _fallback_str_representer) _TurtleDumper.add_representer(None, _fallback_str_representer)
_HANDLERS: dict[str, BaseHandler] = {} _HANDLERS: dict[str, BaseHandler] = {}
_INI_HANDLER = IniHandler() _INI_HANDLER = IniHandler()
@ -49,6 +55,7 @@ _JSON_HANDLER = JsonHandler()
_TOML_HANDLER = TomlHandler() _TOML_HANDLER = TomlHandler()
_YAML_HANDLER = YamlHandler() _YAML_HANDLER = YamlHandler()
_XML_HANDLER = XmlHandler() _XML_HANDLER = XmlHandler()
_HANDLERS["ini"] = _INI_HANDLER _HANDLERS["ini"] = _INI_HANDLER
_HANDLERS["json"] = _JSON_HANDLER _HANDLERS["json"] = _JSON_HANDLER
_HANDLERS["toml"] = _TOML_HANDLER _HANDLERS["toml"] = _TOML_HANDLER
@ -57,17 +64,15 @@ _HANDLERS["xml"] = _XML_HANDLER
def make_var_name(role_prefix: str, path: Iterable[str]) -> str: def make_var_name(role_prefix: str, path: Iterable[str]) -> str:
"""Wrapper for :meth:`BaseHandler.make_var_name`. """
Wrapper for :meth:`BaseHandler.make_var_name`.
This keeps the public API (and tests) working while the implementation
lives on the BaseHandler class.
""" """
return BaseHandler.make_var_name(role_prefix, path) return BaseHandler.make_var_name(role_prefix, path)
def detect_format(path: Path, explicit: str | None = None) -> str: def detect_format(path: Path, explicit: str | None = None) -> str:
""" """
Determine config format (toml, yaml, json, ini-ish, xml) from argument or filename. Determine config format from argument or filename.
""" """
if explicit: if explicit:
return explicit return explicit
@ -96,51 +101,87 @@ def parse_config(path: Path, fmt: str | None = None) -> tuple[str, Any]:
if handler is None: if handler is None:
raise ValueError(f"Unsupported config format: {fmt}") raise ValueError(f"Unsupported config format: {fmt}")
parsed = handler.parse(path) parsed = handler.parse(path)
# Make sure datetime objects are treated as strings (TOML, YAML)
parsed = _stringify_timestamps(parsed)
return fmt, parsed return fmt, parsed
def flatten_config(fmt: str, parsed: Any) -> list[tuple[tuple[str, ...], Any]]: def analyze_loops(fmt: str, parsed: Any) -> list[LoopCandidate]:
""" """
Flatten parsed config into a list of (path_tuple, value). Analyze parsed config to find loop opportunities.
"""
analyzer = LoopAnalyzer()
candidates = analyzer.analyze(parsed, fmt)
# Filter by confidence threshold
return [c for c in candidates if c.confidence >= LoopAnalyzer.MIN_CONFIDENCE]
def flatten_config(
fmt: str, parsed: Any, loop_candidates: list[LoopCandidate] | None = None
) -> list[tuple[tuple[str, ...], Any]]:
"""
Flatten parsed config into (path, value) pairs.
If loop_candidates is provided, paths within those loops are excluded
from flattening (they'll be handled via loops in the template).
""" """
handler = _HANDLERS.get(fmt) handler = _HANDLERS.get(fmt)
if handler is None: if handler is None:
# preserve previous ValueError for unsupported formats
raise ValueError(f"Unsupported format: {fmt}") raise ValueError(f"Unsupported format: {fmt}")
return handler.flatten(parsed)
all_items = handler.flatten(parsed)
if not loop_candidates:
return all_items
# Build set of paths to exclude (anything under a loop path)
excluded_prefixes = {candidate.path for candidate in loop_candidates}
# Filter out items that fall under loop paths
filtered_items = []
for item_path, value in all_items:
# Check if this path starts with any loop path
is_excluded = False
for loop_path in excluded_prefixes:
if _path_starts_with(item_path, loop_path):
is_excluded = True
break
if not is_excluded:
filtered_items.append((item_path, value))
return filtered_items
def _normalize_default_value(value: Any) -> Any: def _path_starts_with(path: tuple[str, ...], prefix: tuple[str, ...]) -> bool:
""" """Check if path starts with prefix."""
Ensure that 'true' / 'false' end up as quoted strings in YAML, not booleans. if len(path) < len(prefix):
return False
- bool -> QuotedString("true"/"false") return path[: len(prefix)] == prefix
- "true"/"false" (any case) -> QuotedString(original_text)
- everything else -> unchanged
"""
if isinstance(value, bool):
# YAML booleans are lower-case; we keep them as strings.
return QuotedString("true" if value else "false")
if isinstance(value, str) and value.lower() in {"true", "false"}:
return QuotedString(value)
return value
def generate_defaults_yaml( def generate_ansible_yaml(
role_prefix: str, role_prefix: str,
flat_items: list[tuple[tuple[str, ...], Any]], flat_items: list[tuple[tuple[str, ...], Any]],
loop_candidates: list[LoopCandidate] | None = None,
) -> str: ) -> str:
""" """
Create YAML for defaults/main.yml from flattened items. Create Ansible YAML for defaults/main.yml.
Boolean/boolean-like values ("true"/"false") are forced to be *strings*
and double-quoted in the resulting YAML so that Ansible does not coerce
them back into Python booleans.
""" """
defaults: dict[str, Any] = {} defaults: dict[str, Any] = {}
# Add scalar variables
for path, value in flat_items: for path, value in flat_items:
var_name = make_var_name(role_prefix, path) var_name = make_var_name(role_prefix, path)
defaults[var_name] = _normalize_default_value(value) defaults[var_name] = value # No normalization - keep original types
# Add loop collections
if loop_candidates:
for candidate in loop_candidates:
var_name = make_var_name(role_prefix, candidate.path)
defaults[var_name] = candidate.items
return yaml.dump( return yaml.dump(
defaults, defaults,
@ -153,21 +194,54 @@ def generate_defaults_yaml(
) )
def generate_template( def generate_jinja2_template(
fmt: str, fmt: str,
parsed: Any, parsed: Any,
role_prefix: str, role_prefix: str,
original_text: str | None = None, original_text: str | None = None,
loop_candidates: list[LoopCandidate] | None = None,
) -> str: ) -> str:
""" """
Generate a Jinja2 template for the config. Generate a Jinja2 template for the config.
If original_text is provided, comments and blank lines are preserved by
patching values in-place. Otherwise we fall back to reconstructing from
the parsed structure (no comments). JSON of course does not support
comments.
""" """
handler = _HANDLERS.get(fmt) handler = _HANDLERS.get(fmt)
if handler is None: if handler is None:
raise ValueError(f"Unsupported format: {fmt}") raise ValueError(f"Unsupported format: {fmt}")
return handler.generate_template(parsed, role_prefix, original_text=original_text)
# Check if handler supports loop-aware generation
if hasattr(handler, "generate_jinja2_template_with_loops") and loop_candidates:
return handler.generate_jinja2_template_with_loops(
parsed, role_prefix, original_text, loop_candidates
)
# Fallback to original scalar-only generation
return handler.generate_jinja2_template(
parsed, role_prefix, original_text=original_text
)
def _stringify_timestamps(obj: Any) -> Any:
"""
Recursively walk a parsed config and turn any datetime/date/time objects
into plain strings in ISO-8601 form.
This prevents Python datetime objects from leaking into YAML/Jinja, which
would otherwise reformat the value (e.g. replacing 'T' with a space).
This commonly occurs otherwise with TOML and YAML files, which sees
Python automatically convert those sorts of strings into datetime objects.
"""
if isinstance(obj, dict):
return {k: _stringify_timestamps(v) for k, v in obj.items()}
if isinstance(obj, list):
return [_stringify_timestamps(v) for v in obj]
# TOML & YAML both use the standard datetime types
if isinstance(obj, datetime.datetime):
# Use default ISO-8601: 'YYYY-MM-DDTHH:MM:SS±HH:MM' (with 'T')
return obj.isoformat()
if isinstance(obj, (datetime.date, datetime.time)):
return obj.isoformat()
return obj

View file

@ -11,7 +11,7 @@ class BaseHandler:
Each handler is responsible for: Each handler is responsible for:
- parse(path) -> parsed object - parse(path) -> parsed object
- flatten(parsed) -> list[(path_tuple, value)] - flatten(parsed) -> list[(path_tuple, value)]
- generate_template(parsed, role_prefix, original_text=None) -> str - generate_jinja2_template(parsed, role_prefix, original_text=None) -> str
""" """
fmt: str # e.g. "ini", "yaml", ... fmt: str # e.g. "ini", "yaml", ...
@ -22,7 +22,7 @@ class BaseHandler:
def flatten(self, parsed: Any) -> list[tuple[tuple[str, ...], Any]]: def flatten(self, parsed: Any) -> list[tuple[tuple[str, ...], Any]]:
raise NotImplementedError raise NotImplementedError
def generate_template( def generate_jinja2_template(
self, self,
parsed: Any, parsed: Any,
role_prefix: str, role_prefix: str,

View file

@ -12,7 +12,7 @@ class IniHandler(BaseHandler):
def parse(self, path: Path) -> configparser.ConfigParser: def parse(self, path: Path) -> configparser.ConfigParser:
parser = configparser.ConfigParser() parser = configparser.ConfigParser()
parser.optionxform = str # preserve key case parser.optionxform = str # noqa
with path.open("r", encoding="utf-8") as f: with path.open("r", encoding="utf-8") as f:
parser.read_file(f) parser.read_file(f)
return parser return parser
@ -32,7 +32,7 @@ class IniHandler(BaseHandler):
items.append(((section, key), processed)) items.append(((section, key), processed))
return items return items
def generate_template( def generate_jinja2_template(
self, self,
parsed: Any, parsed: Any,
role_prefix: str, role_prefix: str,

View file

@ -1,10 +1,12 @@
from __future__ import annotations from __future__ import annotations
import json import json
import re
from pathlib import Path from pathlib import Path
from typing import Any from typing import Any
from . import DictLikeHandler from . import DictLikeHandler
from ..loop_analyzer import LoopCandidate
class JsonHandler(DictLikeHandler): class JsonHandler(DictLikeHandler):
@ -15,23 +17,44 @@ class JsonHandler(DictLikeHandler):
with path.open("r", encoding="utf-8") as f: with path.open("r", encoding="utf-8") as f:
return json.load(f) return json.load(f)
def generate_template( def generate_jinja2_template(
self, self,
parsed: Any, parsed: Any,
role_prefix: str, role_prefix: str,
original_text: str | None = None, original_text: str | None = None,
) -> str: ) -> str:
"""Original scalar-only template generation."""
if not isinstance(parsed, (dict, list)): if not isinstance(parsed, (dict, list)):
raise TypeError("JSON parser result must be a dict or list") raise TypeError("JSON parser result must be a dict or list")
# As before: ignore original_text and rebuild structurally # As before: ignore original_text and rebuild structurally
return self._generate_json_template(role_prefix, parsed) return self._generate_json_template(role_prefix, parsed)
def generate_jinja2_template_with_loops(
self,
parsed: Any,
role_prefix: str,
original_text: str | None,
loop_candidates: list[LoopCandidate],
) -> str:
"""Generate template with Jinja2 for loops where appropriate."""
if not isinstance(parsed, (dict, list)):
raise TypeError("JSON parser result must be a dict or list")
# Build loop path set for quick lookup
loop_paths = {candidate.path for candidate in loop_candidates}
return self._generate_json_template_with_loops(
role_prefix, parsed, loop_paths, loop_candidates
)
def _generate_json_template(self, role_prefix: str, data: Any) -> str: def _generate_json_template(self, role_prefix: str, data: Any) -> str:
""" """
Generate a JSON Jinja2 template from parsed JSON data. Generate a JSON Jinja2 template from parsed JSON data.
All scalar values are replaced with Jinja expressions whose names are All scalar values are replaced with Jinja expressions whose names are
derived from the path, similar to TOML/YAML. derived from the path, similar to TOML/YAML.
Uses | tojson filter to preserve types (numbers, booleans, null).
""" """
def _walk(obj: Any, path: tuple[str, ...] = ()) -> Any: def _walk(obj: Any, path: tuple[str, ...] = ()) -> Any:
@ -39,9 +62,130 @@ class JsonHandler(DictLikeHandler):
return {k: _walk(v, path + (str(k),)) for k, v in obj.items()} return {k: _walk(v, path + (str(k),)) for k, v in obj.items()}
if isinstance(obj, list): if isinstance(obj, list):
return [_walk(v, path + (str(i),)) for i, v in enumerate(obj)] return [_walk(v, path + (str(i),)) for i, v in enumerate(obj)]
# scalar # scalar - use marker that will be replaced with tojson
var_name = self.make_var_name(role_prefix, path) var_name = self.make_var_name(role_prefix, path)
return f"{{{{ {var_name} }}}}" return f"__SCALAR__{var_name}__"
templated = _walk(data) templated = _walk(data)
return json.dumps(templated, indent=2, ensure_ascii=False) + "\n" json_str = json.dumps(templated, indent=2, ensure_ascii=False)
# Replace scalar markers with Jinja expressions using tojson filter
# This preserves types (numbers stay numbers, booleans stay booleans)
json_str = re.sub(
r'"__SCALAR__([a-zA-Z_][a-zA-Z0-9_]*)__"', r"{{ \1 | tojson }}", json_str
)
return json_str + "\n"
def _generate_json_template_with_loops(
self,
role_prefix: str,
data: Any,
loop_paths: set[tuple[str, ...]],
loop_candidates: list[LoopCandidate],
path: tuple[str, ...] = (),
) -> str:
"""
Generate a JSON Jinja2 template with for loops where appropriate.
"""
def _walk(obj: Any, current_path: tuple[str, ...] = ()) -> Any:
# Check if this path is a loop candidate
if current_path in loop_paths:
# Find the matching candidate
candidate = next(c for c in loop_candidates if c.path == current_path)
collection_var = self.make_var_name(role_prefix, candidate.path)
item_var = candidate.loop_var
if candidate.item_schema == "scalar":
# Simple list of scalars - use special marker that we'll replace
return f"__LOOP_SCALAR__{collection_var}__{item_var}__"
elif candidate.item_schema in ("simple_dict", "nested"):
# List of dicts - use special marker
return f"__LOOP_DICT__{collection_var}__{item_var}__"
if isinstance(obj, dict):
return {k: _walk(v, current_path + (str(k),)) for k, v in obj.items()}
if isinstance(obj, list):
# Check if this list is a loop candidate
if current_path in loop_paths:
# Already handled above
return _walk(obj, current_path)
return [_walk(v, current_path + (str(i),)) for i, v in enumerate(obj)]
# scalar - use marker to preserve type
var_name = self.make_var_name(role_prefix, current_path)
return f"__SCALAR__{var_name}__"
templated = _walk(data, path)
# Convert to JSON string
json_str = json.dumps(templated, indent=2, ensure_ascii=False)
# Replace scalar markers with Jinja expressions using tojson filter
json_str = re.sub(
r'"__SCALAR__([a-zA-Z_][a-zA-Z0-9_]*)__"', r"{{ \1 | tojson }}", json_str
)
# Post-process to replace loop markers with actual Jinja loops
for candidate in loop_candidates:
collection_var = self.make_var_name(role_prefix, candidate.path)
item_var = candidate.loop_var
if candidate.item_schema == "scalar":
# Replace scalar loop marker with Jinja for loop
marker = f'"__LOOP_SCALAR__{collection_var}__{item_var}__"'
replacement = self._generate_json_scalar_loop(
collection_var, item_var, candidate
)
json_str = json_str.replace(marker, replacement)
elif candidate.item_schema in ("simple_dict", "nested"):
# Replace dict loop marker with Jinja for loop
marker = f'"__LOOP_DICT__{collection_var}__{item_var}__"'
replacement = self._generate_json_dict_loop(
collection_var, item_var, candidate
)
json_str = json_str.replace(marker, replacement)
return json_str + "\n"
def _generate_json_scalar_loop(
self, collection_var: str, item_var: str, candidate: LoopCandidate
) -> str:
"""Generate a Jinja for loop for a scalar list in JSON."""
# Use tojson filter to properly handle strings (quotes them) and other types
# Include array brackets around the loop
return (
f"[{{% for {item_var} in {collection_var} %}}"
f"{{{{ {item_var} | tojson }}}}"
f"{{% if not loop.last %}}, {{% endif %}}"
f"{{% endfor %}}]"
)
def _generate_json_dict_loop(
self, collection_var: str, item_var: str, candidate: LoopCandidate
) -> str:
"""Generate a Jinja for loop for a dict list in JSON."""
if not candidate.items:
return "[]"
# Get first item as template
sample_item = candidate.items[0]
# Build the dict template - use tojson for all values to handle types correctly
fields = []
for key, value in sample_item.items():
if key == "_key":
continue
# Use tojson filter to properly serialize all types (strings, numbers, booleans)
fields.append(f'"{key}": {{{{ {item_var}.{key} | tojson }}}}')
dict_template = "{" + ", ".join(fields) + "}"
return (
f"{{% for {item_var} in {collection_var} %}}"
f"{dict_template}"
f"{{% if not loop.last %}}, {{% endif %}}"
f"{{% endfor %}}"
)

View file

@ -1,10 +1,15 @@
from __future__ import annotations from __future__ import annotations
import tomllib
from pathlib import Path from pathlib import Path
from typing import Any from typing import Any
from . import DictLikeHandler from . import DictLikeHandler
from ..loop_analyzer import LoopCandidate
try:
import tomllib
except Exception:
import tomli as tomllib
class TomlHandler(DictLikeHandler): class TomlHandler(DictLikeHandler):
@ -19,18 +24,37 @@ class TomlHandler(DictLikeHandler):
with path.open("rb") as f: with path.open("rb") as f:
return tomllib.load(f) return tomllib.load(f)
def generate_template( def generate_jinja2_template(
self, self,
parsed: Any, parsed: Any,
role_prefix: str, role_prefix: str,
original_text: str | None = None, original_text: str | None = None,
) -> str: ) -> str:
"""Original scalar-only template generation."""
if original_text is not None: if original_text is not None:
return self._generate_toml_template_from_text(role_prefix, original_text) return self._generate_toml_template_from_text(role_prefix, original_text)
if not isinstance(parsed, dict): if not isinstance(parsed, dict):
raise TypeError("TOML parser result must be a dict") raise TypeError("TOML parser result must be a dict")
return self._generate_toml_template(role_prefix, parsed) return self._generate_toml_template(role_prefix, parsed)
def generate_jinja2_template_with_loops(
self,
parsed: Any,
role_prefix: str,
original_text: str | None,
loop_candidates: list[LoopCandidate],
) -> str:
"""Generate template with Jinja2 for loops where appropriate."""
if original_text is not None:
return self._generate_toml_template_with_loops_from_text(
role_prefix, original_text, loop_candidates
)
if not isinstance(parsed, dict):
raise TypeError("TOML parser result must be a dict")
return self._generate_toml_template_with_loops(
role_prefix, parsed, loop_candidates
)
def _generate_toml_template(self, role_prefix: str, data: dict[str, Any]) -> str: def _generate_toml_template(self, role_prefix: str, data: dict[str, Any]) -> str:
""" """
Generate a TOML Jinja2 template from parsed TOML dict. Generate a TOML Jinja2 template from parsed TOML dict.
@ -45,6 +69,89 @@ class TomlHandler(DictLikeHandler):
var_name = self.make_var_name(role_prefix, path + (key,)) var_name = self.make_var_name(role_prefix, path + (key,))
if isinstance(value, str): if isinstance(value, str):
lines.append(f'{key} = "{{{{ {var_name} }}}}"') lines.append(f'{key} = "{{{{ {var_name} }}}}"')
elif isinstance(value, bool):
# Booleans need | lower filter (Python True/False → TOML true/false)
lines.append(f"{key} = {{{{ {var_name} | lower }}}}")
else:
lines.append(f"{key} = {{{{ {var_name} }}}}")
def walk(obj: dict[str, Any], path: tuple[str, ...] = ()) -> None:
scalar_items = {k: v for k, v in obj.items() if not isinstance(v, dict)}
nested_items = {k: v for k, v in obj.items() if isinstance(v, dict)}
if path:
header = ".".join(path)
lines.append(f"[{header}]")
for key, val in scalar_items.items():
emit_kv(path, str(key), val)
if scalar_items:
lines.append("")
for key, val in nested_items.items():
walk(val, path + (str(key),))
# Root scalars (no table header)
root_scalars = {k: v for k, v in data.items() if not isinstance(v, dict)}
for key, val in root_scalars.items():
emit_kv((), str(key), val)
if root_scalars:
lines.append("")
# Tables
for key, val in data.items():
if isinstance(val, dict):
walk(val, (str(key),))
return "\n".join(lines).rstrip() + "\n"
def _generate_toml_template_with_loops(
self,
role_prefix: str,
data: dict[str, Any],
loop_candidates: list[LoopCandidate],
) -> str:
"""
Generate a TOML Jinja2 template with for loops where appropriate.
"""
lines: list[str] = []
loop_paths = {candidate.path for candidate in loop_candidates}
def emit_kv(path: tuple[str, ...], key: str, value: Any) -> None:
var_name = self.make_var_name(role_prefix, path + (key,))
if isinstance(value, str):
lines.append(f'{key} = "{{{{ {var_name} }}}}"')
elif isinstance(value, bool):
# Booleans need | lower filter (Python True/False → TOML true/false)
lines.append(f"{key} = {{{{ {var_name} | lower }}}}")
elif isinstance(value, list):
# Check if this list is a loop candidate
if path + (key,) in loop_paths:
# Find the matching candidate
candidate = next(
c for c in loop_candidates if c.path == path + (key,)
)
collection_var = self.make_var_name(role_prefix, candidate.path)
item_var = candidate.loop_var
if candidate.item_schema == "scalar":
# Scalar list loop
lines.append(
f"{key} = ["
f"{{% for {item_var} in {collection_var} %}}"
f"{{{{ {item_var} }}}}"
f"{{% if not loop.last %}}, {{% endif %}}"
f"{{% endfor %}}"
f"]"
)
elif candidate.item_schema in ("simple_dict", "nested"):
# Dict list loop - TOML array of tables
# This is complex for TOML, using simplified approach
lines.append(f"{key} = {{{{ {var_name} | tojson }}}}")
else:
# Not a loop, treat as regular variable
lines.append(f"{key} = {{{{ {var_name} }}}}")
else: else:
lines.append(f"{key} = {{{{ {var_name} }}}}") lines.append(f"{key} = {{{{ {var_name} }}}}")
@ -173,6 +280,236 @@ class TomlHandler(DictLikeHandler):
nested_var = self.make_var_name(role_prefix, nested_path) nested_var = self.make_var_name(role_prefix, nested_path)
if isinstance(sub_val, str): if isinstance(sub_val, str):
inner_bits.append(f'{sub_key} = "{{{{ {nested_var} }}}}"') inner_bits.append(f'{sub_key} = "{{{{ {nested_var} }}}}"')
elif isinstance(sub_val, bool):
inner_bits.append(
f"{sub_key} = {{{{ {nested_var} | lower }}}}"
)
else:
inner_bits.append(f"{sub_key} = {{{ {nested_var} }}}")
replacement_value = "{ " + ", ".join(inner_bits) + " }"
new_content = (
before_eq + "=" + leading_ws + replacement_value + comment_part
)
out_lines.append(new_content + newline)
continue
# If parsing fails, fall through to normal handling
# Normal scalar value handling (including bools, numbers, strings)
var_name = self.make_var_name(role_prefix, path)
use_quotes = (
len(raw_value) >= 2
and raw_value[0] == raw_value[-1]
and raw_value[0] in {'"', "'"}
)
# Check if value is a boolean in the text
is_bool = raw_value.strip().lower() in ("true", "false")
if use_quotes:
quote_char = raw_value[0]
replacement_value = f"{quote_char}{{{{ {var_name} }}}}{quote_char}"
elif is_bool:
replacement_value = f"{{{{ {var_name} | lower }}}}"
else:
replacement_value = f"{{{{ {var_name} }}}}"
new_content = (
before_eq + "=" + leading_ws + replacement_value + comment_part
)
out_lines.append(new_content + newline)
return "".join(out_lines)
def _generate_toml_template_with_loops_from_text(
self, role_prefix: str, text: str, loop_candidates: list[LoopCandidate]
) -> str:
"""
Generate a Jinja2 template for a TOML file with loop support.
"""
loop_paths = {candidate.path for candidate in loop_candidates}
lines = text.splitlines(keepends=True)
current_table: tuple[str, ...] = ()
out_lines: list[str] = []
skip_until_next_table = (
False # Track when we're inside a looped array-of-tables
)
for raw_line in lines:
line = raw_line
stripped = line.lstrip()
# Blank or pure comment
if not stripped or stripped.startswith("#"):
# Only output if we're not skipping
if not skip_until_next_table:
out_lines.append(raw_line)
continue
# Table header: [server] or [server.tls] or [[array.of.tables]]
if stripped.startswith("[") and "]" in stripped:
header = stripped
# Check if it's array-of-tables ([[name]]) or regular table ([name])
is_array_table = header.startswith("[[") and "]]" in header
if is_array_table:
# Extract content between [[ and ]]
start = header.find("[[") + 2
end = header.find("]]", start)
inner = header[start:end].strip() if end != -1 else ""
else:
# Extract content between [ and ]
start = header.find("[") + 1
end = header.find("]", start)
inner = header[start:end].strip() if end != -1 else ""
if inner:
parts = [p.strip() for p in inner.split(".") if p.strip()]
table_path = tuple(parts)
# Check if this is an array-of-tables that's a loop candidate
if is_array_table and table_path in loop_paths:
# If we're already skipping this table, this is a subsequent occurrence
if skip_until_next_table and current_table == table_path:
# This is a duplicate [[table]] - skip it
continue
# This is the first occurrence - generate the loop
current_table = table_path
candidate = next(
c for c in loop_candidates if c.path == table_path
)
# Generate the loop header
collection_var = self.make_var_name(role_prefix, candidate.path)
item_var = candidate.loop_var
# Get sample item to build template
if candidate.items:
sample_item = candidate.items[0]
# Build loop
out_lines.append(
f"{{% for {item_var} in {collection_var} %}}\n"
)
out_lines.append(f"[[{'.'.join(table_path)}]]\n")
# Add fields from sample item
for key, value in sample_item.items():
if key == "_key":
continue
if isinstance(value, str):
out_lines.append(
f'{key} = "{{{{ {item_var}.{key} }}}}"\n'
)
else:
out_lines.append(
f"{key} = {{{{ {item_var}.{key} }}}}\n"
)
out_lines.append("{% endfor %}\n")
# Skip all content until the next different table
skip_until_next_table = True
continue
else:
# Regular table or non-loop array - reset skip flag if it's a different table
if current_table != table_path:
skip_until_next_table = False
current_table = table_path
out_lines.append(raw_line)
continue
# If we're inside a skipped array-of-tables section, skip this line
if skip_until_next_table:
continue
# Try key = value
newline = ""
content = raw_line
if content.endswith("\r\n"):
newline = "\r\n"
content = content[:-2]
elif content.endswith("\n"):
newline = content[-1]
content = content[:-1]
eq_index = content.find("=")
if eq_index == -1:
out_lines.append(raw_line)
continue
before_eq = content[:eq_index]
after_eq = content[eq_index + 1 :]
key = before_eq.strip()
if not key:
out_lines.append(raw_line)
continue
# Whitespace after '='
value_ws_len = len(after_eq) - len(after_eq.lstrip(" \t"))
leading_ws = after_eq[:value_ws_len]
value_and_comment = after_eq[value_ws_len:]
value_part, comment_part = self._split_inline_comment(
value_and_comment, {"#"}
)
raw_value = value_part.strip()
# Path for this key (table + key)
path = current_table + (key,)
# Check if this path is a loop candidate
if path in loop_paths:
candidate = next(c for c in loop_candidates if c.path == path)
collection_var = self.make_var_name(role_prefix, candidate.path)
item_var = candidate.loop_var
if candidate.item_schema == "scalar":
# Scalar list loop
replacement_value = (
f"["
f"{{% for {item_var} in {collection_var} %}}"
f"{{{{ {item_var} }}}}"
f"{{% if not loop.last %}}, {{% endif %}}"
f"{{% endfor %}}"
f"]"
)
else:
# Dict/nested loop - use tojson filter for complex arrays
replacement_value = f"{{{{ {collection_var} | tojson }}}}"
new_content = (
before_eq + "=" + leading_ws + replacement_value + comment_part
)
out_lines.append(new_content + newline)
continue
# Special case: inline table
if (
raw_value.startswith("{")
and raw_value.endswith("}")
and tomllib is not None
):
try:
# Parse the inline table as a tiny TOML document
mini_source = "table = " + raw_value + "\n"
mini_data = tomllib.loads(mini_source)["table"]
except Exception:
mini_data = None
if isinstance(mini_data, dict):
inner_bits: list[str] = []
for sub_key, sub_val in mini_data.items():
nested_path = path + (sub_key,)
nested_var = self.make_var_name(role_prefix, nested_path)
if isinstance(sub_val, str):
inner_bits.append(f'{sub_key} = "{{{{ {nested_var} }}}}"')
elif isinstance(sub_val, bool):
inner_bits.append(
f"{sub_key} = {{{{ {nested_var} | lower }}}}"
)
else: else:
inner_bits.append(f"{sub_key} = {{{{ {nested_var} }}}}") inner_bits.append(f"{sub_key} = {{{{ {nested_var} }}}}")
replacement_value = "{ " + ", ".join(inner_bits) + " }" replacement_value = "{ " + ", ".join(inner_bits) + " }"
@ -191,9 +528,14 @@ class TomlHandler(DictLikeHandler):
and raw_value[0] in {'"', "'"} and raw_value[0] in {'"', "'"}
) )
# Check if value is a boolean in the text
is_bool = raw_value.strip().lower() in ("true", "false")
if use_quotes: if use_quotes:
quote_char = raw_value[0] quote_char = raw_value[0]
replacement_value = f"{quote_char}{{{{ {var_name} }}}}{quote_char}" replacement_value = f"{quote_char}{{{{ {var_name} }}}}{quote_char}"
elif is_bool:
replacement_value = f"{{{{ {var_name} | lower }}}}"
else: else:
replacement_value = f"{{{{ {var_name} }}}}" replacement_value = f"{{{{ {var_name} }}}}"

View file

@ -5,19 +5,19 @@ from pathlib import Path
from typing import Any from typing import Any
import xml.etree.ElementTree as ET # nosec import xml.etree.ElementTree as ET # nosec
from . import BaseHandler from .base import BaseHandler
from ..loop_analyzer import LoopCandidate
class XmlHandler(BaseHandler): class XmlHandler(BaseHandler):
"""
XML handler that can generate both scalar templates and loop-based templates.
"""
fmt = "xml" fmt = "xml"
def parse(self, path: Path) -> ET.Element: def parse(self, path: Path) -> ET.Element:
text = path.read_text(encoding="utf-8") text = path.read_text(encoding="utf-8")
# Parse with an explicit XMLParser instance so this stays compatible
# with Python versions where xml.etree.ElementTree.fromstring() may
# not accept a ``parser=`` keyword argument.
# defusedxml.defuse_stdlib() is called in the CLI entrypoint, so using
# the stdlib XMLParser here is safe.
parser = ET.XMLParser( parser = ET.XMLParser(
target=ET.TreeBuilder(insert_comments=False) target=ET.TreeBuilder(insert_comments=False)
) # nosec B314 ) # nosec B314
@ -30,12 +30,13 @@ class XmlHandler(BaseHandler):
raise TypeError("XML parser result must be an Element") raise TypeError("XML parser result must be an Element")
return self._flatten_xml(parsed) return self._flatten_xml(parsed)
def generate_template( def generate_jinja2_template(
self, self,
parsed: Any, parsed: Any,
role_prefix: str, role_prefix: str,
original_text: str | None = None, original_text: str | None = None,
) -> str: ) -> str:
"""Original scalar-only template generation."""
if original_text is not None: if original_text is not None:
return self._generate_xml_template_from_text(role_prefix, original_text) return self._generate_xml_template_from_text(role_prefix, original_text)
if not isinstance(parsed, ET.Element): if not isinstance(parsed, ET.Element):
@ -43,25 +44,30 @@ class XmlHandler(BaseHandler):
xml_str = ET.tostring(parsed, encoding="unicode") xml_str = ET.tostring(parsed, encoding="unicode")
return self._generate_xml_template_from_text(role_prefix, xml_str) return self._generate_xml_template_from_text(role_prefix, xml_str)
def _flatten_xml(self, root: ET.Element) -> list[tuple[tuple[str, ...], Any]]: def generate_jinja2_template_with_loops(
""" self,
Flatten an XML tree into (path, value) pairs. parsed: Any,
role_prefix: str,
original_text: str | None,
loop_candidates: list[LoopCandidate],
) -> str:
"""Generate template with Jinja2 for loops where appropriate."""
Path conventions: if original_text is not None:
- Root element's children are treated as top-level (root tag is *not* included). return self._generate_xml_template_with_loops_from_text(
- Element text: role_prefix, original_text, loop_candidates
<foo>bar</foo> -> path ("foo",) value "bar" )
<foo attr="x">bar</foo> -> path ("foo", "value") value "bar"
<foo><bar>baz</bar></foo> -> ("foo", "bar") / etc. if not isinstance(parsed, ET.Element):
- Attributes: raise TypeError("XML parser result must be an Element")
<server host="localhost">
-> path ("server", "@host") value "localhost" xml_str = ET.tostring(parsed, encoding="unicode")
- Repeated sibling elements: return self._generate_xml_template_with_loops_from_text(
<endpoint>/a</endpoint> role_prefix, xml_str, loop_candidates
<endpoint>/b</endpoint> )
-> ("endpoint", "0") "/a"
("endpoint", "1") "/b" def _flatten_xml(self, root: ET.Element) -> list[tuple[tuple[str, ...], Any]]:
""" """Flatten an XML tree into (path, value) pairs."""
items: list[tuple[tuple[str, ...], Any]] = [] items: list[tuple[tuple[str, ...], Any]] = []
def walk(elem: ET.Element, path: tuple[str, ...]) -> None: def walk(elem: ET.Element, path: tuple[str, ...]) -> None:
@ -77,10 +83,8 @@ class XmlHandler(BaseHandler):
text = (elem.text or "").strip() text = (elem.text or "").strip()
if text: if text:
if not elem.attrib and not children: if not elem.attrib and not children:
# Simple <foo>bar</foo>
items.append((path, text)) items.append((path, text))
else: else:
# Text alongside attrs/children
items.append((path + ("value",), text)) items.append((path + ("value",), text))
# Repeated siblings get an index; singletons just use the tag # Repeated siblings get an index; singletons just use the tag
@ -97,24 +101,16 @@ class XmlHandler(BaseHandler):
child_path = path + (tag,) child_path = path + (tag,)
walk(child, child_path) walk(child, child_path)
# Treat root as a container: its children are top-level
walk(root, ()) walk(root, ())
return items return items
def _split_xml_prolog(self, text: str) -> tuple[str, str]: def _split_xml_prolog(self, text: str) -> tuple[str, str]:
""" """Split XML into (prolog, body)."""
Split an XML document into (prolog, body), where prolog includes:
- XML declaration (<?xml ...?>)
- top-level comments
- DOCTYPE
The body starts at the root element.
"""
i = 0 i = 0
n = len(text) n = len(text)
prolog_parts: list[str] = [] prolog_parts: list[str] = []
while i < n: while i < n:
# Preserve leading whitespace
while i < n and text[i].isspace(): while i < n and text[i].isspace():
prolog_parts.append(text[i]) prolog_parts.append(text[i])
i += 1 i += 1
@ -146,22 +142,33 @@ class XmlHandler(BaseHandler):
continue continue
if text[i] == "<": if text[i] == "<":
# Assume root element starts here
break break
# Unexpected content: stop treating as prolog
break break
return "".join(prolog_parts), text[i:] return "".join(prolog_parts), text[i:]
def _apply_jinja_to_xml_tree(self, role_prefix: str, root: ET.Element) -> None: def _apply_jinja_to_xml_tree(
self,
role_prefix: str,
root: ET.Element,
loop_candidates: list[LoopCandidate] | None = None,
) -> None:
""" """
Mutate the XML tree in-place, replacing scalar values with Jinja Mutate XML tree in-place, replacing values with Jinja expressions.
expressions based on the same paths used in _flatten_xml.
If loop_candidates is provided, repeated elements matching a candidate
will be replaced with a {% for %} loop.
""" """
# Build a map of loop paths for quick lookup
loop_paths = {}
if loop_candidates:
for candidate in loop_candidates:
loop_paths[candidate.path] = candidate
def walk(elem: ET.Element, path: tuple[str, ...]) -> None: def walk(elem: ET.Element, path: tuple[str, ...]) -> None:
# Attributes # Attributes (unless this element is in a loop)
for attr_name in list(elem.attrib.keys()): for attr_name in list(elem.attrib.keys()):
attr_path = path + (f"@{attr_name}",) attr_path = path + (f"@{attr_name}",)
var_name = self.make_var_name(role_prefix, attr_path) var_name = self.make_var_name(role_prefix, attr_path)
@ -180,51 +187,273 @@ class XmlHandler(BaseHandler):
var_name = self.make_var_name(role_prefix, text_path) var_name = self.make_var_name(role_prefix, text_path)
elem.text = f"{{{{ {var_name} }}}}" elem.text = f"{{{{ {var_name} }}}}"
# Repeated children get indexes just like in _flatten_xml # Handle children - check for loops first
counts = Counter(child.tag for child in children) counts = Counter(child.tag for child in children)
index_counters: dict[str, int] = defaultdict(int) index_counters: dict[str, int] = defaultdict(int)
# Check each tag to see if it's a loop candidate
processed_tags = set()
for child in children: for child in children:
tag = child.tag tag = child.tag
if counts[tag] > 1:
# Skip if we've already processed this tag as a loop
if tag in processed_tags:
continue
child_path = path + (tag,)
# Check if this is a loop candidate
if child_path in loop_paths:
# Mark this tag as processed
processed_tags.add(tag)
# Remove all children with this tag
for child_to_remove in [c for c in children if c.tag == tag]:
elem.remove(child_to_remove)
# Create a loop comment/marker
# We'll handle the actual loop generation in text processing
loop_marker = ET.Comment(f"LOOP:{tag}")
elem.append(loop_marker)
elif counts[tag] > 1:
# Multiple children but not a loop candidate - use indexed paths
idx = index_counters[tag] idx = index_counters[tag]
index_counters[tag] += 1 index_counters[tag] += 1
child_path = path + (tag, str(idx)) indexed_path = path + (tag, str(idx))
walk(child, indexed_path)
else: else:
child_path = path + (tag,) # Single child
walk(child, child_path) walk(child, child_path)
walk(root, ()) walk(root, ())
def _generate_xml_template_from_text(self, role_prefix: str, text: str) -> str: def _generate_xml_template_from_text(self, role_prefix: str, text: str) -> str:
""" """Generate scalar-only Jinja2 template."""
Generate a Jinja2 template for an XML file, preserving comments and prolog.
- Attributes become Jinja placeholders:
<server host="localhost" />
-> <server host="{{ prefix_server_host }}" />
- Text nodes become placeholders:
<port>8080</port>
-> <port>{{ prefix_port }}</port>
but if the element also has attributes/children, the value path
gets a trailing "value" component, matching flattening.
"""
prolog, body = self._split_xml_prolog(text) prolog, body = self._split_xml_prolog(text)
# Parse with comments included so <!-- --> are preserved
# defusedxml.defuse_stdlib() is called in CLI entrypoint
parser = ET.XMLParser(target=ET.TreeBuilder(insert_comments=True)) # nosec B314 parser = ET.XMLParser(target=ET.TreeBuilder(insert_comments=True)) # nosec B314
parser.feed(body) parser.feed(body)
root = parser.close() root = parser.close()
self._apply_jinja_to_xml_tree(role_prefix, root) self._apply_jinja_to_xml_tree(role_prefix, root)
# Pretty indentation if available (Python 3.9+)
indent = getattr(ET, "indent", None) indent = getattr(ET, "indent", None)
if indent is not None: if indent is not None:
indent(root, space=" ") # type: ignore[arg-type] indent(root, space=" ") # type: ignore[arg-type]
xml_body = ET.tostring(root, encoding="unicode") xml_body = ET.tostring(root, encoding="unicode")
return prolog + xml_body return prolog + xml_body
def _generate_xml_template_with_loops_from_text(
self,
role_prefix: str,
text: str,
loop_candidates: list[LoopCandidate],
) -> str:
"""Generate Jinja2 template with for loops."""
prolog, body = self._split_xml_prolog(text)
# Parse with comments preserved
parser = ET.XMLParser(target=ET.TreeBuilder(insert_comments=True)) # nosec B314
parser.feed(body)
root = parser.close()
# Apply Jinja transformations (including loop markers)
self._apply_jinja_to_xml_tree(role_prefix, root, loop_candidates)
# Convert to string
indent = getattr(ET, "indent", None)
if indent is not None:
indent(root, space=" ") # type: ignore[arg-type]
xml_body = ET.tostring(root, encoding="unicode")
# Post-process to replace loop markers with actual Jinja loops
xml_body = self._insert_xml_loops(xml_body, role_prefix, loop_candidates, root)
return prolog + xml_body
def _insert_xml_loops(
self,
xml_str: str,
role_prefix: str,
loop_candidates: list[LoopCandidate],
root: ET.Element,
) -> str:
"""
Post-process XML string to insert Jinja2 for loops.
This replaces <!--LOOP:tagname--> markers with actual loop constructs.
"""
# Build a sample element for each loop to use as template
lines = xml_str.split("\n")
result_lines = []
for line in lines:
# Check if this line contains a loop marker
if "<!--LOOP:" in line:
# Extract tag name from marker
start = line.find("<!--LOOP:") + 9
end = line.find("-->", start)
tag_name = line[start:end].strip()
# Find matching loop candidate
candidate = None
for cand in loop_candidates:
if cand.path and cand.path[-1] == tag_name:
candidate = cand
break
if candidate:
# Get indentation from current line
indent_level = len(line) - len(line.lstrip())
indent_str = " " * indent_level
# Generate loop variable name
collection_var = self.make_var_name(role_prefix, candidate.path)
item_var = candidate.loop_var
# Create sample element with ALL possible fields from ALL items
if candidate.items:
# Merge all items to get the union of all fields
merged_dict = self._merge_dicts_for_template(candidate.items)
sample_elem = self._dict_to_xml_element(
tag_name, merged_dict, item_var
)
# Apply indentation to the sample element
ET.indent(sample_elem, space=" ")
# Convert sample to string
sample_str = ET.tostring(
sample_elem, encoding="unicode"
).strip()
# Add proper indentation to each line of the sample
sample_lines = sample_str.split("\n")
# Build loop
result_lines.append(
f"{indent_str}{{% for {item_var} in {collection_var} %}}"
)
# Add each line of the sample with proper indentation
for sample_line in sample_lines:
result_lines.append(f"{indent_str} {sample_line}")
result_lines.append(f"{indent_str}{{% endfor %}}")
else:
# Keep the marker if we can't find the candidate
result_lines.append(line)
else:
result_lines.append(line)
# Post-process to replace <!--IF:...--> and <!--ENDIF:...--> with Jinja2 conditionals
final_lines = []
for line in result_lines:
# Replace <!--IF:var.field--> with {% if var.field is defined %}
if "<!--IF:" in line:
start = line.find("<!--IF:") + 7
end = line.find("-->", start)
condition = line[start:end]
indent = len(line) - len(line.lstrip())
final_lines.append(f"{' ' * indent}{{% if {condition} is defined %}}")
# Replace <!--ENDIF:field--> with {% endif %}
elif "<!--ENDIF:" in line:
indent = len(line) - len(line.lstrip())
final_lines.append(f"{' ' * indent}{{% endif %}}")
else:
final_lines.append(line)
return "\n".join(final_lines)
def _merge_dicts_for_template(self, items: list[dict[str, Any]]) -> dict[str, Any]:
"""
Merge all dicts to get the union of all possible keys.
This is used to generate XML templates that include ALL possible child
elements, even if they only appear in some items (like OSSEC rules where
different rules have different optional elements).
Args:
items: List of dict representations of XML elements
Returns:
Merged dict with all possible keys, using first occurrence as example
"""
merged: dict[str, Any] = {}
for item in items:
for key, value in item.items():
if key not in merged:
merged[key] = value
return merged
def _dict_to_xml_element(
self, tag: str, data: dict[str, Any], loop_var: str
) -> ET.Element:
"""
Convert a dict to an XML element with Jinja2 variable references.
For heterogeneous XML (like OSSEC rules), this generates conditional
Jinja2 for optional child elements.
Args:
tag: Element tag name
data: Dict representing element structure (merged from all items)
loop_var: Loop variable name to use in Jinja expressions
"""
elem = ET.Element(tag)
# Handle attributes and child elements
for key, value in data.items():
if key.startswith("@"):
# Attribute - these come from element attributes
attr_name = key[1:] # Remove @ prefix
# Use simple variable reference - attributes should always exist
elem.set(attr_name, f"{{{{ {loop_var}.{attr_name} }}}}")
elif key == "_text":
# Simple text content - use ._text accessor for dict-based items
elem.text = f"{{{{ {loop_var}._text }}}}"
elif key == "value":
# Text with attributes/children
elem.text = f"{{{{ {loop_var}.value }}}}"
elif key == "_key":
# This is the dict key (for dict collections), skip in XML
pass
elif isinstance(value, dict):
# Nested element - wrap in conditional since it might not exist in all items
# Create a conditional wrapper comment
child = ET.Element(key)
if "_text" in value:
child.text = f"{{{{ {loop_var}.{key}._text }}}}"
else:
# More complex nested structure
for sub_key, sub_val in value.items():
if not sub_key.startswith("_"):
grandchild = ET.SubElement(child, sub_key)
grandchild.text = f"{{{{ {loop_var}.{key}.{sub_key} }}}}"
# Wrap the child in a Jinja if statement (will be done via text replacement)
# For now, add a marker comment before the element
marker = ET.Comment(f"IF:{loop_var}.{key}")
elem.append(marker)
elem.append(child)
end_marker = ET.Comment(f"ENDIF:{key}")
elem.append(end_marker)
elif not isinstance(value, list):
# Simple child element (scalar value) - also wrap in conditional
marker = ET.Comment(f"IF:{loop_var}.{key}")
elem.append(marker)
child = ET.SubElement(elem, key)
child.text = f"{{{{ {loop_var}.{key} }}}}"
end_marker = ET.Comment(f"ENDIF:{key}")
elem.append(end_marker)
return elem

View file

@ -4,23 +4,29 @@ import yaml
from pathlib import Path from pathlib import Path
from typing import Any from typing import Any
from . import DictLikeHandler from .dict import DictLikeHandler
from ..loop_analyzer import LoopCandidate
class YamlHandler(DictLikeHandler): class YamlHandler(DictLikeHandler):
"""
YAML handler that can generate both scalar templates and loop-based templates.
"""
fmt = "yaml" fmt = "yaml"
flatten_lists = True # you flatten YAML lists flatten_lists = True
def parse(self, path: Path) -> Any: def parse(self, path: Path) -> Any:
text = path.read_text(encoding="utf-8") text = path.read_text(encoding="utf-8")
return yaml.safe_load(text) or {} return yaml.safe_load(text) or {}
def generate_template( def generate_jinja2_template(
self, self,
parsed: Any, parsed: Any,
role_prefix: str, role_prefix: str,
original_text: str | None = None, original_text: str | None = None,
) -> str: ) -> str:
"""Original scalar-only template generation."""
if original_text is not None: if original_text is not None:
return self._generate_yaml_template_from_text(role_prefix, original_text) return self._generate_yaml_template_from_text(role_prefix, original_text)
if not isinstance(parsed, (dict, list)): if not isinstance(parsed, (dict, list)):
@ -28,29 +34,41 @@ class YamlHandler(DictLikeHandler):
dumped = yaml.safe_dump(parsed, sort_keys=False) dumped = yaml.safe_dump(parsed, sort_keys=False)
return self._generate_yaml_template_from_text(role_prefix, dumped) return self._generate_yaml_template_from_text(role_prefix, dumped)
def generate_jinja2_template_with_loops(
self,
parsed: Any,
role_prefix: str,
original_text: str | None,
loop_candidates: list[LoopCandidate],
) -> str:
"""Generate template with Jinja2 for loops where appropriate."""
# Build loop path set for quick lookup
loop_paths = {candidate.path for candidate in loop_candidates}
if original_text is not None:
return self._generate_yaml_template_with_loops_from_text(
role_prefix, original_text, loop_candidates, loop_paths
)
if not isinstance(parsed, (dict, list)):
raise TypeError("YAML parser result must be a dict or list")
dumped = yaml.safe_dump(parsed, sort_keys=False)
return self._generate_yaml_template_with_loops_from_text(
role_prefix, dumped, loop_candidates, loop_paths
)
def _generate_yaml_template_from_text( def _generate_yaml_template_from_text(
self, self,
role_prefix: str, role_prefix: str,
text: str, text: str,
) -> str: ) -> str:
""" """Original scalar-only template generation (unchanged from base)."""
Generate a Jinja2 template for a YAML file, preserving comments and
blank lines by patching scalar values in-place.
This handles common "config-ish" YAML:
- top-level and nested mappings
- lists of scalars
- lists of small mapping objects
It does *not* aim to support all YAML edge cases (anchors, tags, etc.).
"""
lines = text.splitlines(keepends=True) lines = text.splitlines(keepends=True)
out_lines: list[str] = [] out_lines: list[str] = []
# Simple indentation-based context stack: (indent, path, kind)
# kind is "map" or "seq".
stack: list[tuple[int, tuple[str, ...], str]] = [] stack: list[tuple[int, tuple[str, ...], str]] = []
# Track index per parent path for sequences
seq_counters: dict[tuple[str, ...], int] = {} seq_counters: dict[tuple[str, ...], int] = {}
def current_path() -> tuple[str, ...]: def current_path() -> tuple[str, ...]:
@ -60,53 +78,39 @@ class YamlHandler(DictLikeHandler):
stripped = raw_line.lstrip() stripped = raw_line.lstrip()
indent = len(raw_line) - len(stripped) indent = len(raw_line) - len(stripped)
# Blank or pure comment lines unchanged
if not stripped or stripped.startswith("#"): if not stripped or stripped.startswith("#"):
out_lines.append(raw_line) out_lines.append(raw_line)
continue continue
# Adjust stack based on indent
while stack and indent < stack[-1][0]: while stack and indent < stack[-1][0]:
stack.pop() stack.pop()
# --- Handle mapping key lines: "key:" or "key: value"
if ":" in stripped and not stripped.lstrip().startswith("- "): if ":" in stripped and not stripped.lstrip().startswith("- "):
# separate key and rest
key_part, rest = stripped.split(":", 1) key_part, rest = stripped.split(":", 1)
key = key_part.strip() key = key_part.strip()
if not key: if not key:
out_lines.append(raw_line) out_lines.append(raw_line)
continue continue
# Is this just "key:" or "key: value"?
rest_stripped = rest.lstrip(" \t") rest_stripped = rest.lstrip(" \t")
# Use the same inline-comment splitter to see if there's any real value
value_candidate, _ = self._split_inline_comment(rest_stripped, {"#"}) value_candidate, _ = self._split_inline_comment(rest_stripped, {"#"})
has_value = bool(value_candidate.strip()) has_value = bool(value_candidate.strip())
# Update stack/context: current mapping at this indent
# Replace any existing mapping at same indent
if stack and stack[-1][0] == indent and stack[-1][2] == "map": if stack and stack[-1][0] == indent and stack[-1][2] == "map":
stack.pop() stack.pop()
path = current_path() + (key,) path = current_path() + (key,)
stack.append((indent, path, "map")) stack.append((indent, path, "map"))
if not has_value: if not has_value:
# Just "key:" -> collection or nested structure begins on following lines.
out_lines.append(raw_line) out_lines.append(raw_line)
continue continue
# We have an inline scalar value on this same line.
# Separate value from inline comment
value_part, comment_part = self._split_inline_comment( value_part, comment_part = self._split_inline_comment(
rest_stripped, {"#"} rest_stripped, {"#"}
) )
raw_value = value_part.strip() raw_value = value_part.strip()
var_name = self.make_var_name(role_prefix, path) var_name = self.make_var_name(role_prefix, path)
# Keep quote-style if original was quoted
use_quotes = ( use_quotes = (
len(raw_value) >= 2 len(raw_value) >= 2
and raw_value[0] == raw_value[-1] and raw_value[0] == raw_value[-1]
@ -120,7 +124,8 @@ class YamlHandler(DictLikeHandler):
replacement = f"{{{{ {var_name} }}}}" replacement = f"{{{{ {var_name} }}}}"
leading = rest[: len(rest) - len(rest.lstrip(" \t"))] leading = rest[: len(rest) - len(rest.lstrip(" \t"))]
new_stripped = f"{key}: {leading}{replacement}{comment_part}" new_rest = f"{leading}{replacement}{comment_part}"
new_stripped = f"{key}:{new_rest}"
out_lines.append( out_lines.append(
" " * indent " " * indent
+ new_stripped + new_stripped
@ -128,20 +133,14 @@ class YamlHandler(DictLikeHandler):
) )
continue continue
# --- Handle list items: "- value" or "- key: value"
if stripped.startswith("- "): if stripped.startswith("- "):
# Determine parent path
# If top of stack isn't sequence at this indent, push one using current path
if not stack or stack[-1][0] != indent or stack[-1][2] != "seq": if not stack or stack[-1][0] != indent or stack[-1][2] != "seq":
parent_path = current_path() parent_path = current_path()
stack.append((indent, parent_path, "seq")) stack.append((indent, parent_path, "seq"))
parent_path = stack[-1][1] parent_path = stack[-1][1]
content = stripped[2:] # after "- " content = stripped[2:]
parent_path = stack[-1][1]
content = stripped[2:] # after "- "
# Determine index for this parent path
index = seq_counters.get(parent_path, 0) index = seq_counters.get(parent_path, 0)
seq_counters[parent_path] = index + 1 seq_counters[parent_path] = index + 1
@ -151,8 +150,6 @@ class YamlHandler(DictLikeHandler):
raw_value = value_part.strip() raw_value = value_part.strip()
var_name = self.make_var_name(role_prefix, path) var_name = self.make_var_name(role_prefix, path)
# If it's of the form "key: value" inside the list, we could try to
# support that, but a simple scalar is the common case:
use_quotes = ( use_quotes = (
len(raw_value) >= 2 len(raw_value) >= 2
and raw_value[0] == raw_value[-1] and raw_value[0] == raw_value[-1]
@ -173,7 +170,282 @@ class YamlHandler(DictLikeHandler):
) )
continue continue
# Anything else (multi-line scalars, weird YAML): leave untouched
out_lines.append(raw_line) out_lines.append(raw_line)
return "".join(out_lines) return "".join(out_lines)
def _generate_yaml_template_with_loops_from_text(
self,
role_prefix: str,
text: str,
loop_candidates: list[LoopCandidate],
loop_paths: set[tuple[str, ...]],
) -> str:
"""
Generate YAML template with Jinja2 for loops.
Strategy:
1. Parse YAML line-by-line maintaining context
2. When we encounter a path that's a loop candidate:
- Replace that section with a {% for %} loop
- Use the first item as template structure
3. Everything else gets scalar variable replacement
"""
lines = text.splitlines(keepends=True)
out_lines: list[str] = []
stack: list[tuple[int, tuple[str, ...], str]] = []
seq_counters: dict[tuple[str, ...], int] = {}
# Track which lines are part of loop sections (to skip them)
skip_until_indent: int | None = None
def current_path() -> tuple[str, ...]:
return stack[-1][1] if stack else ()
for raw_line in lines:
stripped = raw_line.lstrip()
indent = len(raw_line) - len(stripped)
# If we're skipping lines (inside a loop section), check if we can stop
if skip_until_indent is not None:
if (
indent <= skip_until_indent
and stripped
and not stripped.startswith("#")
):
skip_until_indent = None
else:
continue # Skip this line
# Blank or comment lines
if not stripped or stripped.startswith("#"):
out_lines.append(raw_line)
continue
# Adjust stack based on indent
while stack and indent < stack[-1][0]:
stack.pop()
# --- Handle mapping key lines: "key:" or "key: value"
if ":" in stripped and not stripped.lstrip().startswith("- "):
key_part, rest = stripped.split(":", 1)
key = key_part.strip()
if not key:
out_lines.append(raw_line)
continue
rest_stripped = rest.lstrip(" \t")
value_candidate, _ = self._split_inline_comment(rest_stripped, {"#"})
has_value = bool(value_candidate.strip())
if stack and stack[-1][0] == indent and stack[-1][2] == "map":
stack.pop()
path = current_path() + (key,)
stack.append((indent, path, "map"))
# Check if this path is a loop candidate
if path in loop_paths:
# Find the matching candidate
candidate = next(c for c in loop_candidates if c.path == path)
# Generate loop
loop_str = self._generate_yaml_loop(candidate, role_prefix, indent)
out_lines.append(loop_str)
# Skip subsequent lines that are part of this collection
skip_until_indent = indent
continue
if not has_value:
out_lines.append(raw_line)
continue
# Scalar value - replace with variable
value_part, comment_part = self._split_inline_comment(
rest_stripped, {"#"}
)
raw_value = value_part.strip()
var_name = self.make_var_name(role_prefix, path)
use_quotes = (
len(raw_value) >= 2
and raw_value[0] == raw_value[-1]
and raw_value[0] in {'"', "'"}
)
if use_quotes:
q = raw_value[0]
replacement = f"{q}{{{{ {var_name} }}}}{q}"
else:
replacement = f"{{{{ {var_name} }}}}"
leading = rest[: len(rest) - len(rest.lstrip(" \t"))]
new_rest = f"{leading}{replacement}{comment_part}"
new_stripped = f"{key}:{new_rest}"
out_lines.append(
" " * indent
+ new_stripped
+ ("\n" if raw_line.endswith("\n") else "")
)
continue
# --- Handle list items: "- value" or "- key: value"
if stripped.startswith("- "):
if not stack or stack[-1][0] != indent or stack[-1][2] != "seq":
parent_path = current_path()
stack.append((indent, parent_path, "seq"))
parent_path = stack[-1][1]
# Check if parent path is a loop candidate
if parent_path in loop_paths:
# Find the matching candidate
candidate = next(
c for c in loop_candidates if c.path == parent_path
)
# Generate loop (with indent for the '-' items)
loop_str = self._generate_yaml_loop(
candidate, role_prefix, indent, is_list=True
)
out_lines.append(loop_str)
# Skip subsequent items
skip_until_indent = indent - 1 if indent > 0 else None
continue
content = stripped[2:]
index = seq_counters.get(parent_path, 0)
seq_counters[parent_path] = index + 1
path = parent_path + (str(index),)
value_part, comment_part = self._split_inline_comment(content, {"#"})
raw_value = value_part.strip()
var_name = self.make_var_name(role_prefix, path)
use_quotes = (
len(raw_value) >= 2
and raw_value[0] == raw_value[-1]
and raw_value[0] in {'"', "'"}
)
if use_quotes:
q = raw_value[0]
replacement = f"{q}{{{{ {var_name} }}}}{q}"
else:
replacement = f"{{{{ {var_name} }}}}"
new_stripped = f"- {replacement}{comment_part}"
out_lines.append(
" " * indent
+ new_stripped
+ ("\n" if raw_line.endswith("\n") else "")
)
continue
out_lines.append(raw_line)
return "".join(out_lines)
def _generate_yaml_loop(
self,
candidate: LoopCandidate,
role_prefix: str,
indent: int,
is_list: bool = False,
) -> str:
"""
Generate a Jinja2 for loop for a YAML collection.
Args:
candidate: Loop candidate with items and metadata
role_prefix: Variable prefix
indent: Indentation level in spaces
is_list: True if this is a YAML list, False if dict
Returns:
YAML string with Jinja2 loop
"""
indent_str = " " * indent
collection_var = self.make_var_name(role_prefix, candidate.path)
item_var = candidate.loop_var
lines = []
if not is_list:
# Dict-style: key: {% for ... %}
key = candidate.path[-1] if candidate.path else "items"
lines.append(f"{indent_str}{key}:")
lines.append(f"{indent_str} {{% for {item_var} in {collection_var} -%}}")
else:
# List-style: just the loop
lines.append(f"{indent_str}{{% for {item_var} in {collection_var} -%}}")
# Generate template for item structure
if candidate.items:
sample_item = candidate.items[0]
item_indent = indent + 2 if not is_list else indent
if candidate.item_schema == "scalar":
# Simple list of scalars
if is_list:
lines.append(f"{indent_str}- {{{{ {item_var} }}}}")
else:
lines.append(f"{indent_str} - {{{{ {item_var} }}}}")
elif candidate.item_schema in ("simple_dict", "nested"):
# List of dicts or complex items - these are ALWAYS list items in YAML
item_lines = self._dict_to_yaml_lines(
sample_item, item_var, item_indent, is_list_item=True
)
lines.extend(item_lines)
# Close loop
close_indent = indent + 2 if not is_list else indent
lines.append(f"{' ' * close_indent}{{% endfor %}}")
return "\n".join(lines) + "\n"
def _dict_to_yaml_lines(
self,
data: dict[str, Any],
loop_var: str,
indent: int,
is_list_item: bool = False,
) -> list[str]:
"""
Convert a dict to YAML lines with Jinja2 variable references.
Args:
data: Dict representing item structure
loop_var: Loop variable name
indent: Base indentation level
is_list_item: True if this should start with '-'
Returns:
List of YAML lines
"""
lines = []
indent_str = " " * indent
first_key = True
for key, value in data.items():
if key == "_key":
# Special key for dict collections - output as comment or skip
continue
if first_key and is_list_item:
# First key gets the list marker
lines.append(f"{indent_str}- {key}: {{{{ {loop_var}.{key} }}}}")
first_key = False
else:
# Subsequent keys are indented
sub_indent = indent + 2 if is_list_item else indent
lines.append(f"{' ' * sub_indent}{key}: {{{{ {loop_var}.{key} }}}}")
return lines

View file

@ -0,0 +1,470 @@
"""
Loop detection and analysis for intelligent Jinja2 template generation.
This module determines when config structures should use Jinja2 'for' loops
instead of flattened scalar variables.
"""
from __future__ import annotations
from collections import Counter
from typing import Any, Literal
class LoopCandidate:
"""
Represents a detected loop opportunity in the config structure.
Attributes:
path: Path to the collection (e.g. ("servers",) or ("config", "endpoints"))
loop_var: Variable name for loop items (e.g. "server", "endpoint")
items: The actual list/dict items that will be looped over
item_schema: Structure of each item ("scalar", "simple_dict", "nested")
confidence: How confident we are this should be a loop (0.0 to 1.0)
"""
def __init__(
self,
path: tuple[str, ...],
loop_var: str,
items: list[Any] | dict[str, Any],
item_schema: Literal["scalar", "simple_dict", "nested"],
confidence: float = 1.0,
):
self.path = path
self.loop_var = loop_var
self.items = items
self.item_schema = item_schema
self.confidence = confidence
def __repr__(self) -> str:
path_str = ".".join(self.path) if self.path else "<root>"
return (
f"LoopCandidate(path={path_str}, var={self.loop_var}, "
f"count={len(self.items)}, schema={self.item_schema}, "
f"confidence={self.confidence:.2f})"
)
class LoopAnalyzer:
"""
Analyzes parsed config structures to detect loop opportunities.
Strategy:
1. Detect homogeneous lists (all items same type/structure)
2. Detect dict collections where all values have similar structure
3. Assign confidence scores based on:
- Homogeneity of items
- Number of items (2+ for loops to make sense)
- Depth and complexity (too nested -> fallback to scalars)
- Structural patterns (e.g., repeated XML elements)
"""
# Configuration thresholds
MIN_ITEMS_FOR_LOOP = 2 # Need at least 2 items to justify a loop
MAX_NESTING_DEPTH = 3 # Beyond this, use scalar fallback
MIN_CONFIDENCE = 0.7 # Minimum confidence to use a loop
def __init__(self):
self.candidates: list[LoopCandidate] = []
def analyze(self, parsed: Any, fmt: str) -> list[LoopCandidate]:
"""
Analyze a parsed config structure and return loop candidates.
Args:
parsed: The parsed config (dict, list, or ET.Element for XML)
fmt: Format type ("yaml", "json", "toml", "xml", "ini")
Returns:
List of LoopCandidate objects, sorted by path depth (shallowest first)
"""
self.candidates = []
if fmt == "xml":
self._analyze_xml(parsed)
elif fmt in ("yaml", "json", "toml"):
self._analyze_dict_like(parsed, path=())
elif fmt == "ini":
# INI files are typically flat key-value, not suitable for loops
pass
# Sort by path depth (process parent structures before children)
self.candidates.sort(key=lambda c: len(c.path))
return self.candidates
def _analyze_dict_like(
self,
obj: Any,
path: tuple[str, ...],
depth: int = 0,
parent_is_list: bool = False,
) -> None:
"""Recursively analyze dict/list structures."""
# Safety: don't go too deep
if depth > self.MAX_NESTING_DEPTH:
return
if isinstance(obj, dict):
# Check if this dict's values form a homogeneous collection
if len(obj) >= self.MIN_ITEMS_FOR_LOOP:
candidate = self._check_dict_collection(obj, path)
if candidate:
self.candidates.append(candidate)
# Don't recurse into items we've marked as a loop
return
# Recurse into dict values
for key, value in obj.items():
self._analyze_dict_like(
value, path + (str(key),), depth + 1, parent_is_list=False
)
elif isinstance(obj, list):
# Don't create loop candidates for nested lists (lists inside lists)
# These are too complex for clean template generation and should fall back to scalar handling
if parent_is_list:
return
# Check if this list is homogeneous
if len(obj) >= self.MIN_ITEMS_FOR_LOOP:
candidate = self._check_list_collection(obj, path)
if candidate:
self.candidates.append(candidate)
# Don't recurse into items we've marked as a loop
return
# If not a good loop candidate, recurse into items
# Pass parent_is_list=True so nested lists won't create loop candidates
for i, item in enumerate(obj):
self._analyze_dict_like(
item, path + (str(i),), depth + 1, parent_is_list=True
)
def _check_list_collection(
self, items: list[Any], path: tuple[str, ...]
) -> LoopCandidate | None:
"""Check if a list should be a loop."""
if not items:
return None
# Analyze item types and structures
item_types = [type(item).__name__ for item in items]
type_counts = Counter(item_types)
# Must be homogeneous (all same type)
if len(type_counts) != 1:
return None
item_type = item_types[0]
# Scalar list (strings, numbers, bools)
if item_type in ("str", "int", "float", "bool", "NoneType"):
return LoopCandidate(
path=path,
loop_var=self._derive_loop_var(path, singular=True),
items=items,
item_schema="scalar",
confidence=1.0,
)
# List of dicts - check structural homogeneity
if item_type == "dict":
schema = self._analyze_dict_schema(items)
if schema == "simple_dict":
return LoopCandidate(
path=path,
loop_var=self._derive_loop_var(path, singular=True),
items=items,
item_schema="simple_dict",
confidence=0.95,
)
elif schema == "homogeneous":
return LoopCandidate(
path=path,
loop_var=self._derive_loop_var(path, singular=True),
items=items,
item_schema="simple_dict",
confidence=0.85,
)
# If too complex/heterogeneous, return None (use scalar fallback)
return None
def _check_dict_collection(
self, obj: dict[str, Any], path: tuple[str, ...]
) -> LoopCandidate | None:
"""
Check if a dict's values form a collection suitable for looping.
Example: {"server1": {...}, "server2": {...}} where all values
have the same structure.
NOTE: Currently disabled for TOML compatibility. TOML's dict-of-tables
syntax ([servers.alpha], [servers.beta]) cannot be easily converted to
loops without restructuring the entire TOML format. To maintain consistency
between Ansible YAML and Jinja2 templates, we treat these as scalars.
"""
# TODO: Re-enable this if we implement proper dict-of-tables loop generation
# For now, return None to use scalar handling
return None
# Original logic preserved below for reference:
# if not obj:
# return None
#
# values = list(obj.values())
#
# # Check type homogeneity
# value_types = [type(v).__name__ for v in values]
# type_counts = Counter(value_types)
#
# if len(type_counts) != 1:
# return None
#
# value_type = value_types[0]
#
# # Only interested in dict values for dict collections
# # (scalar-valued dicts stay as scalars)
# if value_type != "dict":
# return None
#
# # Check structural homogeneity
# schema = self._analyze_dict_schema(values)
# if schema in ("simple_dict", "homogeneous"):
# confidence = 0.9 if schema == "simple_dict" else 0.8
#
# # Convert dict to list of items with 'key' added
# items_with_keys = [{"_key": k, **v} for k, v in obj.items()]
#
# return LoopCandidate(
# path=path,
# loop_var=self._derive_loop_var(path, singular=True),
# items=items_with_keys,
# item_schema="simple_dict",
# confidence=confidence,
# )
#
# return None
def _analyze_dict_schema(
self, dicts: list[dict[str, Any]]
) -> Literal["simple_dict", "homogeneous", "heterogeneous"]:
"""
Analyze a list of dicts to determine their structural homogeneity.
Returns:
"simple_dict": All dicts have same keys, all values are scalars
"homogeneous": All dicts have same keys, may have nested structures
"heterogeneous": Dicts have different structures
"""
if not dicts:
return "heterogeneous"
# Get key sets from each dict
key_sets = [set(d.keys()) for d in dicts]
# Check if all have the same keys
first_keys = key_sets[0]
if not all(ks == first_keys for ks in key_sets):
# Allow minor variations (80% key overlap)
all_keys = set().union(*key_sets)
common_keys = set.intersection(*key_sets)
if len(common_keys) / len(all_keys) < 0.8:
return "heterogeneous"
# Check if values are all scalars
all_scalars = True
for d in dicts:
for v in d.values():
if isinstance(v, (dict, list)):
all_scalars = False
break
if not all_scalars:
break
if all_scalars:
return "simple_dict"
else:
return "homogeneous"
def _derive_loop_var(self, path: tuple[str, ...], singular: bool = True) -> str:
"""
Derive a sensible loop variable name from the path.
Examples:
("servers",) -> "server" (singular)
("config", "endpoints") -> "endpoint"
("users",) -> "user"
("databases",) -> "database"
"""
if not path:
return "item"
last_part = path[-1].lower()
if singular:
# Simple English pluralization rules (order matters - most specific first)
if last_part.endswith("sses"):
return last_part[:-2] # "classes" -> "class"
elif last_part.endswith("xes"):
return last_part[:-2] # "boxes" -> "box"
elif last_part.endswith("ches"):
return last_part[:-2] # "watches" -> "watch"
elif last_part.endswith("shes"):
return last_part[:-2] # "dishes" -> "dish"
elif last_part.endswith("ies"):
return last_part[:-3] + "y" # "entries" -> "entry"
elif last_part.endswith("oes"):
return last_part[:-2] # "tomatoes" -> "tomato"
elif last_part.endswith("ses") and not last_part.endswith("sses"):
# Only for words ending in "se": "databases" -> "database"
# But NOT for "sses" which we already handled
if len(last_part) > 3 and last_part[-4] not in "aeiou":
# "databases" -> "database" (consonant before 's')
return last_part[:-1]
else:
# "houses" -> "house", "causes" -> "cause"
return last_part[:-1]
elif last_part.endswith("s") and not last_part.endswith("ss"):
return last_part[:-1] # "servers" -> "server"
return last_part
def _analyze_xml(self, root: Any) -> None:
"""
Analyze XML structure for loop opportunities.
XML is particularly suited for loops when we have repeated sibling elements.
"""
import xml.etree.ElementTree as ET # nosec B405
if not isinstance(root, ET.Element):
return
self._walk_xml_element(root, path=())
def _walk_xml_element(self, elem: Any, path: tuple[str, ...]) -> None:
"""Recursively walk XML elements looking for repeated siblings."""
children = [c for c in list(elem) if isinstance(c.tag, str)]
# Count sibling elements by tag
tag_counts = Counter(child.tag for child in children)
# Find repeated tags
for tag, count in tag_counts.items():
if count >= self.MIN_ITEMS_FOR_LOOP:
# Get all elements with this tag
tagged_elements = [c for c in children if c.tag == tag]
# Check homogeneity
if self._are_xml_elements_homogeneous(tagged_elements):
# Convert to dict representation for easier handling
items = [self._xml_elem_to_dict(el) for el in tagged_elements]
# Determine schema
if all(self._is_scalar_dict(item) for item in items):
schema = "simple_dict"
confidence = 1.0
else:
schema = "nested"
confidence = 0.8
candidate = LoopCandidate(
path=path + (tag,),
loop_var=self._derive_loop_var((tag,), singular=True),
items=items,
item_schema=schema,
confidence=confidence,
)
self.candidates.append(candidate)
# Recurse into unique children (non-repeated ones will be processed normally)
for tag, count in tag_counts.items():
if count == 1:
child = next(c for c in children if c.tag == tag)
self._walk_xml_element(child, path + (tag,))
def _are_xml_elements_homogeneous(self, elements: list[Any]) -> bool:
"""Check if XML elements have similar structure."""
if not elements:
return False
# Compare attribute sets
attr_sets = [set(el.attrib.keys()) for el in elements]
first_attrs = attr_sets[0]
if not all(attrs == first_attrs for attrs in attr_sets):
# Allow some variation
all_attrs = set().union(*attr_sets)
common_attrs = set.intersection(*attr_sets) if attr_sets else set()
# Very permissive for attributes - 20% overlap is OK
if len(common_attrs) / max(len(all_attrs), 1) < 0.2:
return False
# Compare child element tags
child_tag_sets = [
set(c.tag for c in el if hasattr(c, "tag")) for el in elements
]
if child_tag_sets:
first_tags = child_tag_sets[0]
if not all(tags == first_tags for tags in child_tag_sets):
# Allow significant variation for XML - just need SOME commonality
# This is important for cases like OSSEC rules where each rule
# has different optional child elements (if_sid, url_pcre2, etc.)
all_tags = set().union(*child_tag_sets)
common_tags = (
set.intersection(*child_tag_sets) if child_tag_sets else set()
)
# Lower threshold to 20% - if they share at least 20% of tags, consider them similar
# Even if they just share 'description' or 'id' fields, that's enough
if len(common_tags) / max(len(all_tags), 1) < 0.2:
return False
return True
def _xml_elem_to_dict(self, elem: Any) -> dict[str, Any]:
"""Convert an XML element to a dict representation."""
result: dict[str, Any] = {}
# Add attributes
for attr_name, attr_val in elem.attrib.items():
result[f"@{attr_name}"] = attr_val
# Add text content
text = (elem.text or "").strip()
if text:
children = [c for c in list(elem) if hasattr(c, "tag")]
if not elem.attrib and not children:
result["_text"] = text
else:
result["value"] = text
# Add child elements
for child in elem:
if hasattr(child, "tag"):
child_dict = self._xml_elem_to_dict(child)
if child.tag in result:
# Multiple children with same tag - convert to list
if not isinstance(result[child.tag], list):
result[child.tag] = [result[child.tag]]
result[child.tag].append(child_dict)
else:
result[child.tag] = child_dict
return result
def _is_scalar_dict(self, obj: dict[str, Any]) -> bool:
"""Check if a dict contains only scalar values (no nested dicts/lists)."""
for v in obj.values():
if isinstance(v, (dict, list)):
return False
return True

View file

@ -31,4 +31,4 @@ def test_base_handler_abstract_methods_raise_not_implemented(tmp_path: Path):
handler.flatten(object()) handler.flatten(object())
with pytest.raises(NotImplementedError): with pytest.raises(NotImplementedError):
handler.generate_template(parsed=object(), role_prefix="role") handler.generate_jinja2_template(parsed=object(), role_prefix="role")

View file

@ -1,10 +1,6 @@
from __future__ import annotations from __future__ import annotations
import sys
from pathlib import Path from pathlib import Path
import pytest
from jinjaturtle import cli from jinjaturtle import cli
SAMPLES_DIR = Path(__file__).parent / "samples" SAMPLES_DIR = Path(__file__).parent / "samples"
@ -66,20 +62,3 @@ def test_cli_writes_output_files(tmp_path, capsys):
# When writing to files, we shouldn't print the big headers # When writing to files, we shouldn't print the big headers
assert "# defaults/main.yml" not in captured.out assert "# defaults/main.yml" not in captured.out
assert "# config.j2" not in captured.out assert "# config.j2" not in captured.out
def test_main_wrapper_exits_with_zero(monkeypatch):
"""
Cover the main() wrapper that raises SystemExit.
"""
cfg_path = SAMPLES_DIR / "tom.toml"
monkeypatch.setattr(
sys,
"argv",
["jinjaturtle", str(cfg_path), "-r", "jinjaturtle"],
)
with pytest.raises(SystemExit) as exc:
cli.main()
assert exc.value.code

View file

@ -10,8 +10,8 @@ from jinjaturtle.core import (
detect_format, detect_format,
parse_config, parse_config,
flatten_config, flatten_config,
generate_defaults_yaml, generate_ansible_yaml,
generate_template, generate_jinja2_template,
make_var_name, make_var_name,
) )
@ -90,9 +90,9 @@ def test_parse_config_unsupported_format(tmp_path: Path):
parse_config(cfg_path, fmt="bogus") parse_config(cfg_path, fmt="bogus")
def test_generate_template_type_and_format_errors(): def test_generate_jinja2_template_type_and_format_errors():
""" """
Exercise the error branches in generate_template: Exercise the error branches in generate_jinja2_template:
- toml with non-dict parsed - toml with non-dict parsed
- ini with non-ConfigParser parsed - ini with non-ConfigParser parsed
- yaml with wrong parsed type - yaml with wrong parsed type
@ -101,27 +101,29 @@ def test_generate_template_type_and_format_errors():
""" """
# wrong type for TOML # wrong type for TOML
with pytest.raises(TypeError): with pytest.raises(TypeError):
generate_template("toml", parsed="not a dict", role_prefix="role") generate_jinja2_template("toml", parsed="not a dict", role_prefix="role")
# wrong type for INI # wrong type for INI
with pytest.raises(TypeError): with pytest.raises(TypeError):
generate_template("ini", parsed={"not": "a configparser"}, role_prefix="role") generate_jinja2_template(
"ini", parsed={"not": "a configparser"}, role_prefix="role"
)
# wrong type for YAML # wrong type for YAML
with pytest.raises(TypeError): with pytest.raises(TypeError):
generate_template("yaml", parsed=None, role_prefix="role") generate_jinja2_template("yaml", parsed=None, role_prefix="role")
# wrong type for JSON # wrong type for JSON
with pytest.raises(TypeError): with pytest.raises(TypeError):
generate_template("json", parsed=None, role_prefix="role") generate_jinja2_template("json", parsed=None, role_prefix="role")
# unsupported format, no original_text # unsupported format, no original_text
with pytest.raises(ValueError): with pytest.raises(ValueError):
generate_template("bogusfmt", parsed=None, role_prefix="role") generate_jinja2_template("bogusfmt", parsed=None, role_prefix="role")
# unsupported format, with original_text # unsupported format, with original_text
with pytest.raises(ValueError): with pytest.raises(ValueError):
generate_template( generate_jinja2_template(
"bogusfmt", "bogusfmt",
parsed=None, parsed=None,
role_prefix="role", role_prefix="role",
@ -135,8 +137,8 @@ def test_normalize_default_value_true_false_strings():
(("section", "foo"), "true"), (("section", "foo"), "true"),
(("section", "bar"), "FALSE"), (("section", "bar"), "FALSE"),
] ]
defaults_yaml = generate_defaults_yaml("role", flat_items) ansible_yaml = generate_ansible_yaml("role", flat_items)
data = yaml.safe_load(defaults_yaml) data = yaml.safe_load(ansible_yaml)
assert data["role_section_foo"] == "true" assert data["role_section_foo"] == "true"
assert data["role_section_bar"] == "FALSE" assert data["role_section_bar"] == "FALSE"
@ -166,18 +168,19 @@ def test_fallback_str_representer_for_unknown_type():
def test_normalize_default_value_bool_inputs_are_stringified(): def test_normalize_default_value_bool_inputs_are_stringified():
""" """
Real boolean values should be turned into quoted 'true'/'false' strings Boolean values are now preserved as booleans in YAML (not stringified).
by _normalize_default_value via generate_defaults_yaml. This supports proper type preservation for JSON and other formats.
""" """
flat_items = [ flat_items = [
(("section", "flag_true"), True), (("section", "flag_true"), True),
(("section", "flag_false"), False), (("section", "flag_false"), False),
] ]
defaults_yaml = generate_defaults_yaml("role", flat_items) ansible_yaml = generate_ansible_yaml("role", flat_items)
data = yaml.safe_load(defaults_yaml) data = yaml.safe_load(ansible_yaml)
assert data["role_section_flag_true"] == "true" # Booleans are now preserved as booleans
assert data["role_section_flag_false"] == "false" assert data["role_section_flag_true"] is True
assert data["role_section_flag_false"] is False
def test_flatten_config_unsupported_format(): def test_flatten_config_unsupported_format():

View file

@ -8,8 +8,8 @@ import yaml
from jinjaturtle.core import ( from jinjaturtle.core import (
parse_config, parse_config,
flatten_config, flatten_config,
generate_defaults_yaml, generate_ansible_yaml,
generate_template, generate_jinja2_template,
) )
from jinjaturtle.handlers.ini import IniHandler from jinjaturtle.handlers.ini import IniHandler
@ -26,8 +26,8 @@ def test_ini_php_sample_roundtrip():
flat_items = flatten_config(fmt, parsed) flat_items = flatten_config(fmt, parsed)
assert flat_items, "Expected at least one flattened item from php.ini sample" assert flat_items, "Expected at least one flattened item from php.ini sample"
defaults_yaml = generate_defaults_yaml("php", flat_items) ansible_yaml = generate_ansible_yaml("php", flat_items)
defaults = yaml.safe_load(defaults_yaml) defaults = yaml.safe_load(ansible_yaml)
# defaults should be a non-empty dict # defaults should be a non-empty dict
assert isinstance(defaults, dict) assert isinstance(defaults, dict)
@ -41,7 +41,7 @@ def test_ini_php_sample_roundtrip():
# template generation # template generation
original_text = ini_path.read_text(encoding="utf-8") original_text = ini_path.read_text(encoding="utf-8")
template = generate_template(fmt, parsed, "php", original_text=original_text) template = generate_jinja2_template(fmt, parsed, "php", original_text=original_text)
assert "; About this file" in template assert "; About this file" in template
assert isinstance(template, str) assert isinstance(template, str)
assert template.strip(), "Template for php.ini sample should not be empty" assert template.strip(), "Template for php.ini sample should not be empty"
@ -53,16 +53,16 @@ def test_ini_php_sample_roundtrip():
), f"Variable {var_name} not referenced in INI template" ), f"Variable {var_name} not referenced in INI template"
def test_generate_template_fallback_ini(): def test_generate_jinja2_template_fallback_ini():
""" """
When original_text is not provided, generate_template should use the When original_text is not provided, generate_jinja2_template should use the
structural fallback path for INI configs. structural fallback path for INI configs.
""" """
parser = configparser.ConfigParser() parser = configparser.ConfigParser()
# foo is quoted in the INI text to hit the "preserve quotes" branch # foo is quoted in the INI text to hit the "preserve quotes" branch
parser["section"] = {"foo": '"bar"', "num": "42"} parser["section"] = {"foo": '"bar"', "num": "42"}
tmpl_ini = generate_template("ini", parsed=parser, role_prefix="role") tmpl_ini = generate_jinja2_template("ini", parsed=parser, role_prefix="role")
assert "[section]" in tmpl_ini assert "[section]" in tmpl_ini
assert "role_section_foo" in tmpl_ini assert "role_section_foo" in tmpl_ini
assert '"{{ role_section_foo }}"' in tmpl_ini # came from quoted INI value assert '"{{ role_section_foo }}"' in tmpl_ini # came from quoted INI value

View file

@ -2,14 +2,15 @@ from __future__ import annotations
from pathlib import Path from pathlib import Path
import json
import pytest import pytest
import yaml import yaml
from jinjaturtle.core import ( from jinjaturtle.core import (
parse_config, parse_config,
flatten_config, flatten_config,
generate_defaults_yaml, generate_ansible_yaml,
analyze_loops,
generate_jinja2_template,
) )
from jinjaturtle.handlers.json import JsonHandler from jinjaturtle.handlers.json import JsonHandler
@ -23,34 +24,40 @@ def test_json_roundtrip():
fmt, parsed = parse_config(json_path) fmt, parsed = parse_config(json_path)
assert fmt == "json" assert fmt == "json"
flat_items = flatten_config(fmt, parsed) # With loop detection
defaults_yaml = generate_defaults_yaml("foobar", flat_items) loop_candidates = analyze_loops(fmt, parsed)
defaults = yaml.safe_load(defaults_yaml) flat_items = flatten_config(fmt, parsed, loop_candidates)
ansible_yaml = generate_ansible_yaml("foobar", flat_items, loop_candidates)
defaults = yaml.safe_load(ansible_yaml)
# Defaults: nested keys and list indices # Defaults: nested keys
assert defaults["foobar_foo"] == "bar" assert defaults["foobar_foo"] == "bar"
assert defaults["foobar_nested_a"] == 1 assert defaults["foobar_nested_a"] == 1
# Bool normalized to string "true" # Booleans are now preserved as booleans (not stringified)
assert defaults["foobar_nested_b"] == "true" assert defaults["foobar_nested_b"] is True
assert defaults["foobar_list_0"] == 10 # List should be a list (not flattened to scalars)
assert defaults["foobar_list_1"] == 20 assert defaults["foobar_list"] == [10, 20]
# Template generation is done via JsonHandler.generate_template; we just # Template generation with loops
# make sure it produces a structure with the expected placeholders. template = generate_jinja2_template("json", parsed, "foobar", None, loop_candidates)
handler = JsonHandler()
templated = json.loads(handler.generate_template(parsed, role_prefix="foobar"))
assert templated["foo"] == "{{ foobar_foo }}" # Template should use | tojson for type preservation
assert "foobar_nested_a" in str(templated) assert "{{ foobar_foo | tojson }}" in template
assert "foobar_nested_b" in str(templated) assert "{{ foobar_nested_a | tojson }}" in template
assert "foobar_list_0" in str(templated) assert "{{ foobar_nested_b | tojson }}" in template
assert "foobar_list_1" in str(templated)
# List should use loop (not scalar indices)
assert "{% for" in template
assert "foobar_list" in template
# Should NOT have scalar indices
assert "foobar_list_0" not in template
assert "foobar_list_1" not in template
def test_generate_template_json_type_error(): def test_generate_jinja2_template_json_type_error():
""" """
Wrong type for JSON in JsonHandler.generate_template should raise TypeError. Wrong type for JSON in JsonHandler.generate_jinja2_template should raise TypeError.
""" """
handler = JsonHandler() handler = JsonHandler()
with pytest.raises(TypeError): with pytest.raises(TypeError):
handler.generate_template(parsed="not a dict", role_prefix="role") handler.generate_jinja2_template(parsed="not a dict", role_prefix="role")

566
tests/test_roundtrip.py Normal file
View file

@ -0,0 +1,566 @@
"""
Roundtrip tests: Generate config template/YAML regenerate config compare.
These tests verify that:
1. Generated Jinja2 template + Ansible YAML can reproduce the original config
2. The regenerated config is semantically equivalent (allowing whitespace differences)
3. No data loss occurs during the template generation process
This is the ultimate validation - if the roundtrip works, the templates are correct.
"""
from __future__ import annotations
import json
import yaml
from pathlib import Path
from typing import Any
from jinja2 import Environment, StrictUndefined
import pytest
from jinjaturtle.core import (
parse_config,
analyze_loops,
flatten_config,
generate_ansible_yaml,
generate_jinja2_template,
)
def render_template(template: str, variables: dict[str, Any]) -> str:
"""Render a Jinja2 template with variables."""
env = Environment(undefined=StrictUndefined)
jinja_template = env.from_string(template)
return jinja_template.render(variables)
class TestRoundtripJSON:
"""Roundtrip tests for JSON files."""
def test_foo_json_roundtrip(self):
"""Test foo.json can be perfectly regenerated from template."""
samples_dir = Path(__file__).parent / "samples"
json_file = samples_dir / "foo.json"
if not json_file.exists():
pytest.skip("foo.json not found")
# Read original
original_text = json_file.read_text()
original_data = json.loads(original_text)
# Generate template and YAML
fmt, parsed = parse_config(json_file)
loop_candidates = analyze_loops(fmt, parsed)
flat_items = flatten_config(fmt, parsed, loop_candidates)
ansible_yaml = generate_ansible_yaml("test", flat_items, loop_candidates)
template = generate_jinja2_template(fmt, parsed, "test", None, loop_candidates)
# Load variables from YAML
variables = yaml.safe_load(ansible_yaml)
# Render template
regenerated_text = render_template(template, variables)
regenerated_data = json.loads(regenerated_text)
# Compare data structures (should match exactly)
assert regenerated_data == original_data, (
f"Regenerated JSON differs from original\n"
f"Original: {json.dumps(original_data, indent=2, sort_keys=True)}\n"
f"Regenerated: {json.dumps(regenerated_data, indent=2, sort_keys=True)}"
)
def test_json_all_types_roundtrip(self):
"""Test JSON with all data types roundtrips perfectly."""
json_text = """
{
"string": "value",
"number": 42,
"float": 3.14,
"boolean": true,
"false_val": false,
"null_value": null,
"array": [1, 2, 3],
"object": {
"nested": "data"
}
}
"""
original_data = json.loads(json_text)
# Generate template and YAML
loop_candidates = analyze_loops("json", original_data)
flat_items = flatten_config("json", original_data, loop_candidates)
ansible_yaml = generate_ansible_yaml("test", flat_items, loop_candidates)
template = generate_jinja2_template(
"json", original_data, "test", None, loop_candidates
)
# Render template
variables = yaml.safe_load(ansible_yaml)
regenerated_text = render_template(template, variables)
regenerated_data = json.loads(regenerated_text)
# Should match exactly
assert regenerated_data == original_data
class TestRoundtripYAML:
"""Roundtrip tests for YAML files."""
def test_bar_yaml_roundtrip(self):
"""Test bar.yaml can be regenerated from template."""
samples_dir = Path(__file__).parent / "samples"
yaml_file = samples_dir / "bar.yaml"
if not yaml_file.exists():
pytest.skip("bar.yaml not found")
# Read original
original_text = yaml_file.read_text()
original_data = yaml.safe_load(original_text)
# Generate template and YAML
fmt, parsed = parse_config(yaml_file)
loop_candidates = analyze_loops(fmt, parsed)
flat_items = flatten_config(fmt, parsed, loop_candidates)
ansible_yaml = generate_ansible_yaml("test", flat_items, loop_candidates)
template = generate_jinja2_template(
fmt, parsed, "test", original_text, loop_candidates
)
# Load variables from YAML
variables = yaml.safe_load(ansible_yaml)
# Render template
regenerated_text = render_template(template, variables)
regenerated_data = yaml.safe_load(regenerated_text)
# Compare data structures
assert regenerated_data == original_data, (
f"Regenerated YAML differs from original\n"
f"Original: {original_data}\n"
f"Regenerated: {regenerated_data}"
)
def test_yaml_with_lists_roundtrip(self):
"""Test YAML with various list structures."""
yaml_text = """
name: myapp
simple_list:
- item1
- item2
- item3
list_of_dicts:
- name: first
value: 1
- name: second
value: 2
nested:
inner_list:
- a
- b
"""
original_data = yaml.safe_load(yaml_text)
# Generate template and YAML
loop_candidates = analyze_loops("yaml", original_data)
flat_items = flatten_config("yaml", original_data, loop_candidates)
ansible_yaml = generate_ansible_yaml("test", flat_items, loop_candidates)
template = generate_jinja2_template(
"yaml", original_data, "test", yaml_text, loop_candidates
)
# Render template
variables = yaml.safe_load(ansible_yaml)
regenerated_text = render_template(template, variables)
regenerated_data = yaml.safe_load(regenerated_text)
# Compare
assert regenerated_data == original_data
class TestRoundtripTOML:
"""Roundtrip tests for TOML files."""
def test_tom_toml_roundtrip(self):
"""Test tom.toml can be regenerated from template."""
samples_dir = Path(__file__).parent / "samples"
toml_file = samples_dir / "tom.toml"
if not toml_file.exists():
pytest.skip("tom.toml not found")
# Read original
original_text = toml_file.read_text()
import tomllib
original_data = tomllib.loads(original_text)
# Generate template and YAML
fmt, parsed = parse_config(toml_file)
loop_candidates = analyze_loops(fmt, parsed)
flat_items = flatten_config(fmt, parsed, loop_candidates)
ansible_yaml = generate_ansible_yaml("test", flat_items, loop_candidates)
template = generate_jinja2_template(
fmt, parsed, "test", original_text, loop_candidates
)
# Load variables from YAML
variables = yaml.safe_load(ansible_yaml)
# Render template
regenerated_text = render_template(template, variables)
regenerated_data = tomllib.loads(regenerated_text)
# Compare data structures
# Note: TOML datetime objects need special handling
assert _compare_toml_data(regenerated_data, original_data), (
f"Regenerated TOML differs from original\n"
f"Original: {original_data}\n"
f"Regenerated: {regenerated_data}"
)
def test_toml_with_arrays_roundtrip(self):
"""Test TOML with inline arrays and array-of-tables."""
toml_text = """
name = "test"
ports = [8080, 8081, 8082]
[[database]]
host = "db1.example.com"
port = 5432
[[database]]
host = "db2.example.com"
port = 5433
"""
import tomllib
original_data = tomllib.loads(toml_text)
# Generate template and YAML
loop_candidates = analyze_loops("toml", original_data)
flat_items = flatten_config("toml", original_data, loop_candidates)
ansible_yaml = generate_ansible_yaml("test", flat_items, loop_candidates)
template = generate_jinja2_template(
"toml", original_data, "test", toml_text, loop_candidates
)
# Render template
variables = yaml.safe_load(ansible_yaml)
regenerated_text = render_template(template, variables)
regenerated_data = tomllib.loads(regenerated_text)
# Compare
assert regenerated_data == original_data
class TestRoundtripXML:
"""Roundtrip tests for XML files."""
def test_xml_simple_roundtrip(self):
"""Test simple XML can be regenerated."""
xml_text = """<?xml version="1.0"?>
<config>
<name>test</name>
<port>8080</port>
<server>server1</server>
<server>server2</server>
<server>server3</server>
</config>
"""
import xml.etree.ElementTree as ET
original_root = ET.fromstring(xml_text)
# Generate template and YAML
fmt = "xml"
loop_candidates = analyze_loops(fmt, original_root)
flat_items = flatten_config(fmt, original_root, loop_candidates)
ansible_yaml = generate_ansible_yaml("test", flat_items, loop_candidates)
template = generate_jinja2_template(
fmt, original_root, "test", xml_text, loop_candidates
)
# Render template
variables = yaml.safe_load(ansible_yaml)
regenerated_text = render_template(template, variables)
# Parse regenerated XML
regenerated_root = ET.fromstring(regenerated_text)
# Compare XML structures (ignore insignificant whitespace)
assert _xml_elements_equal(
original_root, regenerated_root, ignore_whitespace=True
), (
f"Regenerated XML differs from original\n"
f"Original: {ET.tostring(original_root, encoding='unicode')}\n"
f"Regenerated: {ET.tostring(regenerated_root, encoding='unicode')}"
)
def test_ossec_xml_roundtrip(self):
"""Test ossec.xml (complex real-world XML) roundtrip."""
samples_dir = Path(__file__).parent / "samples"
xml_file = samples_dir / "ossec.xml"
if not xml_file.exists():
pytest.skip("ossec.xml not found")
# Read original
original_text = xml_file.read_text()
import xml.etree.ElementTree as ET
original_root = ET.fromstring(original_text)
# Generate template and YAML
fmt, parsed = parse_config(xml_file)
loop_candidates = analyze_loops(fmt, parsed)
flat_items = flatten_config(fmt, parsed, loop_candidates)
ansible_yaml = generate_ansible_yaml("test", flat_items, loop_candidates)
template = generate_jinja2_template(
fmt, parsed, "test", original_text, loop_candidates
)
# Load variables and render
variables = yaml.safe_load(ansible_yaml)
regenerated_text = render_template(template, variables)
# Parse regenerated
regenerated_root = ET.fromstring(regenerated_text)
# Compare - for complex XML, we compare structure not exact text
assert _xml_elements_equal(
original_root, regenerated_root, ignore_whitespace=True
)
class TestRoundtripINI:
"""Roundtrip tests for INI files."""
def test_ini_simple_roundtrip(self):
"""Test simple INI can be regenerated."""
ini_text = """[section1]
key1 = value1
key2 = value2
[section2]
key3 = value3
"""
from configparser import ConfigParser
original_config = ConfigParser()
original_config.read_string(ini_text)
# Generate template and YAML
fmt = "ini"
loop_candidates = analyze_loops(fmt, original_config)
flat_items = flatten_config(fmt, original_config, loop_candidates)
ansible_yaml = generate_ansible_yaml("test", flat_items, loop_candidates)
template = generate_jinja2_template(
fmt, original_config, "test", ini_text, loop_candidates
)
# Render template
variables = yaml.safe_load(ansible_yaml)
regenerated_text = render_template(template, variables)
# Parse regenerated
regenerated_config = ConfigParser()
regenerated_config.read_string(regenerated_text)
# Compare
assert _ini_configs_equal(original_config, regenerated_config)
class TestRoundtripEdgeCases:
"""Roundtrip tests for edge cases and special scenarios."""
def test_empty_lists_roundtrip(self):
"""Test handling of empty lists."""
json_text = '{"items": []}'
original_data = json.loads(json_text)
loop_candidates = analyze_loops("json", original_data)
flat_items = flatten_config("json", original_data, loop_candidates)
ansible_yaml = generate_ansible_yaml("test", flat_items, loop_candidates)
template = generate_jinja2_template(
"json", original_data, "test", None, loop_candidates
)
variables = yaml.safe_load(ansible_yaml)
regenerated_text = render_template(template, variables)
regenerated_data = json.loads(regenerated_text)
assert regenerated_data == original_data
def test_special_characters_roundtrip(self):
"""Test handling of special characters."""
json_data = {
"quote": 'He said "hello"',
"backslash": "path\\to\\file",
"newline": "line1\nline2",
"unicode": "emoji: 🚀",
}
loop_candidates = analyze_loops("json", json_data)
flat_items = flatten_config("json", json_data, loop_candidates)
ansible_yaml = generate_ansible_yaml("test", flat_items, loop_candidates)
template = generate_jinja2_template(
"json", json_data, "test", None, loop_candidates
)
variables = yaml.safe_load(ansible_yaml)
regenerated_text = render_template(template, variables)
regenerated_data = json.loads(regenerated_text)
assert regenerated_data == json_data
def test_numeric_types_roundtrip(self):
"""Test preservation of numeric types."""
json_data = {
"int": 42,
"float": 3.14159,
"negative": -100,
"zero": 0,
"large": 9999999999,
}
loop_candidates = analyze_loops("json", json_data)
flat_items = flatten_config("json", json_data, loop_candidates)
ansible_yaml = generate_ansible_yaml("test", flat_items, loop_candidates)
template = generate_jinja2_template(
"json", json_data, "test", None, loop_candidates
)
variables = yaml.safe_load(ansible_yaml)
regenerated_text = render_template(template, variables)
regenerated_data = json.loads(regenerated_text)
assert regenerated_data == json_data
def test_boolean_preservation_roundtrip(self):
"""Test that booleans are preserved correctly."""
yaml_text = """
enabled: true
disabled: false
"""
original_data = yaml.safe_load(yaml_text)
loop_candidates = analyze_loops("yaml", original_data)
flat_items = flatten_config("yaml", original_data, loop_candidates)
ansible_yaml = generate_ansible_yaml("test", flat_items, loop_candidates)
template = generate_jinja2_template(
"yaml", original_data, "test", yaml_text, loop_candidates
)
variables = yaml.safe_load(ansible_yaml)
regenerated_text = render_template(template, variables)
regenerated_data = yaml.safe_load(regenerated_text)
# Both should be actual booleans
assert regenerated_data["enabled"] is True
assert regenerated_data["disabled"] is False
# Helper functions
def _compare_toml_data(data1: Any, data2: Any) -> bool:
"""Compare TOML data, handling datetime objects."""
import datetime
if type(data1) != type(data2):
return False
if isinstance(data1, dict):
if set(data1.keys()) != set(data2.keys()):
return False
return all(_compare_toml_data(data1[k], data2[k]) for k in data1.keys())
elif isinstance(data1, list):
if len(data1) != len(data2):
return False
return all(_compare_toml_data(v1, v2) for v1, v2 in zip(data1, data2))
elif isinstance(data1, datetime.datetime):
# Compare datetime objects
return data1 == data2
else:
return data1 == data2
def _xml_elements_equal(elem1, elem2, ignore_whitespace: bool = False) -> bool:
"""Compare two XML elements for equality."""
# Compare tags
if elem1.tag != elem2.tag:
return False
# Compare attributes
if elem1.attrib != elem2.attrib:
return False
# Compare text
text1 = (elem1.text or "").strip() if ignore_whitespace else (elem1.text or "")
text2 = (elem2.text or "").strip() if ignore_whitespace else (elem2.text or "")
if text1 != text2:
return False
# Compare tail
tail1 = (elem1.tail or "").strip() if ignore_whitespace else (elem1.tail or "")
tail2 = (elem2.tail or "").strip() if ignore_whitespace else (elem2.tail or "")
if tail1 != tail2:
return False
# Compare children
children1 = list(elem1)
children2 = list(elem2)
if len(children1) != len(children2):
return False
return all(
_xml_elements_equal(c1, c2, ignore_whitespace)
for c1, c2 in zip(children1, children2)
)
def _ini_configs_equal(config1, config2) -> bool:
"""Compare two ConfigParser objects for equality."""
if set(config1.sections()) != set(config2.sections()):
return False
for section in config1.sections():
if set(config1.options(section)) != set(config2.options(section)):
return False
for option in config1.options(section):
if config1.get(section, option) != config2.get(section, option):
return False
return True
if __name__ == "__main__":
pytest.main([__file__, "-v"])

View file

@ -8,8 +8,8 @@ import yaml
from jinjaturtle.core import ( from jinjaturtle.core import (
parse_config, parse_config,
flatten_config, flatten_config,
generate_defaults_yaml, generate_ansible_yaml,
generate_template, generate_jinja2_template,
) )
from jinjaturtle.handlers.toml import TomlHandler from jinjaturtle.handlers.toml import TomlHandler
import jinjaturtle.handlers.toml as toml_module import jinjaturtle.handlers.toml as toml_module
@ -27,8 +27,8 @@ def test_toml_sample_roundtrip():
flat_items = flatten_config(fmt, parsed) flat_items = flatten_config(fmt, parsed)
assert flat_items assert flat_items
defaults_yaml = generate_defaults_yaml("jinjaturtle", flat_items) ansible_yaml = generate_ansible_yaml("jinjaturtle", flat_items)
defaults = yaml.safe_load(defaults_yaml) defaults = yaml.safe_load(ansible_yaml)
# defaults should be a non-empty dict # defaults should be a non-empty dict
assert isinstance(defaults, dict) assert isinstance(defaults, dict)
@ -42,7 +42,7 @@ def test_toml_sample_roundtrip():
# template generation **now with original_text** # template generation **now with original_text**
original_text = toml_path.read_text(encoding="utf-8") original_text = toml_path.read_text(encoding="utf-8")
template = generate_template( template = generate_jinja2_template(
fmt, parsed, "jinjaturtle", original_text=original_text fmt, parsed, "jinjaturtle", original_text=original_text
) )
assert isinstance(template, str) assert isinstance(template, str)
@ -72,9 +72,9 @@ def test_parse_config_toml_missing_tomllib(monkeypatch):
assert "tomllib/tomli is required" in str(exc.value) assert "tomllib/tomli is required" in str(exc.value)
def test_generate_template_fallback_toml(): def test_generate_jinja2_template_fallback_toml():
""" """
When original_text is not provided, generate_template should use the When original_text is not provided, generate_jinja2_template should use the
structural fallback path for TOML configs. structural fallback path for TOML configs.
""" """
parsed_toml = { parsed_toml = {
@ -84,7 +84,7 @@ def test_generate_template_fallback_toml():
"file": {"path": "/tmp/app.log"} "file": {"path": "/tmp/app.log"}
}, # nested table to hit recursive walk }, # nested table to hit recursive walk
} }
tmpl_toml = generate_template("toml", parsed=parsed_toml, role_prefix="role") tmpl_toml = generate_jinja2_template("toml", parsed=parsed_toml, role_prefix="role")
assert "[server]" in tmpl_toml assert "[server]" in tmpl_toml
assert "role_server_port" in tmpl_toml assert "role_server_port" in tmpl_toml
assert "[logging]" in tmpl_toml or "[logging.file]" in tmpl_toml assert "[logging]" in tmpl_toml or "[logging.file]" in tmpl_toml

View file

@ -10,8 +10,8 @@ import yaml
from jinjaturtle.core import ( from jinjaturtle.core import (
parse_config, parse_config,
flatten_config, flatten_config,
generate_defaults_yaml, generate_ansible_yaml,
generate_template, generate_jinja2_template,
) )
from jinjaturtle.handlers.xml import XmlHandler from jinjaturtle.handlers.xml import XmlHandler
@ -28,8 +28,8 @@ def test_xml_roundtrip_ossec_web_rules():
flat_items = flatten_config(fmt, parsed) flat_items = flatten_config(fmt, parsed)
assert flat_items, "Expected at least one flattened item from XML sample" assert flat_items, "Expected at least one flattened item from XML sample"
defaults_yaml = generate_defaults_yaml("ossec", flat_items) ansible_yaml = generate_ansible_yaml("ossec", flat_items)
defaults = yaml.safe_load(defaults_yaml) defaults = yaml.safe_load(ansible_yaml)
# defaults should be a non-empty dict # defaults should be a non-empty dict
assert isinstance(defaults, dict) assert isinstance(defaults, dict)
@ -55,7 +55,9 @@ def test_xml_roundtrip_ossec_web_rules():
# Template generation (preserving comments) # Template generation (preserving comments)
original_text = xml_path.read_text(encoding="utf-8") original_text = xml_path.read_text(encoding="utf-8")
template = generate_template(fmt, parsed, "ossec", original_text=original_text) template = generate_jinja2_template(
fmt, parsed, "ossec", original_text=original_text
)
assert isinstance(template, str) assert isinstance(template, str)
assert template.strip(), "Template for XML sample should not be empty" assert template.strip(), "Template for XML sample should not be empty"
@ -108,13 +110,13 @@ def test_generate_xml_template_from_text_edge_cases():
assert "role_child_1" in tmpl assert "role_child_1" in tmpl
def test_generate_template_xml_type_error(): def test_generate_jinja2_template_xml_type_error():
""" """
Wrong type for XML in XmlHandler.generate_template should raise TypeError. Wrong type for XML in XmlHandler.generate_jinja2_template should raise TypeError.
""" """
handler = XmlHandler() handler = XmlHandler()
with pytest.raises(TypeError): with pytest.raises(TypeError):
handler.generate_template(parsed="not an element", role_prefix="role") handler.generate_jinja2_template(parsed="not an element", role_prefix="role")
def test_flatten_config_xml_type_error(): def test_flatten_config_xml_type_error():
@ -125,9 +127,9 @@ def test_flatten_config_xml_type_error():
flatten_config("xml", parsed="not-an-element") flatten_config("xml", parsed="not-an-element")
def test_generate_template_xml_structural_fallback(): def test_generate_jinja2_template_xml_structural_fallback():
""" """
When original_text is not provided for XML, generate_template should use When original_text is not provided for XML, generate_jinja2_template should use
the structural fallback path (ET.tostring + handler processing). the structural fallback path (ET.tostring + handler processing).
""" """
xml_text = textwrap.dedent( xml_text = textwrap.dedent(
@ -140,7 +142,7 @@ def test_generate_template_xml_structural_fallback():
) )
root = ET.fromstring(xml_text) root = ET.fromstring(xml_text)
tmpl = generate_template("xml", parsed=root, role_prefix="role") tmpl = generate_jinja2_template("xml", parsed=root, role_prefix="role")
# Root attribute path ("@attr",) -> role_attr # Root attribute path ("@attr",) -> role_attr
assert "role_attr" in tmpl assert "role_attr" in tmpl

View file

@ -8,8 +8,8 @@ import yaml
from jinjaturtle.core import ( from jinjaturtle.core import (
parse_config, parse_config,
flatten_config, flatten_config,
generate_defaults_yaml, generate_ansible_yaml,
generate_template, generate_jinja2_template,
) )
from jinjaturtle.handlers.yaml import YamlHandler from jinjaturtle.handlers.yaml import YamlHandler
@ -24,8 +24,8 @@ def test_yaml_roundtrip_with_list_and_comment():
assert fmt == "yaml" assert fmt == "yaml"
flat_items = flatten_config(fmt, parsed) flat_items = flatten_config(fmt, parsed)
defaults_yaml = generate_defaults_yaml("foobar", flat_items) ansible_yaml = generate_ansible_yaml("foobar", flat_items)
defaults = yaml.safe_load(defaults_yaml) defaults = yaml.safe_load(ansible_yaml)
# Defaults: keys are flattened with indices # Defaults: keys are flattened with indices
assert defaults["foobar_foo"] == "bar" assert defaults["foobar_foo"] == "bar"
@ -34,7 +34,9 @@ def test_yaml_roundtrip_with_list_and_comment():
# Template generation (preserving comments) # Template generation (preserving comments)
original_text = yaml_path.read_text(encoding="utf-8") original_text = yaml_path.read_text(encoding="utf-8")
template = generate_template(fmt, parsed, "foobar", original_text=original_text) template = generate_jinja2_template(
fmt, parsed, "foobar", original_text=original_text
)
# Comment preserved # Comment preserved
assert "# Top comment" in template assert "# Top comment" in template
@ -86,14 +88,14 @@ def test_generate_yaml_template_from_text_edge_cases():
assert "role_list_1" in tmpl assert "role_list_1" in tmpl
def test_generate_template_yaml_structural_fallback(): def test_generate_jinja2_template_yaml_structural_fallback():
""" """
When original_text is not provided for YAML, generate_template should use When original_text is not provided for YAML, generate_jinja2_template should use
the structural fallback path (yaml.safe_dump + handler processing). the structural fallback path (yaml.safe_dump + handler processing).
""" """
parsed = {"outer": {"inner": "val"}} parsed = {"outer": {"inner": "val"}}
tmpl = generate_template("yaml", parsed=parsed, role_prefix="role") tmpl = generate_jinja2_template("yaml", parsed=parsed, role_prefix="role")
# We don't care about exact formatting, just that the expected variable # We don't care about exact formatting, just that the expected variable
# name shows up, proving we went through the structural path. # name shows up, proving we went through the structural path.

View file

@ -0,0 +1,558 @@
"""
Tests to ensure all Jinja2 template variables exist in the Ansible YAML.
These tests catch the bug where templates reference variables that don't exist
because the YAML has a list but the template uses scalar references (or vice versa).
"""
from __future__ import annotations
import re
from pathlib import Path
from typing import Set
import yaml
import pytest
from jinjaturtle.core import (
parse_config,
analyze_loops,
flatten_config,
generate_ansible_yaml,
generate_jinja2_template,
)
def extract_jinja_variables(template: str) -> Set[str]:
"""
Extract all Jinja2 variable names from a template that must exist in YAML.
Extracts variables from:
- {{ variable_name }}
- {{ variable.field }}
- {% for item in collection %}
Returns only the base variable names that must be defined in YAML.
Filters out loop variables (the 'item' part of 'for item in collection').
"""
variables = set()
# First, find all loop variables (these are defined by the template, not YAML)
loop_vars = set()
for_pattern = r"\{%\s*for\s+(\w+)\s+in\s+([a-zA-Z_][a-zA-Z0-9_]*)"
for match in re.finditer(for_pattern, template):
loop_var = match.group(1) # The item
collection = match.group(2) # The collection
loop_vars.add(loop_var)
variables.add(collection) # Collection must exist in YAML
# Pattern 1: {{ variable_name }} or {{ variable.field }}
# Captures the first part before any dots or filters
var_pattern = r"\{\{\s*([a-zA-Z_][a-zA-Z0-9_]*)"
for match in re.finditer(var_pattern, template):
var_name = match.group(1)
# Only add if it's not a loop variable
if var_name not in loop_vars:
variables.add(var_name)
return variables
def extract_yaml_variables(ansible_yaml: str) -> Set[str]:
"""
Extract all variable names from Ansible YAML.
Returns the top-level keys from the YAML document.
"""
data = yaml.safe_load(ansible_yaml)
if not isinstance(data, dict):
return set()
return set(data.keys())
class TestTemplateYamlConsistency:
"""Tests that verify template variables exist in YAML."""
def test_simple_json_consistency(self):
"""Simple JSON with scalars and lists."""
json_text = """
{
"name": "test",
"values": [1, 2, 3]
}
"""
fmt = "json"
import json
parsed = json.loads(json_text)
loop_candidates = analyze_loops(fmt, parsed)
flat_items = flatten_config(fmt, parsed, loop_candidates)
ansible_yaml = generate_ansible_yaml("app", flat_items, loop_candidates)
template = generate_jinja2_template(fmt, parsed, "app", None, loop_candidates)
yaml_vars = extract_yaml_variables(ansible_yaml)
template_vars = extract_jinja_variables(template)
# Every variable in template must exist in YAML
missing_vars = template_vars - yaml_vars
assert not missing_vars, (
f"Template references variables not in YAML: {missing_vars}\n"
f"YAML vars: {yaml_vars}\n"
f"Template vars: {template_vars}\n"
f"Template:\n{template}\n"
f"YAML:\n{ansible_yaml}"
)
def test_toml_inline_array_consistency(self):
"""TOML with inline array should use loops consistently."""
import tomllib
toml_text = """
name = "myapp"
servers = ["server1", "server2", "server3"]
"""
parsed = tomllib.loads(toml_text)
loop_candidates = analyze_loops("toml", parsed)
flat_items = flatten_config("toml", parsed, loop_candidates)
ansible_yaml = generate_ansible_yaml("app", flat_items, loop_candidates)
template = generate_jinja2_template(
"toml", parsed, "app", toml_text, loop_candidates
)
yaml_vars = extract_yaml_variables(ansible_yaml)
template_vars = extract_jinja_variables(template)
missing_vars = template_vars - yaml_vars
assert not missing_vars, (
f"Template references variables not in YAML: {missing_vars}\n"
f"Template:\n{template}\n"
f"YAML:\n{ansible_yaml}"
)
def test_toml_array_of_tables_consistency(self):
"""TOML with [[array.of.tables]] should use loops consistently."""
import tomllib
toml_text = """
[[database]]
host = "db1.example.com"
port = 5432
[[database]]
host = "db2.example.com"
port = 5433
"""
parsed = tomllib.loads(toml_text)
loop_candidates = analyze_loops("toml", parsed)
flat_items = flatten_config("toml", parsed, loop_candidates)
ansible_yaml = generate_ansible_yaml("app", flat_items, loop_candidates)
template = generate_jinja2_template(
"toml", parsed, "app", toml_text, loop_candidates
)
yaml_vars = extract_yaml_variables(ansible_yaml)
template_vars = extract_jinja_variables(template)
missing_vars = template_vars - yaml_vars
assert not missing_vars, (
f"Template references variables not in YAML: {missing_vars}\n"
f"Template:\n{template}\n"
f"YAML:\n{ansible_yaml}"
)
# Additionally verify that if YAML has a list, template uses a loop
defaults = yaml.safe_load(ansible_yaml)
for var_name, value in defaults.items():
if isinstance(value, list) and len(value) > 1:
# YAML has a list - template should use {% for %}
assert "{% for" in template, (
f"YAML has list variable '{var_name}' but template doesn't use loops\n"
f"Template:\n{template}"
)
def test_yaml_list_consistency(self):
"""YAML with lists should use loops consistently."""
yaml_text = """
name: myapp
servers:
- server1
- server2
- server3
databases:
- host: db1
port: 5432
- host: db2
port: 5433
"""
parsed = yaml.safe_load(yaml_text)
loop_candidates = analyze_loops("yaml", parsed)
flat_items = flatten_config("yaml", parsed, loop_candidates)
ansible_yaml = generate_ansible_yaml("app", flat_items, loop_candidates)
template = generate_jinja2_template(
"yaml", parsed, "app", yaml_text, loop_candidates
)
yaml_vars = extract_yaml_variables(ansible_yaml)
template_vars = extract_jinja_variables(template)
missing_vars = template_vars - yaml_vars
assert not missing_vars, (
f"Template references variables not in YAML: {missing_vars}\n"
f"Template:\n{template}\n"
f"YAML:\n{ansible_yaml}"
)
def test_mixed_scalars_and_loops_consistency(self):
"""Config with both scalars and loops should be consistent."""
import tomllib
toml_text = """
name = "myapp"
version = "1.0"
ports = [8080, 8081, 8082]
[database]
host = "localhost"
port = 5432
[[servers]]
name = "web1"
ip = "10.0.0.1"
[[servers]]
name = "web2"
ip = "10.0.0.2"
"""
parsed = tomllib.loads(toml_text)
loop_candidates = analyze_loops("toml", parsed)
flat_items = flatten_config("toml", parsed, loop_candidates)
ansible_yaml = generate_ansible_yaml("app", flat_items, loop_candidates)
template = generate_jinja2_template(
"toml", parsed, "app", toml_text, loop_candidates
)
yaml_vars = extract_yaml_variables(ansible_yaml)
template_vars = extract_jinja_variables(template)
missing_vars = template_vars - yaml_vars
assert not missing_vars, (
f"Template references variables not in YAML: {missing_vars}\n"
f"Template:\n{template}\n"
f"YAML:\n{ansible_yaml}"
)
def test_no_orphaned_scalar_references(self):
"""
When YAML has a list variable, template must NOT reference scalar indices.
This catches the bug where:
- YAML has: app_list: [1, 2, 3]
- Template incorrectly uses: {{ app_list_0 }}, {{ app_list_1 }}
"""
import json
json_text = '{"items": [1, 2, 3, 4, 5]}'
parsed = json.loads(json_text)
loop_candidates = analyze_loops("json", parsed)
flat_items = flatten_config("json", parsed, loop_candidates)
ansible_yaml = generate_ansible_yaml("app", flat_items, loop_candidates)
template = generate_jinja2_template(
"json", parsed, "app", None, loop_candidates
)
defaults = yaml.safe_load(ansible_yaml)
# Check each list variable in YAML
for var_name, value in defaults.items():
if isinstance(value, list):
# Template should NOT reference app_items_0, app_items_1, etc.
for i in range(len(value)):
scalar_ref = f"{var_name}_{i}"
assert scalar_ref not in template, (
f"Template incorrectly uses scalar reference '{scalar_ref}' "
f"when YAML has '{var_name}' as a list\n"
f"Template should use loops, not scalar indices\n"
f"Template:\n{template}"
)
def test_all_sample_files_consistency(self):
"""Test all sample files for consistency."""
samples_dir = Path(__file__).parent / "samples"
sample_files = [
("foo.json", "json"),
("bar.yaml", "yaml"),
("tom.toml", "toml"),
]
for filename, fmt in sample_files:
file_path = samples_dir / filename
if not file_path.exists():
pytest.skip(f"Sample file {filename} not found")
original_text = file_path.read_text()
fmt_detected, parsed = parse_config(file_path)
loop_candidates = analyze_loops(fmt_detected, parsed)
flat_items = flatten_config(fmt_detected, parsed, loop_candidates)
ansible_yaml = generate_ansible_yaml("test", flat_items, loop_candidates)
template = generate_jinja2_template(
fmt_detected, parsed, "test", original_text, loop_candidates
)
yaml_vars = extract_yaml_variables(ansible_yaml)
template_vars = extract_jinja_variables(template)
missing_vars = template_vars - yaml_vars
assert not missing_vars, (
f"File: {filename}\n"
f"Template references variables not in YAML: {missing_vars}\n"
f"YAML vars: {yaml_vars}\n"
f"Template vars: {template_vars}\n"
f"Template:\n{template}\n"
f"YAML:\n{ansible_yaml}"
)
class TestStructuralConsistency:
"""Tests that verify structural consistency between YAML and templates."""
def test_list_in_yaml_means_loop_in_template(self):
"""When YAML has a list (len > 1), template should use {% for %}."""
import json
json_text = """
{
"scalar": "value",
"list": [1, 2, 3]
}
"""
parsed = json.loads(json_text)
loop_candidates = analyze_loops("json", parsed)
flat_items = flatten_config("json", parsed, loop_candidates)
ansible_yaml = generate_ansible_yaml("app", flat_items, loop_candidates)
template = generate_jinja2_template(
"json", parsed, "app", None, loop_candidates
)
defaults = yaml.safe_load(ansible_yaml)
# Find list variables in YAML
list_vars = [
k for k, v in defaults.items() if isinstance(v, list) and len(v) > 1
]
if list_vars:
# Template must contain for loops
assert "{% for" in template, (
f"YAML has list variables {list_vars} but template has no loops\n"
f"Template:\n{template}"
)
# Each list variable should be used in a for loop
for var_name in list_vars:
# Look for "{% for ... in var_name %}"
for_pattern = (
r"\{%\s*for\s+\w+\s+in\s+" + re.escape(var_name) + r"\s*%\}"
)
assert re.search(for_pattern, template), (
f"List variable '{var_name}' not used in a for loop\n"
f"Template:\n{template}"
)
def test_scalar_in_yaml_means_no_loop_in_template(self):
"""When YAML has scalars, template should use {{ var }}, not loops."""
import json
json_text = """
{
"name": "test",
"port": 8080,
"enabled": true
}
"""
parsed = json.loads(json_text)
loop_candidates = analyze_loops("json", parsed)
flat_items = flatten_config("json", parsed, loop_candidates)
ansible_yaml = generate_ansible_yaml("app", flat_items, loop_candidates)
template = generate_jinja2_template(
"json", parsed, "app", None, loop_candidates
)
defaults = yaml.safe_load(ansible_yaml)
# All variables are scalars - template should NOT have loops
scalar_vars = [
k for k, v in defaults.items() if not isinstance(v, (list, dict))
]
# Check that scalar vars are used directly, not in loops
for var_name in scalar_vars:
# Should appear in {{ var_name }}, not {% for ... in var_name %}
direct_ref = f"{{{{ {var_name}"
loop_ref = f"for .* in {var_name}"
assert direct_ref in template, (
f"Scalar variable '{var_name}' should be directly referenced\n"
f"Template:\n{template}"
)
assert not re.search(loop_ref, template), (
f"Scalar variable '{var_name}' incorrectly used in a loop\n"
f"Template:\n{template}"
)
def test_no_undefined_variable_errors(self):
"""
Simulate Ansible template rendering to catch undefined variables.
This is the ultimate test - actually render the template with the YAML
and verify no undefined variable errors occur.
"""
from jinja2 import Environment, StrictUndefined
import json
json_text = """
{
"name": "myapp",
"servers": ["web1", "web2"],
"database": {
"host": "localhost",
"port": 5432
}
}
"""
parsed = json.loads(json_text)
loop_candidates = analyze_loops("json", parsed)
flat_items = flatten_config("json", parsed, loop_candidates)
ansible_yaml = generate_ansible_yaml("app", flat_items, loop_candidates)
template = generate_jinja2_template(
"json", parsed, "app", None, loop_candidates
)
# Load variables from YAML
variables = yaml.safe_load(ansible_yaml)
# Try to render the template
env = Environment(undefined=StrictUndefined)
try:
jinja_template = env.from_string(template)
rendered = jinja_template.render(variables)
# Successfully rendered - this is what we want!
assert rendered, "Template rendered successfully"
except Exception as e:
pytest.fail(
f"Template rendering failed with variables from YAML\n"
f"Error: {e}\n"
f"Template:\n{template}\n"
f"Variables:\n{ansible_yaml}"
)
class TestRegressionBugs:
"""Tests for specific bugs that were found and fixed."""
def test_toml_array_of_tables_no_scalar_refs(self):
"""
Regression test: TOML [[array]] should not generate scalar references.
Bug: Template had {{ app_database_host }} when YAML had app_database as list.
"""
import tomllib
toml_text = """
[[database]]
host = "db1"
port = 5432
[[database]]
host = "db2"
port = 5433
"""
parsed = tomllib.loads(toml_text)
loop_candidates = analyze_loops("toml", parsed)
flat_items = flatten_config("toml", parsed, loop_candidates)
ansible_yaml = generate_ansible_yaml("app", flat_items, loop_candidates)
template = generate_jinja2_template(
"toml", parsed, "app", toml_text, loop_candidates
)
# YAML should have app_database as a list
defaults = yaml.safe_load(ansible_yaml)
assert isinstance(
defaults.get("app_database"), list
), f"Expected app_database to be a list in YAML\n{ansible_yaml}"
# Template should NOT have app_database_host or app_database_port
assert (
"app_database_host" not in template
), f"Template incorrectly uses scalar 'app_database_host'\n{template}"
assert (
"app_database_port" not in template
), f"Template incorrectly uses scalar 'app_database_port'\n{template}"
# Template SHOULD use a loop
assert "{% for" in template, f"Template should use a loop\n{template}"
assert (
"app_database" in template
), f"Template should reference app_database\n{template}"
def test_json_array_no_index_refs(self):
"""
Regression test: JSON arrays should not generate index references.
Bug: Template had {{ app_list_0 }}, {{ app_list_1 }} when YAML had app_list as list.
"""
import json
json_text = '{"items": [1, 2, 3]}'
parsed = json.loads(json_text)
loop_candidates = analyze_loops("json", parsed)
flat_items = flatten_config("json", parsed, loop_candidates)
ansible_yaml = generate_ansible_yaml("app", flat_items, loop_candidates)
template = generate_jinja2_template(
"json", parsed, "app", None, loop_candidates
)
# YAML should have app_items as a list
defaults = yaml.safe_load(ansible_yaml)
assert isinstance(defaults.get("app_items"), list)
# Template should NOT have app_items_0, app_items_1, app_items_2
for i in range(3):
assert (
f"app_items_{i}" not in template
), f"Template incorrectly uses scalar 'app_items_{i}'\n{template}"
# Template SHOULD use a loop
assert "{% for" in template
assert "app_items" in template
if __name__ == "__main__":
pytest.main([__file__, "-v"])

218
utils/diff_configs.py Normal file
View file

@ -0,0 +1,218 @@
#!/usr/bin/env python3
"""
Side-by-side comparison of original vs regenerated config.
Usage:
./diff_configs.py tests/samples/foo.json
./diff_configs.py tests/samples/tom.toml --context 5
"""
import argparse
import sys
from pathlib import Path
import difflib
import yaml
from jinja2 import Environment, StrictUndefined
# Add parent directory to path for imports
sys.path.insert(0, str(Path(__file__).parent))
from jinjaturtle.core import (
parse_config,
analyze_loops,
flatten_config,
generate_ansible_yaml,
generate_jinja2_template,
)
def colorize(text: str, color: str) -> str:
"""Add ANSI color codes."""
colors = {
"red": "\033[91m",
"green": "\033[92m",
"yellow": "\033[93m",
"blue": "\033[94m",
"reset": "\033[0m",
}
return f"{colors.get(color, '')}{text}{colors['reset']}"
def side_by_side_diff(original: str, regenerated: str, width: int = 80):
"""Print side-by-side diff."""
orig_lines = original.splitlines()
regen_lines = regenerated.splitlines()
# Calculate column width
col_width = width // 2 - 3
print(
colorize("ORIGINAL".center(col_width), "blue")
+ " | "
+ colorize("REGENERATED".center(col_width), "green")
)
print("-" * col_width + "-+-" + "-" * col_width)
max_lines = max(len(orig_lines), len(regen_lines))
for i in range(max_lines):
orig_line = orig_lines[i] if i < len(orig_lines) else ""
regen_line = regen_lines[i] if i < len(regen_lines) else ""
# Truncate if too long
if len(orig_line) > col_width - 2:
orig_line = orig_line[: col_width - 5] + "..."
if len(regen_line) > col_width - 2:
regen_line = regen_line[: col_width - 5] + "..."
# Color lines if different
if orig_line != regen_line:
orig_display = colorize(orig_line.ljust(col_width), "red")
regen_display = colorize(regen_line.ljust(col_width), "green")
else:
orig_display = orig_line.ljust(col_width)
regen_display = regen_line.ljust(col_width)
print(f"{orig_display} | {regen_display}")
def unified_diff(original: str, regenerated: str, filename: str, context: int = 3):
"""Print unified diff."""
orig_lines = original.splitlines(keepends=True)
regen_lines = regenerated.splitlines(keepends=True)
diff = difflib.unified_diff(
orig_lines,
regen_lines,
fromfile=f"{filename} (original)",
tofile=f"{filename} (regenerated)",
n=context,
)
for line in diff:
if line.startswith("+++") or line.startswith("---"):
print(colorize(line.rstrip(), "blue"))
elif line.startswith("@@"):
print(colorize(line.rstrip(), "cyan"))
elif line.startswith("+"):
print(colorize(line.rstrip(), "green"))
elif line.startswith("-"):
print(colorize(line.rstrip(), "red"))
else:
print(line.rstrip())
def main():
parser = argparse.ArgumentParser(
description="Compare original config with regenerated version",
formatter_class=argparse.RawDescriptionHelpFormatter,
)
parser.add_argument("file", type=Path, help="Config file to check")
parser.add_argument(
"--mode",
choices=["side-by-side", "unified", "both"],
default="both",
help="Comparison mode (default: both)",
)
parser.add_argument(
"--context",
type=int,
default=3,
help="Number of context lines for unified diff (default: 3)",
)
parser.add_argument(
"--width",
type=int,
default=160,
help="Terminal width for side-by-side (default: 160)",
)
args = parser.parse_args()
if not args.file.exists():
print(colorize(f"❌ File not found: {args.file}", "red"))
return 1
print(colorize(f"\n{'=' * 80}", "blue"))
print(colorize(f" Comparing: {args.file}", "blue"))
print(colorize(f"{'=' * 80}\n", "blue"))
# Read and regenerate
try:
original_text = args.file.read_text()
fmt, parsed = parse_config(args.file)
loop_candidates = analyze_loops(fmt, parsed)
flat_items = flatten_config(fmt, parsed, loop_candidates)
ansible_yaml = generate_ansible_yaml("app", flat_items, loop_candidates)
template = generate_jinja2_template(
fmt, parsed, "app", original_text, loop_candidates
)
variables = yaml.safe_load(ansible_yaml)
env = Environment(undefined=StrictUndefined)
jinja_template = env.from_string(template)
regenerated_text = jinja_template.render(variables)
# Check if identical
if original_text.strip() == regenerated_text.strip():
print(colorize("✅ Files are IDENTICAL (text comparison)\n", "green"))
else:
# Show diff
if args.mode in ("unified", "both"):
print(colorize("\n--- UNIFIED DIFF ---\n", "yellow"))
unified_diff(
original_text, regenerated_text, args.file.name, args.context
)
if args.mode in ("side-by-side", "both"):
print(colorize("\n--- SIDE-BY-SIDE COMPARISON ---\n", "yellow"))
side_by_side_diff(original_text, regenerated_text, args.width)
# Try semantic comparison
print(colorize(f"\n{'=' * 80}", "cyan"))
print(colorize(" Semantic Comparison", "cyan"))
print(colorize(f"{'=' * 80}", "cyan"))
try:
if fmt == "json":
import json
if json.loads(original_text) == json.loads(regenerated_text):
print(colorize("✅ JSON data structures are IDENTICAL", "green"))
else:
print(colorize("⚠️ JSON data structures DIFFER", "yellow"))
elif fmt == "yaml":
if yaml.safe_load(original_text) == yaml.safe_load(regenerated_text):
print(colorize("✅ YAML data structures are IDENTICAL", "green"))
else:
print(colorize("⚠️ YAML data structures DIFFER", "yellow"))
elif fmt == "toml":
try:
import tomllib
except Exception:
import tomli as tomllib
if tomllib.loads(original_text) == tomllib.loads(regenerated_text):
print(colorize("✅ TOML data structures are IDENTICAL", "green"))
else:
print(colorize("⚠️ TOML data structures DIFFER", "yellow"))
except Exception as e:
print(colorize(f" Could not compare semantically: {e}", "yellow"))
except Exception as e:
print(colorize(f"❌ ERROR: {e}", "red"))
import traceback
traceback.print_exc()
return 1
return 0
if __name__ == "__main__":
sys.exit(main())

162
utils/regenerate.py Normal file
View file

@ -0,0 +1,162 @@
#!/usr/bin/env python3
"""
Regenerate config files and save all intermediate files.
Creates:
- original.{ext}
- defaults/main.yml
- templates/config.j2
- regenerated.{ext}
Usage:
./regenerate.py tests/samples/foo.json
./regenerate.py tests/samples/tom.toml --output-dir tmp/toml_test
"""
import argparse
import sys
from pathlib import Path
import yaml
from jinja2 import Environment, StrictUndefined
# Add parent directory to path for imports
sys.path.insert(0, str(Path(__file__).parent))
from jinjaturtle.core import (
parse_config,
analyze_loops,
flatten_config,
generate_ansible_yaml,
generate_jinja2_template,
)
def regenerate_and_save(config_file: Path, output_dir: Path, role_prefix: str = "app"):
"""
Regenerate config and save all intermediate files.
"""
output_dir.mkdir(parents=True, exist_ok=True)
# Read original
original_text = config_file.read_text()
fmt, parsed = parse_config(config_file)
# Determine extension
ext = config_file.suffix
# Save original
original_out = output_dir / f"original{ext}"
original_out.write_text(original_text)
print(f"📄 Saved: {original_out}")
# Generate Ansible files
loop_candidates = analyze_loops(fmt, parsed)
flat_items = flatten_config(fmt, parsed, loop_candidates)
ansible_yaml = generate_ansible_yaml(role_prefix, flat_items, loop_candidates)
template = generate_jinja2_template(
fmt, parsed, role_prefix, original_text, loop_candidates
)
# Save Ansible YAML
defaults_dir = output_dir / "defaults"
defaults_dir.mkdir(exist_ok=True)
defaults_file = defaults_dir / "main.yml"
defaults_file.write_text(ansible_yaml)
print(f"📄 Saved: {defaults_file}")
# Save template
templates_dir = output_dir / "templates"
templates_dir.mkdir(exist_ok=True)
template_file = templates_dir / "config.j2"
template_file.write_text(template)
print(f"📄 Saved: {template_file}")
# Render template
variables = yaml.safe_load(ansible_yaml)
env = Environment(undefined=StrictUndefined)
jinja_template = env.from_string(template)
regenerated_text = jinja_template.render(variables)
# Save regenerated
regenerated_out = output_dir / f"regenerated{ext}"
regenerated_out.write_text(regenerated_text)
print(f"📄 Saved: {regenerated_out}")
# Summary
print(f"\n✅ All files saved to: {output_dir}")
print("\n📊 Statistics:")
print(f" Format: {fmt}")
print(f" Loop candidates: {len(loop_candidates)}")
if loop_candidates:
print(" Loops detected:")
for c in loop_candidates:
print(f" - {'.'.join(c.path)}: {len(c.items)} items")
# Check if identical
if original_text.strip() == regenerated_text.strip():
print("\n✅ Original and regenerated are IDENTICAL (text comparison)")
else:
print("\n⚠️ Original and regenerated differ in whitespace/formatting")
print(f" Run: diff {original_out} {regenerated_out}")
return output_dir
def main():
parser = argparse.ArgumentParser(
description="Regenerate config and save all intermediate files",
formatter_class=argparse.RawDescriptionHelpFormatter,
epilog="""
Examples:
%(prog)s tests/samples/foo.json
%(prog)s tests/samples/tom.toml -o tmp/toml_output
%(prog)s tests/samples/bar.yaml --role-prefix myapp
""",
)
parser.add_argument("file", type=Path, help="Config file to process")
parser.add_argument(
"-o",
"--output-dir",
type=Path,
help="Output directory (default: regenerated_<filename>)",
)
parser.add_argument(
"-r",
"--role-prefix",
default="app",
help="Ansible role prefix for variables (default: app)",
)
args = parser.parse_args()
if not args.file.exists():
print(f"❌ File not found: {args.file}")
return 1
# Determine output directory
if args.output_dir:
output_dir = args.output_dir
else:
output_dir = Path(f"regenerated_{args.file.stem}")
print(f"🔄 Regenerating: {args.file}")
print(f"📁 Output directory: {output_dir}")
print(f"🏷️ Role prefix: {args.role_prefix}\n")
try:
regenerate_and_save(args.file, output_dir, args.role_prefix)
return 0
except Exception as e:
print(f"\n❌ ERROR: {e}")
import traceback
traceback.print_exc()
return 1
if __name__ == "__main__":
sys.exit(main())