scripts: csv.py: Reverted define filtering to before expr eval

It's just too unintuitive to filter after exprs. Note this is consistent with how exprs/mods are evaluated. Exprs/mods can't reference other exprs/mods because csv.py is only single-pass, so allowing defines to reference exprs/mods is surprising. And the solution to needing these sort of post-expr/mod references is the same for defines: You can always chain multiple csv.py calls. The reason defines were change to evaluate after expr eval was because this seemed inconsistent with other result scripts, but this is not actually the case. Other result scripts simply don't have exprs/mods, so filtering in fold is the same as filtering during collection. Note that even in fold, filtering is done _before_ the actual fold/sum operation. --- Also fixed a recursive-define regression when folding. Counter- intuitively, we _don't_ want to recursively apply define filters. If we do the results will just end up too confusing to be useful.
2025-02-28 23:34:52 -06:00
parent e851c654c5
commit 2f20f53e90
9 changed files with 24 additions and 18 deletions
--- a/scripts/csv.py
+++ b/scripts/csv.py
@@ -1412,7 +1412,6 @@ def compile(fields_, results,
        fields=None,
        mods=[],
        exprs=[],
-        defines=[],
        sort=None,
        children=None,
        hot=None,
@@ -1420,10 +1419,6 @@ def compile(fields_, results,
    by = by.copy()
    fields = fields.copy()

-    # make sure define fields are included
-    for k, _ in defines:
-        if k not in by and k not in fields:
-            by.append(k)
    # make sure sort/hot fields are included
    for k, reverse in it.chain(sort or [], hot or []):
        # this defaults to typechecking sort/hot fields, which is
@@ -1562,20 +1557,32 @@ def compile(fields_, results,

 def homogenize(Result, results, *,
        enumerates=None,
+        defines=[],
        depth=1):
    # this just converts all (possibly recursive) results to our
    # result type
    results_ = []
-    for i, r in enumerate(results):
+    for r in results:
+        # filter by matching defines
+        #
+        # we do this here instead of in fold to be consistent with
+        # evaluation order of exprs/mods/etc, note this isn't really
+        # inconsistent with the other scripts, since they don't really
+        # evaluate anything
+        if not all(k in r and str(r[k]) in vs for k, vs in defines):
+            continue
+
+        # append a result
        results_.append(Result(**(
                r
                    # enumerate?
-                    | ({e: i for e in enumerates}
+                    | ({e: len(results_) for e in enumerates}
                        if enumerates is not None
                        else {})
                    # recurse?
                    | ({Result._children: homogenize(
                            Result, r[Result._children],
+                            # only filter defines at the top level!
                            enumerates=enumerates,
                            depth=depth-1)}
                        if hasattr(Result, '_children')
@@ -1661,7 +1668,7 @@ def fold(Result, results, *,
                Result._children: fold(
                        Result, getattr(r, Result._children),
                        by=by,
-                        defines=defines,
+                        # only filter defines at the top level!
                        sort=sort,
                        depth=depth-1)})
                    for r in folded]
@@ -2260,7 +2267,6 @@ def main(csv_paths, *,
            fields=fields,
            mods=mods,
            exprs=exprs,
-            defines=defines,
            sort=sort,
            children=children,
            hot=hot,
@@ -2269,12 +2275,12 @@ def main(csv_paths, *,
    # homogenize
    results = homogenize(Result, results,
            enumerates=enumerates,
+            defines=defines,
            depth=depth)

    # fold
    results = fold(Result, results,
            by=by,
-            defines=defines,
            depth=depth)

    # hotify?