scripts: csv.py: Reverted define filtering to before expr eval

It's just too unintuitive to filter after exprs.

Note this is consistent with how exprs/mods are evaluated. Exprs/mods
can't reference other exprs/mods because csv.py is only single-pass, so
allowing defines to reference exprs/mods is surprising.

And the solution to needing these sort of post-expr/mod references is
the same for defines: You can always chain multiple csv.py calls.

The reason defines were change to evaluate after expr eval was because
this seemed inconsistent with other result scripts, but this is not
actually the case. Other result scripts simply don't have exprs/mods, so
filtering in fold is the same as filtering during collection. Note that
even in fold, filtering is done _before_ the actual fold/sum operation.

---

Also fixed a recursive-define regression when folding. Counter-
intuitively, we _don't_ want to recursively apply define filters. If we
do the results will just end up too confusing to be useful.
This commit is contained in:
Christopher Haster
2025-02-28 23:34:52 -06:00
parent e851c654c5
commit 2f20f53e90
9 changed files with 24 additions and 18 deletions

View File

@@ -1412,7 +1412,6 @@ def compile(fields_, results,
fields=None,
mods=[],
exprs=[],
defines=[],
sort=None,
children=None,
hot=None,
@@ -1420,10 +1419,6 @@ def compile(fields_, results,
by = by.copy()
fields = fields.copy()
# make sure define fields are included
for k, _ in defines:
if k not in by and k not in fields:
by.append(k)
# make sure sort/hot fields are included
for k, reverse in it.chain(sort or [], hot or []):
# this defaults to typechecking sort/hot fields, which is
@@ -1562,20 +1557,32 @@ def compile(fields_, results,
def homogenize(Result, results, *,
enumerates=None,
defines=[],
depth=1):
# this just converts all (possibly recursive) results to our
# result type
results_ = []
for i, r in enumerate(results):
for r in results:
# filter by matching defines
#
# we do this here instead of in fold to be consistent with
# evaluation order of exprs/mods/etc, note this isn't really
# inconsistent with the other scripts, since they don't really
# evaluate anything
if not all(k in r and str(r[k]) in vs for k, vs in defines):
continue
# append a result
results_.append(Result(**(
r
# enumerate?
| ({e: i for e in enumerates}
| ({e: len(results_) for e in enumerates}
if enumerates is not None
else {})
# recurse?
| ({Result._children: homogenize(
Result, r[Result._children],
# only filter defines at the top level!
enumerates=enumerates,
depth=depth-1)}
if hasattr(Result, '_children')
@@ -1661,7 +1668,7 @@ def fold(Result, results, *,
Result._children: fold(
Result, getattr(r, Result._children),
by=by,
defines=defines,
# only filter defines at the top level!
sort=sort,
depth=depth-1)})
for r in folded]
@@ -2260,7 +2267,6 @@ def main(csv_paths, *,
fields=fields,
mods=mods,
exprs=exprs,
defines=defines,
sort=sort,
children=children,
hot=hot,
@@ -2269,12 +2275,12 @@ def main(csv_paths, *,
# homogenize
results = homogenize(Result, results,
enumerates=enumerates,
defines=defines,
depth=depth)
# fold
results = fold(Result, results,
by=by,
defines=defines,
depth=depth)
# hotify?