Changed scripts to not infer field purposes from CSV values

Note there's a bit of subtlety here, field _types_ are still infered,
but the intention of the fields, i.e. if the field contains data vs
row name/other properties, must be unambiguous in the scripts.

There is still a _tiny_ bit of inference. For most scripts only one
of --by or --fields is strictly needed, since this makes the purpose of
the other fields unambiguous.

The reason for this change is so the scripts are a bit more reliable,
but also because this simplifies the data parsing/inference a bit.

Oh, and this also changes field inference to use the csv.DictReader's
fieldnames field instead of only inspecting the returned dicts. This
should also save a bit of O(n) overhead when parsing CSV files.
This commit is contained in:
Christopher Haster
2023-11-04 15:24:18 -05:00
parent 2be3ff57c5
commit d0a6ef0c89
12 changed files with 187 additions and 200 deletions

View File

@@ -315,10 +315,7 @@ def collect(obj_paths, *,
return results
def fold(Result, results, *,
by=None,
defines=[],
**_):
def fold(Result, results, by=None, defines=[]):
if by is None:
by = Result._by