Commit Graph

13 Commits

Author SHA1 Message Date
Christopher Haster
55ea13b994 scripts: Reverted del to resolve shadowed builtins
I don't know how I completely missed that this doesn't actually work!

Using del _does_ work in Python's repl, but it makes sense the repl may
differ from actual function execution in this case.

The problem is Python still thinks the relevant builtin is a local
variables after deletion, raising an UnboundLocalError instead of
performing a global lookup. In theory this would work if the variable
could be made global, but since global/nonlocal statements are lifted,
Python complains with "SyntaxError: name 'list' is parameter and
global".

And that's A-Ok! Intentionally shadowing language builtins already puts
this code deep into ugly hacks territory.
2025-05-15 14:10:42 -05:00
Christopher Haster
b5c3b97ae1 scripts: Reworked dbgtag.py, added -i/--input, included hex in output
This just gives dbgtag.py a few more bells and whistles that may be
useful:

- Can now parse multiple tags from hex:

    $ ./scripts/dbgtag.py -x 71 01 01 01 12 02 02 02
    71 01 01 01    altrgt 0x101 w1 -1
    12 02 02 02    shrubdir w2 2

  Note this _does_ skip attached data, which risks some confusion but
  not skipping attached data will probably end up printing a bunch of
  garbage for most use cases:

    $ ./scripts/dbgtag.py -x 01 01 01 04 02 02 02 02 03 03 03 03
    01 01 01 04    gdelta 0x01 w1 4
    03 03 03 03    struct 0x03 w3 3

- Included hex in output. This is helpful for learning about the tag
  encoding and also helps identify tags when parsing multiple tags.

  I considered also included offsets, which might help with
  understanding attached data, but decided it would be too noisy. At
  some point you should probably jump to dbgrbyd.py anyways...

- Added -i/--input to read tags from a file. This is roughly the same as
  -x/--hex, but allows piping from other scripts:

    $ ./scripts/dbgcat.py disk -b4096 0 -n4,8 | ./scripts/dbgtag.py -i-
    80 03 00 08    magic 8

  Note this reads the entire file in before processing. We'd need to fit
  everything into RAM anyways to figure out padding.
2025-04-16 15:23:10 -05:00
Christopher Haster
9085f1fdd9 scripts: crc32c.py/parity.py: Show string with -s/--string
This matches the behavior of paths and helps figure out which string is
associated with which crc32c/parity when checksumming multiple strings:

  $ ./scripts/crc32c.py -s hi hello
  f59dd9c2  hi
  9a71bb4c  hello

It also might help clear up confusion if someone forgets to quote a
string with spaces inside it.
2025-04-16 15:23:09 -05:00
Christopher Haster
71930a5c01 scripts: Tweaked openio comment
Dang, this touched like every single script.
2025-04-16 15:23:06 -05:00
Christopher Haster
3820be180d scripts: Adopted crc32c lib when available
Jumping from a simple Python implementation to the fully hardware
accelerated crc32c library basically deletes any crc32c related
bottlenecks:

  crc32c.py disk (1MiB) w/  crc32c lib: 0m0.027s
  crc32c.py disk (1MiB) w/o crc32c lib: 0m0.844s

This uses the same try-import trick we use for inotify_simple, so we get
the speed improvement without losing portability.

---

In dbgbmap.py:

  dbgbmap.py w/  crc32c lib:             0m0.273s
  dbgbmap.py w/o crc32c lib:             0m0.697s
  dbgbmap.py w/  crc32c lib --no-ckdata: 0m0.269s
  dbgbmap.py w/o crc32c lib --no-ckdata: 0m0.490s
  dbgbmap.old.py:                        0m0.231s

The bulk of the runtime is still in Rbyd.fetch, but this is now
dominated by leb128 decoding, which makes sense. We do ~twice as many
fetches in the new dbgbmap.py in order to calculate the gcksum (which
we then ignore...).
2025-04-16 15:22:34 -05:00
Christopher Haster
262ad7c08e scripts: Simplified dbgtag.py, tweaked -x/--hex decoding
This drops the option to read tags from a disk file. I don't think I've
ever used this, and it requires quite a bit of circuitry to implement.

Also dropped -s/--string, because most tags can't be represented as
strings?

And tweaked -x/--hex flags to correctly parse spaces in arguments, so
now these are equivalent:

- ./scripts/dbgtag.py -x 00 03 00 08
- ./scripts/dbgtag.py -x "00 03 00 08"
2025-04-16 15:21:54 -05:00
Christopher Haster
313696ecf9 scripts: Fixed openio issue where some scripts didn't import os
This only failed if "-" was used as an argument (for stdin/stdout), so
the issue was pretty hard to spot.

openio is a heavily copy-pasted function, so it makes sense to just add
the import os to openio directly. Otherwise this mistake will likely
happen again in the future.
2025-03-12 21:18:51 -05:00
Christopher Haster
62cc4dbb14 scripts: Disabled local import hack on import
Moved local import hack behind if __name__ == "__main__"

These scripts aren't really intended to be used as python libraries.
Still, it's useful to import them for debugging and to get access to
their juicy internals.
2025-01-28 14:41:30 -06:00
Christopher Haster
7cfcc1af1d scripts: Renamed summary.py -> csv.py
This seems like a more fitting name now that this script has evolved
into more of a general purpose high-level CSV tool.

Unfortunately this does conflict with the standard csv module in Python,
breaking every script that imports csv (which is most of them).
Fortunately, Python is flexible enough to let us remove the current
directory before imports with a bit of an ugly hack:

  # prevent local imports
  __import__('sys').path.pop(0)

These scripts are intended to be standalone anyways, so this is probably
a good pattern to adopt.
2024-11-09 12:31:16 -06:00
Christopher Haster
007ac97bec scripts: Adopted double-indent on multiline expressions
This matches the style used in C, which is good for consistency:

  a_really_long_function_name(
          double_indent_after_first_newline(
              single_indent_nested_newlines))

We were already doing this for multiline control-flow statements, simply
because I'm not sure how else you could indent this without making
things really confusing:

  if a_really_long_function_name(
          double_indent_after_first_newline(
              single_indent_nested_newlines)):
      do_the_thing()

This was the only real difference style-wise between the Python code and
C code, so now both should be following roughly the same style (80 cols,
double-indent multiline exprs, prefix multiline binary ops, etc).
2024-11-06 15:31:17 -06:00
Christopher Haster
9a9b3fc161 scripts: Reverted crc32c.py to naive CRC implementation
The naive implementation is simpler, less code, and more likely to be
correct, each of these are more valuable than speed in our debug
scripts.

We're in Python anyways (no offense Python!).

Plus I think it's good to show that the underlying logic of CRCs aren't
really that complex, at least until we throw optimizations into the mix.
2024-07-31 12:17:13 -05:00
Christopher Haster
9905bd397a Extended crc32c.py to support hex sequences and strings
So now the following forms are supported:

  $ ./scripts/crc32c.py -x 41 42 43 44
  fb9f8872

  $ ./scripts/crc32c.py -s abcd
  fb9f8872

  $ echo '00: 41 42 43 44' | xxd -r | ./scripts/crc32c.py
  fb9f8872

Hopefully this will make crc32c.py more useful. It hasn't seen very much
use, though that may just be because of the difficulty marshalling data
into a format crc32c.py can operate on.

That and dbgblock.py's -x/--cksum flag covering one of the main use
cases.
2024-04-01 16:27:59 -05:00
Christopher Haster
e7bf5ad82f Added scripts/crc32c.py
This seems like a useful script to have.
2023-09-15 18:42:48 -05:00