mirror of
https://github.com/littlefs-project/littlefs.git
synced 2025-12-06 23:52:44 +00:00
scripts: Adopted simpler+faster heuristic for symbol->dwarf mapping
After tinkering around with the scripts for a bit, I've started to
realize difflib is kinda... really slow...
I don't think this is strictly difflib's fault. It's a pure python
library (proof of concept?), may be prioritizing quality over speed, and
I may be throwing too much data at it.
difflib does have quick_ratio() and real_quick_ratio() for faster
comparisons, but while looking into these for correctness, I realized
there's a simpler heuristic we can use since GCC's optimized names seem
strictly additive: Choose the name that matches with the smallest prefix
and suffix.
So comparing, say, lfsr_rbyd_lookup to __lfsr_rbyd_lookup.constprop.0:
lfsr_rbyd_lookup
__lfsr_rbyd_lookup.constprop.0
|'------.-------''----.-----'
'-------|-----. .---'
v v v
key: (matches, 2, 12)
Note we prioritize the prefix, since it seems GCC's optimized names are
strictly suffixes. We also now fail to match if the dwarf name is not
substring, instead of just finding the most similar looking symbol.
This results in both faster and more robust symbol->dwarf mapping:
before: time code.py -Y: 0.393s
after: time code.py -Y: 0.152s
(this is WITH the fast dict lookup on exact matches!)
This also drops difflib from the scripts. So one less dependency to
worry about.
This commit is contained in:
@@ -375,8 +375,6 @@ class DwarfInfo:
|
||||
return self.entries.get(k, d)
|
||||
|
||||
else:
|
||||
import difflib
|
||||
|
||||
# organize entries by name
|
||||
if not hasattr(self, '_by_name'):
|
||||
self._by_name = {}
|
||||
@@ -384,20 +382,24 @@ class DwarfInfo:
|
||||
if entry.name is not None:
|
||||
self._by_name[entry.name] = entry
|
||||
|
||||
# exact match? avoid difflib if we can for speed
|
||||
# exact match? do a quick lookup
|
||||
if k in self._by_name:
|
||||
return self._by_name[k]
|
||||
# find the best matching dwarf entry with difflib
|
||||
# find the best matching dwarf entry with a simple
|
||||
# heuristic
|
||||
#
|
||||
# this can be different from the actual symbol because
|
||||
# of optimization passes
|
||||
else:
|
||||
name, entry = max(
|
||||
self._by_name.items(),
|
||||
key=lambda entry: difflib.SequenceMatcher(
|
||||
None, entry[0], k, False).ratio(),
|
||||
default=(None, None))
|
||||
return entry
|
||||
def key(entry):
|
||||
i = k.find(entry.name)
|
||||
if i == -1:
|
||||
return None
|
||||
return (i, len(k)-(i+len(entry.name)), k)
|
||||
return min(
|
||||
filter(key, self._by_name.values()),
|
||||
key=key,
|
||||
default=d)
|
||||
|
||||
def __getitem__(self, k):
|
||||
v = self.get(k)
|
||||
|
||||
Reference in New Issue
Block a user