namespace is dict. Insert 'x' to it. def add(a): # Insert 'add' to global dict return a + x # lookup 'x' from global dict print(add(7)) # search 'print' and 'add' from global dict There are many dicts in Python program. Lookup speed is critical. Insertion speed and memory usage is very important too.
7 "Open addressing" uses another slot in the table. (Another strategy is "chaining") For example, "linear probing" algorithm uses next entry. ※Python uses more complex probing, but I use simpler way in this example. "foo" 42 "spam" "bar" 52 "ham" "baz" 58 "egg"
usage ◦ At least 1/3 of entries are empty ▪ Otherwise, "probing" can be too slow ◦ One entry uses 3 words ▪ word = 8 bytes on recent machine ◦ minimum size = 192 byte ▪ 8 (byte/word) * 3 (word/entry) * 8 (table width)
for small dict ◦ 3*8 *5 (entries) + 8 (index table) = 128 bytes ▪ It was 192 bytes in legacy implementation • Faster iteration (dense entries) • Preserve insertion order • (cons) One more indirect memory access New dict vs Legacy dict
implementation. • OrderedDict based on New dict ◦ Remove doubly linked list used for keep order ◦ About 1/2 memory usage! ◦ Faster creation and iterating. ◦ (cons) Slower .move_to_end() method
to Github. • Read devguide (https://devguide.python.org/ ) • Find easy bug on https://bugs.python.org/ and fix it. • Review other's code • Translate document on Transifex ◦ See https://docs.python.org/ja/
are interned string ◦ only pointer comparison ◦ no "hash" in entry -> more compact • Implement set like dict ◦ current set is larger than dict... • functools.lru_cache ◦ Use `od.move_to_end(key)`, instead of linked list
drop key sharing dict? ▪ key sharing dict can save more memory. • But __slots__ can be used for such cases! ▪ performance improvements in some microbench • Is it matter for real case? __slots__? ▪ Needs consensus • it's more difficult than implementation
can be permitted ◦ "skipped" keys are prohibited ◦ deletion is also prohibited • Otherwise, stop "key sharing" ◦ `self.x = None` is faster than `del self.x`