Shadowing in Python gave me an UnboundLocalError
6 points by carlana
6 points by carlana
I mentioned some of these things in a talk I gave at PyData Warsaw in 2017: “Top→Down; Left→Right”
Basically, in CPython, other than files (modules,) there are only three statements that can create scopes: class
, def
, and (sort of) except
. In CPython, scoping is determined statically by the parser, and, as the post notes, scoping cannot generally change within a block (except in class
!)
CPython provides five opcodes for loading values onto the stack:
LOAD_CONST
: load a constantLOAD_FAST
: load something locally-scoped (historically from fast_locals
, now from frame->localsplus
)LOAD_DEREF
: load something scoped to the closureLOAD_GLOBAL
: load something globally-scopedLOAD_NAME
which searches first through the local scope, then through the global scope, then through the builtins
These can be determined statically by first identifying constant values, then looking for static patterns that can create name bindings (x = ...
, for x in ...
, &c.,) then performing the same in closed over scopes, then defaulting to global scope if no such binding can be found.
Consider the below:
from dis import get_instructions
def f():
x = ...
assert all(inst.opname == 'LOAD_CONST' for inst in get_instructions(f.__code__) if inst.argval is ...)
def f(x):
return x
assert all(inst.opname == 'LOAD_FAST' for inst in get_instructions(f.__code__) if inst.argrepr == 'x')
def f():
x = ...
return x
assert all(inst.opname in {'LOAD_FAST', 'STORE_FAST'} for inst in get_instructions(f.__code__) if inst.argrepr == 'x')
def g(x):
def f():
return x
return f
assert all(inst.opname in {'LOAD_DEREF'} for inst in get_instructions(g(...).__code__) if inst.argrepr == 'x')
def f():
return x
assert all(inst.opname in {'LOAD_GLOBAL'} for inst in get_instructions(f.__code__) if inst.argrepr == 'x')
This is why the following raise UnboundLocalError
or NameError
: the parser cannot correctly statically determine where x
comes from or it guesses wrong or it guesses from the static information it has (without the ability to dynamically run code paths)! (The use of global
or nonlocal
may be able to fix issues where the parser will guess incorrectly, by hinting to the parser what the correct answer is.)
from dis import get_instructions
def f():
locals()['x'] = ...
return x
assert all(inst.opname in {'LOAD_GLOBAL'} for inst in get_instructions(f.__code__) if inst.argrepr == 'x')
try: f()
except NameError: pass
else: assert False
x = ...,
def f():
x += ...,
return x
assert all(inst.opname.startswith(('LOAD_FAST', 'STORE_FAST')) for inst in get_instructions(f.__code__) if inst.argrepr == 'x')
try: f()
except UnboundLocalError: pass
else: assert False
x = ...
def f():
if False:
x = ...
return x
assert all(inst.opname.startswith('LOAD_FAST') for inst in get_instructions(f.__code__) if inst.argrepr == 'x')
try: f()
except UnboundLocalError: pass
else: assert False
The wrong guess in the second case is because the parser tries only to statically analyse the left-hand-side of the x = ...
line, presumably since the right hand side could be arbitrarily dynamic! The wrong guess in the third case is quite interesting, since the if False
dead branch is actually elided from the source text! (The elision clearly happens after scope determination.) Thus, we have this oddity:
from dis import get_instructions
x = ...
def f():
if __debug__:
x = ...
return x
if __debug__:
assert all(inst.opname in {'STORE_FAST', 'LOAD_FAST'} for inst in get_instructions(f.__code__) if inst.argrepr == 'x')
assert f() is ...
else:
assert all(inst.opname.startswith('LOAD_FAST') for inst in get_instructions(f.__code__) if inst.argrepr == 'x')
try: f()
except UnboundLocalError: pass
else: assert False
We will see starkly different behaviour whether we are running with or without optimisations (i.e., python
vs python -O
)!
Most people assume that LOAD_NAME
is how all variable access works in Python, but that’s simply not the case. It used to be quite easy to generate this with from … import *
was allowed in function bodies. Clearly, in this case, we can’t statically determine whether the name is available in the local or global scope without knowing the contents of the module (which cannot easily be determined statically!)
def f():
from module import *
return x
The easiest way to generate a LOAD_NAME
now is in a class body, which leads us to this oddity (and the only case that I am aware of where a variable can belong to multiple scopes within the same block.)
from dis import get_instructions
x = ...
def f():
class T:
x = x
return T
assert all(inst.opname in {'LOAD_NAME', 'STORE_NAME'} for inst in get_instructions(f.__code__.co_consts[1]) if inst.argrepr == 'x')
obj = f()()
assert obj.x is ...
If we look at the bytecode of a try
/except
we can see a DELETE_FAST
removing a capture exception value.
from dis import get_instructions
def f():
try: pass
except Exception as e: pass
return e
assert any(inst.opname in {'DELETE_FAST'} for inst in get_instructions(f.__code__) if inst.argrepr == 'e')
try: f()
except UnboundLocalError: pass
else: assert False
I suppose we could say that this is a scope: without an additional name binding that creates another reference to e
, we cannot access it later.
I believe in the talk referenced above, I mention that the CPython parser is quite simplistic. It’s very easy to overthink how Python works (e.g., to assume that def
and class
are “definitions” separate from executable code, that mechanisms like “hoisting” might exist, &c.)
While CPython is gaining more optimisations by the way, I think it’s still currently true that the CPython parser/compiler does only three moderately interesting things:
super()
argument provisionIt’s really quite a simple execution model, which I think has contributed a lot to the success of the language!
It’s really quite a simple execution model, which I think has contributed a lot to the success of the language!
I spent the weekend writing Javascript code. Javascript is a fine language—it’s a tool that allow us to get a lot of useful work done.
However, it always surprises me when I see complaints about Python that are not levered equally against Javascript. Compared to Python, Javascript is an extraördinarily syntactically irregular language with a fairly complex runtime model.
(Of course, I may have unusual opinions on this. I believe that many of the major faults with Javascript code stem from how people actually write code in Javascript and how the language supports these ways of writing code. Such faults are not so trivially solvable by tools like Typescript, which seems moderately useful while potentially being very distracting…)
This is a pet peeve of mine about Python. I wish it had been fixed when they moved to Python 3. Now :=
is wasted as an expression, so it’s even less fixable than before.
Here is a minimal example showing the same problem:
>>> def outer():
... x = 1
... def inner():
... print(x)
... x = 2
... inner()
...
>>> outer()
Traceback (most recent call last):
File "<python-input-1>", line 1, in <module>
outer()
~~~~~^^
File "<python-input-0>", line 6, in outer
inner()
~~~~~^^
File "<python-input-0>", line 4, in inner
print(x)
^
UnboundLocalError: cannot access local variable 'x' where it is not associated with a value
Doing for x in [1, 2, 3]
would have the same problem because the for is an implicit =
.