Python's new assignment operator
Matthew Wilkes on 2021-02-13Python is getting a new assignment operator. Unlike the Walrus operator introduced in Python 3.8, this is spelled the same way as regular assignment. It's party trick, however, is that it assigns the right-hand side of the expression, not the left.
Okay, okay, it's not a first-class assignment operator, but it is assignment using =
, as part of the new pattern matching syntax.
Pattern matching
Pattern Matching is described in PEP 622 (and a few others), and decribes a switch statement like you've never seen before. I wrote some C# for the first the last week, and I enjoyed simplifying logic with a switch statement, but this pattern matching feature goes above and beyond. One of the amazing things it can do is to seemingly reach inside objects to do matching, as described in Listing 1, taken from the PEP docs.
Listing 1. Example derived from PEP documentationfrom dataclasses import dataclass @dataclass class Point: x: int y: int point = Point(1,2) match point: case Point(0, 0): print("Origin") case Point(0, y): print(f"Y={y}") case Point(x, 0): print(f"X={x}") case Point(x, y): print(f"X={x}, Y={y}") case _: raise ValueError("Not a point")
This code would print X=1, Y=2. Nice. It's like a fancy case statement that can reach inside objects? My immediate thought upon seeing this code a few weeks ago was "what about side effects?". What if I start calling functions in this superswitch statement? Thankfully, the documentation is clear, that's not allowed. If you try something like listing 2, you immediately get a TypeError
Listing 2. Trying to get dynamic with matches doesn't workfrom dataclasses import dataclass @dataclass class Point: x: int y: int point = Point(1,2) def landmarkpoint(): coordinate = requests.get("http://example.com/api/landmarks/avebury/coordinate").json() return Point(coordinate['x'], coordinate['y']) match point: case landmarkpoint(): print("Avebury standing stones") case Point(0, 0): print("Origin") case Point(0, y): print(f"Y={y}") case Point(x, 0): print(f"X={x}") case Point(x, y): print(f"X={x}, Y={y}") case _: raise ValueError("Not a point")
It's clear that this isn't built with eval under-the-hood, so these Point objects in the case lines aren't being instantiated. This is good, and what you'd expect from the name pattern matching. They case lines define a pattern to match against, which are checked in turn.
How it works
The magic happens inside the function Point.__match_args__()
, which the dataclass decorator generates for you. If you're not using data classes, you'll need to provide your own. Point
's one returns a tuple ('x', 'y')
, meaning the way of matching a Point
is to compare the x and y values.
This is where we get to the backwards assignment operator. We can infer that case Point(0, 0)
effectively decomposes to isinstance(point, Point) and (point.x, point.y) == (0, 0)
, but some of these lines contain unbound variables. case Point(0, y)
is not comparing (point.x, point.y) == (0, y)
because the variable y
does not exist yet. It's doing something that's a mixture of tuple comparison and extended tuple unpacking. It does both comparison and assignment in one go.
The upshot of this is that the line case Point(0, y):
assigns the variable y
. With no equals sign. Fine, we've got a few places that variables are assigned without an equals (like as y), or with a complex use of an equals (like x, y, z = 1, 2, 3
), but it will take some mental remodelling to understand this as assignment.
Bringing in =
While a bit strange, I wouldn't go so far as to call this a new assignment operator that works backwards. The case lines above look like constructors, right? Can we call them with keyword arguments? Turns out we can:
Listing 3: Listing 1 rewritten using named argumentsfrom dataclasses import dataclass @dataclass class Point: x: int y: int point = Point(1,2) match point: case Point(x=0, y=0): print("Origin") case Point(x=0, y=y): print(f"Y={y}") case Point(x=x, y=0): print(f"X={x}") case Point(x=x, y=y): print(f"X={x}, Y={y}") case _: raise ValueError("Not a point")
This works just the same as listing 1, but with the names of the point internals visible. Now we've got an equals sign, so it's a bit more normal, right? No. There's still matching and backfilling of variables here. We're using the equals sign not as an assignment operator, but as a keyword argument, and the right-hand-side is still an unbound variable. The problem becomes clearer if we change the variable names:
Listing 4: Changing the variable namesfrom dataclasses import dataclass @dataclass class Point: x: int y: int point = Point(1,2) match point: case Point(x=0, y=0): print("Origin") case Point(x=0, y=foo): print(f"Y={foo}") case Point(x=bar, y=0): print(f"X={bar}") case Point(x=bar, y=foo): print(f"X={bar}, Y={foo}") case _: raise ValueError("Not a point")
The line case Point(x=bar, y=foo):
that is matching here does an assignment equivalent to bar, foo = point.x, point.y
. This is the only place in Python that I can think of where the sequence a=b
assigns the value referenced by a
to the variable b
, rather than the other way around.
Conclusion
Pattern matching is a very powerful new feature for Python, one which has the potential to make some really complex logic easier to express. However, it's also very clever. I've got a background in the Zope community, an open source project that did lots of clever things, and this feels very familiar to me. Cleverness and unintuitive behaviour often go hand-in-hand, and I fear that will be the case here.
I've spent the last few weeks thinking how I'd teach this feature, and I keep coming back to the feeling that I'd declare it to be black magic. It does amazing things, but the more I look at it, the more it bothers me. I'm not excited about this feature landing.
If you're a Python developer looking to progress to a senior level, or an experienced developer who wants a better handle on Python 3, you could do worse than buying my book. Links in the header.