Welcome to the second edition of “Station Wagon Full of Tapes”. This is a deep dive into how we can achieve structural subtyping in Python using Protocol definitions from. In this post there will be excerpts from code attached as images, so make sure the images are set to be displayed.
How do we infer if a certain type is compatible with a different type while building a new feature or an endpoint? For me it used to be centered around building classes. Class type attributes and inheritance between different types by creating a hierarchical view on the feature holistically.
This habit was mostly formed around the idea of object-oriented programming. Starting with C++, and later on with professional experience around C#, representing type relationships with objects and their attributes became a second nature. In general, as it has been the core of the software development architectures for this long. I still find that object-oriented structures are very easy to comprehend and scalable.
But, I have a new favorite way to represent type relationships and the love was injected when I started working with Go.
For those who are not familiar with Go, and its relationship with object-oriented programming, the answer is, yes and no. Here’s the excerpt from the documentation;
Yes and no. Although Go has types and methods and allows an object-oriented style of programming, there is no type hierarchy. The concept of “interface” in Go provides a different approach that we believe is easy to use and in some ways more general. There are also ways to embed types in other types to provide something analogous—but not identical—to subclassing. Moreover, methods in Go are more general than in C++ or Java: they can be defined for any sort of data, even built-in types such as plain, “unboxed” integers. They are not restricted to structs (classes). Also, the lack of a type hierarchy makes “objects” in Go feel much more lightweight than in languages such as C++ or Java.
Interfaces in Go
My relationship with Go started at Docker. Containers are built with Go, and so are container orchestration libraries such as Kubernetes. It has since become the standard language for many new additions to the cloud native space.
I have really struggled when I first started working with Go. Years (literally) of approaching each feature by representing it with an object-oriented model, it wasn’t an easy transition to move away from inheritance to represent hierarchies.
Once it clicks, it sticks. It clicked for me when I was implementing “.zip” file support for Docker CLI. (#1895) I basically had to represent a new structure that is supposed to be extended off of the base “Reader” library, and have a limitation around how many bytes should be read per run. This change also included a fork of the base definition for a “LimitedReader” type as the base did not error out properly if it exceeded the size or reached EOF.
During implementation, I had to understand what was io.Reader?
That’s it. “Reader” is an interface that has a function that will take in a byte array and read. So how can one make sure their type can be passed in to the functions that require a “Reader”?
Very simple and straight-forward. No need to dance around variables, imports, inheritance. Let Go handle the type relationships.
Once comfortable, the approach to represent type relationships with interfaces feels like a super-power. This was an addicting way to approach new features.
Representing a relationship between two different types by not focusing on the type attributes and strict inheritance hierarchies, but solely focusing on the features of those types (its functions) is what “structural subtyping” is.
We now know how powerful this way of structuring can be for architecturing new APIs. Although there are libraries at Dropbox written in Go, I now primarily work in Python. But, I wasn’t ready to say goodbye to “interfaces”.
That’s where Mypy comes into the picture.
Python 🤝 Types
Python has types? Well, sort of.
Mypy is a static type checker, and is a must especially when the code base is massive. It doesn’t bring a massive overhead, as it is not a compile time checker, but more of a linter style type checker.
By annotating the code with type definitions, it is really easy to achieve a productive Python development experience, making it feel more secure.
Mypy has a typing definition called Protocol. The idea of assuming certain functions exist in certain types and writing modules according to that is not new for Python developers and it is referred to as duck-typing. What protocols introduce is the type checking security, and some hand-holding for developers when they are using types that are protocol based.
Here’s the same example I’ve shared above but in Python with Mypy;
Now any function that needs a type argument that can “read” they can use this definition, and any object that has defined a “read” function regardless of its class type definition or inheritance pattern can be used.
As Python continues to implement more of these type hints into the language itself with PEP 484, PEP 545 and so forth, we are getting closer to a future where Python keeps its flexibility and nimble nature with proper type support. I am very excited for that future.
Thank you for reading “Station Wagon Full of Tapes”.