Welcome to another Tech Talk Tuesday! This week, we feature Raymond Hettinger, a core Python developer. I’ve seen a number of Hettinger’s talks, and most of them are quite good. Hettinger has good technical knowledge, and he’s also quite humorous, in a dry, clever, quirky way.
I decided to cover this particular talk because Hettinger talks about how our cognition concretely plays a role in how we code. This is one of the most under-represented topics in software, but it’s an important one. Writing software is a mental exercise. So knowing the limitations of our cognition, and having solid strategies during this mental exercise is imperative.
Additionally, I like the talk because it talks about generic problem solving and coding strategies that are useful in any language, and even applicable outside of software development.
For the zealots, here’s the talk:
Summary:
Hettinger claims up front that certain strategies are imperative while programming. He also goes over each in depth, with examples. Unfortunately he only gets to talk about the bolded items before he runs out of time:
- Chunking and Aliasing
- Solve related, simpler problems
- Incremental Development
- Build classes independently and let inheritance discover itself
- Repeat tasks manually till patterns emerge, then move to a function
- Consider OOP as a graph traversal problem (won’t cover this since it felt rushed)
- Separate ETL from analysis. Separate analysis from presentation.
- Verify type, verify size, view subset of data, and test a subset
- Humans should never gaze upon unsorted data
- Sets and dict groupings are primary tools for data analysis
Let’s see what Hettinger has to say on these matters by going through his strategies and examples, and I’ll share some quick thoughts of my own on the matter.
Chunking and Aliasing:
Hettinger starts talking about some of the limitations of our brains, including Miller’s Law, or sometimes known as the “Seven Plus or Minus Two” rule. It turns out that our brains can really only remember 7 +/- 2 items in our short term memory. It is simply a fundamental limitation of our minds, backed by hoards of empirical evidence. Fortunately, psychologists have already taught us how to solve this problem, and it’s called chunking. Chunking is where we group information into “chunks” and try to remember those chunks instead of each individual element of a collection. Phone numbers are a common example of where we do some natural “chunking”. Aliasing, on the other hand, is a somewhat more intuitive term, and it’s where we reference a piece of information we’ve already learned.
Hettinger starts describing how the random
module can output a (0,1) uniform range with random()
, and that we can stretch that range by multiplying, and shift it with addition. However, the statement 50 + random() * 200
(i.e. range is now 50-250) is much harder on your cognition than a statement like uniform(50, 250)
, even though they output the same thing. This is a form of chunking, where we group a set of items to be one item (in this case, it’s just less symbols for your mind to think about, and it’s just a nice abstraction). It might not seem like a lot, but having to figure out complex lines of code really wears down your endurance throughout the day.
Hettinger also shows how aliasing is useful, by showing the randrange
function in random
. For us Python programmers, we already know what range
does. For the non-Python folks, range
generates integers given parameters start
, stop
, and step
. For example, list(range(10, 20, 2))
outputs [10, 12, 14, 16, 18]
. randrange
aliases the concept of range
, and gives us a random value from the output of range
.
In Python, it’s easy to get carried away with one liners, especially complicated and/or nested list comprehensions. Sometimes I find it helpful to just rename a complicated one liner into a function to describe what it actually does to help less cognitive load of my future self and other readers of my code.
As Raymond says, when talking about a complex/long one-liner that’s full of about 10 symbols (remember the 7 +/- 2 rule…):
This gives me 10 random outcomes, which is something I know right in the moment I write this code. The problem is, the number of brain registers this uses is 10. This is no longer a decryption effort. This is a puzzle. At the moment you put it together, you fully understand it. But if this is embedded in bigger code, every time you hit this line, you’re going to have to pick apart “what does this thing do”.
Additionally, I wish we would use the concept of aliasing across disciplines and technology stacks. I’ve been using graphQL a lot recently, and I have to say…I think it kind of sucks. We already have really powerful query languages that can support exactly what graphQL is trying to accomplish. graphQL essentially takes a predicate and asks for values you’d like to return…just like standard SQL. But many of us could easily alias standard SQL, since a lot of folks already know how to use it. Why graphQL decided to re-invent this wheel and make life hard with a largely equivalent feature set to basic SQL is beyond me. If it would have used a SQL-like syntax to accomplish its goals, I (and many others) could have aliased the SQL concepts we already know to lessen the learning curve. Note that CQL from Cassandra is a SQL-like dialect that is much easier to learn if you already know SQL, so some people are on the ball here. Go Cassandra devs, woo!
Solving related, simpler problems and do incremental development:
Hettinger goes through solving a related problem and incremental development by showing how to complete a tree traversal problem. At first, Hettinger only shows how to count items in a tree, instead of finding their paths. He also starts with, simple, non-nested flat structures, and then builds up to nested/difficult structures through recursion and code that he has already tested and knows is working through incremental development. After the counting solution is done, it’s pretty clear how to translate it into finding paths to items in a generic tree.
This is a pretty classic problem solving technique, but it’s kind of shocking to see how many people ignore it (or maybe don’t know about it or think about using it). It is best used when if, by magic, we somehow had a solution to our difficult problem, that the simpler problem we formulate would be solved by this solution, too. Then, we solve the simpler problem, and we are concretely one step closer to solving the harder problem (since we’d have to solve it anyway if we wanted to solve the hard version). This is how to easily combine solving simpler problems with incremental development.
The power of this strategy is both practical and psychological:
- We can test our changes in small iterations as we go, and always maintain a working state
- Even if you can’t solve the hard problem, we have at least generated some valuable code that solves some problem
- Often times, the path to solving a hard problem is by building and combining smaller components that work together to solve the hard problem. This helps us follow the 7 +/- 2 rule since abstractions get “chunked” into similar bins and we don’t have to reason about everything all at once
- It uses psychology of “small wins” to help you keep going and not lose motivation.
I see StackOverflow posts all the time that are go like “I need to find <insert some arbitrary goal> such that <insert some complicated conditional>. I tried to do it, but I couldn’t. Here’s my code. What’s my bug?” I then see 100s of lines of complete nonsense that don’t seem to be accomplishing much. It usually becomes clear quite quickly that the author wrote down all of the code at once, didn’t test it at all, and then was confused when it didn’t work. The surefire way to know this is what happened is that you run it on an example test case, and find that it crashes on line 5 of 105. Clearly, the original author never ran their own code! This means they didn’t develop it incrementally as they were going, and never attempted to just get one facet of the problem solved (or solve a simpler problem) and move onto the next facet.
As Hetting says:
And they showed me a big pile of nested for loops and and ifs that they had worked on and they said “It doesn’t work!”. And I didn’t read the code carefully, and I said “I believe you.”
To be honest, if you practice TDD, you almost never have these problems. When I try to help someone who has a bunch of nested/complex code they clearly just sat down and tried to write in one big she-bang, I usually just start over. I do not typically try to massage their broken code into something working. I’ll write a little/simple test, show how to write code to satisfy that simple test, and then move on to more complicated tests/cases. Then post the answer. This is typically pretty well received, because the amount of lines tends to drop drastically, and it’s also clear why and how the solution works. Here’s an example of helping someone on SO where I developed incrementally, and broke the problem down into solving a simpler problem.
Hettinger ends this section with a bit of humor:
This is a methodology that scales very well to hard problems. And it’s how I pick apart really good problems, and when you’re doing interview problems, one of the things that they’re looking for is not just a solution in the interview, they’re looking for how you think about the problems. Do you take the problem and decompose it into smaller problems? Do you start thinking incrementally? That said, there’s a step that you should do in the real world you should never do in the interview. What is that? Check the Python Package Index to see if someone’s already solved the problem. All good practitioners say “I’ll go to the Python Package Index, pip install, it’s done!” And then interview is like “Uhhh, that’s cheating.” And you’re like “maybe I should be your boss.”
Build classes independently and let inheritance discover itself:
Hettinger then pivots to showing how to do some emergent design.
I had studied object oriented modeling techniques, and I learned the predecessors of UML, and made myself nice inheritance diagrams, that related to the entity relationship diagrams I had learned earlier in life. I learned to make a plan! Are you impressed? Why don’t professional chess players think more than 10 moves in advance? … Too many combinations, combinatorial explosion. The idea is that each move in the game updates the state and teaches you something new about the path that you didn’t know before. And so you’re walking through the fog, and each step in the fog, you can see further. And there’s an important lesson in this. A lot of our problems, real world problems, aren’t tic tac toe problems where you can see to the end. They are chess problems where you can’t. So, the idea of fully planning all of this in advance, a waterfall model, sometimes works when the solution is well-known. Otherwise, we prefer more agile methods of letting the inheritance discover itself.
Hettinger then shows a simple example of some validation classes that have common parts that would have been hard to plan for in advance.
What I think Hettinger left out is the consequences of not working in this fashion. In particular, we tend to build superfluous features or abstractions that we don’t need when we plan an inheritance hierarchy up front (tends to create unused methods). It can also make code feel really cumbersome, and it’s almost like being painted into a corner (I can’t do X because the base class doesn’t support it). It’s really the worst of both worlds, you get things you don’t need, and don’t have the things you do need. The reason this happens is because we simply can’t predict the future. Those who think they have sight into the future are typically mistaken. Best to not pretend our brains are good at predicting or inferring the future (they are not), and just react and form as we go.
The downside to working this way, though, is that for the first instance or two of a subclass, they’ll be more manual and some refactoring will be required. A very small price to pay for avoiding poor design. After you’ve got two concrete examples, though, extending to a third, fourth, fifth, subclass is typically trivial plug-and-play (since you’ve already got a decent, working abstraction).
Repeat tasks manually till patterns emerge, then move to a function:
This technique is clearly a generalization to the previous. It doesn’t have to be classes. It can be functions, too. The idea here is that designing up front is just really difficult. Designing after we have many concrete use cases that are actually working in our codebase, however, is trivial. From a psychological perspective, I’ll take trivial over difficult any day of the week.
The example is converting a CSV to XML. As Hettinger puts it:
I find that even amongst programmers with medium experience that have been doing this for several years, they struggle with this. Why? Because their strategy is not awesome. … I can’t count the number of people who immediately make their life worse because the first word they type is def.
Conclusions:
Any generic strategy is not always applicable. Never apply something blindly or dogmatically. However, these are great strategies to start with, and are worth using if you’re stuck on a problem.
Somewhat interestingly, I find that TDD encourages these strategies. Obviously TDD is incremental development, but TDD also really encourages solving simpler but related problems and emergent design. I’d even argue it encourages chunking and aliasing as well, as the last step in the TDD cycle is to refactor. Of course, TDD doesn’t name any of these strategies explicitly other than incremental development, but by following TDD these things usually just occur.
Thanks for the awesome talk, Raymond! Hope to see another one of your talks soon.
Great post, mwm214!
I’m a little confused about strategy #5- “Repeat tasks manually till patterns emerge, then move to a function”. This seems like opposite of chunking/aliasing to me..
LikeLike