Often enough, I end up writing a small chunk of code to keep a set of running counts (for example, counting the occurrences of words in a text file).
Here’s what that code usually looks like:
if x not in d: d[x] = 0 d[x] += 1
Now, I’ve read in a number of places that this is un-Pythonic. Perhaps the more appropriate way to write this would be:
try: d[x] += 1 except (KeyError,): d[x] = 1
The common claim behind this being that you should expect to succeed, and only handle the failure case if it actually fails. Fair enough. The eternal question for me, though, is: which way is faster? They’re both about the same effort to type, and (for me) about the same mental effort to understand.
For the tests, I wanted to look at 4 cases: using if with a miss, using try-except with a miss, using if with a hit, and using try-except with a hit. For each test, I did the specified operation 10,000 times, and repeat each test 2,000 times. For more details, have a look at the code.
The results were pretty much what I expected:
So what does this say? In the success case, where you’re incrementing the counter on a key you’ve already seen, you do incur a penalty for performing the “not in” check first. In the failure case, though, the overhead of handling the exception slows things down much more dramatically.
Like most problems in computer science, there’s no correct answer here, only a set of tradeoffs. If you are likely to have a lot of lookup misses, using the “if … not in …” paradigm will give you better performance, but if you’re likely to have a lot of lookup hits, then the “try … except” paradigm will serve you better.
As always, your mileage may vary, and the best answer for determining what is slow and fast in YOUR application is profiling.