Friday, April 20, 2018

except BaseException or except Exception

I've recently become the 'python expert' in a small company that had a bunch of people who were writing python code, but didn't really have anyone that had studied python. Our code had a lot of cases of this:

try:
    <probably not foolish thing>
except:
    <deal with exceptions>

I've been dutifully correcting it to, and teaching the others to correct it to, this:

try:
    <probably not foolish thing>
except Exception as err:
    <deal with err>

But today I was code reviewing a change where a colleague had applied an automated linter, and one of those instances had been replaced by

try:
    <probably not foolish thing>
except BaseException as err:
    <deal with err>

I had a feeling this was wrong. But I didn't want to just go with a feeling on it and tried to google what circumstances I should use which in. I failed, so I asked twitter.

Essentially, Exception is a subclass of BaseException. So are several things that you probably really do want to kill your program (the one most noted in the twitter replies was ctrl + c / keyboard interrupt). So, except BaseException does what a bare except did, which is why the linter suggested it. But except Exception is almost certainly what you meant.


From the python docs

BaseException
 +-- SystemExit
 +-- KeyboardInterrupt
 +-- GeneratorExit
 +-- Exception
      +-- StopIteration
      +-- StopAsyncIteration
      +-- ArithmeticError
      |    +-- FloatingPointError
      |    +-- OverflowError
      |    +-- ZeroDivisionError
      +-- AssertionError
      +-- AttributeError
      +-- BufferError
      +-- EOFError
      +-- ImportError
      |    +-- ModuleNotFoundError
      +-- LookupError
      |    +-- IndexError
      |    +-- KeyError
      +-- MemoryError
      +-- NameError
      |    +-- UnboundLocalError
      +-- OSError
      |    +-- BlockingIOError
      |    +-- ChildProcessError
      |    +-- ConnectionError
      |    |    +-- BrokenPipeError
      |    |    +-- ConnectionAbortedError
      |    |    +-- ConnectionRefusedError
      |    |    +-- ConnectionResetError
      |    +-- FileExistsError
      |    +-- FileNotFoundError
      |    +-- InterruptedError
      |    +-- IsADirectoryError
      |    +-- NotADirectoryError
      |    +-- PermissionError
      |    +-- ProcessLookupError
      |    +-- TimeoutError
      +-- ReferenceError
      +-- RuntimeError
      |    +-- NotImplementedError
      |    +-- RecursionError
      +-- SyntaxError
      |    +-- IndentationError
      |         +-- TabError
      +-- SystemError
      +-- TypeError
      +-- ValueError
      |    +-- UnicodeError
      |         +-- UnicodeDecodeError
      |         +-- UnicodeEncodeError
      |         +-- UnicodeTranslateError
      +-- Warning
           +-- DeprecationWarning
           +-- PendingDeprecationWarning
           +-- RuntimeWarning
           +-- SyntaxWarning
           +-- UserWarning
           +-- FutureWarning
           +-- ImportWarning
           +-- UnicodeWarning
           +-- BytesWarning
           +-- ResourceWarning

Saturday, January 13, 2018

When json.dumps(json.loads(x)) != x

I hit a minor bug in some code where it seemed like .readlines() was stripping lines and .writelines() wasn't putting them back in. I started adding them back manually, but it irked me.

Following up on a twitter conversation about it, I saw that it wasn't actually readlines and writelines that was the problem. When I read the data, I was doing a json.loads() to turn the data from a string into a dictionary, and then later using json.dumps() to turn that (now modified) dictionary back into a string. See the problem yet?


code demonstrating that loading and then dumping a json string with new lines removes them

json.loads turns the string, however messy, into a nice, neat python object. But json.dumps creates the neatest possible json form of that object. So any extra white space, including those new lines I expected, gets stripped out in the process.

I'm currently still of the opinion that readlines should strip out those extra lines, and writelines should put them in. If you're counting the new line character as the way you know to go on to a new thing, they how can it also be part of that first thing? Shouldn't it behave more like .split('\n')? But I'm open to convincing on this.

Monday, November 27, 2017

My first real open source project!

Okay, I might be overselling it there in the title. But I have created an open source project, that has a real, publicly available web app and has even had a contribution (some spelling corrections) from another human being. It's a small victory, but I'll take it!

Wednesday, October 25, 2017

Today in things I didn't know about Python - __call__

If you define a __call__ method in your class it will get called when you call an instance of that class. I didn't know that. That is all.

In [336]: class foo:
    def __init__(self):
        print "in init"
    def __call__(self):
        print "in call"

In [337]: bar = foo()
in init

In [338]: bar()
in call

Monday, May 22, 2017

Today in things I didn't know in Python - sets

I'm slowly and thoroughly reading through the Python documentation. From the amount I'm learning in the process, I clearly should have done this some time ago.

Today I'm reading about sets. I knew a few things about sets:
- Sets are mutable, unordered, collections of unique objects
- You can declare an empty set with 'set()'
- You can declare a set with things in it like this: '{1, 2, 2}'
- You can get the difference, union, intersection, etc. like this:
 - my_set.thing_I_want(other_set)
- How to create a set comprehension

But there were things I didn't know about sets:
- You don't have to use the .operation() version, you can use operators! Some of which are more intuitive to me than others.

Operation Equivalent Result
s.issubset(t) s <= t test whether every element in s is in t
s.issuperset(t) s >= t test whether every element in t is in s
s.union(t) s | t new set with elements from both s and t
s.intersection(t) s & t new set with elements common to s and t
s.difference(t) s - t new set with elements in s but not in t
s.symmetric_difference(t) s ^ t new set with elements in either s or t but not both
s.update(t) s |= t return set s with elements added from t
s.intersection_update(t) s &= t return set s keeping only elements also found in t
s.difference_update(t) s -= t return set s after removing elements found in t
s.symmetric_difference_update(t) s ^= t return set s with elements from s or t but not both


- There is such a thing as immutable set (it's another class)*
- If you do use the operation_name function version for union, intersection, difference, or symmetric_difference, you don't need to cast the second thing to a set (it just needs to be iterable)


*Edited to add that I've realised it's depreciated and replaced by frozenset since 2.6

Monday, April 24, 2017

Can you use a tuple as a dictionary key? Well, that depends.

Today I was asked if you can use a tuple as a dictionary key. I wasn't sure, and looked into it. And the answer seems to be, 'it depends'.

Can you ever use a tuple as a dictionary key? Yes:


In [35]: my_tup = (1, 2)

In [36]: my_dict = {my_tup: 1}

In [37]: my_dict
Out[37]: {(1, 2): 1}



Can you ALWAYS use a tuple as a dictionary key? Nope:

In [38]: my_other_tup = ([1, 2], 2)

In [39]: my_dict = {my_other_tup: 2}
------------------------------------------------------------
TypeError                 Traceback (most recent call last)
<ipython-input...> in <module>()
----> 1 my_dict = {my_other_tup: 2}

TypeError: unhashable type: 'list'


You can use a tuple as a dictionary key, but only if it's hashable all the way down.

Thursday, March 30, 2017

Don't put nullable fields in your MySQL unique keys

MySQL will allow you to include a nullable field in a unique key. I heartilly recommend that you don't do it.

You may already be aware that NULL isn't equal to anything. That's why we always have 'IS NULL' instead of '= NULL' in our where clauses. When they say not equal to anything, they really mean not equal to anything. Not even itself.

So let's say you have a table that looks a bit like this:

always1 always2 always3 sometimes1 sometimes2
a b c d e
f g h NULL NULL
i value1 value2 NULL NULL

and you have a unique key on always2, always3, and sometimes1

You build some logic around the fairly reasonable idea 'we'll try to insert, and if we get a duplicate key error, we'll update instead'.

And then you try to insert
'always2 = "value1", always3 = "value2", sometimes1 = NULL, always1 = m, sometimes2 = n'
expecting this to fail because of a duplicate key error and update the values for always1 and sometimes2 for the third row above. But it won't. Because NULL isn't equal to itself. So you'll end up with a table like this:

always1 always2 always3 sometimes1 sometimes2
a b c d e
f g h NULL NULL
i value1 value2 NULL NULL
m value1 value2 NULL n


Beware!