Summing Lists of Tuples
A coworker came to me with a simple python problem and asked if I could find a better solution. The problem is as follows.
The Problem
Given a list of 2 element tuples (suchas x and y coordinates), how do I sum the tuples so that they only sum the respective elements (sum(x), sum(y)).
Example:
1 | [(1,2), (3,4), (5,6), (7,8)] => (16, 20) |
For the following solutions, we will give that
1 | tuple_list = [(1,2), (3,4), (5,6), (7,8)] |
and for performance testing:
1 | long_tuple_list = [[(1,2), (3,4), ..., (999, 1000)]] |
All code and tests will be run in python 3.6.
Beginner Solution
So the Python101 type solution would look a little something like this:
1 | def sum_tuples(tuples): |
Our results from %timeit
with jupyter notebook as as follows:
1 | > %timeit sum_tuples(long_tuple_list) |
Proposed Solution
The solution my coworker came up with went something like
1 | def sum_tuples(tuples): |
This is pretty nice, its a one liner that makes good use of builtin zip
and uses,
sum
. This performs nicely due to the list comprehension:
1 | %timeit sum_tuples(long_tuple_list) |
However there is an unnecessary list casting in here since we want to return a tuple, so we can refactor this to:
1 | def sum_tuples(tuples): |
Remove the square brackets to cast directly to tuple results in the same speed but saves us two characters and an operation.
Map Solution
After falling in love with functional programming, my first gut reaction to this
was to use map
:
1 | def sum_tuples(tuples): |
This is probably my favorite. One liner, all builtin keyword methods. It replaces
the comprehension with map
and performs a hair faster too:
1 | %timeit sum_tuples(long_tuple_list) |
Reduce Solution
In functional programming, this type of problem where you go from many to one is
often handled with a reducer. Since python3, reduce
is no longer a builtin keyword,
but it is still available via functools
:
The initial reduce with a lambda looks like:
1 | from functools import reduce |
So to me, this looks weird. I don’t like the lambda
anonymous function, we are
calling map
anyways inside, and the whole thing is a bit hard to read. Also
due to the anonymous lambda
, the thing is slow:
1 | %timeit sum_tuples(long_tuple_list) |
But lets not count reduce dead yet, we can clean up that anonymous function with a named one:
1 | from functools import reduce |
This to me looks like the best functional approach as far as signal to noise. It
defines a function on how to add two tuples together, then uses reduce
to reduce
the list to a single result. The code reads exactly as it does. However since
reduce
is no longer first class, it’s performance is pretty similar to the above
results.
Numpy
To be complete, we could use numpy to do matrix addition to get our answer. *Note,
numpy
is huge. According to pypi, its between 4MB and 12MB depending on your
operating system. This is the equvalent of killing a fly with a sledge hammer.
1 | from numpy import array |
Numpy array
s do matrix math, so adding them will result the behavior that we want.
However, the cast to array
comes with a cost:
1 | %timeit sum_tuples(long_tuple_list) |
OOP Solution
Again for completeness, we can look at an OOP solution to this problem.
1 | class Mtuple(tuple): |
So here we cast all of our tuples to a new class, Mtuple
, which defines a __add__
function. This allows us to override the behavior of +
or sum
on these objects.
We then can simply sum
the list of Mtuple
, given we give sum
a starting value of
Mtuple((0,0)))
. However we run into the same problem with casting as we did with
numpy:
1 | %timeit sum_tuples(long_tuple_list) |
Conclusion
While I like how clean reduce
looks, due to it no longer being “pythonic”, I would
have to say the best way to solve this problem is either using a comprension or
using map
. Personally I think map(sum, zip(*tuple_list))
is the best because
it is lazily evaluated and can be further composed.