Windowing an iterable with itertools
As any good python developer does, I make heavy use of python’s iterator protocol. It’s easy, it’s efficient, it’s a good thing. As you know, an iterator consumes an iterable piece by piece each time “next” is called — which means that the next value cannot be peeked without incrementing the iterator (thus consuming the data). However, what if we want to peek the next value without having to increment the iterator? The recipe below solves this problem with a wrapper class that adds two methods — peek and prev.
from itertools import tee
class Iterator(object):
"""Intended to be used inside a while loop"""
def __init__(self, iterable):
self._a, self._b = tee(iter(iterable), 2)
self._previous = None
self._peeked = self._b.next()
def __iter__(self):
return self
def next(self):
self._previous = self._a.next()
self._current = self._peeked
try:
self._peeked = self._b.next()
except StopIteration:
self._peeked = None
return self._current
def prev(self): return self._previous
def peek(self): return self._peeked
Notice that we only need two, not three, copies of the original iterable. Initially the “current” value is undefined and the “peeked” value equals the first iteration of the iterable. As we consume the data, the “current” becomes the “peeked” while the “peeked” is incremented. The “previous” value starts at None and is one iteration behind. Thus class Iterator’s “next” method provides us with a sliding window, solving the problem. For example,
I = Iterator([1, 2, 3, 4])
fcn = lambda itr: (itr.prev(), itr.next(), itr.peek())
L = []
while True:
try:
L.append(fcn(I))
except StopIteration:
break
print 'L: ', L
>>> L: [(None, 1, 2), (1, 2, 3), (2, 3, 4), (3, 4, None)]
Using Iterator inside a ‘for’ loop works just like you’d expect, preserving the normal iterator protocol. But what if we want to obtain the “peeked” and “previous” values inside of a ‘for’ loop? The recipe can be modified as follows:
class Window(object):
"""Intended to be used with a for loop"""
def __init__(self, iterable):
self._a, self._b = tee(iter(iterable), 2)
self._previous = None
self._peeked = self._b.next()
def __iter__(self):
return self
def next(self):
_prev = self._previous
self._previous = self._a.next()
self._current = self._peeked
try:
self._peeked = self._b.next()
except StopIteration:
self._peeked = None
return _prev, self._current, self._peeked
def prev(self): return self._previous
def peek(self): return self._peeked
Then,
W = Window([1, 2, 3, 4])
L = []
for prev, current, peeked in W:
L.append((prev, current, peeked))
print 'L: ', L
>>> L: [(None, 1, 2), (1, 2, 3), (2, 3, 4), (3, 4, None)]
The coolest thing though is that any iterable can be used with Window and Iterator. This includes generators! Talk about powerful.
