Windowing an iterable with itertools

As any good python developer does, I make heavy use of python’s iterator protocol.  It’s easy, it’s efficient, it’s a good thing.  As you know, an iterator consumes an iterable piece by piece each time “next” is called — which means that the next value cannot be peeked without incrementing the iterator (thus consuming the data).  However, what if we want to peek the next value without having to increment the iterator?  The recipe below solves this problem with a wrapper class that adds two methods — peek and prev.

from itertools import tee

class Iterator(object):
    """Intended to be used inside a while loop"""
    def __init__(self, iterable):
        self._a, self._b = tee(iter(iterable), 2)
        self._previous = None
        self._peeked   = self._b.next()

    def __iter__(self):
        return self

    def next(self):
        self._previous = self._a.next()
        self._current  = self._peeked
        try:
            self._peeked = self._b.next()
        except StopIteration:
            self._peeked = None
        return self._current

    def prev(self): return self._previous

    def peek(self): return self._peeked

Notice that we only need two, not three, copies of the original iterable.  Initially the “current” value is undefined and the “peeked” value equals the first iteration of the iterable.  As we consume the data, the “current” becomes the “peeked” while the “peeked” is incremented.  The “previous” value starts at None and is one iteration behind.  Thus class Iterator’s “next” method provides us with a sliding window, solving the problem.  For example,

I = Iterator([1, 2, 3, 4])
fcn = lambda itr: (itr.prev(), itr.next(), itr.peek())

L = []
while True:
    try:
        L.append(fcn(I))
    except StopIteration:
        break

print 'L: ', L

>>> L: [(None, 1, 2), (1, 2, 3), (2, 3, 4), (3, 4, None)]

Using Iterator inside a ‘for’ loop works just like you’d expect, preserving the normal iterator protocol.  But what if we want to obtain the “peeked” and “previous” values inside of a ‘for’ loop?  The recipe can be modified as follows:

class Window(object):
    """Intended to be used with a for loop"""
    def __init__(self, iterable):
        self._a, self._b = tee(iter(iterable), 2)
        self._previous = None
        self._peeked   = self._b.next()

    def __iter__(self):
        return self

    def next(self):
        _prev = self._previous
        self._previous = self._a.next()
        self._current  = self._peeked
        try:
            self._peeked = self._b.next()
        except StopIteration:
            self._peeked = None
        return _prev, self._current, self._peeked

    def prev(self): return self._previous

    def peek(self): return self._peeked

Then,

W = Window([1, 2, 3, 4])
L = []

for prev, current, peeked in W:
    L.append((prev, current, peeked))

print 'L: ', L

>>> L: [(None, 1, 2), (1, 2, 3), (2, 3, 4), (3, 4, None)]

The coolest thing though is that any iterable can be used with Window and Iterator.  This includes generators!  Talk about powerful.

Share and Enjoy:
  • Digg
  • del.icio.us
  • Facebook
  • Google Bookmarks
  • Slashdot
  • StumbleUpon
  • Technorati
  1. No comments yet.

  1. No trackbacks yet.