The Iterator Protocol

Laziness

An iterable is anything that you can iterate over. In other words, anything you can loop over with a for loop.

Generators are iterables, but they’re a little weird as iterables go.

Generators are lazy. As we loop over generators, they don’t compute the next item in the loop until we ask them to.

We can see this by manually looping over a generator using the built-in next function:

>>> def count(n=0):
...     print("start")
...     while True:
...         yield n
...         n += 1
...         print("loop")
...     print("end")
...
>>> c = count()
>>> next(c)
start
0
>>> next(c)
loop
1
>>> next(c)
loop
2
>>> next(c)
loop
3

Single-use

Notice that when we call next(c), our generator starts looping where it left off before? Let’s try that on a different generator:

>>> def generatorify(iterable):
...     for i in iterable:
...         yield i
...
>>> g = generatorify([1, 2, 3, 4])
>>> g
<generator object generatorify at 0x7f57a0436360>
>>> next(g)
1
>>> next(g)
2
>>> for x in g:
...     print(x)
...
3
4

Generators keep track of where they were so that when we loop over them they’ll always start up where they left off.

What happens if we loop over a generator that’s all done?

>>> for x in g:
...     print(x)
...

It doesn’t give us anything else. Let’s try calling next on it:

>>> next(g)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
StopIteration

Raising a StopIteration error is the generator’s way of telling us that we’ve used it all up.

Generators are single-use. By that I don’t mean we can only loop over them once: I mean that they don’t have a reset button. Once they’ve run all the way through, we can’t start them over.

Iterators

So we can use the built-in next function to loop over a generator.

What would happen if we used next on a list?

>>> numbers = [1, 2, 3, 4, 5]
>>> next(numbers)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'list' object is not an iterator

We get an error. Python is telling us that we cannot loop over our list with next because our list is not an iterator, even though it is an iterable.

Let’s review what we know at the moment:

  1. next only works on iterators

  2. next does not work on lists because they are not iterators

  3. next does work on generators so they must be iterators

We can create an iterator for our list using the built-in iter function:

>>> iter(numbers)
<list_iterator object at 0x7f57a03d66d8>
>>> i = iter(numbers)
>>> next(i)
1
>>> next(i)
2
>>> next(i)
3

This iterator object acts just like our generator did. It even keeps track of its place:

>>> for x in i:
...     print(x)
...
4
5

If we want to loop over the list again we can get a new iterator:

>>> i = iter(numbers)
>>> next(i)
1

What happens when we get an iterator from a generator?

>>> g = generatorify(numbers)
>>> g
<generator object generatorify at 0x7f57a0436360>
>>> iter(g)
<generator object generatorify at 0x7f57a0436360>

We get the generator right back!

>>> i = iter(g)
>>> i is g
True

Asking a generator for an iterator of itself gives us back a reference to itself!

Iterator Protocol

All this iterator and generator stuff is a little confusing out of context.

To really understand what’s going on, we need to talk about the iterator protocol.

The iterator protocol:

  1. An iterable is anything that you can get an iterator from using iter

  2. An iterator is an iterable that you can loop over using next

A couple of caveats:

  1. An iterator is “exhausted” (completed) if calling next raises a StopIteration exception

  2. When you use iter on an iterator, you’ll get the same iterator back

  3. Not all iterators can be exhausted (they can keep giving next values forever if they want)

So those generators we were working with are iterators. And all iterators are also iterables.

This means that the generatorify function we made earlier is sort just of a re-implementation of the iter function:

>>> generatorify(numbers)
<generator object generatorify at 0x7f57a04363b8>
>>> iter(numbers)
<list_iterator object at 0x7f57a03c4278>

How for loops work

The iterator protocol exists so that for loops work and so that you can make your own custom iterables that work with for loops.

You don’t often need to understand the depths of the iterator protocol outside of knowing that it’s the way iteration works under the hood.

With this new information, let’s try to re-implement a for loop using a while loop.

Take this function:

def print_each(iterable):
    for item in iterable:
        print(item)

And re-implement it using only a while loop.

You can make sure your function works like this:

>>> print_each({1, 2, 3})
1
2
3
>> print_each({"a", "b", "c"})
a
c
b

The answer:

def print_each(iterable):
    iterator = iter(iterable)
    while True:
        try:
            item = next(iterator)
        except StopIteration:
            break  # Iterator exhausted: stop the loop
        else:
            print(item)

You can see that the while loop will go on forever unless the iterator we got from the input iterable has an end of its own, resulting in the StopIteration exception.

Iteration

>>> numbers = [1, 2, 3, 4, 5]

What are some different ways we can try to iterate over something?

We can use a for loop:

>>> for n in numbers:
...     print(n)
...
1
2
3
4
5

We can use a comprehension:

>>> [n ** 2 for n in numbers]
[1, 4, 9, 16, 25]

What about multiple assignment?

>>> a, b, c, d, e = numbers
>>> a
1
>>> b
2
>>> c
3
>>> d
4
>>> e
5

What about iterable unpacking when calling a function?

>>> my_range = [1, 6]
>>> range(*my_range)
range(1, 6)
>>> list(range(*my_range))
[1, 2, 3, 4, 5]

Iterables

So to review, an iterable is anything that you can loop over. We can think of iterables as anything from which we can create an iterator by using the iter() built-in function.

Likewise, an iterator is an iterable that will work with the built-in next() function. A generator is an easy way to implement __iter__.

What if you want to make your own iterables? You’d need your custom object to work with iter somehow.

But how does iter actually work?

>>> numbers = [1, 2, 3, 4]
>>> iter(numbers)
<list_iterator object at 0x7f57a03b6a90>
>>> numbers.__iter__()
<list_iterator object at 0x7f57a03b19b0>

The built-in iter function works by calling __iter__ on the object it’s given. So if your object has an __iter__ method that returns an iterator, it will be an iterable!

We’ll talk more about that later.

iter and next

There’s a little more to iter and next that we aren’t going to go into.

For example next accepts an optional second argument that will be returned as a default for empty iterators (it is returned instead of raising a StopIteration exception).

You can look up documentation on those functions or pass them to the help function to find out more.

Iterator Exercises

These exercises are all in the iterators.py file in the exercises directory. Edit the file to add the functions or fix the error(s) in the existing function(s). To run the test: from the exercises folder, type python test.py <function_name>, like this:

$ python test.py first

First

Edit the function first so that it returns the first item in any iterable:

>>> from iterators import first
>>> first(iter([1, 2]))
1
>>> first([1, 2])
1

Is Iterator

Edit the function is_iterator so that it accepts an iterable and returns True if the given iterable is an iterator.

Example:

>>> from iterators import is_iterator
>>> is_iterator(iter([]))
True
>>> is_iterator([1, 2])
False
>>> i = iter([1, 2])
>>> is_iterator(i)
True
>>> list(i)
[1, 2]
>>> def gen(): yield 4
...
>>> is_iterator(gen())
True

Point

Make a Point class that stores 3-dimensional coordinates. Your Point class should work with multiple assignment, like this:

>>> p = Point(2, 3, 6)
>>> x, y, z = p
>>> x
2
>>> y
3
>>> z
6

All Same

Edit the function all_same so that it accepts an iterable and True if all items in the iterable are equal to each other.

Example:

>>> from iterators import all_same
>>> all_same(n % 2 for n in [3, 5, 7, 8])
False
>>> all_same(n % 2 for n in [3, 5, 7, 9])
True

Your function should work with any iterable and any items that can be compared (including unhashable ones). It should return as soon as an unequal value is found.

minmax

Edit the function minmax to accept an iterable and return the minimum and maximum values of that iterable.

Example:

>>> from iterators import minmax
>>> minmax(n**2 for n in [9, 5, 2, 8])
(4, 81)

Your minmax function should accept any iterable.

Note

This function should not copy every item in the supplied iterable into a new list. Process the items one by one so that your function won’t have any memory concerns with extremely long/large iterables.

Random Number

Make an inexhaustable iterator object RandomNumberGenerator that returns random integers between two numbers (inclusive).

Example:

>>> number_generator = RandomNumberGenerator(4, 8)
>>> next(number_generator)
4
>>> next(number_generator)
7
>>> next(number_generator)
8
>>> iter(number_generator) is number_generator
True

Dictionary Changes

  1. Create an empty dictionary

  2. Get an iterator for the dictionary

  3. Add an item to the dictionary

  4. Try to get the next item out of the iterator

What happened?

List Changes

  1. Create a list with two items

  2. Get an iterator from the list

  3. Get the next item from the iterator

  4. Insert an item at the beginning of the list

  5. Get the next item from the iterator

What happened?

Write more Pythonic code

I send out 1 Python exercise every week through a Python skill-building service called Python Morsels.

If you'd like to improve your Python skills every week, sign up!

You can find the Privacy Policy here.
reCAPTCHA protected (Google Privacy Policy & TOS)