The Iterator Protocol¶
Laziness¶
An iterable is anything that you can iterate over. In other words, anything you can loop over with a for
loop.
Generators are iterables, but they’re a little weird as iterables go.
Generators are lazy. As we loop over generators, they don’t compute the next item in the loop until we ask them to.
We can see this by manually looping over a generator using the built-in next
function:
>>> def count(n=0):
... print("start")
... while True:
... yield n
... n += 1
... print("loop")
... print("end")
...
>>> c = count()
>>> next(c)
start
0
>>> next(c)
loop
1
>>> next(c)
loop
2
>>> next(c)
loop
3
Single-use¶
Notice that when we call next(c)
, our generator starts looping where it left off before? Let’s try that on a different generator:
>>> def generatorify(iterable):
... for i in iterable:
... yield i
...
>>> g = generatorify([1, 2, 3, 4])
>>> g
<generator object generatorify at 0x7f57a0436360>
>>> next(g)
1
>>> next(g)
2
>>> for x in g:
... print(x)
...
3
4
Generators keep track of where they were so that when we loop over them they’ll always start up where they left off.
What happens if we loop over a generator that’s all done?
>>> for x in g:
... print(x)
...
It doesn’t give us anything else. Let’s try calling next
on it:
>>> next(g)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration
Raising a StopIteration
error is the generator’s way of telling us that we’ve used it all up.
Generators are single-use. By that I don’t mean we can only loop over them once: I mean that they don’t have a reset button. Once they’ve run all the way through, we can’t start them over.
Iterators¶
So we can use the built-in next
function to loop over a generator.
What would happen if we used next
on a list?
>>> numbers = [1, 2, 3, 4, 5]
>>> next(numbers)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'list' object is not an iterator
We get an error. Python is telling us that we cannot loop over our list with next
because our list is not an iterator, even though it is an iterable.
Let’s review what we know at the moment:
next
only works on iteratorsnext
does not work on lists because they are not iteratorsnext
does work on generators so they must be iterators
We can create an iterator for our list using the built-in iter
function:
>>> iter(numbers)
<list_iterator object at 0x7f57a03d66d8>
>>> i = iter(numbers)
>>> next(i)
1
>>> next(i)
2
>>> next(i)
3
This iterator object acts just like our generator did. It even keeps track of its place:
>>> for x in i:
... print(x)
...
4
5
If we want to loop over the list again we can get a new iterator:
>>> i = iter(numbers)
>>> next(i)
1
What happens when we get an iterator from a generator?
>>> g = generatorify(numbers)
>>> g
<generator object generatorify at 0x7f57a0436360>
>>> iter(g)
<generator object generatorify at 0x7f57a0436360>
We get the generator right back!
>>> i = iter(g)
>>> i is g
True
Asking a generator for an iterator of itself gives us back a reference to itself!
Iterator Protocol¶
All this iterator and generator stuff is a little confusing out of context.
To really understand what’s going on, we need to talk about the iterator protocol.
The iterator protocol:
An iterable is anything that you can get an iterator from using
iter
An iterator is an iterable that you can loop over using
next
A couple of caveats:
An iterator is “exhausted” (completed) if calling
next
raises aStopIteration
exceptionWhen you use
iter
on an iterator, you’ll get the same iterator backNot all iterators can be exhausted (they can keep giving next values forever if they want)
So those generators we were working with are iterators. And all iterators are also iterables.
This means that the generatorify
function we made earlier is sort just of a re-implementation of the iter
function:
>>> generatorify(numbers)
<generator object generatorify at 0x7f57a04363b8>
>>> iter(numbers)
<list_iterator object at 0x7f57a03c4278>
How for loops work¶
The iterator protocol exists so that for
loops work and so that you can make your own custom iterables that work with for
loops.
You don’t often need to understand the depths of the iterator protocol outside of knowing that it’s the way iteration works under the hood.
With this new information, let’s try to re-implement a for
loop using a while
loop.
Take this function:
def print_each(iterable):
for item in iterable:
print(item)
And re-implement it using only a while
loop.
You can make sure your function works like this:
>>> print_each({1, 2, 3})
1
2
3
>> print_each({"a", "b", "c"})
a
c
b
The answer:
def print_each(iterable):
iterator = iter(iterable)
while True:
try:
item = next(iterator)
except StopIteration:
break # Iterator exhausted: stop the loop
else:
print(item)
You can see that the while loop will go on forever unless the iterator we got from the input iterable has an end of its own, resulting in the StopIteration
exception.
Iteration¶
>>> numbers = [1, 2, 3, 4, 5]
What are some different ways we can try to iterate over something?
We can use a for
loop:
>>> for n in numbers:
... print(n)
...
1
2
3
4
5
We can use a comprehension:
>>> [n ** 2 for n in numbers]
[1, 4, 9, 16, 25]
What about multiple assignment?
>>> a, b, c, d, e = numbers
>>> a
1
>>> b
2
>>> c
3
>>> d
4
>>> e
5
What about iterable unpacking when calling a function?
>>> my_range = [1, 6]
>>> range(*my_range)
range(1, 6)
>>> list(range(*my_range))
[1, 2, 3, 4, 5]
Iterables¶
So to review, an iterable is anything that you can loop over. We can think of iterables as anything from which we can create an iterator by using the iter()
built-in function.
Likewise, an iterator is an iterable that will work with the built-in next()
function. A generator is an easy way to implement __iter__
.
What if you want to make your own iterables? You’d need your custom object to work with iter
somehow.
But how does iter
actually work?
>>> numbers = [1, 2, 3, 4]
>>> iter(numbers)
<list_iterator object at 0x7f57a03b6a90>
>>> numbers.__iter__()
<list_iterator object at 0x7f57a03b19b0>
The built-in iter
function works by calling __iter__
on the object it’s given. So if your object has an __iter__
method that returns an iterator, it will be an iterable!
class Point:
def __init__(self, x, y):
self.x = x
self.y = y
def __repr__(self):
return f"Point(x={self.x}, y={self.y})"
def __iter__(self):
yield self.x
yield self.y
Instances of that class works can be looped over and unpacked:
>>> x, y = Point(1, 2)
>>> x, y
(1, 2)
>>> list(Point(1, 2))
[1, 2]
Note that the __iter__
method above is a generator function.
A generator is the easiest way to make an iterator.
iter and next¶
There’s a little more to iter
and next
that we aren’t going to go into.
For example next
accepts an optional second argument that will be returned as a default for empty iterators (it is returned instead of raising a StopIteration
exception).
You can look up documentation on those functions or pass them to the help
function to find out more.
Iterator Exercises¶
These exercises are all in the iterators.py
file in the exercises
directory. Edit the file to add the functions or fix the error(s) in the existing function(s). To run the test: from the exercises
folder, type python test.py <function_name>
, like this:
$ python test.py first
First¶
Edit the function first
so that it returns the first item in any iterable:
>>> from iterators import first
>>> first(iter([1, 2]))
1
>>> first([1, 2])
1
Is Iterator¶
Edit the function is_iterator
so that it accepts an iterable and returns True
if the given iterable is an iterator.
Example:
>>> from iterators import is_iterator
>>> is_iterator(iter([]))
True
>>> is_iterator([1, 2])
False
>>> i = iter([1, 2])
>>> is_iterator(i)
True
>>> list(i)
[1, 2]
>>> def gen(): yield 4
...
>>> is_iterator(gen())
True
Point¶
Make a Point
class that stores 3-dimensional coordinates.
Your Point
class should work with multiple assignment, like this:
>>> p = Point(2, 3, 6)
>>> x, y, z = p
>>> x
2
>>> y
3
>>> z
6
All Same¶
Edit the function all_same
so that it accepts an iterable and True
if all items in the iterable are equal to each other.
Example:
>>> from iterators import all_same
>>> all_same(n % 2 for n in [3, 5, 7, 8])
False
>>> all_same(n % 2 for n in [3, 5, 7, 9])
True
Your function should work with any iterable and any items that can be compared (including unhashable ones). It should return as soon as an unequal value is found.
minmax¶
Edit the function minmax
to accept an iterable and return the minimum and maximum values of that iterable.
Example:
>>> from iterators import minmax
>>> minmax(n**2 for n in [9, 5, 2, 8])
(4, 81)
Your minmax
function should accept any iterable.
Note
This function should not copy every item in the supplied iterable into a new list. Process the items one by one so that your function won’t have any memory concerns with extremely long/large iterables.
Random Number¶
Make an inexhaustable iterator object RandomNumberGenerator
that returns random integers between two numbers (inclusive).
Example:
>>> number_generator = RandomNumberGenerator(4, 8)
>>> next(number_generator)
4
>>> next(number_generator)
7
>>> next(number_generator)
8
>>> iter(number_generator) is number_generator
True
Dictionary Changes¶
Create an empty dictionary
Get an iterator for the dictionary
Add an item to the dictionary
Try to get the next item out of the iterator
What happened?
List Changes¶
Create a list with two items
Get an iterator from the list
Get the next item from the iterator
Insert an item at the beginning of the list
Get the next item from the iterator
What happened?
I send out 1 Python exercise every week through a Python skill-building service called Python Morsels.
If you'd like to improve your Python skills every week, sign up!
You can find the Privacy Policy here.reCAPTCHA protected (Google Privacy Policy & TOS)