Understanding Python Generators

Posted by Shaifali's Blog on June 20, 2017

Few days back I gave a talk in BangPypers meetup (Bengaluru Python group) about generators. The focus was to discuss what generators are and how does they work in python or why does they work the way they do. This was the first meetup I have ever attended for Python in Bengaluru so I was excited more than I was nervous for speaking. :P

Here I am sharing first half content that I have presented in the talk, keep reading if you want to understand what generators are. Slide is present here.

Warm-Up Exercise

I started with a warm up exercise, where I wrote an easy-peasy code. Here is the simple function mycount

def mycount():
    n =0
    numbers = [] # 1 get rid of
    while n < 10:
        numbers.append(n) # 3 yield
        n += 1
    return numbers # 2 rid of it

for num in count():
    print("Got number", num)

# OUTPUT

The mycount function basically returns list of numbers from 0 to 9. And then I called that function and print out each number inside that list it returned. So here we have a function that returns a list and then something (the for loop) which consumes that list; the function returns something we can loop over. Everything simple pretty much?

Now I will tweak the mycount function by adding one more print statement inside loop.

def mycount():
    n =0
    numbers = [] # 1 get rid of
    while n < 10:
        # print statement added
        print("In loop", n)
        numbers.append(n) # 3 yield
        n += 1
    return numbers # 2 rid of it

for num in count():
    print("Got number", num)

# OUTPUT

It is printing all In loop first and then all Got number. So what’s happening here is that we are making a list inside the mycount function and then immediately after we made the list, we are iterating over that list and then throwing it away. We never look it again. Whenever you see this type of scenario in your code, if there is a function returning a list, you can make that function into a generator, as long as there is no other code relay on it being a list.

Writing generator function

So to make above mycount function to return a generator instead of list, following 3 steps need to be followed:
* get rid of the empty list assignment, at line 3 in above code.
* get rid of return statement at line 9 in above code.
* change append into yield at line 7 in above code.

yield is a special keyword in python, that is basically means produce/generate. And it always goes along with generators, so whenever there is yield there is generator going on and wherever there is generator there is probably yield happening there.

Generator’s code

def mycount():
    n =0
    while n < 10:
        print("In loop", n)
        yield n
        n += 1

for num in count():
    print("Got number", num)

# OUTPUT

Notice the output, “In loop 0, got number 0, In loop 1, got number 1….so on”. So what happening this time, when we call generator, it starts executing mycount function and once the yield is encountered it stop executing and pause by yielding the value after yield (if any) and when we iterate again it yield value again and pause again till the time it returns.

A little, complex thing, so to understand it lets start with iterables.

Iterables

You know that we can loop over many many things in python, like we can loop over a list,like

l = [1,2,3]
for i in l:
    print(i)

it will iterate one element at a time. Similarly we can loop over a string, we can loop over a dict, a set. We are able to loop around them because all of them are iterables. Not only this, open file objects, open sockets are also iterables essentially.

So what so special about iterables?

It’s just that, that when we call them with the standard built-in iter() method, they returns iterators. And iterator is something around which we can loop over. So just keep in mind that iterable is something which returns itrerator, and iterators are very powerful. They allows us to iterate over a loop, they allows us to perform iteration one by one over and over.

we can loop around with iterator. Its a value factory, because, every time we ask it for “the next” value (in next iteration), it knows how to compute it and that’s what it gives back to us. Every iterator have a built-in method, next(), when we call that next(), method it returns the next value of the iterator, if there are no more values it raises StopIteration exception. And thats how for loop works in Python (under the hood).

Generators are Iterators

But why was we talking about iterators? We are suppose to talk about generators! Well, because generators are a special type of iterators. They simplifies the creation of iterators. And since a generator is an iterator, it can be loop over with. Thus in our example of warm up exercise, we were able to iterate via a generator object.

yield is the keyword, which is used to create a generator function, any function which have yield keyword inside it is a generator function. Generators are also known is lazy factories, because they don’t store any value in memory, whenever we ask for a value they compute it and simply give it back to the calling environment. They don’t keep values computed in advance.

So our function from warm up exercise, mycount() is now a generator function which returns a generator object (a lazy factory). But that’s not the only way of creating a generator object. We can also create a generator with generator expressions, its very similar to list comprehensions but instead of square bracket, to create generate object simple use round brackets.