![]() ![]() ![]() They are written like regular functions, but they use the yield statement whenever they want to return data. So generators are a simple and powerful tool for creating iterators. Yield statement is what makes a function a generator function. So that's the difference between return and yield statements in Python. This is where the concept of generators are introduced and the yield statement resumes where the function left off. Should the same function be called some time later, the function will get a fresh new set of variables.īut what if the local variables aren't thrown away when we exit a function? This implies that we can resume the function where we left off. The return statement is where all the local variables are destroyed and the resulting value is given back (returned) to the caller. Because there is a possibility that person who don't know Generators also don't know about yield I find this explanation which clears my doubt. Now you can iterate like this: db = nnect(host="localhost", user="root", passwd="root", db="domains")Ĭursor.execute("SELECT domain FROM domains") If a function uses the keyword yield then it's a generator. If you used the keyword return instead of yield, then the whole function would be ended once it reached return. Here is our generator function: def ResultGenerator(cursor, batchsize=1000):Īs you can see, our function keeps yielding the results. Now we need a generator function which generates our batches. In our third batch it will be from 2001 to 3000 and so on. In our second batch we will work on the next 1000 rows. In our first batch we will query the first 1000 rows, check Alexa rank for each domain and update the database row. So you decided to run the program in batches. If you use SELECT domain FROM domains it's going to return 100 million rows which is going to consume lot of memory. Let's say your table name is domains and column name is domain. Let's say you have 100 million domains in your MySQL table, and you would like to update Alexa rank for each domain.įirst thing you need is to select your domain names from the database. If you want to see an example of the latter two approaches, see os.path.walk() (the old filesystem-walking function with callback) and os.walk() (the new filesystem-walking generator.) Of course, if you really wanted to collect all results in a list, the generator approach is trivial to convert to the big-list approach: big_list = list(the_generator) The latter could be done by passing the result-printing function to the filesystem-search function, or it could be done by just making the search function a generator and iterating over the result. Or you could display the results while you find them, which would be more memory efficient and much friendlier towards the user. All of the results would have to be collected before you showed the first, and all of the results would be in memory at the same time. You could perform the search in its entirety, collect the results and then display them one at a time. The caller, instead of writing a separate callback and passing that to the work-function, does all the reporting work in a little 'for' loop around the generator.įor example, say you wrote a 'filesystem search' program. The generator approach is that the work-function (now a generator) knows nothing about the callback, and merely yields whenever it wants to report something. You pass this callback to the work-function and it would periodically call this callback. Traditionally you'd use a callback function for this. In some situations you want a function to do a lot of work and occasionally report back to the caller. Or for situations where the generator uses another generator, or consumes some other resource, and it's more convenient if that happened as late as possible.Īnother use for generators (that is really the same) is to replace callbacks with iteration. ![]() Generators are good for calculating large sets of results (in particular calculations involving loops themselves) where you don't know if you are going to need all results, or where you don't want to allocate the memory for all results at the same time. You can think of generators as returning multiple items, as if they return a list, but instead of returning them all at once they return them one-by-one, and the generator function is paused until the next item is requested. You use them by iterating over them, either explicitly with 'for' or implicitly by passing it to any function or construct that iterates. ![]()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |