Multiprocessing A For Loop In Python
Multiprocessing a for loop in Python I have a program that currently takes a very long time to run since it processes a large number of files. I was hoping to be able to run the pr
Solution 1:
I suppose the quickest / simplest way to get there is to use a multiprocessing pool and let it run across iterable (of your files)... A minimal example with fixed number of workers and a little extra info to observe behavior would be:
import datetime
import time
from multiprocessing import Pool
deflong_running_task(filename):
    time.sleep(1)
    print(f"{datetime.datetime.now()} finished: {filename}")
filenames = range(15)
with Pool(10) as mp_pool:
    mp_pool.map(long_running_task, filenames)
This creates a pool of 10 workers and will call long_running_task with each item from filenames (here just series of 0..14 ints as a stand-in) as a task finishes and the worker becomes available.
Alternatively, if you wanted to iterate over the inputs yourself, you could do something like:
with Pool(10) as mp_pool:
    forfninrange(15):
        mp_pool.apply_async(long_running_task, (fn,))
    mp_pool.close()
    mp_pool.join()
This would pass fn as first positional argument for each long_running_task call... when assigning all the work, we need to close the pool to stop accepting any more requests and join to wait for any outstanding jobs to finish.
Post a Comment for "Multiprocessing A For Loop In Python"