Multiprocessing A For Loop In Python
Multiprocessing a for loop in Python I have a program that currently takes a very long time to run since it processes a large number of files. I was hoping to be able to run the pr
Solution 1:
I suppose the quickest / simplest way to get there is to use a multiprocessing pool and let it run across iterable (of your files)... A minimal example with fixed number of workers and a little extra info to observe behavior would be:
import datetime
import time
from multiprocessing import Pool
deflong_running_task(filename):
time.sleep(1)
print(f"{datetime.datetime.now()} finished: {filename}")
filenames = range(15)
with Pool(10) as mp_pool:
mp_pool.map(long_running_task, filenames)
This creates a pool of 10 workers and will call long_running_task
with each item from filenames
(here just series of 0..14
ints as a stand-in) as a task finishes and the worker becomes available.
Alternatively, if you wanted to iterate over the inputs yourself, you could do something like:
with Pool(10) as mp_pool:
forfninrange(15):
mp_pool.apply_async(long_running_task, (fn,))
mp_pool.close()
mp_pool.join()
This would pass fn
as first positional argument for each long_running_task
call... when assigning all the work, we need to close
the pool to stop accepting any more requests and join
to wait for any outstanding jobs to finish.
Post a Comment for "Multiprocessing A For Loop In Python"