Skip to content Skip to sidebar Skip to footer

Python's Multiprocessing: Speed Up A For-loop For Several Sets Of Parameters, "apply" Vs. "apply_async"

I would like to integrate a system of differential equations using a lot of different parameter combinations and store the variables’ final values that belong to a certain set of

Solution 1:

Note that the fact that your apply_async is 289 times faster then the for loop is a little suspicious! And right now, you're guaranteed to get the results in the order they're submitted, even if that isn't what you want for maximum parallelism.

apply_async starts a task, it doesn't wait until it's completed; .get() does that. So this:

tic = time.time()    
resultsAsync = [pool.apply_async(runMyODE, args=(INIT[combi,:],tval,{'para': PARA[combi,:]})) for combi in range(numComb)]    
toc = time.time()

Isn't really a very fair measurement; you've started all the tasks, but they're not necessarily completed yet.

On the other hand, once you .get() the results, you know that the task has completed and that you have the answer; so doing this

for sol in range(numComb):
    print resultsAsync[sol].get()[2,-1] #print final value of z

Means that for sure you have the results in order (because you're going through the ApplyResult objects in order and .get()ing them); but you might want to have the results as soon as they're ready rather than doing a blocking wait on the steps one at a time. But that means you'd need to label the results with their parameters one way or another.

You can use callbacks to save the results once the tasks are done, and return the parameters along with the results, to allow completely asynchronous returns:

def runMyODE(yn,tvec,allpara):
    return allpara['para'],transpose(odeint(myODE, yn, tvec, args=(allpara,)))

asyncResults = []

def saveResult(result):
    asyncResults.append((result[0], result[1][2,-1]))

tic = time.time()
for combi in range(numComb):
    pool.apply_async(runMyODE, args=(INIT[combi,:],tval,{'para': PARA[combi,:]}), callback=saveResult)
pool.close()
pool.join()
toc = time.time()

print'Using apply_async took ', toc-tic, 'seconds!'for res in asyncResults:
    print res[0], res[1]

Gives you a more reasonable time; the results are still almost always in order because the tasks take very similar amounts of time:

Using apply took  0.0847041606903 seconds!
[ 6.02763376  5.44883183]41.7597176061[ 4.37587211  8.91773001]48.0603437545[ 7.91725038  5.2889492 ]38.7413413879[ 0.71036058  0.871293  ]25.6022231983[ 7.78156751  8.70012148]46.4843604574[ 4.61479362  7.80529176]46.3495273394[ 1.43353287  9.44668917]50.9073202011[ 2.64555612  7.74233689]48.2603508573[ 0.187898    6.17635497]50.0502618731[ 9.43748079  6.81820299]41.7948313502
Using apply_async took  0.0259671211243 seconds!
[ 4.37587211  8.91773001]48.0603437545[ 0.71036058  0.871293  ]25.6022231983[ 6.02763376  5.44883183]41.7597176061[ 7.91725038  5.2889492 ]38.7413413879[ 7.78156751  8.70012148]46.4843604574[ 4.61479362  7.80529176]46.3495273394[ 1.43353287  9.44668917]50.9073202011[ 2.64555612  7.74233689]48.2603508573[ 0.187898    6.17635497]50.0502618731[ 9.43748079  6.81820299]41.7948313502

Note that rather than looping over apply, you could also use map:

pool.map_async(lambda combi: runMyODE(INIT[combi,:], tval, para=PARA[combi,:]), range(numComb), callback=saveResult)

Post a Comment for "Python's Multiprocessing: Speed Up A For-loop For Several Sets Of Parameters, "apply" Vs. "apply_async""