Multiprocessing Ignores "__setstate__"
Solution 1:
The multiprocessing module can start one of three ways: spawn, fork, or forkserver. By default on unix, it forks. That means that there's no need to pickle anything that's already loaded into ram at the moment the new process is born.
If you need more direct control over how you want the fork to take place, you need to change the startup setting to spawn. To do this, create a context
ctx=multiprocessing.get_context('spawn')
and replace all calls to multiprocessing.foo()
with calls to ctx.foo()
. When you do this, every new process is born as a fresh python instance; everything that gets sent into it will be sent via pickle, instead of direct memcopy.
Solution 2:
Reminder: when you're using multiprocessing, you need to start a process in an 'if __name__ == '__main__':
clause: (see programming guidelines)
import pickle
import multiprocessing
class Tricky:
def __init__(self,x):
self.data=x
def __setstate__(self, d):
print('setstate happening')
self.data = 10
def __getstate__(self):
return self.data
print('getstate happening')
def report(ar,q):
q.put(ar.data)
if __name__ == '__main__':
ar = Tricky(5)
q = multiprocessing.Queue()
p = multiprocessing.Process(target=report, args=(ar, q))
print('now starting process')
p.start()
print('now joining process')
p.join()
print('now getting results from queue')
print(q.get())
print('now getting pickle dumps')
print(pickle.loads(pickle.dumps(ar)).data)
On windows, I see
now starting process
now joining process
setstate happening
now getting results from queue
10
now getting pickle dumps
setstate happening
10
On Ubuntu, I see:
now starting process
now joining process
now getting results from queue
5
now getting pickle dumps
getstate happening
setstate happening
10
I suppose this should answer your question. The multiprocess
invokes __setstate__
method on Windows but not on Linux. And on Linux, when you call pickle.dumps
it first call __getstate__
, then __setstate__
. It's interesting to see how multiprocessing module is behaving differently on different platforms.
Post a Comment for "Multiprocessing Ignores "__setstate__""