Dask Distributed Getting Futures After Client Closed
Is there anyway to prevent dask/distributed from cancelling queued & executing futures when the client is closed? I want to use a jupyter notebook to kick off some very long ru
Solution 1:
You can use the "publish" mechanism to keep references to some data around in the scheduler for later retrieval in another client. Two forms exist which do the same thing with different syntax:
>>> client.publish_dataset(mydata=f)
Here f
is a future, list of futures or a dask collection (dataframe, etc).
In another session:
>>> client.list_datasets()
['mydata']
>>> client.get_dataset('mydata')
<same thing as f>
The alternative and maybe simpler syntax looks like
>>> client.datasets['mydata'] = f
>>> list(client.datasets)
['mydata']
>>> client.datasets['mydata']
<same thing as f>
To remove the references and allow the data to be cleared from the cluster (if no client needs them), use client.unpublish_dataset('mydata')
or del client.datasets['mydata']
.
Post a Comment for "Dask Distributed Getting Futures After Client Closed"