Is There A Way To Run Cpython On A Diffident Thread Without Risking A Crash?
Solution 1:
Urllib uses cpython deep down in the socket module, so the threads that are being created just add up and do nothing because python's GIL prevents a two cpython commands from being executed in diffident threads at the same time.
Wrong. Though It is a common misconception. CPython can and do release GIL for IO-operations (look at all Py_BEGIN_ALLOW_THREADS
in the socketmodule.c
). While one thread waits for IO to complete other threads can do some work. If urllib
calls are the bottleneck in your script then threads may be one of the acceptable solutions.
I am running Windows XP with Python 2.5, so I can't use the multiprocess module.
You could install Python 2.6 or newer or if you must use Python 2.5; you could install multiprocessing separately.
I created my own custom timeit like module and the above takes around 0.5-2 seconds, which is horrible for what my program does.
The performance of urllib2.urlopen('http://example.com...).read()
depends mostly on outside factors such as DNS, network latency/bandwidth, performance of example.com server itself.
Here's an example script which uses both threading
and urllib2
:
import urllib2
from Queue import Queue
from threading import Thread
def check(queue):
"""Check /n url."""
opener = urllib2.build_opener() # if you use install_opener in other threads
for n in iter(queue.get, None):
try:
data = opener.open('http://localhost:8888/%d' % (n,)).read()
except IOError, e:
print("error /%d reason %s" % (n, e))
else:
"check data here"
def main():
nurls, nthreads = 10000, 10
# spawn threads
queue = Queue()
threads = [Thread(target=check, args=(queue,)) for _ in xrange(nthreads)]
for t in threads:
t.daemon = True # die if program exits
t.start()
# provide some work
for n in xrange(nurls): queue.put_nowait(n)
# signal the end
for _ in threads: queue.put(None)
# wait for completion
for t in threads: t.join()
if __name__=="__main__":
main()
To convert it to a multiprocessing script just use different imports and your program will use multiple processes:
from multiprocessing import Queue
from multiprocessing import Process as Thread
# the rest of the script is the same
Solution 2:
If you want multi threading, Jython could be an option, as it doesn't have a GIL.
I concur with @Jan-Philip and @Piotr. What are you using urllib for?
Post a Comment for "Is There A Way To Run Cpython On A Diffident Thread Without Risking A Crash?"