Everyone has heard about threads, but how to get into python threads? Today I hope I can explain you. First of all, threads are basically execution flows. Each of this execution flows runs independently from another, so if you want to make a lot of disk IO (or any time-consuming task) and at the same time some computing, threads are for you.
Concerning python, there’s a tip that you have to have in mind. In python, threads are simulated, I mean, when you have 3 threads, really there’s only one, but that one (the python interpreter) swaps between your 3 threads in a scheduled manner to simulate threads behaviour. So, python threads behaves as normal threads but they’re not normal threads.
Let’s go to see the example…
from threading import Thread import time import random threads = 10 class MyThread (Thread): def __init__ (self, id): Thread.__init__ (self) self.TID = id # Thread id def run (self): print 'Thread [%d] starts' % self.TID time.sleep (random.random ()) # Some long call between 0 and 1 print 'Thread [%d] ends' % self.TID self.status = random.randrange (10) if __name__ == '__main__': s = time.time () threadBag = [] for id in range (threads): t = MyThread (id) t.start () threadBag.append (t) for thread in threadBag: thread.join () print 'Thread [%d] returned %d' % (thread.TID, thread.status) print 'Total time = %f' % (time.time () - s)
This a simple first example of using threads in python. Every thread in python is an instance of a derived class of threading.Thread. You can see that class MyThread is that class. The method of execution is run (self) that is invoked by the base class. You have to put all your thread’s code into that method in order to execute it. Thread will end when this method return. The other method to override is __init__. If you want to pass some data to the thread, this is your method. Save all the data in instance’s variables and enjoy! To return a value from thread, use self.status and recover it in join () call.
The example simply creates some threads, launch them (with start() method) and then join them to recover their exit status. Also calculates the time needed for all the script. The (variable, because I use random module) output is:
Thread [0] starts Thread [1] starts Thread [2] starts Thread [3] starts Thread [4] starts Thread [5] starts Thread [6] starts Thread [7] starts Thread [8] starts Thread [9] starts Thread [0] ends Thread [0] returned 2 Thread [3] ends Thread [5] ends Thread [1] ends Thread [1] returned 7 Thread [6] ends Thread [9] ends Thread [4] ends Thread [7] ends Thread [2] ends Thread [2] returned 2 Thread [3] returned 2 Thread [4] returned 5 Thread [5] returned 3 Thread [6] returned 2 Thread [7] returned 5 Thread [8] ends Thread [8] returned 2 Thread [9] returned 6 Total time = 0.861000
You only have to focus on ‘ends’ of threads, because creation and joining are sequential loops. You can see each thread needs a different time to end. But the magic here is that the total amount of time of the whole script is, more or less, the maximum of the random time values, so execution is parallel!!!
For more information about threads, visit this.
This is all for now. Comments welcomed!