spacepy.toolbox.thread_job

spacepy.toolbox.thread_job(job_size, thread_count, target, *args, **kwargs)[source]

Split a job into subjobs and run a thread for each

Each thread spawned will call L{target} to handle a slice of the job.

This is only useful if a job:
  1. Can be split into completely independent subjobs

  2. Relies heavily on code that does not use the Python GIL, e.g. numpy or ctypes code

  3. Does not return a value. Either pass in a list/array to hold the result, or see L{thread_map}

Parameters:
job_sizeint

Total size of the job. Often this is an array size.

thread_countint
Number of threads to spawn. If =0 or None, will

spawn as many threads as there are cores available on the system. (Each hyperthreading core counts as 2.) Generally this is the Right Thing to do. If NEGATIVE, will spawn abs(thread_count) threads, but will run them sequentially rather than in parallel; useful for debugging.

targetcallable
Python callable (generally a function, may also be an

imported ctypes function) to run in each thread. The last two positional arguments passed in will be a “start” and a “subjob size,” respectively; frequently this will be the start index and the number of elements to process in an array.

argssequence
Arguments to pass to L{target}. If L{target} is an instance

method, self must be explicitly passed in. start and subjob_size will be appended.

kwargsdict

keyword arguments to pass to L{target}.

Examples

squaring 100 million numbers:

>>> import numpy
>>> import spacepy.toolbox as tb
>>> numpy.random.seed(8675301)
>>> a = numpy.random.randint(0, 100, [100000000])
>>> b = numpy.empty([100000000], dtype='int64')
>>> def targ(in_array, out_array, start, count):              out_array[start:start + count] = in_array[start:start + count] ** 2
>>> tb.thread_job(len(a), 0, targ, a, b)
>>> print(b[0:5])
[2704 7225  196 1521   36]
This example:
  • Defines a target function, which will be called for each thread. It is usually necessary to define a simple “wrapper” function like this to provide the correct call signature.

  • The target function receives inputs C{in_array} and C{out_array}, which are not touched directly by C{thread_job} but are passed through in the call. In this case, C{a} gets passed as C{in_array} and C{b} as C{out_array}

  • The target function also receives the start and number of elements it needs to process. For each thread where the target is called, these numbers are different.