This chapters presents different techniques for using parallelization inside the code of your VisTrails modules.
VisTrails is single-threaded: the modules are executed one after the other, and while you are free to use threads in your module’s compute() method, you should not interact with VisTrails from other threads.
Note that because of the restrictions of the CPython interpreter, you might not improve performance by using this type of parallelization: the interpreter has a lock, preventing two threads from executing Python code at the same time. If you are not already, consider using packages such as NumPy which provides efficient numerical functions implemented in C (and parallelizable).
Use of the multiprocessing package introduced with Python 2.6 is possible in VisTrails. This is generally the preferred way of performing multiple computational tasks in parallel in Python, and will effectively leverage multiple cores; please refer to the official documentation for more details.
You can access IPython clusters through the Parallel Flow package, via the provided API. Simply declare it in your package’s dependencies, and import vistrails.packages.parallelflow.api. This module provides the following:
Gives you a view on the engines of the cluster, which you can use to submit tasks. This is currently equivalent to calling get_client()[:].
You should probably use load_balanced_view() instead.