UsersGuideVisTrailsPackages

From VistrailsWiki
Revision as of 17:09, 31 January 2007 by Cscheid (talk | contribs)
Jump to navigation Jump to search

Introduction

VisTrails provides infrastructure for user-defined functionality to be incorporated into the main program. Specifically, users can incorporate their own visualization and simulation codes into pipelines by defining custom modules. These modules are bundled in what we call packages. A VisTrails package is simply a collection of Python classes -- each of these classes will represent a new module -- created by the user that respects a certain convention. Here's a simplified example of a very simple user-defined module:

class Divide(Module):
    def compute(self):
        arg1 = self.getInputFromPort("arg1")
        arg2 = self.getInputFromPort("arg2")
        if arg2 == 0.0:
            raise ModuleError(self, "Division by zero")
        self.setResult("result", arg1 / arg2)

registry.addModule(Divide)
registry.addInputPort(Divide, "arg1", (basic.Float, 'dividend'))
registry.addInputPort(Divide, "arg2", (basic.Float, 'divisor'))
registry.addOutputPort(Divide, "result", (basic.Float, 'quotient'))

New VisTrails modules must subclass from Module, the base class that defines basic functionality. The only required override is the compute() method, which performs the actual module computation. Input and output is specified through ports, which currently have to be explicitly registered with VisTrails. However, this is straightforward, and done through method calls to the module registry. A complete documented example of a (slightly) more complicated module is available here.

Dealing with side effects

In an ideal world, each module would be referentially transparently. In other words, a module's outputs should be completely determined by its inputs. This is important for provenance purposes - if modules have implicit dependencies, it is not possible to be certain that when the process is reexecuted, the same results will be generated.

However, it is clear that certain modules are inherently side-effectful (reading/writing files, network, etc). For the common case of temporary files, VisTrails provides a convenience layer that removes part of the burden of managing temporary files.