FAQ

From VistrailsWiki
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Also check our Known Issues page for troubleshooting.


Running workflows

How can I run a workflow using the command line?

(Updated for version 1.2) Call vistrails using the following options:

python vistrails.py -b path_to_vistrails_file:pipeline

where pipeline can be a version tag name or version id

NOTE: If you downloaded the MacOS X bundle, you can run vistrails from the command line via the following commands in the Terminal. Change the current directory to wherever VisTrails was installed (often /Applications), and then type:

Vistrails.app/Contents/MacOS/vistrails [<cmd_line_options>] Running a Specific Workflow in Batch Mode

Using the command line, we'd like to execute a workflow multiple times, with slightly different parameters, and create a series of output files. Is this possible?

(Updated for version 1.2) We can change parameters that have an alias through the command line.

For example, offscreen pipeline in offscreen.vt always creates the file called image.png. If you want generate it with a different filename:

python vistrails.py -b ../examples/offscreen.vt:offscreen -a"filename=other.png"

filename in the example above is the alias name assigned to the parameter in the value method inside the String module. When running a pipeline from the command line, VisTrails will try to start the spreadsheet automatically if the pipeline requires it. For example, this other execution will also start the spreadsheet (attention to how $ characters are escaped when running on bash):

python vistrails.py -b ../examples/head.vt:aliases -a"isovalue=30\$&\$diffuse_color=0.8,0.4,0.2"

You can also execute more than one pipeline on the command line:

python vistrails.py -b ../examples/head.vt:aliases ../examples/spx.vt:spx \ -a"isovalue=30"

Use the -a parameter only once regardless the number of pipelines. Running a Workflow with Specific Parameters

I can load a vistrail, and the version tree shows up fine. However, no pipelines appear when I click on a version. What gives?

The most likely reason is that the vistrail uses a package that is not registered with VisTrails. You need to identify the needed package and add it to your .vistrails/startup.py. A single line like the following should be enough:

addPackage('enter_package_name_here')

Some packages might need more information. For example:

addPackage('afront', executable_path='/path/to/afront')

Refer to the package documentation for details. The one inconvenient step is that currently there's no automated way to describe what is the missing package. We're working on this feature for future releases.

I have a workflow that reads a file and then does some processing. The first time it runs, it executes correctly. But in subsequent, nothing happens.

VisTrails caches by default, so after a workflow is executed, if none of its parameters change, it won't be executed again.

If a workflow reads a file using the basic module File, VisTrails does check whether the file was modified since the last run. It does so by keeping a signature that is based on the modification time of the file. And if the file was modified, the File module and all downstream modules (the ones which depend on File) will be executed.


Note: If you would like your input and output data to be versioned, you can use the Persistence package.

If you do not want VisTrails to cache executions, you can turn off caching: go to Menu Edit -> Preferences and in the General Configuration tab, change Cache execution results to Never. Workflow Execution

Can VisTrails execute workflows in parallel?

The VisTrails server can only execute pipelines in parallel if there's more than one instance of VisTrails running. The command

self.rpcserver = ThreadedXMLRPCServer((self.temp_xml_rpc_options.server, self.temp_xml_rpc_options.port))

starts a multithreaded version of the XML-RPC server, so it will create a thread for each request received by the server. The problem is that Qt/PyQT doesn't allow these multiple threads create GUI objects, only in the main thread. To overcome this limitation, the multithreaded version can instantiate other single threaded versions of VisTrails and put them in a queue, so workflow executions and other GUI-related requests, such as generating workflow graphs and history trees can be forwarded to this queue, and each instance takes turns in answering the request. If the results are in the cache, the multithreaded version answers the requests directly.

Note that this infrastructure works on Linux only. To make this work on Windows, you have to create a script similar to start_vistrails_xvfb.sh (located in the scripts folder) where you can send the number of other instances via command-line options to VisTrails. The command line options are:

python vistrails_server.py -T <ADDRESS> -R <PORT> -O<NUMBER_OF_OTHER_VISTRAILS_INSTANCES> [-M]&

If you want the main vistrails instance to be multithreaded, use the -M at the end.

After creating this script, update function start_other_instances in vistrails/gui/application_server.py lines 1007-1023 and set the script variable to point to your script. You may also have to change the arguments sent to your script (line 1016: for example, you don't need to set a virtual display). You will need to change the path to the stop_vistrails_server.py script (on line 1026) according to your installation path. Executing Workflows in Parallel

When a workflow is executed, what do the colors mean?

- lilac: module was notexecuted

- yellow: module is currently being executed

- green: module was successfully executed

- orange: module was cached

- red: the execution of the module failed

Workflow Execution

Workflow execution hangs on Windows

This can happen if you are using "quick edit mode" in the console and have print statements in your code. Standard output can then get blocked by the console. Pressing space in the console resumes the execution. To avoid this problem, either disable "quick edit mode", or avoid print statements in your code.

VisTrails do not install Missing System Packages

If VisTrails do not try to install missing system packages it may be because it cannot determine your system type. I that case you can run this (in python) to determine your system type:

   import platform
   platform.linux_distribution()

And add this system name to gui/bundles/utils.py by, e.g., modifying the _guess_ubuntu method (if your system is apt-based):

   def _guess_ubuntu():
       return platform.linux_distribution()[0]=='Ubuntu' or \
              platform.linux_distribution()[0]=='YourSystemName'

Cannot update subworkflows after upgrading packages or vistrails version

When packages used by a subworkflow is upgraded, any subworkflows that use it will be automatically upgraded. It may then lose the ability to be updated to a newer local subworkflow. In this case the subworkflow needs to be updated by hand by removing it from the pipeline and be dragged in again from the module palette. This may get fixed in a future release.

Building workflows

Is there a way to give each widget a "display name" in addition to the module name at the center of the widget?

Yes, a "display name" can be assigned to a module by selecting the triangle in its top right corner to open a popup menu and selecting the Set Module Label... menu item. You will then be prompted to enter the "display name". Changing Module Labels

Is there a way to re-center the picture-in-picture (PiP) view?

Yes. If you click on the PIP window to bring it to focus, you can press Ctrl-R (or Command-R on Mac) to re-center the PiP window. Vistrails Interaction

How do I search for a literal "?" (question mark) in the search box in the Property panel?

Since we allow regular expressions in our search box, question marks are treated as meta-characters. Thus, searching for "?" returns everything and "abc?" will return everything containing "abc". You need to use "\?" instead to search for "?". So the search for "??" would be "\?\?". Textual Queries

Saving a vistrail fails when Running VisTrails on Windows inside a Virtual Machine

After installing Windows in a Virtual Machine, the path to zip.exe may be missing, and you may see this error when trying to save a vistrail:

   WindowsError: [Error 2] The system cannot find the file specified: '***/vt.zip'

Then you need to add the path to zip.exe, which is included in the binary distribution of VisTrails, to your PATH variable.

Using VisTrails as a server

What is the VisTrails server-mode?

Using the VisTrails server mode, it is possible to execute workflows and control VisTrails through another application. For example, the CrowdLabs Web portal (http://www.crowdlabs.org) accesses a VisTrails sever to execute workflows, retrieve and display vistrail trees and workflows. Using VisTrails as a Server

How do I execute workflows and control VisTrails through another application?

The way you access the server is by doing XML-RPC calls. In the current VisTrails release, we include a set of PHP scripts that can talk to a VisTrails server instance. They are in "extensions/http" folder. The files are reasonably well documented. Also, it should be not difficult to create python scripts to access the server (just use xmlrpclib module).

Note that the VisTrails server requires the provenance and workflows to be in a database. More detailed instructions on how to setup the server and the database are available here:

http://www.crowdlabs.org/site_media/static/dev_docs/vistrails_server_setup.html

http://www.crowdlabs.org/site_media/static/dev_docs/vistrails_database_setup.html

If what you want is just to execute a series of workflows in batch mode, a simpler solution would be to use the VisTrails client in batch mode. Chapter 12 of the user's guide contains detailed information and examples on that. Running VisTrails in Batch Mode

VisTrails server executes a workflow but generates a blank image and generates the error message cannot get access to X server

You will need to check if the display the server is trying to use is a valid display (by default it uses the display 0). On linux, the command w will list the logged users and the display associated with them (FROM column).

Note that the VisTrails server requires the machine to be running X.

cannot get access to X server

Running VisTrails in server or batch mode requires a connection to an X server.

No additional setup is required if you run VisTrails on a terminal because you are already logged in to X. To make it work in other scenarios, you need to run the python command through Xvfb or make sure you can run cgi scripts that access the GUI.

If you can run Xvfb, you can use the following script, where you need toconfigure the first four variables according to your system: http://www.vistrails.org/images/Run_vistrails_batch_xvfb.script.sh.txt

(To run the script, rename the file and remove the ".txt")


You should also modify yout cgi script to invoke the bash script instead of vistrails directly. The bash script will accept the virtual display, the vistrail file and workflow tag as input arguments.

Another possibility is if your workflow does not require the GUI, you can use VisTrails as a regular python module and it will not require the GUI or X Server to run. This functionality is available in the nightly builds and will be included in VisTrails 2.0 beta to be released soon. There is an example of how to use this feature in our FAQ: http://www.vistrails.org/index.php/FAQ#Using_VisTrails_as_a_Python_module

Problems starting VisTrails

Setup was unable to create the directory "N:\.vistrails"

When VisTrails is installing, it tries to create the .vistrails folder in the users %HOMEPATH% directory. In some Windows installations, network accounts are set to a directory that a user does not have write access to. Consequently, the installation will fail. To get around this problem, you can use the "-S <directory>" flag when starting VisTrails. This option allows you to put the .vistrails directory wherever you wish. You could also write a short script that automatically invokes VisTrails with the "-S" flag pointing to a directory that makes sense to your network. If you are unable to install VisTrails, you can run the installer after setting a new home path from the command line like this:

set HOMEPATH=\My\New\Home\
set HOMEDRIVE=C:
vistrails-setup-2.0.1-xxx.exe

Using VisTrails as a Python module

Can I use VisTrails as a Python module without installing PyQt?

Yes! We have improved the ability to use VisTrails from other software, and have eliminated most GUI (PyQt) dependencies in the core part of the code. Thus, you can now work with workflow versions and provenance information in a standard python shell. Note packages that directly rely on the GUI like the VisTrails Spreadsheet will still require PyQt to be installed.

How do I open and execute workflows in a standard python shell?

Here is a simple example that shows how you can open and execute a workflow from a Python script:

>>> import vistrails as vt
>>>
>>> vistrail = vt.load_vistrail('simplemath.vt')
>>> vistrail.select_latest_version()
>>> result = vistrail.execute(in_a=2, in_b=4)
>>> result.output_port('out_plus')
6.0

A more complete example is available in the VisTrails distribution as examples/api/ipython-notebook.ipynb

Control Flow

Note: using map

When using 'map', the module (or subworkflow) used as function port in the map module MUST be a function, i.e., it can only define 1 output port. The Map Operator

Spreadsheet

Where pipeline is a version number or a tag.

How can I save an image from the spreadsheet?

While having the focus on a spreadsheet cell and select the camera on the toolbar to take a snapshot. The system will prompt you for the location and file name where it should be saved. The other icons can be used for saving multiple images that can be used for generating an animation on demand. A whole sheet can also be saved by selecting Export (either from the menu or from the toolbar). Saving a Spreadsheet Image

Is it possible to save the complete state of the spreadsheet?

Saving a Spreadsheet

Can I view multiple sheets at the same time?

Yes. Each sheet on the spreadsheet can be displayed as a dock widget separated from the main spreadsheet window by dragging its tab name out of the tab bar at the bottom of the spreadsheet. Multiple Spreadsheets

Then, how can I put back a separated sheet?

A sheet can be docked back to the main window by dragging it back to the tab bar or double-click on its title bar. Multiple Spreadsheets

How can I order sheets on the spreadsheet?

This can be done by dragging the sheet name on the bottom top bar and drop it to the right place. Multiple Spreadsheets

Can I control where a cell will be placed on the spreadsheet window?

By default, an unoccupied cell on the active sheet will be chosen to display the result. However, you can specify exactly in the pipeline where a spreadsheet cell will be placed by using CellLocation and SheetReference. CellLocation specifies the location (row and column) of a cell when connecting to a spreadsheet cell (VTKCell, ImageViewerCell, ...). Similarly, a SheetReference module (when connecting to a CellLocation) will specify which sheet the cell will be put on given its name, minimum row size and minimum column size. There is an example of this in examples/vtk.xml (select the version below Double Renderer). Sending Output to the Spreadsheet

How do I output results to the spreadsheet?

By inspecting the VisTrails Spreadsheet package (in the list of packages, to the left of the pipeline builder), you can see there are built-in cells for different kinds of data, e.g., RichTextCell to display HTML and plain text. op You (the user) can also define new cell types to display application-specific data. For example, we have developed VtkCell, MplFigureCell, and OpenGLCell. It is possible to display pretty much anything on the Spreadsheet! Sending Output to the Spreadsheet

Examples of writing cell modules can be found in: RichTextCell: packages/spreadsheet/widgets/richtext/richtext.py VTK: packages/vtk/vtkcell.py

Here is the summary of some requirements on a cell widget:

(1) It must be a Qt widget. It should inherit from spreadsheet_cell.QCellWidget in the spreadsheet package. Although any Qt Widget would work, certain features such as animation will not be available (without rewriting it).

(2) It must re-implement the updateContents() function to take a set of inputs (usually coming from input ports of a wrapper Module) and display on the cells. VisTrails uses this function to update/reuse cells on the spreadsheet when new data comes in.

(3) It needs a wrapper VisTrails Module (inherited from basic_widgets.SpreadsheetCell of the spreadsheet package). Inside the compute() method of this module, it may call self.display(CellWidgetType, (inputs)) to trigger the display event on the spreadsheet. Advanced Cell Options

How do I control the default number of cells in the spreadsheet?

You can configure the rowCount and colCount using the preferences dialog. Just go to the Module Packages tab, select spreadsheet in the "Enabled packages" and press the Configure button. Then a list of all the configuration options for the spreadsheet will show up. Custom Layout Options

Is it possible to launch a web browser from the vistrails spreadsheet? We would like to output several urls from a parameter sweep and then have the option to click on each one to view the resulting page. I can view the page within the spreadsheet, but it is really too crowded.

Currently, there isn't a widget that provides exactly this functionality, but I can think of a few solutions that may work for you:

(1) You can use parameter exploration to generate multiple sheets so you might have an exploration that opens each page in a new sheet. Use the third column/dimension in the exploration interface to have a parameter span sheets.

(2) The spreadsheet is extensible so you can write a custom spreadsheet cell widget that has a button or label with the desired link (a QLabel with openExternalLinks set to True, for example).

(3) You can tweak the existing RichTextCell be adding the line "self.browser.setOpenExternalLinks(True)" at line 63 of the source file "vistrails/packages/spreadsheet/widgets/richtext/richtext.py". Then, if your workflow creates a file with html markup text like "<a href="http://www.vistrails.org/">VisTrails</a>" connected to a RichTextCell, clicking on the rendered link in the cell will open it in a web browser. You need to add the aforementioned line to the source to let Qt know that you want the link opened externally; by default, it will just issue an event that isn't processed. Launching a Web Browser

Integrating your software into VisTrails

How can I integrate my own program into VisTrails?

The easiest way is to create a package. Writing a package is often very simple; please refer to this section of the users' guide.

You can also dynamically generate modules. For an example see:

Generating Modules Dynamically

In particular, see the new_module call which uses python's type() function to generate new classes dynamically.

How do I add a port that is not visible on the module (when it appears on the design canvas)?

This can be accomplished via the "optional" argument. This is the fourth argument of add_input_port (add_output_port) or can be specified as a kwarg. In your example, this would look like:

reg.add_input_port(MyModule, "MyPort", (core.modules.basic_modules.String, 'MyPort Name'), True)

or with kwargs

reg.add_input_port(MyModule, "MyPort", (core.modules.basic_modules.String, 'MyPort Name'),\
                   optional=True)

or

_input_ports = [('MyPort', '(core.modules.basic_modules.String)', {"optional": True})]

Configuring Ports

How do modules deal with multiple inputs in a same port?

(And should that even be allowed?)

For compatibility reasons, we do need to allow multiple connections to an input port. However, most package developers should never have to use this, and so we do our best to hide it. the default behavior for getting inputs from a port, then, is to always return a single input.

If on your module you need multiple inputs connected to a single port, use the 'forceGetInputListFromPort' method. It will return a list of all the data items coming through the port. The spreadsheet package uses this feature, so look there for usage examples (vistrails/packages/spreadsheet/basic_widgets.py) Configuring Ports

Are there mechanisms for attaching widgets to different modules/parameters?

Right now, we have a mechanism for putting a specific widget for an input port. For example, if a port is SetColor(red, green, blue), we can put a color wheel widget there. Or we can also replace the SetFileName port with a File Widget. However, this is not per parameter (only per port). We are currently working on this problem.

Can I organize my package so it appears hierarchical in the module palette?

Yes. Use the namespace keyword argument when adding the module to the registry. For example,

registry.add_module(MyModule, namespace='MyNamespace')

Configuring Modules - Hierarchy and Visibility

Can I nest namespaces?

Yes. Use the '|' character to separate different the hierarchy. For example,

registry.add_module(MyModule, namespace='ParentNamespace|ChildNamespace')

Configuring Modules - Hierarchy and Visibility

Are there shortcuts for registry initialization?

Yes. If you define _modules as a list of classes in the __init__.py file of your package, VisTrails will attempt to load all classes specified as modules. You can provide add_module options as keyword arguments by specifying a tuple (class, kwargs) in the list. For example:

_modules = [MyModule1, (MyModule2, {'namespace': 'MyNamespace'})]

In addition, you need to identify the ports of your modules as a field in your class by defining _input_ports and _output_ports lists. Here, the items in each list must be tuples of the form (portName, portSignature, optional=False, sort_key=-1). For example:

class MyModule(Module):
    def compute(self):
       pass

   _input_ports = [('firstInput', String), ('secondInput', Integer, True)]
   _output_ports = [('firstOutput', String), ('secondOutput', String)]

Customizing Modules and Ports

Can I define ports to be of types that I do not import into my package?

Yes. You can pass an identifier string as the portSignature instead. The port_signature string is defined by:

<module_string> := <package_identifier>:[<namespace>|]<module_name>,
<port_signature> := (<module_string>*)

For example,

registry.add_input_port(MyModule, 'myInputPort', '(edu.utah.sci.vistrails.basic:String)')

or

 _input_ports = [('myInputPort', '(edu.utah.sci.vistrails.basic:String)')]

Configuring Ports - Port Types

What do I need to change in my package to make it reloadable (new in v1.4.2)?

See Creating Reloadable Packages for an explanation.

Can I add default values or labels for parameters?

Yes. Versions 1.4 and greater support these features. See Configuring Ports - Default Values and Labels for more details.

How can I access the default values for a parameter?

The default values are stored in PortSpec.defaults for each port.

I want to write a module to load HDF data whose output (e.g., data, string) varies according to the input I give it. Is is possible to do this in VisTrails, and if yes, how can I do that? Ideally, I would like to avoid having to change the connection of my output every time I change the input.

There are a few ways to tackle this - each has it's own benefits and pitfalls. Firstly, module connections do respect class hierarchies as we're familiar with in object oriented languages. For instance, A module can output a Constant of which String, Float, Integer, etc are specifications. In this way, you can have a subclass of something like HDFData be passed out of the module and the connections will be established regardless of the sub-type. This is a bit dangerous though. Modules downstream of such a class may not really know how to operate on certain types derived from the super-class. Extreme care must be taken both when creating the modules as well as connecting them to prevent things like this from happening.

A second method that I employ in several different packages is the idea of a container class. For instance, the NumSciPy package uses a relatively generic container "Numpy Array" to encapsulate the data. Of course, these encapsulating objects can store dictionaries that other modules can easily access and understand how to operate on. Although this method is slightly more work, the benefits of a stricter typing of ports is beneficial - particularly upon interfacing with other packages that may depend on strongly typed constants (for example). Varying Output According to the Input

I need to determine, at run-time, whether or not a "child" module is attached to the output port of a "parent" module. (I do not specifically need to know which child; just if there is one).

The outputPorts dictionary of the base Module stores this information. Thus, you should be able to check

("myPortName" in self.outputPorts)

on the parent module to check if there are any downstream connections from the port "myPortName". This might be used, for example, to only set results for output ports that will be used. ***Note***, however, that the caching algorithm assumes that all outputs are set so adding a new connection to a previously unconnected output port will not work as desired if that module is cached. For this reason, I would currently recommend making such a module not cacheable. Another possibility is overriding the update() method to check the outputPorts and set the upToDate flag if they are not equal. In a single, limited test, this seemed to work, but be warned that it is not fully tested. Here is an example:

class TestModule(Module):
    _output_ports = [('a1', '(edu.utah.sci.vistrails.basic:String)'),
                     ('a2', '(edu.utah.sci.vistrails.basic:String)')]
    def __init__(self):
        Module.__init__(self)
        self._cached_output_ports = set()
    
    def update(self):
        if len(set(self.outputPorts) - self._cached_output_ports) > 0:
            self.upToDate = False
        Module.update(self)
    
    def compute(self):
        if "a1" in self.outputPorts:
            self.setResult("a1", "test")
        if "a2" in self.outputPorts:
            self.setResult("a2", "test2")
        self._cached_output_ports = set(self.outputPorts)

Configuring Ports - Determining Whether or Not a Module is Attached to an Output Port

How can I make a module not display in the modules list?

You should set the abstract parameter to True when adding the module to the registry. Using the original syntax, this looks like:

def initialize():
    reg = core.modules.module_registry.get_module_registry()
    reg.add_module(InvisibleModule, abstract=True)
    # ...

With the _modules dictionary shortcut (for more details, see the FAQ section on this), you include it in a kwargs dict as part of a module tuple:

_modules = [AnotherModule, (InvisibleModule, {'abstract': True})]

There is also a 'hide_descriptor' parameter that prevents the module from appearing in the module palette without declaring it to be abstract.

The technical difference between the two is that 'abstract' will not add the item to the module palette while 'hide_descriptor' does add the item but immediately hides it. If the module should never be instantiated in a workflow, declare it abstract. If you don't want users to be able to add the module to a pipeline, but you have code that may add it programmatically, declare it with hide_descriptor=True.

Configuring Modules - Hierarchy and Visibility

How do I document individual ports?

To access port documentation, users can right-click on the port in the port list and choose the corresponding menu item. To provide this documentation, you should define the provide_input_port_documentation and/or the provide_output_port_documentation class methods. Note that these methods take the class and the port name as arguments. For example,

class MyModule(Module):
    _input_ports = [('test', '(edu.utah.sci.vistrails.basic:String)'),
                    ('test2', '(edu.utah.sci.vistrails.basic:String)')]
    port_docs = {'test': 'Some documentation',
                 'test2': 'More documentation'}
    @classmethod
    def provide_input_port_documentation(cls, port_name):
        return cls.port_docs[port_name]

How do I access modules from other packages?

Currently, it is best to access modules from the registry. First, make sure that any dependencies on another package are specified in package_dependencies method in __init__.py. To create a module from another package as an output, you can generate it from the registry. For example,

from core.modules.module_registry import get_module_registry
from core.modules.vistrails_module import Module

class ReturnFigManager(Module):
 _output_ports = [('figManager', 
                   '(edu.utah.sci.vistrails.matplotlib:MplFigureManager)')]
 def compute(self):
     reg = get_module_registry()
     wrapper = \
         reg.get_descriptor_by_name("edu.utah.sci.vistrails.matplotlib", 
                                    "MplFigureManager").module()
     wrapper.figManager = "blah"
     self.setResult('figManager', wrapper)

You can also create subclass from classes obtained from the registry. For example,

MplFigureManager = get_module_registry().get_descriptor_by_name(
    "edu.utah.sci.vistrails.matplotlib", 
    "MplFigureManager").module
class MplFigureManagerSubclass(MplFigureManager):
    pass


How do I create a custom module configuration widget?

See Module Configuration Example for a full example and notes about doing this.

Can I make a PythonSource module cacheable?

Yes. If you have a module that you are planning to re-use in a workflow, we recommend making a packaged module (which are by default cacheable). However, you can make a PythonSource (which are by default not cacheable) cacheable using the line

self.is_cacheable = lambda *args, **kwargs: True

in the source of the PythonSource module.

The Console

Where should I go to find out what I can call from the console and how to import it?

We have tried to make some methods more accessible in the console via an api. You can import the api via from vistrails import api in the console and see the available methods with dir(api). To open a vistrail:

from vistrails import api
api.open_vistrail_from_file('/Applications/VisTrails/examples/terminator.vt')

To execute a version of a workflow, you currently have to go through the controller:

api.select_version('Histogram')
api.get_current_controller().execute_current_workflow()

Currently, only a subset of VisTrails functionality is directly available from the api. However, since VisTrails is written in python, you can dig down starting with the VistrailsApplication or controller object to expose most of our internal methods. If you have suggestions for calls to be added to the api, please let us know.

One other feature that we're working on, but is still in progress is the ability to construct workflows via the console. For example:

vtk = load_package('edu.utah.sci.vistrails.vtk')
vtk.vtkDataSetReader() # adds a vtkDataSetReader module to the pipeline
# click on the new module
a = selected_modules()[0] # get the one currently selected module
a.SetFile('/vistrails/examples/data/head120.vtk') # sets the SetFile parmaeter for the data set reader
b = vtk.vtkContourFilter() # adds a vtkContourFilter module to the pipeline and saves to var b
b.SetInputConnection0(a.GetOutputPort0()) # connects a's GetOutputPort0 port to b's SetInputConnection0

Finding Methods Via the Command Line

Persistence Package

How do I use the output of one workflow as the input for another using the persistence package?

You need to configure the persistence modules using the module's configuration dialog. After adding a PersistentOutputFile to the workflow, click on the triangle in the upper-right corner of the PersistentOutputFile, and select "Edit Configuration" from the menu that appears. In this dialog, select "Create New Reference" and give the reference a name (and any space-delimited tags). Upon running that workflow, the data will be written to the persistent store. In the second workflow where you wish to use that file, add a PersistentInputFile and go to its configuration dialog in the same manner as with the output file. In that dialog, select "Use Existing Reference" and select the data that you just added in the first workflow in the list of files below. Now, when you run that workflow, it will grab the data from the persistent store.

Here is an example: offscreen_persistent.vt. Run the "persistent offscreen" workflow first, and then run the "display persistent output" to use the output of the first workflow as the input for the second.

VTK

Given a VTK visualization, how can I generate a webpage from it?

Check out the html pipeline in offscreen.xml.

I'm trying to use VTK, but there doesn't seem to be any output. What is wrong?

To use VTK on VisTrails, you need a slightly different way of connecting the renderer modules. Instead of using the standard RenderWindow/RenderWindowInteractor infrastructure, you simply connect the renderer to a VTKCell. The examples directory in the distribution has several VTK examples that illustrate.

I am trying to add a module to the workflow via Python, but how can I access vtk modules?

Here's an example:

import api

vtvtk = 'edu.utah.sci.vistrails.vtk'

module = api.add_module(0, 0, vtvtk, 'vtkContourFilter', )


The third argument in add_module is the package identifier. You can find this in the "Module Packages" panel of the Preferences; just click on the package you're interested in and it will appear in the information on the right.

matplotlib

I'm experiencing a problem with Latex labels and the matplotlib that comes with VisTrails 1.5. The script below entered to the interpreter that comes with VT is sufficient to reproduce it.

  import matplotlib.pyplot as plt
  plt.plot([1,2,3],[1,2,3])
  plt.xlabel("$foo$")

Remove your ~/.matplotlib folder and re-start VisTrails

rpy

Package rpy fails with "module object has no attribute RVector"

The rpy package needs to be updated to support a newer rpy version. In "packages/rpy/init.py", replace all instances of "objects.RVector" with "objects.Vector", or use this file.

JobSubmission

The JobSubmission package depends on the stable version of BatchQ. Download https://github.com/troelsfr/BatchQ/archive/stable.zip, copy the "BatchQ-stable/batchq" directory to your local site-packages folder. Copying it to the "vistrails/packages/JobSubmission" folder should also work. See batchq/contrib/vistrails for examples.

VisTrails Development

I would like to build VisTrails from source. Are there instructions on how to do this?

Yes! Take a look at Installing VisTrails from source

Accessing Provenance Information

How do I access the information in the execution log?

The code responsible for storing execution information is located in the "core/log" directories, and the code that generates much of that information is in "core/interpreter/cached.py". Modules can add execution-specific annotations to provenance via annotate() calls during execution, but much of the data (like timing and errors) is captured by the LogController and CachedInterpreter (the execution engine) objects. To analyze the log from a vistrail (.vt) file, you might have something like the following:

 import core.log.log
 import db.services.io
 def run(fname):
  # open the .vt bundle specified by the filename "fname"
  bundle = db.services.io.open_vistrail_bundle_from_zip_xml(fname)[0]
  # get the log filename
  log_fname = bundle.vistrail.db_log_filename
  if log_fname is not None:
      # open the log
      log = db.services.io.open_log_from_xml(log_fname, True)
      # convert the log from a db object
      core.log.log.Log.convert(log)
      for workflow_exec in log.workflow_execs:
          print 'workflow version:', workflow_exec.parent_version
          print 'time started:', workflow_exec.ts_start
          print 'time ended:', workflow_exec.ts_end
          print 'modules executed:', [i.module_id 
                                      for i in workflow_exec.item_execs]
 if __name__ == '__main__':
    run("some_vistrail.vt")

You should be able to see what information is available by looking at the "core/log" classes. Accessing the Execution Log

VisTrails Binaries

Is there a Mac OS X 10.6+ x64 binary of the version 1.7 of VisTrails available?

We don't have a 64bit Mac binary for v1.7 release because at the time we didn't have 64 bit versions of the libraries shipped in the 1.7 binary.

However, it is possible to update a 64bit or any other binary with a source release of VisTrails, including the sources of 1.7 version or the nightly builds.

Assuming you have the sources of 1.7 in /vistrails1.7 and the 64bit binary in /Applications/VisTrails1.7 do the following steps:

  cp /vistrails1.7/vistrails/vistrails.py  /Applications/VisTrails1.7/VisTrails.app/Contents/Resources
  cp -r /vistrails1.7/vistrails/api /Applications/VisTrails1.7/VisTrails.app/Contents/Resources/lib/python2.7/
  cp -r /vistrails1.7/vistrails/core /Applications/VisTrails1.7/VisTrails.app/Contents/Resources/lib/python2.7/
  cp -r /vistrails1.7/vistrails/db /Applications/VisTrails1.7/VisTrails.app/Contents/Resources/lib/python2.7/
  cp -r /vistrails1.7/vistrails/gui /Applications/VisTrails1.7/VisTrails.app/Contents/Resources/lib/python2.7/
  cp -r /vistrails1.7/vistrails/packages /Applications/VisTrails1.7/VisTrails.app/Contents/Resources/lib/python2.7/
  cp -r /vistrails1.7/examples /Applications/VisTrails1.7/
  cp -r /vistrails1.7/extensions /Applications/VisTrails1.7/
  cp -r /vistrails1.7/scripts /Applications/VisTrails1.7/

AVG Antivirus falsely report a virus in VisTrails 2.0.2 32-bit Windows installer

Problematic file is vistrails/Python27/Lib/site-packages/_mysql.pyd: https://www.virustotal.com/en/file/d8aabd921b5eba8aabcce936ce3b92e3d1de43eb44c43d921ca1b9ab91d7fd81/analysis/1366640335/. This is most likely a false positive and can be ignored.

VisTrails fails after upgrading to OSX 10.9

Reinstalling XQuartz should solve the problem.