Difference between revisions of "Users Guide"

From VistrailsWiki
Jump to navigation Jump to search
Tag: Manual revert
(29 intermediate revisions by 10 users not shown)
Line 1: Line 1:
If you are using, or planning to use, VisTrails, please subscribe to the vistrails users mailing list. Details on how to do that are available [[Downloads|here]].
If you are using, or planning to use, VisTrails, please subscribe to the [[MailingLists|vistrails users mailing list]].

Up-to-date user's guides in online and pdf versions are available here:

{| class="wikitable" style="margin-left: 20px;"
|v2.2 (Current Release)
|[https://vistrails.org/usersguide/v2.2/html html]
|[https://vistrails.org/usersguide/v2.2/html/VisTrails.pdf pdf]
|dev (Master Branch)
|[https://vistrails.org/usersguide/dev/html html]
|[https://vistrails.org/usersguide/dev/html/VisTrails.pdf pdf]

== Getting_Started ==
Older Releases:
{| class="wikitable" style="margin-left: 20px;"
VisTrails is available on Windows XP, Mac OS X, and Linux. These
versions all have the same functionality and only differ in user
|[https://vistrails.org/usersguide/v2.1/html html]
interface as noted throughout this document.
|[https://vistrails.org/usersguide/v2.1/html/VisTrails.pdf pdf]
There are different download options, available
[http://www.vistrails.org/index.php/Downloads here]. It is substantially
|[https://vistrails.org/usersguide/v2.0/html html]
easier to start with a binary version, and this is encouraged for first-time
|[https://vistrails.org/usersguide/v2.0/html/VisTrails.pdf pdf]
users. If you decided on a source version (maybe because a binary version
for your architecture is not available at this time), please follow the
instructions on building the software from source available
|[https://vistrails.org/usersguide/v1.7/html html]
[http://www.vistrails.org/index.php/Building_From_Source here].
|[https://vistrails.org/usersguide/v1.7/html/VisTrails.pdf pdf]
Starting up the binary version is system depended. On Windows XP and Mac OS X, it requires clicking on the application icon. To start the binary version on any system, you should change directory to "src/vistrails/trunk/vistrails/", where the "vistrails.py" file is available. You can start VisTrails with the following command: "python vistrails.py -l".
Depending on a number of factors, it can take a few seconds for the system to start up. You will see a splash screen while that happens. On the console, you will see some messages that show the packages being loaded. On my Mac OS X system, I get the following:
  Initializing  vtk
  Initializing  pythonCalc
  Initializing  spreadsheet
  Loading Spreadsheet widgets...
    ==> Successfully import <Basic Widgets>
    ==> Successfully import <Image Viewer>
    ==> Successfully import <VTK Viewer>
    ==> Successfully import <HTML Viewer>
    ==> Successfully import <SVG Widgets>
Also, I get two separate windows, the VisTrails Builder:
[[Image:VisTrails_Builder.png | 750px]]
And the VisTrails Spreadsheet:
[[Image:VisTrails_Spreasheet.png | 750px]]
You are now ready to load a vistrail inside the system. Go to the Builder, and under "File",
there will be an "Open" option. After clicking it, you will be giving a list of files, and you can
load any of the vistrails there. For instance, if you load the "vtk_book_3rd_p189.xml", your
screen will look like this:
[[Image:VisTrails_Builder_with_vtk_book_3rd_p189.xml.png | 750px]]
Each "oval" correspond to a different workflow. If you click on "final", you can "execute" that workflow
either by clicking on the execute the workflow icon: [[Image:execute_workflow_icon.png | 50px]].
More details on interacting with the components of VisTrails are available below.
== VisTrails Builder ==
You can create and edit dataflows (workflows) using
the Vistrail Builder user interface.
The dataflow specifications are saved in
a repository that can be either local or remote. For now, we
only discuss local repository here, which are "xml" files by
A key feature of VisTrails is the support for
full provenance of the exploration process. For this,
we introduced the notion of a visual trail, or a
vistrail.  A vistrail captures the
evolution of a dataflow---all steps followed to construct a set of
workflows. It represents several versions of a dataflow (which
differ in their specifications), their relationships, and their
instances (which differ in the parameters used in each particular
execution). VisTrails uses a change-based model to capture provenance.
As the scientist makes modifications to a particular dataflow, the
provenance mechanism records those changes. 
Instead of storing a set of related dataflows, VisTrails stores the
operations or changes that are applied to the dataflows, e.g., the
addition of a module, the modification of a parameter, etc.
This representation is both simple and compact---it uses substantially
less space than the alternative of storing multiple versions of
a dataflow. In addition, it enables the construction of an intuitive
interface that allows scientists to both understand and interact with
the history of the dataflow through these changes.
A tree-based view allows a scientist to return to a previous version
in an intuitive way; to undo bad changes; to compare different
dataflows; and be reminded of the actions that led to a particular
The Builder is roughly divided into three main regions. In the left, a
list of modules that can be used for building particular workflows are
listed. The middle is the main workflow interaction area, which can be
toggled to display an instance of a workflow, or the vistrail, which
corresponds to the collection of a number of different instances. On
the right, there is a context sensitive menu that can be used for
operating on the properties of what is being shown in the middle.
== VisTrails Modules ==
Modules are the basic building blocks of workflows. In general, each
module has a number of inputs, a number of outputs, and a set of
parameters that can be configured by the user. A workflow is built by
putting a collection of modules together, to achieve a desirable
The list of modules available depends on the packages that were loaded
by the user. They appear on the left side of the builder, and when the
builder is in "pipeline" mode, modules can be dragged from the left,
and placed into the workflow. Once modules have been moved, their
inputs and outputs can be connected, as long as they have appropriate
To change the parameters of a module, first one needs to click on the
module. Once that is done, all the methods that can set module
parameters will appear on the panel on the right. After the user has
determined the method that they want to use, you need to drag the
method to the "Properties" panel, which is directly below the
"Methods" panel. At this point, you can select the text edit boxes in
the panel and type in a value.  The labels to the left of each text
edit box indicate the parameter input type (double – number with a
decimal point or int – whole number) and the name of the
parameter. When a module is changed, a new instance of the workflow
with the changed parameters is added to the vistrails. (If PIP is
turned on, you will see the change immediately in the version tree.)
=== Using the VisTrails Version Tree ===
As you make changes to the modules of a workflow, the instances
are automatically added to the vistrail.  This allows you to go back
to a previous version (higher up in the tree), and use a different set
of parameters to modify the data without losing any of the changes you
have already made.
It is possible to build a very large number of different versions quite quickly with
the system. In order to help the user, the system provides a number of
different ways of what workflows to show in the version tree. The default
is to only show "named" workflows, or the ones that are at the leaves.
All other ones are collapsed by default.
To name a workflow, you need to select it while in Version Tree mode. When that is
done, the panel on the right will have a "Version Tag" text box that can be used
for naming the workflow.
You need to type in a name
for the module and select the Change button to the right.  This will
place that name in the selected workflow in the version tree.
=== Working With Modules ===
Modules are
connected with lines that represent dataflow connection between modules.
Modules can be connected or disconnected, and added or deleted from a
To see how this works, we will change the original data from the
vtkQuadric module to a vtkCylinder module in the workflow labeled "final"
in the vtk_book_3rd_p189.xml vistrail.
==== Creating a New Module ====
Since there are literally 100s of modules, the easiest way to find the module we
want is to search for it. On the left panel, where all the modules are listed, there
is a "Search" text box on top. As you type "vtkcyli", the system will automatically
filter the modules, and at one point, you will be able to see the vtkCylinder
module. Alternatively, you could have searched for the module by actually looking at
the modules one by one. You can now drag the module to the pipeline.
==== Creating a Dataflow From Modules ====
To change the data source from vtkQuadric to
vtkCylinder, you need replace the output of the first with the
second.  Notice that the line connecting each of the modules starts
and ends in a small box at the top or bottom of the modules.
To disconnect the vtkQuadric output from
vtkSampleFunction, you can select the connection line, and
delete it.
To connect the vtkCylinder output to the
vtkSampleFunction input, place the cursor over the small box in
the lower right corner of the vtkCylinder module, click and
hold down the mouse button.  Drag the
cursor away from vtkCylinder and a line will appear.  Drag the
end of the line to the left most small input box in the upper left
corner of the vtkSampleFunction module and release the mouse
button.  The line now connects vtkCylinder and
To check that you were successful, just execute
the pipeline, and the result should appear on the
The data in the cell
shows a series of cylindrical shapes.
The input ports of the module will only accept connections from
correct output ports.  Dropping a connection on a module will cause it
to snap to the nearest appropriate port.  However, when a module
accepts multiple ports of the same type, care must be take as to how
the connection is made.  The easiest way to ensure proper connectivity
is to begin the connection at the module with multiple ports of the
same type and drag it to the appropriate endpoint.  To determine the
exact port to begin at, simply hover the mouse cursor over the port to
query and a small note will be displayed with information about the
port in question.
==== Accessing Module Parameters ====
You will notice that when you select the vtkCylinder component
in the visualization panel on the left, there are no parameters to
adjust in the lower right panel on the right.  Only the parameters
that have been modified by the user are displayed to prevent clutter.
To modify a parameter from its default setting,
clickon the word
SetRadius and continue to hold down
the mouse button. 
SetRadius becomes highlighted.  With the
mouse button still held down, drag the cursor to the "Properties" panel directly
below.  A parameter text edit box is shown for
SetRadius.  You can enter a new radius size for the
vtkCylinder component.
== VisTrails Spreadsheet ==
== Advanced Topics ==
=== [[UsersGuideVisTrailsPackages | VisTrails Packages]] ===
== Examples ==
=== Working with Web Services and HTML ===
In this example, we will build a very simple pipeline that invokes a web service and publishes its results on a html page.
We will build a page displaying the exchange currency rates of three countries.
In order to do that, first you have to enable the webServices package in your <code>startup.py</code> file by adding the following lines:
<code>wsdlList</code> is a list of urls pointing to a WSDL (Web Services Description Language) document which describes a Web service. If you know other urls and want to use them in VisTrails you might want to add them.  
The webService package will generate a module for each published method in a web service. In our example, we will be interested in the <code>getRate</code> method of CurrencyExchangeService.
;Creating a new VisTrail
Start VisTrails. On the Builder Window, click on '''New''' button at the top left of the toolbar to create a new VisTrail.
The Builder Window will now look like this:
;Adding Modules to the Pipeline
At this point, we must begin to add modules to our pipeline.
In the '''Search''' field of the '''Modules Panel''' (the rightmost pane of the Builder Window), type in ''getRate''. You will notice that a module under the WebService branch will be selected. Now you can add it by clicking-and-dragging (on Mac also hold '''alt''' key) it over the Pipeline View area represented by the darker grey canvas on the Builder Window. This module acquires the exchange rate between 2 countries. We need to add more modules that provide input for getRate module and other modules that will receive getRate module output. In the same way described above, add the following modules (the number in parenthesis represents the number of modules you should add):
* String (3)
* getRate (1)
* PythonSource (1)
* RichTextCell (1)
After adding these modules, your builder should be similar to the one shown below.
The modules described above were added to the pipeline view, but remain unconnected. This is a good point for us to save our work. First we will name this pipeline as ''modules''. To do that, we need to switch to the version tree view by selecting the '''Version Tree''' tab in the top pane on the right of the buider window. In the '''Version Tag''' field type in ''modules'' and click on change. Your version tree should now look similar to this:
Now we save our work by clicking on the '''Save''' button (the third from left to right on the Builder Toolbar) or pressing Ctrl+S (Command + S on Mac). Give a name to your file, such as <code>tutorial_webservice_html.xml</code>. Then we can go back to the pipeline view by selecting the '''Pipeline''' tab.
;Connecting Modules in the Pipeline
Now that we have all the modules necessary to process our data, we must
connect them properly to fully form our processing pipeline. Each module
box has a set of input ports, located in the upper-left hand corner of the box,
and a set of output ports, located in its lower-right hand corner.  In order to connect two modules
together, click-and-drag the appropriate output box contained in the module
to the module using it as its input. For modules that have more than one input/output,
you can see the type of each individual port when hovering the mouse (as a tooltip). VisTrails will also snap a connection to matching ports.
So, for example, by clicking and dragging
the rightmost output port of the String module to one of the input ports of getRate module,
a connection will be made between the two modules. This connection is
indicated by a solid black line. Now we must continue
connecting our pipeline. Add connections between the following modules:
* The output port 2 of the leftmost '''String''' to the input port 1 of the leftmost '''getRate'''
* The output port 2 of the middle '''String''' to the input port 2 of the leftmost '''getRate'''
* The output port 2 of the middle '''String''' to the input port 1 of the rightmost '''getRate'''
* The output port 2 of the rightmost '''String''' to the input port 2 of the rightmost '''getRate'''
Your pipeline should look like this:
Let's give the name ''first connections'' to this pipeline. Repeat the steps we performed above to change a version tag.
;Module Customization and Parameterization
You may have noticed that the PythonSource module does not contain any input or output port. This module is designed to contain any piece of Python code. So we as pipeline builders, we must define the input and the output ports and include the piece of code to manipulate the inputs and generate outputs. We are going to use this module to create our html page.
First, double-click on the PythonSource module. A new window called '''PythonSource Configuration''' will be shown.
Add two input ports, <code>value1</code> and <code>value2</code>, both of type <code>Float</code> (the same output type of getRate module)
Add a output port, <code>htmlFile</code> of type <code>File</code>.
Now type the following code in the text area:
  import datetime
  value1 = self.getInputFromPort("value1")
  value2 = self.getInputFromPort("value2")
  text = '<nowiki><HTML><TITLE>Exchange Currency Rate Table </TITLE><BODY BGCOLOR="#FFFFFF"></nowiki>'
  text += '<nowiki><TABLE WIDTH="70%" BORDER="1" BGCOLOR="#FFFFFF" CELLPADDING="4"> </nowiki>'
  text += '<nowiki><TR><TD>USA</TD><TD>Brazil</TD><TD><B></nowiki>'
  text += str(value1)
  text += '<nowiki></B></TD></TR><TR><TD>Brazil</TD><TD>China</TD><TD><B></nowiki>'
  text += str(value2)
  text += '<nowiki></B></TD></TR></nowiki>'
  text += '<nowiki><TR><TD colspan="3" align="right">Last updated on: </nowiki>'
  text += datetime.datetime.now().strftime("%m-%d-%Y %I:%M:%S")
  text += '<nowiki></TD></TABLE></BODY></HTML></nowiki>'
  output = self.interpreter.filePool.create_file()
  my_file = open(str(output.name), 'w')
The PythonSource Configuration should look similar to the one below.
It is out of the scope of this example to explain the Python language, but what it is important about PythonSource are the following lines
  value1 = self.getInputFromPort("value1")
  value2 = self.getInputFromPort("value2")
They are used to get data from input ports and to send data to output ports. You should use the same name you typed in the fields on the top of this window.
Click on the Ok button. Another good point to save our work.
Now our PythonSource module is ready to connect to both '''getRate''' and to '''RichTextCell''' modules. Add the following connections in the same way described above:
* The output port of the leftmost '''getRate''' to the input port 1 of '''pythonSource'''
* The output port of the rightmost '''getRate''' to the input port 2 of '''pythonSource'''
* The output port of '''pythonSource''' to the input port 1 of '''RichTextCell'''
At the present time, we have a connected pipeline, but we still need to set a few parameters in order to getRate to work properly. '''getRate''' receives 2 strings (country names) and return a float (rate). We will set the country names in the '''String''' modules.
Parameters for the pipeline are
set on a ''per-module basis''. So, in order to tune a parameter, select the module
containing the parameter to change by left clicking on it. You will notice it is
highlighted in yellow when it is selected. Now, once the appropriate module
is selected, click on the tab in the rightmost pane labelled '''Methods'''.
Once the '''Methods''' pane is active, a list of all methods supported by
this module are displayed. To change a parameter in this module, select the
method governing the parameter to be changed, in this case the value method and drag it to the '''Properties''' pane (right below). This module
should have the parameter of type String set to the value ''usa''.
Change the value methods of the other String modules to '''brazil''' and '''china''', respectively.
At this point, your Builder Window should look similar to the one shown
Now that the modules are connected and have a working set of parameters, the pipeline is ready to be visualized. Pressing the '''Execute Current pipeline'''
button ([[Image:Execute_workflow_icon.png]]) will send the current pipeline with the set parameters to the VisTrails
Spreadsheet, with the following result:
You may also want to download the [http://www.vistrails.org/images/Tutorial_webservice_html.xml vistrails file] corresponding to this example.
If you want to change the country, England instead of China for example, you have to re-set the parameter in the String module, and the hardcoded name in the PythonSource, which is not desirable. As an exercise, receive the names of the countries also as input ports and replace the hardcoded names.
== Summary of Technical Terms ==

Latest revision as of 21:01, 3 May 2022

If you are using, or planning to use, VisTrails, please subscribe to the vistrails users mailing list.

Up-to-date user's guides in online and pdf versions are available here:

v2.2 (Current Release) html pdf
dev (Master Branch) html pdf

Older Releases:

v2.1 html pdf
v2.0 html pdf
v1.7 html pdf