Difference between revisions of "Main Page"
(→News) |
(Update to reflect the new VisTrailsJL) |
||
| (19 intermediate revisions by 6 users not shown) | |||
| Line 1: | Line 1: | ||
= VisTrails = | |||
VisTrails | '''VisTrails''' is an open-source scientific workflow and provenance management system developed at the [https://vida.engineering.nyu.edu/ VIDA Center] at New York University. It supports computational science by capturing and managing the complete history of the exploratory process: the workflows, their executions, and the results they produce. | ||
[[ | VisTrails is actively developed again. The new version, '''[[VisTrailsJL]]''', is a complete reimplementation in [https://julialang.org/ Julia] that brings modern performance, notebook-based workflow authoring, and native compatibility with existing <code>.vt</code> files. See the [https://github.com/VIDA-NYU/VisTrailsJL GitHub repository] to get started. | ||
== What's New == | |||
After a hiatus since 2018, VisTrails is back. '''VisTrailsJL''' (v2.2) is a ground-up reimplementation in Julia that preserves everything that made the original system valuable — comprehensive provenance, visual workflow management, and support for real scientific use cases — while modernizing the foundation: | |||
* '''Julia reimplementation''' — Julia's JIT compilation brings performance suitable for demanding scientific workflows, and its rich ecosystem (DataFrames.jl, DifferentialEquations.jl, Plots.jl) is a natural fit. | |||
* | * '''Notebook-based workflow authoring''' — Workflows can now be defined directly in Jupyter notebooks using simple <code>#|</code> directives, with no GUI required. | ||
* '''Full <code>.vt</code> compatibility''' — Existing workflows created with the Python version can be loaded, replayed, and visualized without modification. | |||
* | * '''Git-native version control''' — Standard git replaces the custom versioning infrastructure for workflow history. | ||
* | * '''Python interoperability''' — Existing Python modules and libraries remain accessible via PyCall.jl. | ||
* | |||
* | |||
The original Python codebase (v2.2) is preserved in the repository for reference and compatibility testing. | |||
; Quick links | |||
: [https://github.com/VIDA-NYU/VisTrailsJL GitHub (VisTrailsJL)] | [[Documentation]] | [[Publications, Tutorials and Presentations]] | [[MailingLists|Mailing Lists]] | |||
== Core Features == | |||
== | === Provenance and Workflow History === | ||
A | A defining feature of VisTrails is its '''comprehensive provenance infrastructure'''. Unlike systems that track only the current state of a workflow, VisTrails maintains the full history of every step taken during an exploratory analysis — what was tried, what was changed, and what results each version produced. This enables users to: | ||
* Navigate and compare workflow versions in an intuitive tree interface | |||
* Undo changes without losing intermediate results | |||
* Visually diff two workflows and their outputs side by side | |||
* Reproduce any prior result exactly, long after it was first computed | |||
Provenance information is stored as XML or in a relational database (Python version), or managed via standard git (Julia version). | |||
=== Building and Running Workflows === | |||
VisTrails supports workflows expressed as '''dataflows''', with support for functional loops and conditional branching. Workflows can be run interactively through the GUI or in batch mode via a server. The system is designed to connect loosely coupled resources — specialized libraries, web services, and grid computing infrastructure. | |||
In VisTrailsJL, workflows can also be defined declaratively in Jupyter notebooks: | |||
<pre> | |||
#| workflow: my_analysis | |||
#| module-id: input | |||
#| module-type: basic:Integer | |||
#| params: | |||
#| - value: 42 | |||
#| module-id: process | |||
#| module-type: mypackage:Transform | |||
#| inputs: | |||
#| - value: input.value | |||
#| execute | |||
</pre> | |||
Packages and modules are easy to add. The <code>JuliaSource</code> and <code>PythonSource</code> module types allow custom code to be embedded directly in a workflow without creating a full package. | |||
=== Publishing Reproducible Results === | === Publishing Reproducible Results === | ||
< | VisTrails 2.0 introduced support for embedding reproducible results directly in LaTeX/PDF documents via a companion LaTeX package. A figure in a compiled PDF becomes active: clicking it invokes VisTrails and re-executes the workflow that produced it on any machine with the software installed. | ||
<pre> | |||
\usepackage{vistrails} | \usepackage{vistrails} | ||
\begin{figure} | \begin{figure} | ||
\begin{center} | \begin{center} | ||
\subfigure[a=0.9]{\vistrail[filename=alps.vt, version=2, pdf]{width=8cm}} | |||
\caption{Clicking this figure retrieves and re-runs the workflow that produced it.} | |||
\end{center} | |||
\end{figure} | |||
</pre> | |||
=== Querying and Refining Workflows === | |||
Users can construct expressive queries over a collection of workflows using the same interface used to build them. An '''analogy mechanism''' allows complex modifications to be applied to one workflow by example from another, without manually editing workflow specifications — useful when a family of related analyses needs to evolve together. | |||
=== Visualizing and Comparing Results === | |||
VisTrails provides a '''spreadsheet view''' for comparing the results of multiple workflows or multiple parameterizations of the same workflow side by side. The visual diff interface highlights structural differences between two workflow versions. Workflows and their version trees can be rendered as SVG (VisTrailsJL) or displayed on large-format display walls. | |||
== Getting Started == | |||
=== VisTrailsJL (Julia — current) === | |||
<pre> | |||
# Clone the repository | |||
git clone https://github.com/VIDA-NYU/VisTrailsJL.git | |||
cd VisTrailsJL/julia | |||
= | # Install dependencies | ||
julia --project=. -e 'using Pkg; Pkg.instantiate()' | |||
# Load and render an existing workflow | |||
julia --project=. -e ' | |||
using VisTrailsJL | |||
vt = load_vistrail("../examples/gcd.vt") | |||
workflow = get_pipeline(vt) | |||
render_pipeline_svg(workflow, "workflow.svg") | |||
' | |||
</pre> | |||
See the [https://github.com/VIDA-NYU/VisTrailsJL/blob/v2.2/julia/QUICKSTART.md Quickstart Guide] for a full walkthrough. | |||
== | === Python VisTrails (legacy reference) === | ||
The original Python version (v2.2, requires Python 2 / PyQt4) is preserved in the repository for reference and for loading existing <code>.vt</code> files in legacy environments. | |||
<pre> | |||
# GUI mode | |||
python vistrails/run.py | |||
# Batch mode | |||
python vistrails/run.py --batch [options] | |||
</pre> | |||
== | == Projects Using VisTrails == | ||
VisTrails has supported real scientific workflows across a wide range of domains. The following projects reflect the breadth of communities that have relied on the system. | |||
{| class="wikitable" | {| class="wikitable" | ||
|- | |- | ||
! NASA Climate Data Analysis | ! USGS Habitat Modeling | ||
! NASA Climate Data Analysis | |||
! DOE CDAT | ! DOE CDAT | ||
|- | |- | ||
| [[Image:usgs.png|200px|left]] | |||
| [[Image:nasa.png|200px|left]] | | [[Image:nasa.png|200px|left]] | ||
| [[Image:cdat.png|200px|left]] | | [[Image:cdat.png|200px|left]] | ||
|} | |} | ||
{| class="wikitable" | {| class="wikitable" | ||
|- | |- | ||
! ALPS Simulations | |||
! NSF STC CMOP | ! NSF STC CMOP | ||
! NSF CDI Wildfire | ! NSF CDI Wildfire | ||
|- | |- | ||
| [[Image:alps-shot.png|200px|left]] | |||
| [[Image:cmop-ss.png|200px|left]] | | [[Image:cmop-ss.png|200px|left]] | ||
| [[Image:wildfire.png|200px|center]] | | [[Image:wildfire.png|200px|center]] | ||
|} | |||
{| class="wikitable" | |||
|- | |||
! NSF DataONE-EVA | |||
|- | |||
| [[Image:eva.png|200px|left]] | | [[Image:eva.png|200px|left]] | ||
|} | |} | ||
== [[Vistrails and Teaching]] == | [https://vistrails.org/index.php/Projects_using_VisTrails See all projects using VisTrails] | ||
== VisTrails in Teaching == | |||
VisTrails has been used as a teaching tool in courses on Scientific Visualization and Digital Media. Its provenance infrastructure makes it particularly effective in educational settings, where capturing and comparing student workflows provides rich feedback for instructors and learners alike. | |||
Our [http://www.cs.utah.edu/~juliana/pub/vistrails-teaching-eurographics2010.pdf paper] describing a provenance-rich teaching methodology received the '''Best Paper Award''' at Eurographics 2010 Education. | |||
[[Vistrails and Teaching|More on VisTrails and Teaching]] | |||
== System Documentation == | |||
* [[Documentation|Documentation overview]] | |||
* [https://github.com/VIDA-NYU/VisTrailsJL/blob/v2.2/julia/README.md VisTrailsJL README] | |||
* [https://github.com/VIDA-NYU/VisTrailsJL/blob/v2.2/julia/QUICKSTART.md Quickstart Guide] | |||
* [https://github.com/VIDA-NYU/VisTrailsJL/blob/v2.2/julia/docs/IMPLEMENTATION_STATUS.md Implementation Status] | |||
* [[FAQ]] | |||
* [[Users_Guide|Python User's Guide (legacy)]] | |||
To report bugs or request features, please use the [https://github.com/VIDA-NYU/VisTrailsJL/issues issue tracker]. | |||
For questions not covered by the documentation, post to the [https://vistrails.org/index.php/MailingLists mailing list]. | |||
== Citing VisTrails == | |||
If you use VisTrails or VisTrailsJL in your research, please cite the relevant work: | |||
'''Original VisTrails system:''' | |||
<pre> | |||
@inproceedings{vistrails2006, | |||
title = {VisTrails: visualization meets data management}, | |||
author = {Callahan, Steven P and Freire, Juliana and Scheidegger, | |||
Carlos E and Silva, Cl{\'a}udio T and Vo, Huy T}, | |||
booktitle = {Proceedings of the 2006 ACM SIGMOD International Conference | |||
on Management of Data}, | |||
pages = {745--747}, | |||
year = {2006}, | |||
doi = {10.1145/1142473.1142574} | |||
} | |||
</pre> | |||
'''VisTrailsJL (Julia reimplementation):''' | |||
<pre> | |||
@software{vistrailsjl2025, | |||
title = {VisTrailsJL: A Julia Implementation of VisTrails}, | |||
author = {Silva, Claudio T}, | |||
year = {2025}, | |||
url = {https://github.com/VIDA-NYU/VisTrailsJL} | |||
} | |||
</pre> | |||
[[Publications, Tutorials and Presentations|Full publication list]] | |||
== People == | |||
[[People]] | |||
== Sponsors == | == Sponsors == | ||
This work has been | This work has been supported in part by the National Science Foundation under grants | ||
[http://www.nsf.gov/awardsearch/showAward.do?AwardNumber=0905385 IIS-0905385], | [http://www.nsf.gov/awardsearch/showAward.do?AwardNumber=0905385 IIS-0905385], | ||
[http://www.nsf.gov/awardsearch/showAward.do?AwardNumber=0844572 IIS-0844572], | [http://www.nsf.gov/awardsearch/showAward.do?AwardNumber=0844572 IIS-0844572], | ||
[http://www.nsf.gov/awardsearch/showAward.do?AwardNumber=0746500 IIS CAREER-0746500], | [http://www.nsf.gov/awardsearch/showAward.do?AwardNumber=0746500 IIS CAREER-0746500], | ||
[http://www.nsf.gov/awardsearch/showAward.do?AwardNumber=0751152 CNS-0751152], | [http://www.nsf.gov/awardsearch/showAward.do?AwardNumber=0751152 CNS-0751152], | ||
[http://www.nsf.gov/awardsearch/showAward.do?AwardNumber=0513692 IIS-0513692], | [http://www.nsf.gov/awardsearch/showAward.do?AwardNumber=0513692 IIS-0513692], | ||
[http://www.nsf.gov/awardsearch/showAward.do?AwardNumber=0401498 CCF-0401498], | [http://www.nsf.gov/awardsearch/showAward.do?AwardNumber=0401498 CCF-0401498], | ||
and others; by the Department of Energy under the SciDAC program (SDM, VACET, and UV-CDAT); | |||
and by IBM Faculty Awards (2005–2008) and a University of Utah Seed Grant. | |||
the Department of Energy under the SciDAC program | |||
( | |||
IBM Faculty Awards ( | |||
== Related == | |||
[[BirdVis]] | | |||
[http://www.crowdlabs.org CrowdLabs] | | |||
[[RepeatabilityCentral]] | | |||
[[ProvenanceAnalytics]] | | |||
[[Provenance: potpourri]] | [[Provenance: potpourri]] | ||
Latest revision as of 22:15, 23 April 2026
VisTrails
VisTrails is an open-source scientific workflow and provenance management system developed at the VIDA Center at New York University. It supports computational science by capturing and managing the complete history of the exploratory process: the workflows, their executions, and the results they produce.
VisTrails is actively developed again. The new version, VisTrailsJL, is a complete reimplementation in Julia that brings modern performance, notebook-based workflow authoring, and native compatibility with existing .vt files. See the GitHub repository to get started.
What's New
After a hiatus since 2018, VisTrails is back. VisTrailsJL (v2.2) is a ground-up reimplementation in Julia that preserves everything that made the original system valuable — comprehensive provenance, visual workflow management, and support for real scientific use cases — while modernizing the foundation:
- Julia reimplementation — Julia's JIT compilation brings performance suitable for demanding scientific workflows, and its rich ecosystem (DataFrames.jl, DifferentialEquations.jl, Plots.jl) is a natural fit.
- Notebook-based workflow authoring — Workflows can now be defined directly in Jupyter notebooks using simple
#|directives, with no GUI required. - Full
.vtcompatibility — Existing workflows created with the Python version can be loaded, replayed, and visualized without modification. - Git-native version control — Standard git replaces the custom versioning infrastructure for workflow history.
- Python interoperability — Existing Python modules and libraries remain accessible via PyCall.jl.
The original Python codebase (v2.2) is preserved in the repository for reference and compatibility testing.
- Quick links
- GitHub (VisTrailsJL) | Documentation | Publications, Tutorials and Presentations | Mailing Lists
Core Features
Provenance and Workflow History
A defining feature of VisTrails is its comprehensive provenance infrastructure. Unlike systems that track only the current state of a workflow, VisTrails maintains the full history of every step taken during an exploratory analysis — what was tried, what was changed, and what results each version produced. This enables users to:
- Navigate and compare workflow versions in an intuitive tree interface
- Undo changes without losing intermediate results
- Visually diff two workflows and their outputs side by side
- Reproduce any prior result exactly, long after it was first computed
Provenance information is stored as XML or in a relational database (Python version), or managed via standard git (Julia version).
Building and Running Workflows
VisTrails supports workflows expressed as dataflows, with support for functional loops and conditional branching. Workflows can be run interactively through the GUI or in batch mode via a server. The system is designed to connect loosely coupled resources — specialized libraries, web services, and grid computing infrastructure.
In VisTrailsJL, workflows can also be defined declaratively in Jupyter notebooks:
#| workflow: my_analysis #| module-id: input #| module-type: basic:Integer #| params: #| - value: 42 #| module-id: process #| module-type: mypackage:Transform #| inputs: #| - value: input.value #| execute
Packages and modules are easy to add. The JuliaSource and PythonSource module types allow custom code to be embedded directly in a workflow without creating a full package.
Publishing Reproducible Results
VisTrails 2.0 introduced support for embedding reproducible results directly in LaTeX/PDF documents via a companion LaTeX package. A figure in a compiled PDF becomes active: clicking it invokes VisTrails and re-executes the workflow that produced it on any machine with the software installed.
\usepackage{vistrails}
\begin{figure}
\begin{center}
\subfigure[a=0.9]{\vistrail[filename=alps.vt, version=2, pdf]{width=8cm}}
\caption{Clicking this figure retrieves and re-runs the workflow that produced it.}
\end{center}
\end{figure}
Querying and Refining Workflows
Users can construct expressive queries over a collection of workflows using the same interface used to build them. An analogy mechanism allows complex modifications to be applied to one workflow by example from another, without manually editing workflow specifications — useful when a family of related analyses needs to evolve together.
Visualizing and Comparing Results
VisTrails provides a spreadsheet view for comparing the results of multiple workflows or multiple parameterizations of the same workflow side by side. The visual diff interface highlights structural differences between two workflow versions. Workflows and their version trees can be rendered as SVG (VisTrailsJL) or displayed on large-format display walls.
Getting Started
VisTrailsJL (Julia — current)
# Clone the repository
git clone https://github.com/VIDA-NYU/VisTrailsJL.git
cd VisTrailsJL/julia
# Install dependencies
julia --project=. -e 'using Pkg; Pkg.instantiate()'
# Load and render an existing workflow
julia --project=. -e '
using VisTrailsJL
vt = load_vistrail("../examples/gcd.vt")
workflow = get_pipeline(vt)
render_pipeline_svg(workflow, "workflow.svg")
'
See the Quickstart Guide for a full walkthrough.
Python VisTrails (legacy reference)
The original Python version (v2.2, requires Python 2 / PyQt4) is preserved in the repository for reference and for loading existing .vt files in legacy environments.
# GUI mode python vistrails/run.py # Batch mode python vistrails/run.py --batch [options]
Projects Using VisTrails
VisTrails has supported real scientific workflows across a wide range of domains. The following projects reflect the breadth of communities that have relied on the system.
| USGS Habitat Modeling | NASA Climate Data Analysis | DOE CDAT |
|---|---|---|
| ALPS Simulations | NSF STC CMOP | NSF CDI Wildfire |
|---|---|---|
| NSF DataONE-EVA |
|---|
See all projects using VisTrails
VisTrails in Teaching
VisTrails has been used as a teaching tool in courses on Scientific Visualization and Digital Media. Its provenance infrastructure makes it particularly effective in educational settings, where capturing and comparing student workflows provides rich feedback for instructors and learners alike.
Our paper describing a provenance-rich teaching methodology received the Best Paper Award at Eurographics 2010 Education.
More on VisTrails and Teaching
System Documentation
- Documentation overview
- VisTrailsJL README
- Quickstart Guide
- Implementation Status
- FAQ
- Python User's Guide (legacy)
To report bugs or request features, please use the issue tracker.
For questions not covered by the documentation, post to the mailing list.
Citing VisTrails
If you use VisTrails or VisTrailsJL in your research, please cite the relevant work:
Original VisTrails system:
@inproceedings{vistrails2006,
title = {VisTrails: visualization meets data management},
author = {Callahan, Steven P and Freire, Juliana and Scheidegger,
Carlos E and Silva, Cl{\'a}udio T and Vo, Huy T},
booktitle = {Proceedings of the 2006 ACM SIGMOD International Conference
on Management of Data},
pages = {745--747},
year = {2006},
doi = {10.1145/1142473.1142574}
}
VisTrailsJL (Julia reimplementation):
@software{vistrailsjl2025,
title = {VisTrailsJL: A Julia Implementation of VisTrails},
author = {Silva, Claudio T},
year = {2025},
url = {https://github.com/VIDA-NYU/VisTrailsJL}
}
People
Sponsors
This work has been supported in part by the National Science Foundation under grants IIS-0905385, IIS-0844572, IIS CAREER-0746500, CNS-0751152, IIS-0513692, CCF-0401498, and others; by the Department of Energy under the SciDAC program (SDM, VACET, and UV-CDAT); and by IBM Faculty Awards (2005–2008) and a University of Utah Seed Grant.
Related
BirdVis | CrowdLabs | RepeatabilityCentral | ProvenanceAnalytics | Provenance: potpourri






