Development/2012

From VistrailsWiki
Jump to navigation Jump to search

Dec 19, 2012

Updates

  • [DK] USGS Work:
    • Modules that use package settings
    • Modules that do not update when input ports are not set
  • [DK] Reproducibility Workshop
  • [TE] Using a vistrail to test VisTrails (the approach adopted by UV-CDAT)
  • [TE] Parallel VisTrails (Remote PBS)
    • Trying to add PBS support to BatchQ
    • Have a working PBS server for testing
  • [BB] UV-CDAT Bug status
    • 2 critical bugs and 5 minor bugs/enhancements remaining

Items to Discuss

  • iPython-style faceted output:
    • iPython has a very nice feature where output can have multiple facets meaning the output in a command-line environment is different from that in a GUI environment
    • Basically, an output is actually a dictionary where keys are mime-types and values is the corresponding output for that type
    • A similar approach would be very useful for cases where we wish to run a VTK workflow that on the GUI appears in the spreadsheet cell but on the command-line generates a file, and in crowdLabs uses Wendel's WebGL solution
    • That way, a user writes one workflow that runs everywhere
    • I think the output dictionary should be mime-type -> function, however, as we don't want to produce all the outputs if we only need one at a time
    • Could also have modules that lookup the output types and could be used for configuring output types.
    • Should use the same cell names for output
  • Merge refactor-add-vistrails-prefix branch?
    • Want to discuss with Rémi
  • Python on menubar for UV-CDAT
    • [ES] Only way to change this is to change this in a plist directory of the python being run
    • [ES] Sent this to Charles at some point, will forward to Ben
  • Global variables in UV-CDAT
    • Can have multilpe projects open at once, each should have its own set of variables
    • Possibility of copying variable from one project to another?
  • NASA work
    • use batchq and pbs, batchq uses lsf instead of pbs
    • batchq uses suspended state
    • execute workflow and run in the background
  • Thumbnails and mime type issue
    • python2.7 issue?

Dec 12, 2012

Updates

  • [TE] Finished Vistrail Variable support for Parameter explorations
  • [TE] Finished Improving interface for uploading to crowdlabs
  • [TE] Talking with Eduardo and Troyer about adding support for PBS to BatchQ
  • [BB] UVCDAT Bugs
    • format pipeline automatically
    • multi-variable drag'n'drop
    • plot and variable drag'n'drop behavior

Items to Discuss

  • [BB] UVCDAT 'Python' in menu bar on mac

Dec 05, 2012

Updates

  • [FC] New version of PROV exporter, now generating a PROV-XML from a VisTrails workflow
  • [TE] Working on Vistrail Variable support for Parameter explorations
  • [TE] Working on improving interface for uploading to crowdlabs

Items to Discuss

  • VisTrails: parallel and remote execution (Tommy will lead the discussion)
    • A way of running multiple workflows in parallel in vistrails
    • Add support for cluster frameworks: BatchQ/Hadoop/MapReduce/Remote PBS
  • Idea for parallel execution: use/extend the persistent sub-system so that "make-like dependencies" can be tracked. The spreadsheet should probably be avoided in the middle layers of such parallel executions.
  • VisTrails and usability: http://www.wolfram.com/mathematica/new-in-9 (Claudio will lead the discussion)

Nov 28, 2012

Updates

  • [BB] UV-CDAT bug fixes
    • Redirected python output to a session log, and added command line flag to change the default log file or redirect back to std.out
    • Initial work on setting up vistrails as a git submodule of a new uvcdat-vistrails git repo
  • [ES] Package Installation
    • easy-install branch should work on all platforms (not tested on Linux).
    • currently working on making it platform aware (running easy_install on Windows x on Mac, for example)
  • [TE] Working on VisTrails bug fixes
    • Which ones are most urgent?
  • [DK] DataONE package, a bit of UV-CDAT

Items to Discuss

  • [BB] ParaView releasing new version 3.98
  • [BB] Ideas for VisTrails "DAT" - generalization of UV-CDAT

Nov 21, 2012

Updates

  • [FC] ParallelFlow -- working on creating SSH engines through VisTrails
    • I'm using vgchead for the tests, because there is no zmq (a required module) available for webdb1 (as it is an outdated version Suse)
  • [TE] SubWorkflows now work on crowdlabs
  • [ES] Package Installation
    • branch easy-install works on a Mac 10.7 and above using a binary built with python 2.7.2 (it's in tests area in sourceforge)
      • the problem is that py2app does not support setuptools and we have to use easy_install available on the system. We require python 2.7 and it is included by default on Mac only starting on version 10.7. If you install python 2.7 on 10.6 it will also work on that system.
    • after I commit my changes it will also work on Windows (it will use easy_install available on the binary)
  • [DK] spreadsheet analogy bug, group caching issue, switch to uuid
  • [BB] UV-CDAT bug fixes
    • Fixed calculator command line related bug
    • Added a simple cdms file cache to prevent files from loading/downloading twice, along with a GUI widget to selectively delete cached cdms files

Items to Discuss

  • [JF] Upgrades
    • Developer usability---make it easier to find and define the upgrades
    • Improve error messages
    • Also, want to get to the point where we detect changes in packages and alert developers & users to changes that could be important for provenance
  • [FC] ParallelFlow -- IPython engines must be started remotely, and we also need to copy a configuration file from localhost (controller) to the machine where the ssh engines are located, before the execution starts
    • paramiko is a nice Python module that connects to a remote machine, being able to execute commands and transfer files
    • Is it ok to use this module, or a solution that does not use an external module is preferred?
    • Talk about this next meeting, also see Joel Daniel's work on pushing data over ssh?
  • [ES] Package Installation
    • Just have Snow Leopard users install python 2.7
    • Dealing with scripts and/or executables installed by easy_install
      • Where to put them? The location needs to be added to PATH
      • Put this in the user manual, do not add to PATH by default
    • [DK] what about frameworks/libraries?
      • how would you deal with links to dynamic libraries?
    • still need to work on package repository on vistrails.org
      • support indirection so that we don't have to host the packages
      • also put architectures supported in this repository
  • [JF] Do we have configuration for environment variables in VisTrails?
  • [BB] Moving uvcdat branch to it's own github repo

Nov 14, 2012

Updates

  • [DK] Refactored to be vistrails. imports
    • rope (Python refactoring tool) is awesome
  • [DK] Package installation
    • met with Tommy and Emanuele on Thursday
    • Emanuele is working on detailing the process for installing to ~/.vistrails/Python directory
    • [ES] I made some progress but also found some issues that I would like to discuss
      • When running VisTrails with the same interpreter used by easy_install, when enabling a package that supports easy_install, VisTrails will install it and enable it at runtime. This is using the same infrastructure to install python packages on linux systems but should work on any platform:
  rpy_dict = {'easy_install': 'rpy2',
               'linux-ubuntu': 'python-rpy2'}
  rpy2 = py_import('rpy2', rpy_dict)
  • [DK] DataONE Package
    • GitHub
    • Now supports uploading data to DataONE
  • [TE] JobSubmission
    • Eduardo would like to have synchronous job submission (wait until finished).
  • [TE] DB SubWorkflows is working. Will test on vis-7 then add to crowdlabs.
  • [BB] Added two new command line flags to uvcdat
    • -P --noDebugPopups prevents dialog popups from happening when debug/error messages occur
    • -T TIME --time=TIME runs the gui for the specified TIME in seconds and then quits, useful for basic testing
    • Should these be added to the vistrails master as well?
  • [BB] Calculator commands related bug in uvcdat
    • Likely needs to be largely refactored to prevent further bugs with these

Items to Discuss

  • UV-CDAT Repo?
    • Charles is moving the LLNL repo to GitHub
    • Wants to know if we want to move our UV-CDAT code too
    • We should probably not refactor this now, but we could move the branches?
    • Any reason not to move the branches as is?
  • Issues with calculated variables in UV-CDAT
    • open file from calculator example
    • Ben will ask Charles about information to CDMS variable
  • VisTrails Package Installation
    • It seems that we can't start an interpreter different from the one we are running. Does anybody know about this?
      • Because of the issue above, I couldn't execute system's easy_install from the VisTrails bundle
      • My solution would be to ship easy_install in the VisTrails bundle and use it when installing extra packages
    • Need to test from the binary and frmo Windows
    • Interrupted system call error
      • R_HOME = tmp.readlines() IOError: [Errno 4] Interrupted system call
      • ok when you enable from preferences
  • Refactoring
    • branch where everything uses vistrails. imports
    • e.g. "import core.vistrail.module" becomes "import vistrails.core.vistrail.module"
    • Makes installing into a system python much more straightforward
      • doesn't require PYTHONPATH tweaks
      • deoesn't overlap with other libraries as easily
  • [vistrails-users]
    • I would like to ask if it is possible to build stand alone applications based on vistrails?
      • Emanuele will answer, we can, just not sure what example is best here
    • To try and avoid reinventing the wheel, I am asking if anyone has developed, or knows of a development, of a "text" module. By this I mean, a module that is somewhat equivalent to a String module, but capable of allowing the input of multi-line text in a very simple editor. At this time, plain text is more than adequate (i.e. for handling configuration files, or multi-line test data), but the module should have the capacity to write out the data strings to file.
      • Tommy will answer, String type is not the problem, just want a subclass of String that can have the multi-line entry as its widget. Also suggest having a WriteStringToFile module that writes to temp file (FileSink persists as normal). Could submit a pull request to github.

Nov 7, 2012

Updates

  • [FC] ParallelFlow
    • ParallelFlow can execute a module (or a subworkflow) for different parameters, in different IPython engines (embarrassingly parallelism).
    • It is assumed that, in the machines where the engines are started, we have both VisTrails and IPython. Also, it is assumed that VisTrails is in PYTHONPATH.
    • If IPython engines are local, this means that different threads are created for each engine - these threads are automatically created by IPython. This is particularly useful for multicore machines.
    • IPython engines can also be created in a remote machine using SSH - this functionality still needs to be tested though.
    • Modules can now define serialize and deserialize methods in case they need to be retrieved from the engines (e.g.: 'self' output port). In this case, each module needs to define its own methods. For instance, VTK modules need to implement these methods using VTK's serialization routines.
  • [FC] PROV exporter
    • First version of the PROV exporter in VisTrails - this exporter will be used by the DataONE ProvWG.
  • [TE] JobSubmission
    • Eduardo working with Troels to add features to BatchQ
  • [TE] Add SubWorkflow support to DB
    • Almost done, need to fix some issues with upgrades.
  • [BB] More UV-CDAT bug fixing
  • [DK] UV-CDAT bugs, DataONE additions, Inc report/updates

Items to Discuss

  • [DK] VisTrails packages
    • Emanuele recently asked the py2app developers about making it easier to install other packages (and their dependencies into VisTrails)
    • Reply is here: http://mail.python.org/pipermail/pythonmac-sig/2012-October/023744.html
    • I like the idea of having system-wide and local user vistrails directories for added packages
    • This needs to be integrated with a much easier installation process. Ideally, this looks something like:
      • Users goes to package manager and clicks "Install Package".
      • We have a vistrails.org-based list of available packages that appears as a list.
      • User selects a package and VisTrails determines any VisTrails, python, or other dependencies.
      • We use setuptools or something like it to install the package and its dependencies into the local directories
    • Ideally, this also works for system-based python installs as well, this probably means separate python 2.x directories for VisTrails packages.
    • Also, for "import vistrails" style use of VisTrails, it would be nice to be able to specify which packages will be used instead of relying on the ~/.vistrails/startup.xml settings
    • Potential issues include packages that use external libraries. Can these also be installed into the package directories?
    • Developer support: how does a developer build an installer for a more involved package (e.g. SAHM, ALPS)?
    • [JF] VTK and such will be difficult.
    • [DK] user-hosted packages
    • how does this work for packages?
    • Dave, Tommy, and Emanuele will meet on this

Oct 31, 2012

  • [DK] scripting support (UV-CDAT)
    • dependencies?
    • allow scripting to automatically load dependencies
    • GUI dependencies here:
    • .vistrails clashes between scripting and GUI
  • [JF] organizing workshop on reproducibility
    • need to understand what is out there (actually how they work and fit into big picture)
    • others will need to help with this
    • proposals are out
  • [TE] Added git access as: git://vistrails.org/git/vistrails.git
  • [TE] Subworkflows on crowdlabs not working
    • they are not stored in the database
    • need to have a unique identifier
    • could also copy to the .vistrails directory, possible name clashes
    • in the long-term, nice to have vistrails access to database
  • [BB] Added new uvcdat-next and uvcdat-master branches
    • mirror the LLNL uvcdat development structure
    • allows us to test new features in next before they go into master
  • [BB] packages were failing on the UV-CDAT side
    • shouldn't try and import package when package requirements are not met
    • .uvcdat directory or a sub-directory of .vistrails (as a project-specific directory)
    • [CS} don't do anything right now, need a stable release
    • [CS] need a UV-CDAT workshop
  • [BB] most of the remaining bugs are not crashing things, just annoyances
  • [CS] get Fabio started on DAT
  • [FC] ParallelFlow package
    • execution in the engines not always working
    • starting engines using ssh, could also use webdb machines

Oct 24, 2012

  • [ES] Update: Package distribution
    • Installing extra python packages to be used in the binary
    • easy_install configuration that installs in a custom folder in user's home directory
    • worked for R, rpy
    • possible issues with compiled things
    • should work for all platforms
  • [ES] New problem with rpy package (other than RVector)
    • globalenv variable has changed--capitalization
  • [ES] SourceForge VisTrails Project upgrade
  • [FC] Parallel Flow
    • Local controller and local engines are now started using VisTrails GUI (under Packages menu)
    • Trying now SSH engines (IPython also supports MPI and PBS)
    • Local controller needs to be stopped when VisTrails closes (in case user forgets)
      • if controller is not stopped, will have issues upon restarting because controllers are still there
      • cannot stop controller outside of vistrails
    • Outer subworkflows are transformed into groups -- for inner subworkflows, needs to be recursive
      • problem is that the xml files that define the subworkflows cannot be found on remote machines
      • [DK} look for code that finds the subworkflows
    • have an example, can test with Emanuele's new package idea
    • have test workflow where module takes 10 seconds, executing 4 times with ipython also takes 10 seconds!
    • execution logs from engines are being insereted into provenance
    • machines are also being added for all engines
  • [TE] posted issue on github with temporary directories, added configuration option
  • [TE] crowdlabs executable workflows, external ALPS server is not accessible
    • script to check external error
    • string object not callable error
    • one Titan workflow made non-executable
  • [TE} check email addresses for github for the fixes/closes shorthand
  • [BB] working on UV-CDAT issues
    • checking on mountain lion build
    • expects variables to be cdms variables
    • how to handle all variables (python-based)
  • [BB] Visit: no callbacks on the c++ side coming through, issue with linking to the correct global variables

Oct 17, 2012

  • [FC] paraflow
    • New commit
    • No more startup code
    • Add start/stop menu options
  • [FC] Remoting package
    • Our solution cover this
  • [BB] uvcdat calculator commands
    • Need to talk to Emanuele about making this more robust, and what capabilities it should have.
  • [BB] visit plugin
    • Need to find how to access to methods used to implement the "isWorking" function originally created specially for the python VisTrails plugin
    • Need to see if threading is still needed for the C++ version

Oct 10, 2012

  • Update the People page: http://vistrails.org/index.php/People
    • Also publications are likely out of date
  • Mailing List
    • Huy will check for anyone who signed up on the old vistrails-users list at SCI after we moved the list
    • Then we can write Nick to shutdown the SCI list
  • Remoting package:
  • Parallel Execution
    • Startup code in engines is possible (it executes a script before the initialization of the engines), but still not working. There was an issue before with it, that was apparently solved, but some other people still complained - waiting answer in mailing list
    • Currently changing schema to add the notion of parallel execution
    • Provenance capture:
      • make a new parallel execution entry, or
      • annotate that module was executed in parallel, can annotate each module execution with machine information
      • what about trying to determine which modules were tied to the same parallel execution
    • Still going to look into serialization of some output values (e.g.: self)
    • IPython shipped with VisTrails?
      • How big is iPython? 10-20 MBs
  • Package distribution
    • Ideally, users could download supplemental installers to add packages into binaries
    • Hard to know which pieces belong to each package
    • py2app selects only modules that we are using, hard to separate out particular packages
    • could we tell py2app to install a full python distribution?
  • [DK] bugs, UV-CDAT, NASA meeting, helping Fernando
  • [BB] working on UV-CDAT
  • [ES] working on builds
  • [HV] finishing parallel execution in parallel, not locking GUI when doing execution
  • [TE[ looking into making crowdlabs examples work, gridfields and ALPS

Oct 3, 2012

  • UV-CDAT development
    • status of bug fixing (Ben)
      • to do: msg with summary
    • release date (do we know?)
    • code re-org plans (David)
      • git submodule for vistrails core code
      • refactoring some code where we have added features (e.g. in CDMSPipelineHelper)
      • list issues
      • planning for what Fabio, Jorge, and Feng
    • proposal for interface improvements (Claudio)
      • can we bring in other analysis here?
      • have a start on improving matplotlib support (could better integrate this into VisTrails)
    • discussion of interface improvements for supporting Jorge's workflows
  • VisTrails and IPython
    • Singleton pattern in VisTrails: should it be really changed? Is there a way to destroy the global state?
    • If output port is an object (e.g.: self), serialize it and send it back?
    • How to include the execution (performed in the IPython engines) in the log?
    • How engines should be started?
  • website (www.vistrails.org)
    • The header was updated and the source files added to the master branch
    • Need to decide about the sponsors (currently they have a section on the main page).
  • 2.0.1 release
    • Matthias fixed the ALPS package and the release should be out today
  • Including packages with dependencies
    • how to best support this for developers and users
    • ideally, users have automated or double-click installers
    • developers need to be able to match and compile against the existing libraries inside the binaries

Sep 26, 2012

  • github
    • add README to github
    • fixes #NN for issues? [DK] didn't seem to close the issue when I used this
  • website (www.vistrails.org)
    • do we have access to the original illustrator? header
    • documentation link should go to the current release sphinx documentation now? (even if not, the documentation page should point to the web version)
  • [DK] MissingPackage errors: made to ignore module_id so we do not get duplicates
  • [FC] Parallel execution
    • Using iPython and serializing workflow pieces and sending the xml to the engines
    • Execution happens in workflow batch mode
    • vtl is a wrapper for a vt file or a workflow, documentation
  • [ES] 2.0.1 release
    • almost done, checking ALPS package on Windows
    • need tp test Mac version
    • also should test SAHM package
    • documenting the build and installation instructions
  • Still need to have machines for other Mac OS-es (10.6, 10.8)
    • signing the application: need to create certificates on developer
  • [TE] BuildBot bugs when execcuting multiple instances at the same time
    • disable some of the timing tests (e.g. for memoization)
    • could configure branches to use different .vistrails directories
  • [TE] Fixed the single-instance checks
  • [TE] testing for crowdlabs workflows
    • Titan package not working right now -- check with Wendel?, is this in VTK now?
    • fixes for the crowdlabs web page (description overflowing, workflow names truncated)
  • [BB]: working on the UV-CDAT colormap code: adding the module in code...

Sep 19, 2012

  • No known items?

Sept 12, 2012

  • Improvements to connection with CrowdLabs (see Dave's message)
    • clean up interface
    • currently using 2.0 on crowdLabs
    • check into whether master can be translated back to 2.0
  • Update on CrowdLabs + WebGL
    • need some better documentation here
    • examples in the user's manual would also be useful
    • ALPS image resolution is not good, might be using the uploaded thumbnail
    • seems not to be working right now
  • Update on iPython and multi-threaded VisTrails
    • each workflow would run on a different thread
    • trying to serialize Module object
    • example is executing Map concurrently
    • pickling errors...
    • can we push port values across only---do not need to pickle module
  • Feedback from Jorge on his experience with VisTrails (and the remote job submission package)
    • need some documentation on using the package
    • how are the UV-CDAT instances going to be running on the cluster?
    • want to push a workflow and run on a cluster
    • sync up with Fernando and Huy's work
    • would be good to have linked views support
    • add initial support from Jorge's UV-CDAT work
  • FIXED! mac_update_bin.sh is not working -- it does not delete the existing directories from lib/python_XY
  • CLTools:
    • Add reload button in the wizard window
      • does the new module reload? or do I have to delete the module and add a new one?
      • try to use the upload logic here
    • Add a debug mode that shows how the command will be invoked -- this should help the users understand how CLTools work and to get a wrapper working more quickly
      • want to see what the command line would look like with the settings
    • We should have more examples that cover additional features of command line tools, e.g., the use additional parameters such as a flag to specify an integer value. It would also be useful to show what the module will look like in VisTrails --- this will help users understand, e.g., what the "Visible" flag means.
    • It looks like the ENV var is set for the CLTools package. But what if different modules need different environments?
    • In the documentation, there is mention to file "suffix". Is this related to the exchange with Remi on June 14, 2012 4:14:32 PM EDT?
  • Open Tickets:
    • 577: grayed out when publishing until focus change
    • 578: two instances of VisTrails
  • Testing
  • [ES]: Mac builds are now working
    • pytables support has been removed (not working on Lion)
    • already added to the tests
  • Colin asked for new binaries to fix conflict with osgeo
    • wants to distribute for his users

Aug 29, 2012

  • Mailing lists?
    • should we keep the sci.utah.edu addresses?
  • Unicode issues [Rémi]
    • sqlite filename
    • configuration files
    • and filenames in vistails
  • WebGL and crowdLabs [Wendel]
    • Lis was able to add visualization
    • Google bot tries to access some pages and gets sent an error
    • fix the transformation
    • version issues, Wendel not getting the errors when opening versions from newer versions
    • make work for any workflows
  • Reproducibility [Fernando]
  • iPython [Fernando and Huy]
  • crowdLabs bugs [Tommy]
  • test machine access [Tommy]
    • working on
  • Bugtracker on github
    • how to specify branches in bugtracker
    • want to add bugs for java branch
    • cannot fork own repository
  • which version of VTK does VisTrails currently support?
    • we should make it clear on the website whch versions work (or what limitations exist in which versions)?
    • tracking this for provenance

Aug 22, 2012

  • Automated workflow layout -- done by Lauro -- Dave integrated
  • Serialization (pickling) Issues with IPython
    • take advantage of the the multiple engines in iPython
    • schedule modules to run in parallel
    • Dependencies in the Module class
    • Objects that stays at library level
    • VTK is problematic because data is in C++ (cannot send data)
  • fix UV-CDAT bugs
    • separate meeting for this?
    • merge in latest v2.0
  • [RR] Java Updates
    • add README file on git with link to documentation
    • README is shown on main github page so this is a better link back to the documentation
    • Should we maintain both the Java and PyQt interfaces?
      • problem of maintaining both
      • jpype limitations (reflection, introspection), maintenance
  • [DK] DataONE Package
    • able to load DataONE data in a VisTrails workflow
  • Add Jorge's work to news page
  • [ES] Build machine is almost setup
    • gridfields support not there
    • move to Windows build
  • [JP] Feedback from Oak Ridge:
    • Mashups do not work in UV-CDAT, vcs or DV3D
    • More next week and demo
    • using scikit-learn
  • Website maintenance:
    • Update header
    • Add page for contributors, contributing organizations
  • [WS] WebGL and crowdLabs
    • update next week

Aug 15, 2012

  • Big Data (Huy will join the call)
    • Big Data and VisTrails (let me know if you cannot access the file)
    • Big Data Pipeline (let me know if you cannot access the file)
    • IPython seems to be promising
    • streaming and parallel execution
    • simple example: parameter exploration
    • ipython multithreading (python interpreter pool)
    • threading helps on single machine
    • for cluster, want to have "execution engine" component
    • how do packages work here?
  • Burrito + VisTrails
    • Unpacking experiment: data files inside or outside .vt? (currently, only creating xml)
    • Libs and hardcoded input files: how the user can specify that? A configuration file?
  • Java code in our repository [Rémi]
    • add java files and a Makefile to compile the jar
    • need to figure out how to make sure this happens for the release (build script)
  • Java version of spreadsheet
    • working modulo a few issues
  • Testing framework?
    • Tommy prefers BuildBot
    • Question is what mileage we get from adopting CTest? (UV-CDAT?)
  • Move bug tracking to github?
    • for now, keep separate from our trac and see the utility

Aug 8, 2012

  • Java [Rémi]
    • working on spreadsheet--can be used with real packages
    • writing exclusively in jython, would not work via jpype
    • keep GUI in java, cell widgets in Java, too
    • network visualization using perfuse, could port to regular version of vistrails
  • Possible analogies issue in spreadsheet
  • Show in spreadsheet which version a cell correponds to
    • show a label in upper-right corner (tag or "tag + N")
    • parameter values for parameter exploration
  • Burrito [Fernando]
    • directory names
    • packaging into vistrails
  • Caching issue
    • Dave will write an email summarizing the issue here
  • Big data
    • Wait for Huy
  • crowdLabs integration
    • Tommy is working with Wendel on this
  • Testing machine
  • setting up github
    • Tommy is going to do this

Aug 1, 2012

  • testing [Tommy]
    • Switched from CTest to Buildbot
    • The buildbot is up and running with a few slaves:
    • can set up schedulers to run tests upon git checkins
    • need windows license: both 32-bit and 64-bit?
    • virtual machine with snow leopard
    • virtualbox for virtual machines
    • expand test coverage
    • looking to test gui (mouse clicks, etc.)
    • Qt has a testing framework?
  • VisTrails on Mac OS X 10.8 (Mountain Lion)
    • possible issue with X11 dependencies
    • also, how long should we continue to build for 10.5 (Leopard)
  • Bug #615: https://www.vistrails.org/ticket/615
    • need new version of PyQt?
  • [Rémi] working on Java spreadsheet
    • have to write JTable that supports functionality like in Qt
    • have basic UI in Jython, possible to use jpype
  • [Fernando] burrtio and dtrace
    • translate burrito script from systemtap to dtrace
    • dtrace's dscript doesn't have control flow structures (only if-then-else, no for loops)
  • Paris report [Emanuele]
    • feature request: which parameters values are used in a cell
    • overlay these parameters onto
    • almost like vtk to build package based on xml files from galaxy

July 25, 2012

  • testing [Tommy]
  • abstraction bug: possibly triggered by upgrades
  • [Fernando] burrito
    • using DTrace on Mac, translating calls from SystemTap
    • reading file, python programs read 64 files for a simple program
    • need all files that are dependent on the program
    • scripting virtual machines?
  • setting up virtual machines
    • Fernando and Tommy coordinate
  • Rémi
    • still working on spreadsheet -- need to talk to Huy
    • issue with parameters updating: sometimes need to click on the version again
  • Big Data and VisTrails
    • meeting next week?

July 18, 2012

  • Big data & vistrails
  • Publishing bug
    • hangs right after the spreadsheet comes up
    • Ctrl-C on command-line causes process to finish and
  • Java [Rémi]
    • working on spreadsheet with jpype, cannot subclass java classes in python
  • burrito: lower barrier to adoption for VisTrails
    • can build workflow that uses information from burrito
    • system tap: gets all process information, dtrace for mac os
  • new machine for testing and releases
    • virtual machines for linux distributions are setup
    • create windows or mac?
  • vis-7 machine
    • upgraded to latest suse linux
    • problem updating the crowdlabs machine, poly sysadmins working on
  • github
    • Tommy sent email on this
    • goal is to have a public repository
    • function to notify main developers if there is something to add
  • WebGL [Wendel]
    • fixed issue with upgrades
    • workflow does not execute, still looking into this

July 11, 2012

  • Big data [Fernando, Huy]
    • change execution engine to allow streaming of data
    • probably need to change language for execution engine in C++
  • Burrito [Fernando]
    • if you have a program running and have process id, stores in db information about files being written, input
    • mongoDB and python
    • linux-based (possibly only fedora?)
  • Jython and Jpype [Rémi]
    • Some Java functions (reflection) are not accessible and needed
    • Walks jar file to see which modules can be created; writes these to a cache file
    • Package needs to load this file and auto-generate VisTrails modules
    • Cannot build cache file in python right now, only jython
  • Publishing bug [Tommy]
    • Get two versions when running
    • Two cases: when running VisTrails and not running VisTrails?
    • Either executes or it hangs
  • WebGL [Wendel, Tommy]
    • vis-7 issue: VTK 5.10
    • after uploading, get id already used message?
    • queue from xmlrpc gets full and don't receive any more commands
    • Possible issue with upgrades (call change_selected_version instead of getPipeline)
    • operating system on crowdlabs and vis-7 is getting old, need to upgrade these?
  • Installing binaries or into standard python
  • Testing
    • build machine has arrived
    • how to run multiple tests
    • need to have some type of protocol
    • create tests for new functionality
    • also establish governance

July 5, 2012

  • Java dvelopment [Rémi]
    • Weka package can be run in both jython and python via jpype
    • Jython error when changing parameters (changes don't seem to happen, and trying this a second time causes db_id exception)
    • Instructions to run Jython version?
    • Is it possible to write a java spreadhseet that works in the python version?
  • Parameter Exploration Changes [Tommy]
    • Added apply to other versions
    • Drag and drop to apply exploration to another version
    • Similar methods to apply analogies and mashups? [Add to trac]
    • Also save analogies
    • Apply to other versions works by module id so it won't
    • Reference by tag or by unique id (resilient with unique ids)
    • Visibility controlled by when there are tags
  • Move to github and governance
    • github: can create organization
    • others can fork repository, we can decide what to include or not
  • Survey users -- what are people using VisTrails for?
    • look for old survey?
    • what information are people willing to share? how to encourage?
  • VisTrails users' and developers' day
  • Status on crowd labs (Tommy will contact Wendel)
  • Big data: Huy, Fernando, Tommy, Juliana need to discuss

June 27, 2012

  • Annotations for general data identification in provenance?
  • IPAW and DataONE Report
  • Parameter Exploration Interface [Tommy]:
    • save all parameter explorations for each version
    • adding more metadata to the panel (user, date, exploration name)
  • LaTeX Issue
    • Emanuele cannot reproduce the issue Juliana encountered
    • Used CLTools with multiple configurations
    • Tommy fixed some process blocking bug, perhaps this fixed it?
  • Port Documentation update from Pasteur folks
    • Always in center
  • Rémi working on Weka package (try to use jpype here?)
  • Possible issue with core code in java branch --- validating pipeline and/or adding functions?
  • exporting VisTrails trace for DataONE working group

June 20, 2012

  • Move to github and governance
  • Survey users -- what are people using VisTrails for?
  • VisTrails users' and developers' day
  • Status on crowd labs (Tommy will contact Wendel)
  • Big data: Huy, Fernando, Tommy, Juliana need to discuss
  • Parameter exploration: apply to different versions

June 13, 2012

  • Tommy:
    • Add support for parameter exploration on the command line---provide the user the ability to invoke a named parameter exploration
    • Add support for parameter exploration from the API: this will be for users that need to customize the exploration
  • Remi:
    • Ironing out some remaining bugs on the Weka wrapping
  • Emanuele:
    • Working on UVCDAT improvements: ability to replace a variable
    • Need to look into the issue of guiding users on what variables are needed to run a pipeline
    • Will look into the latex package issue, where the workflow is not executed (maybe an issue with CLTools or Persistence Package)
  • David:
    • global variables: can be used in the same workflow in multiple places; similar to having multiple aliases that are synched
    • aliases: only set for one parameter
    • Should we combine aliases and global vars?
    • Currently we cannot do parameter exploration over a global variable

June 6, 2012

  • Tommy: Almost finished with the parameter exploration updates
    • Parameter explorations show up in project list now
    • Can use functions that have not been set in parameter explorations
    • Also working on serialization of parameter explorations to their own schema
  • Wendel
    • Changed combo box style for HTML5 medleys
    • Using XMLRPC
    • Linux python 2.7.1 issue with cElementTree?
  • Emanuele: API 1.7 vs. 2.0 returning execution results?
  • Fernando: git-annex demo
    • looks to satisfy the main requirements from Matthias
    • Windows haskell issues...
  • Rémi
    • automatically wrapping Weka package
    • building Weka workflows
    • use Fernando's simple workflows to test this package
    • jars: configure button not available before loading a package? check this?
  • Dave
    • updates for USGS, others
    • basemap example
    • can we package basemap without full resolution maps?

May 23, 2012

  • Review remaining bugs
  • Issue with multiple instances
  • Wendel: almost done with mashups in HTML5, trying to make work on different browsers
    • Emanuele will send Wendel examples for web and desktop
  • Juliana's issues:
    • analogies are only displayed if we change the focus out of VisTrails to another application and back
    • in publish window, the snippet appears grayed even if the vistrail is saved -- but if the focus is changed to the main window and back, it works
    • it seems there are some issues with focus in general...
    • can't run 2 instances of vistrails: Juliana managed to break Fernando's setup too
    • need better viewer for persistence package: as is, it is not possible to look at values---tedious to change the length of the fields
    • persistence: should allow labels in module config--can be hard to identify a file in the persistence manager
    • latex extension: clicking on a figure on the PDF file (on acrobat) gives an error (is this related to the change to relative paths?)
    • CLTools: should support other types, e.g., integer
    • CLTools: sometimes, in the module list, under CLTools, a module called CLTools is displayed

May 16, 2012

  • Review remaining bugs
    • Default values for parameters not showing
  • CLTools: allow it to be invoked from within VisTrails (Tommy will look into this)
    • Emanuele created a command that makes it easier to invoke it on MacOS
  • Remi will demo the Java version of VisTrails and lead a discussion on design issues
  • Investigate connection to iRods
  • Mashups on iPad (ask Wendel to look into this)
    • Need to translate the XML spec for the mashups into HTML5 (should be similar to what Wendel did for the automatically generated mashups he had)


May 11, 2012

  • Fix for issue with caching group modules
    • looks to be working ok
  • Workspace fixes
    • seem to be working ok
  • QPixmap:scaled error?
  • detaching panels issue? Cannot reproduce
  • Ticket #551: Make Preferences non-modal?
  • Ticket #539
  • Ticket #540: Tommy will check
  • Ticket #541
  • Ticket #517
  • Ticket #523: Dave with check
  • Ticket #532
  • Ticket #533: Emanuele

May 2, 2012

  • Performance issues with workspace reconstruction
    • should be fixed, need to close ticket
  • Streamlining build process
    • Set up virtual machines?
    • Do we need to buy a dedicated server for this?
      • [ES] It could be a desktop machine, with windows and mac installed. I don't think it needs to be a dedicated machine, but we would like it to be available when a binary is built. I think it could be a mac with windows installed as a virtual machine. A 64-bit Windows has to be installed. I believe Lion allows multiple users logged in the graphical system at the same time, so if it is a powerful machine, it can be used as the development machine for another user. Also, we need to make sure the vnc port is open to the outside so a remote user can connect to it.
    • Running multiple Mac versions?
      • [ES] I am not sure if a binary built on Lion will run on Snow Leopard. The contrary is true, but then if we could run snow leopard as a virtual machine would be fine. It seems that Parallels only allows running the server version of Snow Leopard as a virtual machine. If we want to create a virtual machine of Snow Leopard, we need to do some hacking (some users were able to do this).
  • File management
    • Does git-annex work with cygwin?
    • Fernando contacted author of this code, he pointed to the web page on windows requirements, thinks symbolic links are the issue
  • Preferences Dialog is modal and does not allow switch to error messages.
    • Make preferences non-modal?
    • May already be working non-modal...
    • Move "Module Packages" from Preferences to "Tools" window?
  • Package identifier checks
    • move the specific code for SUDS packages checks to the package itself
    • add hooks to allow packages to identify which identifiers they can load
  • Enumerations
    • mashups already has a widget to select from a list of existing constant values (general, not just a combo box)
  • Auto connect
    • highlight possible ports to give the user the idea which ports are possible
    • Check names of ports to try to make these connections
    • Issue: when there are multiple inputs that match multiple outputs (which connection to create)

April 25, 2012

  • Persistent Package
    • File management: discuss git-annex
      • Issue with Windows support
      • May work through cygwin, but written in Haskell which may lead to issues of integration in the Windows environment
    • Make inputs of persistent modules read-only, to maintain consistency (only configure them through the configuration widget)?
    • Reproducibility of workflows that have persistent inputs: get the version id from the log, put it in PersistentRef and execute the workflow?
  • Parameter exploration
    • Do we want to add named parameter explorations and multiple explorations per version? (Tommy)

April 20, 2012

  • File management:
    • USGS suggestion: use just files in the file system; more concerned about performance
    • Git Annex may provide a good solution: http://git-annex.branchable.com/ "Fernando will look into days"
  • Wendel: update on Elsevier/Sciverse/Crowdlabs
    • crowdlabs now uses the proper identification for workflows
    • Assumption: author has already uploaded the vt to crowdlabs, and the paper refers to the workflow id, then our app would load the associated mashup and allow the user to manipulate the mashup on the browser
    • Bug: random message from VisTrails--interrupt system call "Wendel will ask Emanuele for help"
    • Wendel will add all vtk_examples to crowdlabs
  • Dave: update on additional improvements (e.g., global variables)
    • Moved code from pipeline to controller
    • The interaction needs to be improved so that if a var is removed, when the user clicks on a workflow, the var is automatically resolved (deleted from the workflow)
    • Update Huong's example to use global variables
  • Discuss changes to persistence package (see email exchange)
    • Also more from USGS on persistence and large files
  • Parallel execution
    • python paralellism: for modules that serialize their output, they can be executed in different cores
    • allow modules to say they are detachable--they have implement methods for reading the input and serializing output
    • ask Matthias for Troels' code--USGS could test it!
  • Possible issue in Windows version where you need to press space bar in the console before workflows execute? (USGS)
    • We need to have a Windows tester (Tommy can test this)
  • Defaults are not showing in the new ports panel?
  • Change parameter exploration to allow all parameters to be shown
    • Make unset parameters accessible from a Parameter Exploration view (Tommy)
  • Enumerations (Pasteur group)
    • Expand portspec to have subitems to better support this?
  • Annoying bug where selecting a version selects the text box instead of the version ellipse

April 11, 2012

  • Wendel gave update on Crowdlabs/WebGL
  • Update on persistence package and the need to support a global repository
    • Juliana followed up with Matthias (see email exchange)
  • Schema updates going smoothly
  • Dave made several improvements to module attributes to handle different 'symbols' for the ports, ability to mark ports as optional, visible, etc
    • discussed usability issues, including the recording of which ports are marked as visible

April 4, 2012

  • Fernando will look into the persistent file management
  • Updates to schema
    • Add mashups to schema (to better support crowdlabs)
    • Other changes? best to make all needed changes at the same time
  • Bug-fix release
  • Crowdlabs
    • Get rid of flash

Mar 21, 2012

  • Need to remind people at Poly about the meeting
  • Tommy will send an email the day before our meetings
  • Integer slider for Matthias (need to translate)
  • Schema changes

Mar 14, 2012

  • Mashups in the database
    • Matthias's error: is there an upgrade here?
    • Mashups are not currently being saved in the database, Emanuele is fixing this
  • Schema changes:
    • Mashup changes
    • Ticket on trac about schema changes
    • Global variables: promote this to specified schema, not just annotations
    • Port cardinality: single input or multiple input guidance
    • What else should be updated in the schema?
  • ClimatePipes
    • Demo

Mar 7, 2012

  • Proposal to change the interpreter to support suspended execution (Tommy will look into this and then coordinate with Troels)
    • any module should be allowed to have its execution suspended --- we will have a new execution state
    • modules that are declared by the developer as suspendable will have a different shape (oval?)
    • during execution, suspended modules are shown in a different color (orange?)
    • VisTrails should execute all other modules that do not depend on the suspended modules
    • the suspension of a module should generate a log entry that states the module is suspended
    • Question: how often should VisTrails polls the suspended process? Should we have a 'resume' button in the interface to allow the user to also control this manually?
  • Bug with updates (Dave will look into this)
  • export to stable menu does not work, and we need a export to XML (Emanuele will add these to trac)
  • Crowdlabs and upgrades: Tommy and Wendel will work on this
  • Emanuele is implementing support to export a single workflow. Caveat: the workflow will be detached from its original, and it will not be possible to 'update' it on Crowdlabs (this will be documented in the manual)
  • Oracle interface: add dialog error msg when 'reproduce' is not possible; differentiate between execution and 'reproduce' (use italics for reproduce, and keep the existing colors)



  • Matthias says: Troels has now found the time to finish his Python background job submission package, which can run calculations in the background locally, on remote Unix machines, and on remote clusters and supercomputers. We are integrating it into ALPS now and next want to do a VisTrails package for it. That should not be too hard, but we'll need some changes in VisTrails to be able to implement it well. If a background job starts we want to stop the execution of the downstream part of the workflow and can do that by raising an exception - but this exception should be treated differently to others. In particular we don't want to have the execution log record it as a failed execution, but maybe as "suspended" or "incomplete" or similar. One might also use another color than red to mark the status of the module and the downstream modules. Finally there might still be other branches of the workflow that could be executed and one does not have to stop all execution.

Feb 21, 2012

  • Wendel: 3D crowdlabs -- found some issues that have been fixed
    • TODO: create a function to add a package and reload a package
    • TODO: add a function to automatically modify a workflow to produce WebGL results
  • Tommy: Crowdlabs
    • allow users to upload packages -- similar to safe workflows
  • Fernando: VisTrails/Oracle
    • TODO: save db state ID in the log, associated with a workflow execution
    • Oracle package supports all functionality of the MySQL package
    • already supports 'reproduce'
  • Emanuele:
    • Updated VisTrails doc to include info about server setup
    • working on the release
    • hard to use VisTrails as a server because of the dependence on X server; when one does not need the GUI, it would be nice to transform the workflow so that it could run in batch, for example, use VTK offscreen support
    • caveat: requires VTK to be compiled with offscreen support
    • it is not clear if this would work for matplotlib, which also requires X server
    • http://stackoverflow.com/questions/4931376/generating-matplotlib-graphs-without-a-running-x-server

Feb 14, 2012

To Do:

  • new branch in crowdlabs for Wendel
  • update doc
  • Oracle package
  • release 2.0 beta
  • Future tasks: packaging, testing
  • Update on crowdlabs (Wendel and Tommy)
    • Wendel had problems with black images being returned by the VisTrails server, we suspect the issue is related to the virtual frame buffer---he will investigate
    • We need a script to migrate old crowdlabs information into the new format
    • There is a new GIT repo for crowdlabs, currently Emanuele, Tommy, Juliana and Wendel are on the notification list
    • Tommy is almost done with the update, need to clean up documentation
  • VisTrails + Oracle Total Recall (Fernando)
    • Overview of total recall
    • Having the ability to execute and reproduce
    • To reproduce go through the log; only works for read-only queries
  • UVCDAT
    • Done, except for icons
  • Tighter integration with scripting
    • require the modules to support serialization into Python
    • not clear if can use compute method as it; python serialization is simpler than compute
    • need to better investigate this -- ask Dave about his implementation for the VTK scripts
    • in UVCDAT there is translation only from pipeline to script, not the other way around

Jan 31, 2012

  • Climate Tools and Provenance
  • Server:
    • Save workflows to the server database
    • Keep the parameter changes
    • Add annotation that says when a version is WebGL enabled
    • Just send actions that are the changes that are being made in WebGL version


Jan 24, 2012

  • TE: crowdLabs is running, able to upload a file from VisTrails
    • still debugging, found transaction error when writing to database maybe order
  • defaults:
    • problem is that we currently are not showing default values for modules in the port panel
    • don't know if we should clutter the interface with all defaults, should we save them as provenance?
    • set all defaults when a user drags in a module, this way we have provenance
    • hide defaults in the interface--if user clicks on a port, show it, but don't automatically expand it
  • error handling:returns values and exceptions
    • need to track down bug from Colin's workflow

Jan 20, 2012

  • Updating crowdlabs install
  • calling VisTrails from command-line using "python vistrails.py"
  • want to change this to use the VisTrails sever
  • create a new page, option to view visualization in 3D
  • send GET request to Wendel's 3D stuff
  • running on git version of VisTrails (so the code is up-to-date)
  • server currently has an issue with the change to core_no_gui code
  • crowdlabs, one change to have "View Vistrail in 3D" link
  • one new parameter to settings (ip address)
  • Tommy has crowdlabs code running with new versions of Django
  • Wendel's code is on the vgc git repo
  • Branch a new version for crowdlabs

January 10, 2012

  • need to support mandatory vs. optional attributes
  • find the status of remote execution
  • Dave will reply to question from QGIS developer
  • Emanuele has sent information to LLNL
  • To do for VisTrails
    • test server version for CrowdLabs
    • revisit CrowdLabs--issue: Django has evolved, many libraries no longer exist