Development

From VistrailsWiki
Revision as of 21:34, 22 January 2014 by Remi (talk | contribs) (→‎Jan 29, 2014: str-format-module)
Jump to navigation Jump to search

2014

Jan 29, 2014

Updates

Items to Discuss

  • [RR] str-format-module: merge in? Having to open the configuration window and click on a button is awkward

Jan 22, 2014

Updates

  • [RR] Branches to be merged in: reload-disabled-package (#714), ungroup-keep-disconnected-ports_/hybrid
  • [TE] VisTrails 2.1.1 released
  • [DK] matplotlib, uuid work

Items to Discuss

  • [TE] Automatic Looping
    • Using ListOf module as input will trigger the looping
      • Runtime type checking because we don't have a ListOfT class for each type.
      • [DK] USGS interested in working with this
      • Add a merge module to make it clear where the iteration stops and ends
      • Have different type of ports
      • Remi to talk to Huy about streaming ideas
      • can we detect the iteration and draw those modules differently
      • decided that some fold/merge module would be useful, even if we don't always need to use it explicitly
  • [RR] Ticking in matplotlib
    • formatters, locators, tick scale
  • [RR] Sort tickets? There are issues for the 2.0 milestone and issues that affect released versions. Tickets assigned to 2.1 are here.
  • [DK] To contact USGS about the new persistence package and get feedback
  • [DK] Merge weather and subway examples to use new tabledata package
  • Look into pandas http://pandas.pydata.org/
    • could be helpful to support some of the table operations
    • in the future, we may want to support it for data analysis and stats (but for this, additional dependencies are required)

Jan 15, 2014

Updates

  • [RR] improves-logging:
    • Pass exceptions to it directly for single line, or traceback.format_exc() for traceback
    • WARNING level now printed in the console by default, -V 1: INFO (log() calls), -V 2: DEBUG (debug() calls)
    • Messages view now shows whatever is selected, no matter the console log level
    • Python warnings get captured
    • Use warnings.warn(..., category=VistrailsWarning|VistrailsDeprecation) to warn once
    • Backport to v2.1? (so deprecation warnings actually show up)
      • -> Yes, include in v2.1.1
  • [TE] Hadoop package documented here: http://www.vistrails.org/index.php/Hadoop_Package
  • [DK] UV-CDAT, LSU

Items to Discuss

  • [RR] input-module-no-subclass: input ports of type 'Module' shouldn't accept any output? These are used for 'self'->controlflow module connections...
    • controlflow modules have a function port that accepts a module
    • check for modules that might use Module for something else
    • suggestion is to use Variant for any type being supported (StandardOutput, List, others)
    • warns for now (doesn't actually disallow the connection until next version)
    • -> David wants to look at this more
  • [RR] ungroup-keep-disconnected-ports: alternative ungrouping, keeps pipeline disconnected and preserves InputPort/OutputPort modules
    • ideally, want to edit in place, could try to emulate the edit subworkflow code here, just have to save things back to the original vistrail
    • -> hybrid approach, materialize the unconnected ports
  • Bugfix release? Would solve issues with server and parameter exploration. (also, logging improvement?)
    • [TE] Also fixes to workspace
  • [DK] font issue for OS X 10.9
  • Look at Java support in Java?
    • can we leverage Rémi's past work here? on Weka
    • pyjnius: cannot subclass a Java class in Python, should be able to create and call out to java classes
    • issue with Jython is the interface
    • also had java-based spreadsheet
  • also look at moving VTK to a matplotlib style generation

Jan 8, 2014

Updates

  • [TE] Hadoop package: Added -combine option for Cs9223_Mapreduce_Assignment
    • tricky to set up the package still, need documentation to understand how to assemble workflows from scratch
  • [RR] Logging improvements underway (branch 'improves-logging')
    • Fixes some logging gotchas
    • Better exception printing (pass them directly to critical|warning|log)
    • WARNING is now the default level for the console (only CRITICAL displays a popup)
  • [RR] Alternative Mac .app (example dmg)
    • Standard build of Python (currently as a Mac .framework)
    • Have to relink libraries with install_name_tool
    • Seems to work
    • Advantages: standard, full build of Python; pip will work
    • Last problem: if Qt is installed, it gets loaded and conflicts with bundled Qt (I missed a dylib link)
  • [DK] Work with USGS, refactoring version trees

Items to Discuss

  • Colin's feedback on loops is that it has taken some time to wrap his head around what is going on
    • what does Taverna do? automatically determines when a collection is being input and does iteration over the input and determines what output looks like
    • look at what this would take, figure out what the best way to attack this (Automatic loops)
  • Colin also asked about use of persistence in another module

Jan 3, 2014

Updates

  • [TE] VisTrails 2.1 released
  • [DK] UUID branch, USGS work, upgrade recursion
    • van Wijk on tree vis, and Buneman on XML Updates

Items to Discuss

2013

Dec 25, 2013

  • No meeting, holiday break

Dec 18, 2013

Updates

  • [DK] USGS module-suspended branch, still concern about usability for loops
  • [TE] Added mac binary testing to buildbot
    • Working on windows script
  • [BB] Started work on separating uvcdat gui from vistrails (use git submodule)

Items to Discuss

  • 2.1 release date (2013)
    • ALPS integration
    • Would like to release in 2013
  • [DK] Updating signatures during execution? (e.g. in loops)
  • [DK] merge rename-api?
  • [DK] log schema in master
    • 1.0.4 schema on master
  • [DK] upgrade logic (where to recurse)
    • remap method:
  • [BB] uvcdat packages
    • starting to separate uvcdat from vistrails.git, uv-cdat packages
    • try to generalize the package list to support more package locations
  • [BB] Disabling provenance
    • maybe focus more on scripting (more a new to have scripting work
  • [BB] VTK 6
    • see migration guide

Dec 11, 2013

Updates

  • [TE] Merged job-info into master
    • Updated PBS and hadoop to use job-info
  • [RR] dont-use-modules-as-data: works, awkward in a limited number of cases (two classes instead of one), brought up some issues (listed on #804)

Items to Discuss

  • [DK] Job submission in 2.1
    • two modules are executing before the workflow stops (in "Stop on Error" mode) #806
    • job info in 2.1.
  • [RR] remove-socket: change is there, improves single-instance handling, but still race condition
    • creates a file (a socket) on which the new instances communicate with new one
    • if VT crashes, the file stays behind even though no one is connected so next startup will fail
    • check process id?
  • interaction provenance: need API, how to best capture this
    • concern about the interaction handlers
  • module upgrades: [DK] to work on
  • mashup upgrades: [TE] to look at to determine next step
  • webServices is deprecated: we should get rid of this

Dec 6, 2013

Updates

  • [TE] VisTrails 2.1 ready - waiting for updated ALPS version

Items to Discuss

  • [DK] UV-CDAT Meeting Update
  • [RR] Pending examples on new-matplotlib-pkg branch
  • [RR] Module types (after discussion on #798)
    • We use Modules to represent both computations and types
    • Sometimes, objects of the type are instances; sometimes not (Integer and List vs File and vtkActor)
    • Passing down Module objects is dangerous (keeps references to moduleexecution, upstream modules, logger, even interpreter)
    • Keeping references that we don't need, references to pipeline, etc., memory concerns
    • some modules (Map, If) may need executing infrastructure
    • note that anything accessed downstream should be a "data field" not a core Module field
    • Remi to try to make change (test here)
  • [TE] What to include in next vistrails version (the multiprocessing update?)
    • multiprocessing?
    • unicode support?
    • uuid identifiers?
    • Improvements to startup, configuration, parameters?
    • start working on merges, may use other branches first, but we need to try to start integrating
  • Last few changes for 2.1?
    • XSL module? (#292)
    • Better handling for disconnected single-instance socket (WIP, #793)
  • py2app?
    • create app bundle using virtualenv?
    • disk space is about the same
    • try on build machine (10.7)

Nov 27, 2013

Updates

Items to Discuss

  • [TE] Fixing bugs for 2.1
    • Synch release with ALPS 2.2

Nov 20, 2013

Updates

  • [RR] Branches to be merged in: tuple-fixes (#224), for-module (#282), richtextcell-rtf-support (#789, needs pyth lib), remove-lockmethod (#791)

Items to Discuss

  • [TE] Fixing bugs for 2.1
  • [DK] Ordered connections, scripting ideas
  • [DK] Enforce connection cardinality?
  • [RR] Documentation/API effort? (improve-api, parameter-configuration-api, rename-api)
    • [DK] look into merging master
  • [BB] Speeding up vtk init
    • Store auto-generated results in a file in the vistrails user directory?
    • [DK] No, run the generated spec once and store it until VTK version is updated
      • Auto-generate python files for the vtk package, which then loads like any other package
    • [BB] Profile vtk load and unload
  • [RR] Parallel multithreading
    • base functionality is there
    • working to limit data movement, persistent file tie-in
    • the JobSubmission package works to run later jobs
    • partition workflow to take into account dependencies, right now, just push modules out when there are extra engines

Nov 13, 2013

Updates

  • [RR] Branches to be merged in: improves-module-doc (#426), remove-persistence_exp (#703)
  • [DK] Scripting support (parsing python source to guess ports), translate python functions to VT modules

Items to Discuss

  • [RR] Delete old branches? Development Branches#To be deleted
    • Yes, this is fine, commits are preserved (just not named any longer)
  • [RR] matplotlib figure/plot modules problem (#396)
    • We are using matplotlib backwards
    • try to defer code execution until we hit MplFigure (just save code in a buffer)
  • [BB] Add uvcdat branches to github
    • Question about whether to make vistrails a submodule, move the repository to github?
    • Do we create a separate repository here?
    • [BB] to talk to Claudio about this
  • [TE] What to include in 2.1
    • Fixing all tickets would delay the release. Some of these are present in 2.0 so it would still be a good release.
    • Suggest making feature releases (2.2, 2.3, ...) for the working branches that are not ready.
    • (updated with (status) as of 11/20 -- RR) (11/26 -- TE)
    • 224(fixed) tuple objects - move to 2.x?
    • 248(postpone) multiple 1D transfer function - priority?
    • 277(fixed) & 722(fixed) pipeline validation bug - priority?
    • 282(fixed) fold flexibility
    • 396 matplotlib figures (see above)
    • 490(postponed) add label via api
    • 581(reprod?) latex: clicking figure in pdf - reproducible?
    • 652(postponed) helper method for function upgrades - move to 2.x?
    • 670(fixed) Traceback in configuration window - easy fix?
    • 699(postponed enh.) parameter-configuration-api ready for 2.1?
    • 709(fixed) & 712(postponed) abstraction bugs - move milestone to 2.2? (use uuids) - no, these are more simple ones
    • 710(postponed) Module configuration widgets should be default-aware - fix for 2.1?
    • 736 update documentation
    • 754(question) usersguide controlflow - fix Conditional module
    • 755(postponed) new persistence package
    • 762(tbm) import sys package
    • 771(fixed) create schema if missing?
    • Also see Development_Branches

Nov 6, 2013

Updates

  • [RR] Branches to be merged in: disable-thumbnail-test-old-vtk (#764), cacheable-controlflow (#778)
  • [RR] Added a level to the LoopExec structure (LoopExec/LoopIteration), doesn't work, need help (#774)
  • [TE] job-info done - ready for integration with other branches
  • [RR] to be closed: #757 #683

Items to Discuss

  • [RR] Data movement with multithreaded-interpreter
    • Right now, single module (opt. Group) sent with inputs, outputs & provenance sent back
    • Input/output serialized with pickle
    • Doesn't work with e.g. File (should allow modules to customize?), might be very slow (send multiple modules to minimize data movement?)
    • [JF] ReproZip does some rewriting to change filenames
    • Implement the simplest solution.
    • For now, look at fixing this for simple File modules
  • [TE] Update ticket milestones and sort branches by milestone
  • [TE] 2.1 release? Specify what we want to put in the release
    • Windows 8 build machine crash
    • [TE] Go over open questions, distill what has been fixed
  • [BB] UV-CDAT Animation issues
  • [BB] ClimatePipes UI work, minor fixes and features
  • Try to decide features, release data on 2.1 for next week. Look at tickets, try to determine multi-threaded and command-line/documentation progress.

Oct 30, 2013

Updates

  • [RR] Rewrite of the logger (#774) nearly completed (loops haven't been tested extensively).
    • It looks like the test suite doesn't really cover that
    • Then multithreaded-interpreter can be stabilized
    • job-info?
  • [DK] Further work on configuration/preferences
    • Make it easy to add preference settings to GUI
  • [TE] job-info
    • Fixed issues with job identifiers
      • Can not use module signatures
    • Working on examples

Items to Discuss

  • [RR] Some tests [1] [2] fail on Fedora (both v2.1 and master) but pass everywhere else, anybody knows why? (added issue 775)
  • [TE] Signatures are not always unique---can have two different groups, do not depend on upstream?
    • Groups that are duplicated in the same pipeline have the same signature, execute only once? issue 765
    • Does looping corrupt the cache? (e.g. with Map) (it does, added issue 777)
    • Extra ModuleSuspended?
    • Rémi showed some bizarre beahvior with the same subworkflow outside a group and one in a group
    • Two options to make this uniform:
      • Always compute multiple copies when they appear in the same pipeline
      • Issue here is that the cache only contains one entry despite there being multiple copies of the same subworkflow
      • Always use the cache (build it incrementally)
      • Problem here is that if we have something which isn't quite stateless (e.g. vtk), having no copy might cause only one copy of the output
  • [DK] Cache replacement policy?
    • Ben may have done something with this for UV-CDAT
    • Issue was determining the size of objects in Python
  • [TE] Different brightness of images produced for VTK examples on Debian?
    • Different versions of VTK? 5.4.2 (bundled with Debian 6)
    • Added issue 764

Oct 24, 2013

Updates

  • [TE] job-info
    • Core functionality complete
    • TODO: Testing, console mode, integration with batchq/hadoop/pbs
  • [RR] Branches to be merged in: install_package_requirements, non-english-locale
    • Yes
    • non-english-locale - strange files in db

Items to Discuss

  • [TE] OSX 10.9 does not work with VisTrails beta
    • Has anybody upgraded yet?
  • [TE] Does control flow modules get unique signatures?
    • Log have extra "completed" instances of modules used in a Map.
  • [TE] Debian thumbnail difference

Oct 16, 2013

Updates

  • [RR] Branches to be merged in: install_packages_requirements (issue 694)
  • [DK] Rewriting startup and configuration
    • unify command-line options that match configuration options
    • reduce number of single-character flags (e.g. "-U")
    • push startup into db to enable translation for future versions
  • [TE] New test system for Minimum Requirements running Debian 6 (Squeeze)
  • [TE] job-info: Work in progress, core functionality implemented

Items to Discuss

  • [RR] Support for spawned VisTrails instances (multithreaded branches, parallelflow)
    • used to use local configuration which is problematic
    • just use configuration that always add packages
    • Sometimes need to reneable the spreadsheet so that certain modules (e.g. VTKCell) are available.
    • [TE] Perhaps this could support the VisTrails server too?
      • Server needs to be looked at because it seems to be out of date
  • [TE] Update crowdlabs for 2.1? At least start testing
  • [TE] New test system for Minimum Requirements running Debian 6 (Squeeze)
  • {DK] Command-line stlye
    • subcommands? e.g. "vistrails batch", "vistrails paramexp" [TE] "vistrails job"
      • [TE] makes sense to have different commands
    • support multiple vistrails? multiple versions? (e.g. vistrails /path/to/example.vt -v tag -v 12, /path/to/example.vt:12)
    • package options (move spreadsheetDumpPDF, fixedSpreadsheetSize but allow flags (e.g. -P spreadsheet.outputType=pdf)
    • GUI Configuration also needs to change-improve "Always, Never, "For this session"?
      • Always show the configuration options, possibly flag the ones changed with an icon
    • Other changes? Rewrite Startup Usage

Oct 9, 2013

Updates

  • [DK] Rewrite startup code
    • use db to better structure startup.xml
    • default, persisted, current configuration
  • [TE] job-info
    • Persist jobs to disk
    • Cache finished jobs to disk
    • Run from command-line
    • Integrate with BatchQ/PBS/Hadoop/parallelflow

Items to Discuss

  • [BB] VTK Startup Times
    • can we run this beforehand?
  • [TE] Feedback on WritingCommitMessages?
    • Approved.
  • [TE] Minimum Requirements
    • Add build machine with minimum requirements?
    • What Requirements?
      • Python 2.6
      • Qt 4.6
      • 5 years for others?
  • [TE] Status of PostgreSQL support (Ricardo@vistrails-users)
  • [RR] Matthias wants the archive to have choosable filenames (can be done but makes collisions possible)
    • two separate features: persistent caching and archival of final results
    • key part is probably the transformation between the two: want to make this easy to go from an exploratory workflow to one that works with the archive

Oct 2, 2013

Updates

  • [DK] rename-api branch updated with documentation
    • includes name changes to match underscore style
  • [DK] module configuration widgets
  • [TE] Examples in usersguide are now clickable
  • [TE] Added registration of file types on Linux

Items to Discuss

  • Parallel Design for UV-CDAT
  • New persistence package
    • add UUIDs as identifiers for vistrails/actions so that each workflow could have a unique id
  • [DK] When/if to merge rename-api?
    • Should be ok to merge into 2.1?
  • [TE] Show pipeline or history view when opening a vistrail?
    • Either use configuration setting, or default to history view if vistrail contains tagged workflows.
    • Default to the history view for saved files but make a configuration setting
  • [TE] New commit message syntax
    • #ticket/bugfix/feature 777 (on third line)?
    • why not Ticket: Bugfix: Feature: on last line, like Signed-off-by lines? or could be on any line really
    • Go with last line and don't start with hash as this may conflict with comments (which start with a hash, too)

Sep 25, 2013

Updates

  • [RR] I forgot to mention Development Branches
    • [TE] Thanks, lets continue to use this and keep it up to date.
  • [DK] Rework Developer API and Documentation
    • Documentation: API and Writing VisTrails Packages
      • Reorganized the "Writing VisTrails Packages" chapter; I could never find the text I wanted: Updated Chapter
      • Still work to be done
      • Use sphinx autodoc to generate real API documentation: API Docs
    • Rework the whole Module API but maintain old calls for backward compatibility:
      • setResult -> set_output
      • getInputFromPort -> get_input
      • drop "FromPort" on other similar calls
    • Add vistrails.core.modules.config classes ModuleSettings, InputPort, OutputPort, ConfigWidgets to give kwarg capabilities
  • [TE] VisTrails v2.1beta2 Released
  • [TE] Linking to examples from usersguide
    • Uses vtl files on server to point to local examples
    • Works with both html/pdf
    • Cannot use crowdlabs because it cannot link to mashups and parameter explorations
    • .vtl file extension need to be set up for this to work

Items to Discuss

  • [BB] Aashish and Utkarsh from Kitware may be interested in using VisTrails in another project of theirs, and they had a few questions
    • How hard would it be to export a workflow as a python script to run independently of vistrails?
      • easiest if there is a 1-1 mapping from lines of source code to module creation/parameter steps
      • would be nice to know what the goal of the work is (is this VTK or ParaView related)?
    • Is it possible to extract just the workflow part of Vistrails as a lightweight api with or without provenance?
      • probably can be done, but currently there is no "core.workflow" project so action and module objects are both currently in "core.vistrail".
      • would be nice to separate the workflow pieces but this is currently not done, if it happens, would like to include in VisTrails
    • Allow folks from Kitware to push code to the VisTrails repo (which after code review ends up in master).
      • Sure.
  • [RR] core.system rewrite: merge it in? (#743)
    • executable_is_in_path() returns True or False, only returns True if executable is on the PATH (previously: returned executable name if on the PATH or in top directory)
    • get_executable_path() looks for executable in PATH and top directory and returns full path (previously: didn't look in top directory, didn't return file with extension)
    • Removes executable_is_in_pythonpath() (purpose unknown) and example()
    • Trying to remove temporary_directory(): remaining calls look like bugs (#744)
    • Can test branches on buildbot for other branches using the web API
  • [RR] Meeting with Matthias: rework the persistence package
    • Summary of meeting
    • [DK] Dropping git is fine, but one of the reasons for the choice was the ability to pull/push files to different machines.
    • [DK] Do the key-value indexes work for searching for time?
    • [DK] Are there existing flat object stores (with hases) that we can use?

Sep 17, 2013

Updates

  • [TE] build notifications
    • If you break the build you get notified
    • New mailing list vistrails-build@vistrails.org
    • Will send digests to vistrails-dev once tested

Items to Discuss

  • [TE] New bugs found by Matthias
    • FileSink overwrite port FIXED
    • Git username missing FIXED
    • Persistent directories
      • The main issue was fixed, empty directories possibly still an issue
    • Release new beta once these fixed?
      • Yes.
  • [RR] Logger problem with multithreading: where to store parent executions?
    • link between Group module in main workflow and modules inside the group
    • create a structure to track this information
    • make the modules themselves dictate the change, tell the logger when they have sub-executions
  • [TE] adding/improving examples
    • KEGG web service has moved to REST
      • Should we look into adding a REST package?
    • Add more controlflow examples for new modules and add to usersguide
    • Subworkflow, Group, Mashup, Parameter Exploration, aliases, vistrail variables - in usersguide, should we add examples?
    • Packages without examples:
      • ImageMagick
      • qgis
      • rpy
      • sql
      • tabledata
      • vtlcreator

Sep 11, 2013

Updates

  • [RR] Reworking logger because of issues and incompatibility with multithreaded-interpreter
  • [TE] Updating Job Monitor to support serialized jobs
  • [DK] Module/PortSpec APIs.

Items to Discuss

  • [TE] Issues raised by Mathias
    • IPython console does not receive console output FIXED
    • Map module broken FIXED
    • Persistent directories broken FIXED
    • Variant type checking FIXED by making it optional
    • Subworkflow upgrades FIXED?
    • Dulwich not included in Windows binaries FIXED in next release
    • Removing files in persistence fails:
      • We're rewriting history which causes issues with identifying versions by commit hashses
      • Delete is disabled
    • It seems to be working for him now...
  • [DK] Improve testing / code management
    • Are the VMs still running the test suite? Are we seeing errors? Can we get notifications?
    • Should we adopt a master/next/branches strategy like UV-CDAT uses?
    • Strategies for ensuring all files are added to commits?
  • [RR] VisTrailsWiki:About has an (outdated?) description of VisTrails but there is nothing on the main page
  • [TE] New binaries for beta being completed

Sep 5, 2013

Updates

  • [TE] Hadoop
    • Asynchronous hadoop streaming now works
      • Hadoop jobs detach using ModuleSuspended and persist when restarting vistrails
      • Uses a custom updateUpstream to execute Machine while skipping other modules if job is already running
      • Need to add widget for monitoring job output
      • Fetching job result files is not cached so it happens everytime
      • Demo?

Items to Discuss

  • Multiple instances of VisTrails
    • There were some errors in the way messages were received and the error code propogated, should be fixed
    • One place where we exit() instead of returning code, want to make sure the correct shutdown code
    • Try to take the same path as the "error path" which shuts things down
  • [RR] Requirement installation from packages (issue 694): init.py is imported before the package_requirements() check
    • agree that we should change this, new packages should do py_import in package_requirements call
    • update release notes and manual to highlight this ("as of verison 2.1...")
  • [RR] Remove "creating new figure" code from MplFigureCellWidget for issue 672
    • [DK] Should be able to remove this, left over from trials to move
  • [RR] Remove persistence_exp package? (or at least, code?) (issue 703)
    • [DK] Should be able to get rid of, should be all in the current "persistence" package
    • Update GUI view for persistence to allow more meaningful searches
  • [TE] Change app logic so that exit codes can be propagated correctly?
  • [DK] Fix defaults to be more universally known in the VisTrails code (see tickets 708 and 710)
  • [DK] Defaulted ports in Module.inputPorts versus ModuleRegistry.module_destination_ports
    • Module.inputPorts stores only connected or set ports (e.g. the user has done something to that port)
    • Iterating over Module.inputPorts will thus not include defaulted ports
    • This is good for packages like VTK where we do not want to call all of the methods that default these values since they were already defaulted by the constructor
    • This is not ideal for package developers who expect to see all of the ports that have values set in self.inputPorts
    • Compromise might be to have a new method in Module like getInputPortNamesWithValues()?
  • [DK] TODO: history of groups/subworkflows and start conversation on design
    • [RR]: really want to be able to edit groups in place

Aug 28, 2013

Updates

  • [TE] Hadoop: Making hadoop package use batchq for communication over ssh
    • first version of package needed to connect across firewall
    • need to make hadoop work, then reorganize
  • [BB] UV-CDAT: Get past the last few bugs
    • Release target is Sept. 1, or 7-10
  • [RR] Finishing up the threading stuff
  • [DK] Parameter widgets

Items to Discuss

  • [TE] 2.1beta binaries
    • Upgrading identifiers was broken. Dave, is this fixed? Should we update the binaries?
      • Think tickets 698, 699 could be fixed for this
    • Keep lung.vt although we cannot run it?
      • nice example of a branching tree
      • could try to use other data for this?
      • should we contintue to distribute this?
    • Other issues?
  • Remi: nested modules
    • tie each module with a configuration
    • putting group information into the pipeline_view widget
  • Parameter config widgets
  • While module not in 2.0 yet, have a branch but need other bug fixes to make it work
    • incorporate this into 2.0 branch, but don't make separate release for this

Aug 21, 2013

Updates

  • [TE] Hadoop: cannot run client from outside poly
  • [TE] PE and mashup module id upgrades now fixed
  • [TE] Currently building 2.1 beta binaries
  • [RR] Execution target selection UI done for the multithreaded branch; needs to port IPython code.

Items to Discuss

Aug 14, 2013

Updates

  • [RR] New log schema on multithreaded-interpreter (see Multithreaded-interpreter#Changes)
  • [TE] Upgraded hadoop package to latest version, will try to integrate with the poly cluster
  • [TE] Need to fix PE and mashup module id upgrades
  • [DK] Output modules and PE upgrades

Items to Discuss

  • Branches to be merged: dont_stop_on_first_error, optimize-module (While module) (v2.0? see all changes to the pkg)
  • OutputModules questions: OutputModules
  • Package configurations with multiple instances of VisTrails (e.g. with iPython worker instances)
    • read-only mode that doesn't change configuration?
  • Module remap upgrades: how to enable them immediately
    • what type of remaps do mashups need? functions/parameters?

Aug 7, 2013

Updates

  • [TE] VisTrails 2.0.3 released
  • [DK] Much improved db tests
  • [DK] Many bugfixes to translations

Items to Discuss

  • [DK] More general parameter exploration?
    • Unify some of the mashup constructions with parameter exploration elements?
  • [DK] More general module outputs

Jul 31, 2013

Updates

Items to Discuss

  • [DK] Where to put code to store bundles; should this be versioned?
    • [TE] Yes, this makes sense, can address issues like mashups which didn't appear in earlier versions
  • [DK] Caching: vtDV3D has its own persistent module store so that underlying VTK modules do not need to be reconstructed each time
    • this means that the underlying VTK pipeline need not be reconstructed when a parameter on one of the files is changed
    • does this actually speed things up? if so, can we generalize this style of caching where structure is the same and updating parameters on the structure can be done more quickly than reconstructing a new (sub-)pipeline
  • [DK] UV-CDAT Developer Documentation
    • document how to create a new package in UV-CDAT (simplest example to begin, pull in pieces of DV3D to showcase more interesting pieces)
    • document pipeline_helper
  • [DK] Coordinated spreadsheet cells (e.g. selection in one cell updates another cell)
    • Jorge had a Coordinator module, VTK has some other coordination, DV3D may have some?
  • [BB] UV-CDAT
    • still looking at memory issues, using pysizer didn't seem to be picking up everything

Ω¸* [TE] Matthias has fixes for ALPS so we can build binaries

Jul 24, 2013

Updates

  • [DK] sqlalchemy changes working, need more tests

Items to Discuss

  • [EV] NASA work
    • familiar representation, convert between workflows specified already (e.g. using pyxb like java xml binding)
    • NED Client: everything is Java, can run on local machine or remotely
    • has workflow editor, tasks specifiied in hierarchical lists
    • each task can have and executable and task dependencies, specified in tabular view
    • different types of tasks (including loops, system, java)
    • scientists have shied away from using workflow tool directly
    • some scientists comfortable working at low level, running qsub directly
    • [JF] Look at UV-CDAT-style layer?
    • [EV] still need a way to control copying at the file level
    • workflow front-and-center versus configuration front-and-center
    • [JF] VisTrails treats changes to parameters the same way changes to the structure of the workflow
    • configuration tree with lots of parameters that can be changed
    • only four workflow variables? (user, cwd, ...)
    • current scripts could be broken into smaller components
    • global variables on a per-workflow basis
    • creates running instance view showing completed/running tasks
    • can examine logs for each task
    • can suspend, resume, or abort a workflow
    • build coarser modules?
  • [BB] UV-CDAT bugs
  • [DK] versioning bundles? what is the common interface for all versions?
    • common locator and bundle -> specific version serializes/deserializes?
    • if so, bundle must be extensible
    • translation between versions happens at bundle granularity?
  • [RR] referencing modules in Groups

Jul 18, 2013

Updates

  • [TE] Win32 version of ALPS still does not work
  • [DK] sql-alchemy, use-uuid branches

Items to Discuss

  • [RR] Branches to merge? dont_stop_on_first_error, cltools
    • [TE] to test the cltools changes?
  • [TE] Update on NASA work
    • has script that he wants to send to PBS through VisTrails
    • issue comes up with PBS Pro (different versions of PBS)
    • restore missing or failed jobs
  • [JF] Microsoft using workflows in cloud
    • data movement is a first-class operation, moving data and computation around
    • also helper with resiliency
    • vision seems to take things beyond crowdLabs-style
  • [RR] Recording provenance in multiple machines (machine-wf-exec)
    • Duplicate entries (part of module runs on local machine, part on remote), this is ok, but want to know these are linked
    • Need more information in log about how execution occurred
    • Push ip info into machine?
    • have other information (name and/or MAC address)? privacy issues?
    • allow users to configure machine name as a preference?
  • [JF] have a wiki page for new features and bulleted list of items, things that changed, etc. so that we make sure to document these later
  • [RR] parallel execution: choose where to execute each module, happens at runtime, want to allow user to control this
    • where to store this information?
    • store separately, but write each configuration to the log upon execution
  • [BB] UVCDAT bug status
    • 5 or 6 bugs left
    • still looking at memory issues in UV-CDAT, seems to be a Python issue
      • read more on this? PySizer, Dowser
    • grower bug needs significant redesign (need two outputs)
  • [DK] uuid/bundle status

Jul 11, 2013

Updates

  • [TE] 2.0.3 release - Still waiting for reply from Matthias
  • [RR] tabledata merged in
  • [RR] Variant type checking #600
  • [DK] UUID

Items to Discuss

  • Demos of parallel flow and job submission
  • UVCDAT bugs [3]
  • Unique identifiers and configuration management
    • looking at use-uuid branch again, updating to master
  • [RR] Continuing on errors (dont_stop_on_first_error)
    • Need to make it clear that there was an error (both in GUI and via API)
    • able to get to results from each module via objects returned from execute
  • Pausing workflows
    • Interact using the progress dialog to cancel/pause
    • multithreading doesn't fix everything because of UI drawing from packages (e.g. matplotlib)
  • v2.1 release
    • put together binaries, test
    • anything left to integrate?
    • hold off on UUID changes, mulithreading
    • waiting on Matthias's changes for 2.0.3 release

Jul 5, 2013

Updates

  • [TE] Implemented mashup animations
  • [TE] Eduardo at NASA is testing BatchQ-PBS.
  • [TE] Waiting to hear from Mathias about fixing Alps for v2.0.3 release.
  • [TE] Testing BatchQ-PBS on the NYU HPC cluster.

Items to Discuss

  • Eduardo's work
    • Installing VisTrails (can download binary and then replace source with latest source code)
    • Workflow that allows people to create sandboxes or tap into existing runs in supercomputing environment
    • Scientists have specific models that use various parallel techniques that use different codes and libraries, may have to grab source code and compile
    • Important to be able to execute things from scratch (compile, set up environment, and kick off run), all of this has depdnency graph
    • Running means everything is staged and just using qsub
    • Monitoring using qstats to see where you are
    • Have to know if/when it makes it to the queue, which files were produced, model output
    • machines outside security boundaries still have two-factor authentication
    • HPC environment won't allow connections over 12 or 24 hours
    • Running scripts needs to be supported
    • VisTrails needs to be able to track the scripts and progress in workflow based on outcome of scripts
    • Not every failure means the next module cannot execute, could continue in other directions
    • [JF] Suggest looking at ModuleSuspended state and CLTools & contrflowflow packages
  • Terrence's work
    • Distributed computing, running processes of different machines
    • VT lacked multithreading at module level, adds layer so don't have to patch
    • implications of order
    • [TvZ] intention to use multithreading in the core of VisTrails?
      • [JF] Yes
    • can run modules on different machines, VisTrails just orchestrates remote processes
    • [JF] Have availalble at sub- workflow level, would have to have something running that understands the subworkflow scheme
    • [TvZ] machines have "crippled" versions of VT that permite this sort of execution, crippled meaning that only enough to run the module, makes calls back to the remote machines
    • Not a solution for everyone (manage the distributed computing environment)
    • Use tagging for different workflow versions
    • Modues will store configuration
  • Provenance in multithreaded execution?
    • Fernando wrote for parallelflow package
    • Have some support for sending python code to remote machines

June 26, 2013

Updates

  • [DK] SQLAlchemy changes added, basic write/read works
    • need to check things like updates
    • mysql is one of few DBMS that supports multiple selects at the same time, need to special case *_many_* calls
  • [DK] Fixed bug in PortSpecItem serialization (new expandAction flag)
  • [TE] Working on animation support for mashups, similar to that in crowdlabs.
  • [BB] UV-CDAT: working to keep variables in cache (load variable now creates a pipeline)

Items to Discuss

  • [DK] Output types:
    • Discussed before, but want to allow modules that produce output like SpreadsheetCell or StandardOutput to support a get_output() call that takes a mime-type and returns None or an output that matches the type.
    • Could support, for example, inserts of textual values in LaTeX/wiki documents
    • Also allow InputModule to ensure that we can do data lineage-style queries?
  • Package flag on load_configuration should be removed
  • Parallel execution: write an interface so that you could choose on a per-module basis which type of execution it can support
    • parallelflow versus multiprocessing (xml serialization of pipeline versus pickling objects)
    • advantage of xml is that large intermediate objects don't need to be pickled

June 19, 2013

Updates

  • [DK] Bundles work
    • basic architecture: bundle contains a set of VisTrails entities, storage-specific serializers provide support for reading/writing the bundle
    • serializers can register sub-serializers for each entity, sub-serializers could be added at runtime (e.g. for packages that wish to save to a vt bundle)
  • [TE] VisTrails 1.0.3 binaries almost ready
    • Troyer is looking into a problem with ALPS on win64

Items to Discuss

  • [DK] Move DB support to something like SQLAlchemy?
    • can just use core SQL expression work for now
    • makes cross-platform DB access much easier
  • [DK] DB Updates would allow move of temporary vistrails to sqlite3 database
    • should make updates much quicker for auto-save
    • could consider updating database during runs == improved provenance
  • [JF] Why not just use XML fragments?
    • Actions are monotonic
    • [DK] Survey of the "Web" seems to show that people don't like fragments
    • [JF] Shouldn't be an issue
  • Tagging
    • Could we have a save/save as for tags (the tag moves down unless someone "tags as"
  • [DK] UUID identifiers?
    • Switch identifiers at same time?
    • Allows easy merges
  • [TE] Status of import_rewrite branch
    • replace import statement to add "vistrails." prefix
    • kind of broken in master right now
    • new branch that has fixes for the
  • [RR] Parallel execution
    • controlflow issues
    • change in update_upstream

June 12, 2013

Updates

  • [TE] Initial wrapping for Madagascar
  • [RR] Fix imports for old-style packages
  • [DK] Bundles
    • Can load old vt files using the new interface, saving with manifest works
  • [RR] Python library installation
    • Can use pip (defaults to distrib package manager)
    • Works on Windows

Items to Discuss

  • [TE] old-style imports
    • Did you test NumSciPy?
    • Any ideas for PythonSource upgrades? Terminator "isosurface script" uses "core." imports.
    • A few ideas:
    • save the temporary package created in the look at call
    • still an isue with python dependencies (better to have a graph than a list), otherwise when packages have same python dependencies, one package "owns" them, and the other doesn't have
    • How to deal with PythonSource (__import__ comes from string exec)
  • [RR, DK] Package states
    • [RR] noticed that the code the preferences dialog uses to look at a package's dependencies, etc. uses a temporary object
    • Means that any imports in __init__.py are not captured since the package was imported as temporary
    • [DK] suggests that we have three states for packages: available, loaded, and enabled so that the look_at dependencies are captured but we can differentiate between packages that are loaded and those that are not
  • [DK] New package development (dynamic or generator-based?)
    • Noticed when looking at Madagascar wrapper
    • Difference between VTK and new matplotlib, VTK builds the wrapper from underlying code each time VT starts while new matplotlib builds the wrapper from underlying code only at build time
    • Likely to have better stability and consistency with generator-based
    • Easier to incorporate new code in the dynamic environment
    • Generator-based is more indirect
  • [DK] Port types
    • Consistent style for referring to port types?
    • Have old class-type method which doesn't work well for userpackages
    • String-based method may be harder to track errors?
  • [RR] Multithreading attempt
    • what are the issues with multithreading
    • uses a task system to track things
    • for backgrounding thread, using future library
    • since Python has GIL, can call setResult from different threads without worrying about memory
    • old modules still run the same way
    • can we get workflows running on a separate thread so the interface doesn't hang
  • [TE] Survey

June 5, 2013

Updates

  • [TE] Added IPython Console
    • Requires installing IPython from source
    • Uses an updated PyQt interface
  • [DK] Bundle/locator updates, bug fixes

Items to Discuss

  • Workshop discussion
    • IPython console: required some pyqt changes
    • Packaging
    • Componentizing VisTrails
    • Wrapping Madagascar (estimate the amount of effort)
  • Pending feature branches:
    • import-rewrite
      • Importing NumSciPy now works, but not unloading
      • probably because global __import__ override
      • potential issue with __init__ and temporary package?
    • automatic-conversion
      • e.g. Float -> Int
    • multiline-strings
      • Already merged but there was a textedit parameter view issue (It works on Linux) [TE]
    • tabledata-pkg
      • merge the old tabdata package in here
    • apt
      • a lot of (bad) things can happen in apt, but under normal operation, things work
      • use pip or easy_install for python packages when possible?
    • new-pyqt-api
      • May break third-party packages
      • deals with QString, QVariant
      • Requires latest version from git
      • Connect to running python (without separate kernel)
  • Scripting support
    • Python script <-> workflow
  • [TE] v2.0.3 Release
    • MySQLdb was missing
    • Autoremove vistrails-single-instance file
    • Fixed latex examples [ES]
      • crowdLabs needs to be updated for full support
    • Fixes for JobView, save_vistrail_to_db, scipy.weave
    • More?
  • User survey
    • Claudio suggested talking with power-users like Tom Maxwell

May 28, 2013

Updates

  • [RR] 'tabledata' package: numpy arrays, CSV, data conversion
  • [DK] GMapCell, TableCell, bundles

Items to Discuss

  • [TE] Testing for db backend
    • Currently no automatic tests
    • Use sqlite, local/remote mysql
  • Data conversion modules
    • IBM WebSphere to convert image types
    • useful in scientific applications
    • Converter base module class
  • New user survey?
    • Look at old PDF, add new/old questions to a Google Doc
  • [RR] new __import__ override
  • [RR] Python 3: tricky (circular imports)
    • Probably easier to keep

May 22, 2013

Updates

  • [TE] branches are merged, but there is some work to be done
  • [TE] Images in thumbnails are now sorted by cell location

Items to Discuss

  • [TE] Newline support for Strings
    • We can use the Configuration Button to open a multi-line editor
    • Do we want an optional small multi-line editor in the ports window as well?
  • [TE] Unicode support
    • Derek@CSIR is interested in using unicode symbols in labels, module names, and code.
    • In a matplotlib source this works: xlabel(u'\u2211')
    • We have a branch where saving to unicode works
    • Full unicode support would require converting all strings to unicode u""-strings and making sure everything still works
    • Supporting Python 3 is another option, problem is many packages don't support Python 3 yet.
  • [TE] API tests fail on Linux, see here.

May 15, 2013

Updates

  • [DK,RR] Locator changes (Rémi also added unicode support)
  • [DK] Persistence fix
  • [TE] Demoed CrowdLabs on UV-CDAT meeting
    • They want to create a web interface for UV-CDAT and may be interested in using CrowdLabs

Items to Discuss

  • Feature branches to be merged in master?
    • new-matplotlib-pkg (includes pkg-id-changes)
    • locator-filenames
    • spreadsheet-resizing, tempfiles, tests
  • [DK] Mac binary with scipy.weave causes doesn't have nice Mac binary update, mac_site.py embedded in site-packages.zip
    • look into easy_install issues, would help solve some of the missing package issues
  • Persistence bugfixes:
    • how to tell different versions (add a prompt to save a comment when a file changes)
  • [DK] Bundle updates:
    • Create DirectoryBundle, ZipBundle, DBBundle
    • Add manifest file to Directory/Zip bundles?
    • Serializer base class, register serializer with bundle based on serialization type and object(s) serialized
      • For example, a LogDBSerializer loads/saves the log to the database and would be called for DBBundle
      • Packages subclass from Serializer and define how they read/write data (given base properties)
    • XML Fragments (appending data)?
    • Support different types of bundles (e.g. register a Local/PersistentFileSerializer to add files used in a workflow to make a self-contained bundle).
    • crowdLabs support?
    • checkboxes in the UI so that things can scale (allow users to select what is exported)
  • [TE] NumSciPy and "vistrails." import issues
    • Update NumSciPy?
    • create run.py at top level to get rid of old import path?
    • How good is the support for old packages?
    • what about compute for PythonSource?
      • could wrap __import__
      • use upgrades
      • add prompt and warning before doing these upgrades
  • matplotlib date plotting not working, shows number

May 9, 2013

Updates

  • [TE] Mac binary: added support for scipy.weave
  • [DK] Added better support for untitled locators
  • [DK] More API Changes
  • [DK] controller's current_pipeline_view is actually the pipeline view now (not the scene)

Items to Discuss

  • [TE] Canceling VisTrails execution [DEMO]
    • Implemented progress dialog similar to parameter exploration
    • Shows progress by counting the modules
    • Shows the executing module name
    • Cancel works by catching pipeline scene update signals
    • Users can enable module to be canceled by implementing progress reporting
  • [DK] Threading for pipeline execution?
    • Issue with Qt windows on different threads?
    • In the middle solution where users could choose to run non-GUI workflows on a separate thread?
  • [TE] UVCDAT Web meeting Monday
    • Present crowdlabs/webgl
  • [RR] Spreadsheet ideas from discussions with Dean
    • Operations on multiple cells
    • Select multiple cells and then perform same operation on all of them
    • Selecting L-shaped set of cells (textual entry A1:A3, D1:D2)
    • Formula-type bar in spreadsheet (like Excel)
    • Provenance issues here
    • Synchronization between cells (look at common properties)

May 2, 2013

Updates

  • [DK] Identifier switch
  • [DK] matplotlib migration
  • [TE] VisTrails 2.0.2 released - 100 downloads already
  • [RR] has new utilities from constructing tests for testing packages

Items to Discuss

  • [TE] VisTrails 2.1 beta ready to go
    • would like to include the changes on pkg-id-changes and new-matplotlib-pkg
    • should put the 'old_identifiers' in the schema
    • add tests for new features
    • what about reading 2.1 vistrails in 2.0?
    • what happens when you load a version created using a future version of the package in a older version of VisTrails?
  • [DK] core/gui coding: want to keep most logic in core
    • want to be able to use packages like job monitor, etc. from core
  • [DK] api/application logic
    • currently have been moving logic from vistrails_window into api/application, does this make sense?
    • should application call api or vice versa?
  • [FC] Trying to gt Win8 installed
    • Sent an email to Anup about this
    • Also OSX 10.8?

April 24, 2013

Updates

  • [RR] New version of 'parallelflow' module (IPython)
    • Now supports remote engines through ipython profiles
    • No ssh support
    • We could look into PBS support
  • [TE] 2.0.2 binaries ready
  • [DK] API

Items to Discuss

  • [TE] Support more build machines?
    • OSX 10.6? Emanuele might be able to build using her machine
    • OSX 10.7? The build machine version (The current build is falsely named 10.6)
    • OSX 10.8? Need a new build machine
      • Need to ask Emanuele how other versions can be built on the build machine.
    • Windows 8? May not need a separate binary but should be good to test on
      • Ask Fernando to set up a virtual machine on the build machine
  • [TE] Changes in 2.1
    • Renamed vistrails.py to run.py to avoid conflicts
      • Need to make sure documentation is updated
    • Renamed the "packages" directory containing contributions to "contrib" to avoid confusion
  • [DK] API/App Changes
    • Dave will commit changes that makes the api clearer
  • [TE] I suggested creating a new devel branch and a new workflow for committing changes. I will look in to this some more.

April 17, 2013

Updates

  • [TE] Done with checking schema conversion and creating tests
  • [TE] Started generating new v2.0 binaries
  • [RR] DAT: data provenance viewer implemented

Items to Discuss

  • [JF] The LaTeX extension issues
    • uses the weather example
    • [ES] related to CLTools package?
  • [TE] No known v2.0 bugs that need to be fixed
  • [TE] What is left to do for v2.1beta release?
    • are some bugs
  • [DK] need to look at merging the new matplotlib package
    • need to add MplScript module
  • cannot run parallelflow on multiple machines?
    • we should set features before releasing the v2.0 version
  • have root-level API?
    • move the logic to core/gui as needed, and have the root-level import depending on mode
    • need to separate concerns as necessary
  • try to make the switch to the org.vistrails prefix
    • keep the old identifiers and have both point to the same package?
  • three types of subworkflow-like pieces: mashups, subworkflows, DAT
    • maybe more implementation-level differences
    • can we comine the infrastructure for these?
    • should we try to use a similar API?
    • for visual compression, more of a conceptual change rather than create new group/ungroup each time
  • synchronizing aliases and vistrail variables
    • really want to have some unifying solution here
  • including GDAL, etc. in binaries?

April 10, 2013

Updates

  • [DK] Fixed the matplotlib cell resizing issue in current and new mpl packages.

Items to Discuss

  • v2.0 Release:
    • [DK] matplotlib cell size issue should be fixed
    • Any reported bugs that are not fixed?
    • add v1.0.3 -> v1.0.2 translation in the v2.0 release
  • v2.1 Beta:
    • What are the new features we should include?
      • auto-connection
      • visual port feedback (i.e. parameter set)
      • different port shapes
      • vistrails.-prefix
      • new matplotlib package?
      • mashup integration?
      • parameter exploration serialization
      • new schema for port spec
      • parallel flow, multiprocessing work?
      • what else?
      • add matplotlib examples inside the examples directory
      • add tests for the new features
    • What needs to be checked before a beta is released?
      • want to make sure users that create vistrails in the beta can use them later
      • need to freeze the schema---are we happy with what is there?
      • need to check the schema v1.0.3 <-> v1.0.2 translations: are they robust?
        • [TE] I am checking by using the vistrail files on vis-7, but the update_db script needs to be updated.
        • v2.0 should support importing the v1.0.3 schema
      • add tests for things like schema
    • matplotlib package
      • have Sunitha add some documentation about how to use the mpl package
      • still need legend updates, patch support, colorbars, script support
    • DAT VTK issue with flickering

April 3, 2013

Updates

  • [BB] Working on UV-CDAT bugs, release mid-April
  • [RR] Working on the VTKCell Overlay

Items to Discuss

  • [DK] Better DOT_VISTRAILS management. Why not have "modes" which correspond to certain package sets?
    • USGS users have some requirements here (sometimes run with SAHM, other times without)
    • It's really annoying to have to update the enabled packages each time
    • That way, if I am interested in running without the spreadsheet, etc., I can just run in a given mode and when I want the spreadsheet and VTK, etc. back, I just switch back
    • basically, just segment out the packages part of startup.xml and make multiples of these
    • also add command-line flag
    • [TE} maybe ony specify only whether some packages are enabled or disabled?
    • [TE} maybe just point to a different user-packages directory
    • what about just making command-line flags to enable/disable packages
  • [DK] Latest VTK version
    • Has anyone tested how the latest version of VTK runs with VisTrails?
    • Huy ran into an error about "no concrete implementation..."
    • [TE] May just need to ignore the class causing this issue
    • [ES] Seems like the 5.10 master also had this issue before the release but perhaps they removed the features
  • [DK] Installing packages/python dependencies?
    • [ES] can install distutils into the Mac binary?
    • WIndows has a full python install so this isn't as much of a problem.
    • Just have a menu-item in VisTrails that sets the right configuration variables for the installer (point to the setup.py that the user wants to install)
    • Want to install to a VisTrails library directory that lives outside of the binary.
    • How to deal with python dependencies for compiled pieces?
  • [TE] Ryan Danks on vistrails-users: I've been trying to get Vistrails to work in parallel both by using the server instructions found in the documentation as well as by using the standard Python multiprocessing.Pool objects within my module. Neither of these approaches seem to work. Has anybody successfully run their Vistrails code in parallel? If so, how?
    • Tommy to check with Fernando on whether ParallelFlow satisfies these requires
    • Is ParallelFlow ready for release?

March 27, 2013

Updates

  • [BB]Online/Incremental layout
    • Added option to prevent any gaps in layers between connected modules. This in addition to preserving horizontal order seems to produce nice layouts from version to version.
  • [RR] working on DAT: multiple variables per port working
    • Also added variable number of inputs on List module
    • Also did some changes on the controlflow package
    • Also made some changes to the VT GUI as suggested by Juliana last week to the PythonSource and List modules
    • Also updated DAT to use matplotlib plots and add conversion typecasts if variable types don't match

Items to Discuss

  • [JF] Update to package API documentation
    • Need to write documentation with this in mind
  • [DK] Colin (USGS) asked about installing packages into VisTrails's python distribution. What is the status of the easy_install work; are there other solutions?
    • need to talk to Emanuele/Tommy about this
  • [DK] UV-CDAT Scripting. Is there anyone who can take this on? Charles is planning a meeting for early April to show what he would like to see for this.
    • Part of this is just designing meaningful modules (the wrapping side of things)
    • The other side is what can be done to make the scripting interface more like a normal python script
    • Currently, the scripting interface just builds a workflow. This is somewhat unintuitive if you expect that things act like a normal python shell...
    • Do we support partial execution as modules are added?
    • What to do if a call is not wrapped?
    • Perhaps a good starting point is the VTK example scripts. I converted these to workflows by tracking all the calls. Why don't we see if the scripting interface can convert these on-the-fly?
    • Also, I want to keep in mind numerical ops and numpy. Typing expressions is often much easier but we don't have a great way to represent this in a workflow. What should be done?
    • python has a built-in abstract syntax tree (ast) for itself that is user-facing. Could we use this to better translate scripts to workflows?
  • [BB] Add auto layout to api calls?
    • Possble flag for this so that the user can choose whether things are laid out or not
    • Also try to use this for the auto-snap feature in the GUI, BUT need to be careful to animate this as users will not be able to follow what is going on otherwise.
    • the layout code now runs without any GUI dependencies but can still use QFontMetrics (like for the Cmd+L call)
  • VirtualBox issues on the testing machine? Emanuele is trying to fix, may need to contact Fernando

March 20, 2013

Updates

  • [TE] Save Camera (and similar) now works for unsaved and "saved as" vistrails using the new ensureController method
  • [TE] DB Mashups merged into master
    • working to upgrade the older mashups
  • [TE] Added JobSubmission entry to FAQ
  • [ES] Planning to have the binary ready for testing on Friday
  • [DK] Work on matplotlib
  • [DK] Help answer Charles's UV-CDAT questions

Items to Discuss

  • [RR] Control flow modules:
    • Lists vs multiple connections
    • Using 'self' ports on ungrouped modules
    • along each port to take a list or a single change a la Taverna
    • try to write a new module to dynamically create a set of connections from a list
    • do this in the interpreter? (need to modify the executing pipeline, not the specification
  • [RR] Issue with "self" port connected (emailed Fernando)
  • [DK] Similarly, should we try to add some method for ordering connections?
    • Could display the order using a number overlay when a user moves the mouse over the port
    • Then, selecting connection, could move up/down like illustrator...
    • Probably would require an additional attribute to be stored in the workflow
    • Rémi suggests mirroring Tuple here, (set number and type)
  • [DK] Dynamic ports behind the scenes?
    • Problem: Would like to have both stronger container types (e.g. List<String>) as well as flexible types (e.g. Float|List)
    • Idea: Don't change anything about the current registry, but allow this at the interface layer and dynamically add the port to the module/registry when needed?
    • For flexible types, could create myportFloat or myportList ports depending on the connection and then map these back to the same port before compute call...
    • For container types, could only add a port type to the registry when it is actually used so we don't have to materialize everything beforehand.
  • [JF] PythonSource/Tuple port types
    • cannot type to search for port types -- have to type fast
    • use a completer in the text field for this?
    • drag and drop from module list?
  • [BB] Online layout and gui-free layout
    • Am close to a working solution that just uses the old layout but optionally preserves order of specified modules
    • Having an issue with saving the positions, and strange issue with newly dragged modules vs modules loaded from file

March 13, 2013

Updates

  • Juliana's visit to Brookhaven National Labs
    • interested in climate data analysis on observational data (also light source data)
    • need to have executino on parallel machines, not able to find batchq package
    • batchQ seems to be pure python and should not be hard to install, need to specify which version to install, add to FAQ
    • UV-CDAT Sandy demo: Sandy YouTube video
  • [TE] Saving mashups to DB works, but there is still a bug with tagging a mashup, the interface is redrawn while tagging and a new mashup version is created.
  • [TE] Working on getting spreadsheet "to version" method work with unsaved vistrails
    • Added an "ensureController" method for loading a view from a controller object
    • Getting the spreadsheet to work with controllers is difficult because it uses the locator everywhere and still needs the locator for things like saving the spreadsheet.
  • [DK] More progress on matplotlib updates
  • [RR] Fixed the bugs on DAT, trying to bring in more examples from VTK

Items to Discuss

  • It seems the ALPS package is not included in the Linux binary distribution. Is this a bug or is it omitted by design?
    • [DK] Is it included in other distributions? I thought ALPS provided its own installers that update VisTrails with the necessary support for ALPS.
    • Ask Emanuele and/or Matthias about this to see what the current setup is
  • We used to have a scripting window in VisTrails and also in an early version of UV-CDAT. What is the status of the support for shell-like commands in VisTrails?
    • [DK] core.api has some support for commands like this. See the FAQ
  • [DK] How to best deal with importing Modules from other packages. For example, if I want to subclass a Module in matplotlib, I can use the registry (ugly, see FAQ, or I could try to directly import from vistrails.packages (e.g. import MplFigureManager from vistrails.packages.pylab). While the second method is nicer, it is also not very robust because
  1. it does not enforce package dependencies. I can import from pylab even if that package has not been imported.
  2. It does not work great for a package that might be in packages or $DOTVISTRAILS/userpackages depending on how it was installed (e.g. ALPS). A developer would have to check both places.
    • One option is to make any import of vistrails.packages use the registry method but hide it from the user. Not sure how this works for non-modules, however.
    • Another option is to have vistrails.packages import all of $DOTVISTRAILS/userpackages which means a developer has to use the "from vistrails.packages import" syntax to access items.
    • Other ideas?
    • Just limit the way an import occurs (only the "from vistrails.packages import <package_name>" syntax)
  • [TE] Should we add BatchQ to VisTrails?
    • Users are having problems because only the stable branch of BatchQ works with JobSubmission
    • Could we use git submodule?

March 06, 2013

Updates

  • [RR] DAT can now execute operations on variables
  • [ES] Build Machine: Windows 32-bit setup is ready, but the 3D Texture Mapping doesn't work on the VM. I will test if the problem will be present in the binary.
  • [TE] Saving mashups to DB now works
    • Will add more tests and then merge into master
    • After that update crowdlabs to use it

Items to Discuss

  • [ES] Build machine:
    • Do we have to include the patched version of SUDS in the binaries? (I noticed the example that required it was removed from the repository)
      • It was decided to keep it
    • Do we have to support old packages (for example, NumSciPy)?
      • It was decided to use the prebuilt binaries for now
  • [TE] Adding locators to unsaved vistrails
    • This will fix view switching issues with spreadsheet and vtk
    • We should be able to add a MemoryLocator class to unsaved vistrails
    • It was decided to store a translation from controller/view to locator

Feb 27, 2013

Updates

  • [TE] Working on mashup bug fixes and DB support

Items to Discuss

  • [TE] Animation support for Matplotlib cells?
  • [TE] Will look at adding locator for unsaved vistrails

Feb 20, 2013

Updates

  • [TE] Added PBS support to BatchQ master branch
    • Troels working on a few performance items
  • [TE] Also working on adding mashp
  • [BB] Looked into some existing graph libraries for layout (NetworkX, igraph, boost graph tools, ...)
  • [DK] Checked in WIP on the new matplotlib package (include README about how things are generated)
    • [DK] Need to clean up the specs for this package
    • [DK] Figure out what the new student can help with
  • [RR] Determining how to type data

Items to Discuss

  • need some type of dataset standard for use with the DAT
  • what about list of (floats, strings, any type of VisTrails object)
  • require some additional wrapping for the DAT right now
  • also think about numpy/scipy operations

Feb 13, 2013

Updates

  • [TE] Job Monitor Improvements (New DEMO?)
    • need some documentation
    • Troels is working on some features for multiple jobs (100+) because each job creates a new connection
    • also going to change a bunch in the implementation
    • some projects need PBS submission capabilities to be using VisTrails
  • [DK] matplotlib work improving (a few things still missing but working, mostly automated)

Items to Discuss

  • [DK] vistrails prefix and numpy issue that Remi found
  • [DK] online pipeline layout code
  • [JF] shape for File module
  • [JF] Sunitha to work on examples for matplotlib

Feb 06, 2013

Updates

  • [FC] Working on the package for SciCumulus
    • Widget to query run-time provenance from the cloud (almost done)
    • Use ModuleSuspended in the execution (to be done)
  • [TE] BatchQ-PBS
    • Finished a Job Monitor Widget for VisTrails (DEMO?)
    • Need to add provenance gathering
  • [DK] Fixed minor issue with upgrades (USGS)
  • [DK] Working on matplotlib bindings

Items to Discuss

  • DAT project
    • Basic functionality is there, some small example packages
    • How do we handle multiple projects?
    • How do we support multiple 'layered' plots per cell? (UV-CDAT needs this)
    • [BB] Configuration widget
      • show ports pane for each module in plot subworkflow
    • [CS] Automatic labeling of the pipelines
    • [CS] Also try to auto-generate some of the code that is used to specify ports in the DAT
    • Talk to Jerry and Tom about what they like about UV-CDAT, what can be improved as feedback for the DAT (talk about refactor)

Jan 30, 2013

Updates

  • [DK] Fix for older packages to run using new vistrails. prefix
  • [DK] Finished dulwich translation of persistence package
  • [BB] working on DAT)
  • [FC] working on integration between VisTrails and SciCumulous
  • [TE] BatchQ and PBS, module suspended state
  • [TE] way of mointorining jobs
  • [TE] mashups on crowdlabs in database, Emanuele will help on this

Items to Discuss

  • [DK] Need to update documentation for vistrails. prefix
    • check vtl files here
  • [DK] Ticket #649

Jan 23, 2013

Updates

  • DataONE Updates
    • obseleted tags
  • [TE] First version of BatchQ-PBS ready. Waiting for feedback.
    • forked on github, no changes to the VisTrails package
  • [DK} Fixed Python dependency determination
  • [DK] Added a branch with UUIDs as identifiers
  • [DK] Working on using dulwich (git implmentation in python) for peristence package
  • [FC] Updates in PROV exporter
  • [FC] Start integration between SciCumulus and VisTrails
  • [RR] Multiple projects in DAT?
    • Some way to tie cells back to the project is needed
    • Global/local mismatch is an issue here (if I drag a variable from one project and combine it with a variable from another project, which project does the new vis/data below to?)

Items to Discuss

  • [JF] Update on package David worked on for USGS (control/suspend execution, support execution on different computational environments) -- msg data 12/18/2012
    • methods to fork off processes and to condor
    • need to track the settings for condor/asynchronous runs in provenance
    • more general modules for this?
    • abstraction mechanism for subworkflows to split up workflows (mixed execution models)
    • better indication of when a module has been started but is not completed.
    • BatchQ shows "suspended" state for this, stops executed this branch.
  • [TE] Specifying python dependencies and py_import at the same time is confusing. We could probably call py_import from the package loader if the needed information was specified in the dependencies.
    • merge these two things for packages (otherwise have issues with specifying new dependencies and trying to import them)
  • [TE] Will new import structure and UUID:s go into 2.1?
    • Synchronization issue
  • [DK] Nix ideas?
    • we should at least record some hash of the wrappers/packages in the provenance

Jan 16, 2013

Updates

  • [TE] WebGL port issue: Wendel added fallback mode (use pre-rendered data), is now helping me getting it to work using port 80 through apache.
  • [DK] DataONE package
  • Rémi's progress on the DAT
    • working on VisTrails code to modify the spreadsheet
    • GUI is usable except that the interface doesn't execute VisTrails pipelines
    • Has a document available in Google Docs documenting progress
  • [BB] All critical open bugs for UVCDAT are complete, one minor bug and some enhancements left

Items to Discuss

  • WebGL on crowdLabs
    • try to send cached images through port 80 if necessary
    • need to inform users why an issue comes up (e.g. port is closed, using cached image)
  • Remove the sci mailing lists finally?
  • Building VisTrails on a NASA supercomputer?
  • Web interface aka ClimatePipes

Jan 9, 2013

Updates

  • Emanuele resumed setting up the build machine for Windows.
  • [TE] crowdlabs now uses rendered image in vistrail tab if available
  • [TE] Discussing WebGL port issue with Wendel
    • One possibility is to use port 80 on another server and use a proxy
  • [TE] Working with Troels on adding PBS support to BatchQ
  • [BB] 5 Bugs remaining with UVCDAT
  • [DK] DataONE work

Items to Discuss

  • Uploading Mashups to crowdLabs from VisTrails
    • For this to work need to complete relational support for Mashups (see ticket #611). The code is in mashups branch and needs to be updated with the current master (it was branched from old schema-v1_0_3 branch and not updated since then).
    • Need someone to work on this for Matthias
    • Not just relational support here, want to improve the schema
    • Could just dump xml as a blob into the database, but were trying to make things work better
  • Cleaning up older files (see last week)
  • UV-CDAT Bug #69:
    • The variables are global; cdat has a global dictionary
    • thus closing a project has no effect on the dictionary
    • if you have two projects that reference the same variable name, you run into issues
    • discuss this with Charles
  • UV-CDAT undo button:
    • should be ok, just remember to update cell version

Jan 2, 2013

Updates

Items to Discuss

  • Rémi went through the code to find modules that are no longer used
    • core.modules.module_configure and related have been moved to gui, can we remove the core redirects? Problem is that this could break older packages
      • need to update documentation and add to FAQ
    • should we continue to maintain third-party packages (those in /packages not /vistrails/packages) or should these be moved?
      • do we distribute the /packages directory with the source tarball?
      • we don't want to be responsible for keeping these packages up-to-date, but at the same time, we shouldn't distribute packages that do not work with our releases...
  • crowdLabs issues?
    • Trying to make ALPS server work with VisTrails
    • Thumbnails: make these higher resolution
    • Issue with ports and WebGL?
      • need an error message if this is a port issue
      • should be solutions/workarounds to use "standard" ports with the WebGL data

Older meetings