Difference between revisions of "Development"

From VistrailsWiki
Jump to navigation Jump to search
Line 20: Line 20:
* [TE] New bugfix release? - Parameter Explorations and wrong usersguide in Mac binary
* [TE] New bugfix release? - Parameter Explorations and wrong usersguide in Mac binary
* [TE] FileOpen event on Mac ([https://www.vistrails.org/ticket/889 #889])
* [TE] FileOpen event on Mac ([https://www.vistrails.org/ticket/889 #889])
** We can pass argv to QApplication
* [TE] 1.0.3 schema differences ([https://www.vistrails.org/ticket/894 #894])
* [TE] 1.0.3 schema differences ([https://www.vistrails.org/ticket/894 #894])
** We can restore schema 1.0.3 on master from branch v1.2


=== August 6, 2014 ===
=== August 6, 2014 ===

Revision as of 15:35, 13 August 2014

2014

August 13, 2014

Updates

  • [TE] Parameter Exploration bugfixes
    • "Create Version" spreadsheet button fixed
    • Missing date attribute in xml caused errors
    • Issue opening vtl links

Items to Discuss

  • [RR] Pending branches:
    • pipeline-validation (#886)
      • fix-open-workflow (loading a workflow keeps the locator, so saving will overwrite)
    • logger-propagation (#892)
    • hide-missing-dependencies (#895)
  • [TE] New bugfix release? - Parameter Explorations and wrong usersguide in Mac binary
  • [TE] FileOpen event on Mac (#889)
    • We can pass argv to QApplication
  • [TE] 1.0.3 schema differences (#894)
    • We can restore schema 1.0.3 on master from branch v1.2

August 6, 2014

Updates

  • [DK] bug fixes, improve-vt-bundle, uvcdat

Items to Discuss

  • [LB] LSU, instrument building, tomography, interferometry system
  • [TC] image registration/processing, infrastructure at BNL
  • [ED] working on data analysis at BNL, hoping to leverage VT for data provenance, user activity
  • [LB] tomopy and VT integration
  • [TC] how to integrate tomography software
  • [LB] goal to communicate with others, data security for crowdLabs?
  • [LB] improve parameter exploration
    • [JF] different ways: whole workflows, loops inside workflows, multiple threads, implict looping
    • [TE] add new examples of saved parameter explorations, different versions
  • [RR] Bug #886 ("from_root" parameter for change_selected_version())
  • [RR] API redesign, example: notebook, target: notebook
    • [DK] looks good, nicer string rep, improve access to and use of results is important
  • Jeffrey Guenther - list handling

July 30, 2014

Updates

  • [DK] UV-CDAT animation, sql-alchemy branch update

Items to Discuss

  • [TE] parameters in workflow - demo
    • show functions on mouse_over
    • move set functions to the top
    • inline parameter widgets
      • show/hide all
      • toggle in ports list to show/hide these widgets
  • API design wiki page
    • vistrails.api versus vistrails.core.api
    • API is currently stateful, would like to change
    • API calls using module names, parameter names instead of ids
    • tighter integration with iPython (what were ideas from meeting last year with iPython devs)?
  • argparse mutually_exclusive_group bug
  • provenance viewer: make panel wider
  • parameter views: combo box with different parameter viewing types
  • change parameter not updating GUI
  • installing python packages into binary
  • modules that dynamically configure their ports

July 23, 2014

Updates

  • [DK] output-modules merge, uvcdat, improve-vt-bundle

Items to Discuss

  • API design wiki page
  • [TE] output-modules feedback
    • Can we have 'Output' module that uses input value to determine output?
      • [DK] Unsure what this means, we could have more than one input port, seems like the mode of output is independent of what the value is
      • [RR] Infer from python types would take a lot of work, explicit easier to understand
      • Usability vs. issues for developers
    • Hide all but value port for outputmodules
      • Unsure here
    • Make output mode be an enumeration rather than text entry (actually need to allow own entry)
    • Should vtkRendererOutput replace VTKCell?
      • [DK] Need to see if display views are involved, interaction handler, etc.
    • Use module name as base name?
      • Also naming for parameters on command-line here
      • if you name module "flower", default base name is "flower"
    • Also consider possibly only generating one output based on this type of naming
    • Global imageFile format setting not working?
      • Put on trac
    • Created WebGL output for vtkRenderer (As empty package)
    • Matplotlib WebAgg and MBAgg backends better for web
    • Clash between mode identifiers
      • Register this on trac
    • [RR] how to load a specific mode if it isn't loaded? How do we find it?
  • [TE] other ideas
    • See other workflow systems: http://imgur.com/a/xY37B
    • Add icons to modules (graphical representation)
    • Add module by double-clicking in the pipeline view and writing module name
    • Edit parameters directly on the module
      • First step: move set parameters to top of list (or make more visible)
      • Second step: mine to find commonly used parameters
    • Better use of hovering to provide module information
      • Show parameters on right
    • [JF] Inspect workflows in the same window
  • [DK] matplotlib feedback from Thomas Caswell
    • figure top-down?
    • see Tom's github fork of matplotlib of pyplot changes
    • declarative syntax (MEP 25), goal for 1.5
    • create artists in modules
  • [RR] export-as-script
    • It's mainly about renaming variables, to connect module and avoid collisions
    • Currently gets code for script using package-provided to_python_script methods
    • Using the redbaron library for this (under early development)
    • Need to handle the different types of input corresponding to different calls (get_input, get_input_list, force_get_input, ...)
  • [BNL] Adding new palette tool
    • search box with many fields to format query, send back dictionaries
  • [BNL] Use numpy docstring format? yes
  • [BNL] PEP whitespace? yes
  • [BNL] python3 compatibility? no a big priority right now
  • [BNL] PyQt new style connections: yes
  • [BNL] Access to vistrails.org repository? yes

July 16, 2014

Updates

  • [TE] VisTrails 2.1.3 released
  • [DK] UV-CDAT

Items to Discuss

  • Integrate patches from github
    • we push from vistrails.org to github
  1. git fetch git@github.com:VisTrails/VisTrails.git +refs/pull/17/head:refs/heads/pull17
  2. git checkout master
  3. git merge pull17
  4. git push origin master
    • Synchrotron needs tools for computational pipelines
  • Discussion on scripting support
    • we are going to have to support a set of restricted operations
    • readability versus parsing
    • blocks vs. functions
  • layered API:
    • scripting api is base, interactive api has convenience methods that use state
    • design document on API, wiki page?
  • Other requirements for scripting
    • higher-level operations to find results linked to certain provenance
  • vt bundles
    • should we look into improve-vt-bundle and sql-alchemy branches?
    • design this (configurable bundles)
    • have some hooks to add files to the vt bundle already
  • move-dist-directory branch
    • conflicts with python build directories
    • does this affect build machines?

July 11, 2014

Updates

Items to Discuss

  • Claudio wants to make it easy to edit PythonSource without VisTrails
    • [RR] Dropping url-encoding would allow you to edit them from XML? But they are also quoted because used in <parameter val="..."/>
  • Scripting support
    • Export workflow as standalone script:
      • Modules could provide their own logic, or we can make something up from compute() ast in some cases
      • Also, add annotations to script to make it easy to reconstruct (and module IDs, so we can update)
    • Read back script as workflow
      • General case is hard
      • Building PythonSources from blocks might be doable
      • First step: simply reading some parameters from some modules in annotated script (i.e. has module ids)
  • IPython notebook integration
    • Wait on output-modules branch; then we can show visualizations in notebook
  • API needs work
    • API calls need to return things (to put them in notebook)
    • Make initialization easier (shouldn't need to create application, etc)
    • Currently stateful! (app.new_vistrail(), ..., api.save_vistrails(f) instead of vt=Vistrail(), ..., vt.save(f))
    • Should probably be in vistrails package directly

July 9, 2014

Updates

  • [TE] Added logging of parameter explorations

Items to Discuss

  • [TE] Crash when adding String module on Mac #760
    • Updated binary to Qt 4.8.6
    • Release 2.1 with Qt 4.8.6 to fix String module crash
  • [TE] Table cell - Use HTML cell and D3?
    • Interactivity for free
    • Can it work over multiple cells?
    • Will work directly on web and crowdlabs
  • [Claudio] Scripting support
    • VisTrails is general but does not interact well with other systems
    • What is not working, how can we make it easier?
    • Need ability to work without the GUI
      • Construct workflows using utility functions that does not use the VisTrails GUI
    • Convert python script to/from vistrails workflow
      • Show PythonSource contents in script
      • Start with workflow that do not show stuff at spreadsheet
    • TODO:
      • Design a better mechanism to integrate VisTrails with other tools, starting with Python
        • Provide the ability to save a workflow in a format that can be edited using a text editor -- for example, save a workflow and use an editor to change the contents of the Python Source boxes in the workflow
      • Provide functionality similar to what was implemented in UV-CDAT, where workflows can be used as black-boxes from Python. See slide 41 in http://uvcdat.llnl.gov/media/pdf/VisTrails_UV-CDAT_course.pdf
      • For which applications would scripting support be useful?
      • Look at max and grasshopper3d
      • Claudio wants to test how parallellization works in VisTrails
      • How would scripting support work?
      • Meet on Friday

July 2, 2014

Updates

  • [TE] List handling improvements done
    • Now List ports can connect to anything
    • Only exception is List<->Variant

Items to Discuss

  • [DK] List/Variant bug: #871
    • [TE] Fixed
    • [TE] Why is Map output a Variant?
    • looks like this was done because Fold produces a Variant (makes sense since fold can be a sum), but Map should always be a list)
    • this will allow depth 1 input ports to connect
  • [RR] added List.head as depth 1 port
    • add documentation for this new functionality
  • [DK] QPipelineScene.current_pipeline, controller refs #869
  • [DK] PythonSource crash? #760
    • update Qt for newer binaries?
    • pyside support?
    • Also, see #737
  • [TE] Log Parameter Exploration? #870
    • Very simple to enable, but we should annotate cell info
  • [DK] tabledata pandas support?
    • investigate on best way to make this available (data types, scripts, modules)?
  • [JF] SIGMOD: python is becoming lingua franca for data analysis, munging

June 25, 2014

Updates

  • [DK] master merged into output-modules

Items to Discuss

  • [DK] Global configuration settings for output-modules when using inheritance:
    • Example: Allow different file.suffix='.dat', imageFile.suffix='.png'?
    • For an individual output module, we only have one setting since imageFile derives from file, but how should this work if we a global setting for moduleA=vtkRendererOutput and moduleB=FileOutput?
    • decided that overriding for specific subclases makes sense
  • [TE] automatic-looping
    • get_input now merges all connections when depth>0
    • Typechecking now respects configuration flags - fix for float/int reverted
    • Issues with allowing List<->"Type of depth 1" conversions
      • Treat list as Variant of depth 1
      • List->Variant need to be treated differently otherwise it would loop over the variant
    • wouldn't be able to do implicit looping over List/Variant because we don't know what the list depth of incoming list objects?
  • [RR] matplotlib issue: log doesn't reflect the actual execution
    • can we update logging provenance here?
    • other packages (VTK) have similar issues since the renderer actually causes some of the upstream internal modules to do computations.

June 18, 2014

Updates

  • [TE] Chris is using BatchQ

Items to Discuss

  • [TE] automatic-loops-changes
    • Mixing 0 and 1 depth types now works
    • Removed use of moduleInfo
    • transfer_attrs copies specs and control parameters
    • Use get_input_list to support multiple connections
  • [RR] matplotlib issues
    • pushed branch to create a function that does this
    • mpld3 -- do they use same matplotlib model where figure must exist?
  • [DK] list of depth 1, can we pass a list here
    • exception of List input port type?
    • will this work? allows only
  • [TE] numpy types in matplotlib?
    • e.g. different float types
    • workaround for numpy for List types already
    • what about adding other types later? how would a package developer do this?
    • Converter modules could work here but add more modules? hide the extra modules?
    • Unify type checking that occurs over connections (controlled by settings) and Loop/depth type checking method (now in vistrails.core.modules.vistrails_module)
  • [RR] depth 1 port going to List head port
  • [TE] variant for (float, integer) tuple
  • [RR] multiple connections into a depth-1 port only use one of the values
    • should we have a self.get_input_list(flatten=True) option?
  • [DK] output-modules: should work on so we can merge

June 11, 2014

Updates

  • [TE] RemoteQ jobs can now skip upstream
  • [TE] Fixed zipfile bug corrupting vt files
  • [TE] Fixed ungrouping caused by
  • [TE] Google maps binary available
    • Updated for zipfile bug but not for ungrouping bug yet

Items to Discuss

  • [TE] File as parameter is confusing
    • signature changes with mtime
    • Make File non-constant and use Path instead
  • [TE] Talked with Chris at LSU
    • PBS Jobs 20 min to a few hours
    • VisTrails does most of what they need
      • Suggest we focus on bug-fixing and documentation
    • Want interactivity
      • Fetch and visualize files while PBS Job is running
    • May use Hadoop in the future

June 4, 2014

Updates

  • [TE] Added skip upstream for RemoteQ jobs
  • [DK] tabledata/gmaps updates

Items to Discuss

  • [TE] Test run.py #863
    • Changes in run.py are currently not detected by test suite
    • Other changes in configuration also don't get tested
    • What about importing unittest2 in 2.6?
  • [DK] userpackages/packages shadowing #864
    • add the directory
    • being able to see both the package that exists in userpackages and packages
    • checking loading/unloading
  • [DK] need to improve upgrade utility methods
  • [RR] Removing zip/unzip executables #862
    • check in the old commit logs and/or trac to figure out what the old problem was
  • [RR] Rolling out conda packages: anyone has experience with Anaconda?

May 28, 2014

Updates

  • [DK] rewrite-startup close to stable, fixed issue with going back and forth from v2.1 (trying to resolve package config settings issue), means output-modules is close, too

Items to Discuss

  • [DK] extra_info on modules: is it really module-specific info?
  • [DK] programmatic package configuration settings without direct imports?
  • [TE] job-cache
    • New control parameter for caching module outputs to disk
    • checks for Constant modules, so any Constant module can be cached
    • used for any module to save state of, not necessarily jobs
    • is_serializable flag?
    • will this be package-specific?
    • have an is_pickeable decorator and allow more complex is_serializable interface code
  • [TE] LSU feedback (Jinghua and Chris)
    • Have used PBS with VisTrails (by running commands over SSH)
    • Want to use distributed workflows with vistrails (more than one machine) Could use IPython?
    • Using MOAB (Meta scheduler) Module would be useful
    • Interested in moving files back and forth
    • Have experience with SAGA (Could likely make use of it)

May 21, 2014

Updates

  • [DK] output-modules, MplFigureToFile
  • [TE] VisTrails 2.1.2 Released (100 downloads so far)

Items to Discuss

  • [DK] persistence across machines (USGS use case)
  • [TE] Merge automatic-loops-streaming into master?
    • Implicit looping
    • Streaming
    • ControlParameters
    • New schema version 1.0.4
    • Possible conflict with multithreaded-interpreter?
  • control parameters can also be used for other things like caching?
  • [TE] job-cache
    • Make module jobs independent of workflow
    • Ability to cache jobs for all modules with constant outputs. Set in module/PythonSource/ControlParameter
    • Needs branches: controlflow-fake-signatures, jobs-use-signature_/connect-folded-module
  • [TE] [vistrails-users] Michele: VisTrails on remote machine
  • [TE] SAGA-Python: A Light-Weight Access Layer for Distributed Computing Infrastructure
    • Alternative to support distributed computing in VisTrails?
    • Support for most job schedulers
    • Could replace BatchQ/SSH/PBS/LSF/HTTP modules
    • No Hadoop support
    • Open Grid standard
    • Compatibility with VisTrails/ModuleSuspended?
  • [RR] Debian vistrails package

May 14, 2014

Updates

  • CrowdLabs mashups done
  • v2.1.2 release ready
  • [DK] work on rewrite-startup, make sure preferences work in new style; output-modules

Items to Discuss

  • vagrant for reproducibility
    • simpler interface for exploring the experiments
    • [RR] generating a VisTrails workflow from reprozip was planned, but not integration of the tracer in VisTrails, to trace VisTrails workflows -- it probably makes sense. See https://github.com/remram44/reprozip-ptrace/issues/8

May 7, 2014

Updates

  • CrowdLabs mashups
    • Uploading works
    • HTML looping works
    • Flash removed (actually hidden)
  • [DK] output-modules
    • Running in batch mode switches to file mode
    • Allows parameters to be specified on command-line
    • Work on GUI for parameters

Items to Discuss

  • VTK 6
  • [DK] output-modules: target single module parameters?
    • [RR] use labels to name specific modules, useful in command-line
  • scripting support

Apr 30, 2014

Updates

  • Does spreadsheet resizing work correctly? Is it too slow? (#833)
    • looks like on downsize the image is smaller than it should be and then flickers to flll space (upsize is ok)
  • [TE] Mashups on Crowdlabs
    • HTML mashups is now rendered in the crowdlabs template
    • Added embedding html mashups (using jsonp)
    • Fixed uploading mashups with a vistrail
    • Investigating issue with saving mashups to DB
  • [DK] rewrite-startup, output-modules
    • working on command-line and global parameters

Items to Discuss

  • 2.1.2 release
    • some major bugs a
  • VTK 6
  • iPython integration
    • how to migrate from scripts to workflows
  • Introducing vistrails as a normal python package
    • check on anaconda support for Qt, VTK support?
  • output-modules
    • make sure that filenames can be set
    • output directory
    • interpreter should return a dictionary with python objects produced (what is input to the output modules?)
    • unify OutputPort and OutputModule? Right now, act differently, but function in similar ways
  • directions: scripting support, large-scale data analysis, reproducibility

Apr 23, 2014

Updates

  • debugger
    • have to enable the configuration option, starts python debugger in console
    • have similar flag for test suite
  • Branches to be merged in: new-url-package
    • API calls don't go through upgrades currently, can we support this so we don't break scripts?
    • added HTTPS certificate verification
    • new dependencies for binaries?
  • dont-use-modules-as-data?
    • branch is ready?
    • changes a lot of packages,

Items to Discuss

  • output-specific modules (e.g. vtkInteractionHandler)
    • only need to be enabled when spreadsheet is active, but if we create a workflow with such a module, it would be nice to still be able to run that workflow in batch mode
    • not verify entire workflow?
    • load module logic on-demand so we don't need dependencies?
  • reload packages
    • interactions with spreadsheet, stale controller objects
    • spreadsheet may need to get information about pipeline when copying cells (get from controller?)
  • [TE] server mode
    • Changed to non-interactive
    • Pass debug messages back?
    • Auto-load packages security?
      • startup.xml file gets corrupted?
      • using same xml file for multiple instances
    • Locale errors
      • crowdlabs defaults to english, but vistrails server does not.
      • set LANGUAGE will fix it
      • should we set this environment variable in VisTrails
      • decided to chose English for now, can reevaluate when unicode branch is ready
    • Default to HTML5 interface?
      • Flash interface as backup
      • look at animations for mashups in HTML5
  • module descriptor weakrefs
  • VTK6
  • continue-dialog branch
    • changes SpreadsheetCell modules
    • merge into 2.1 for the LSU folks
  • output modules: where should the output module live? only one package?

Apr 16, 2014

Updates

  • Branches to be merged in: readfile-module, save-module-moves (#853)
  • CrowdLabs needs update to VisTrails 2.1 with schema 1.0.3
  • [DK] OutputModules (FileOutput)

Items to Discuss

  • LSU meetings
    • Pausing workflow/confirmation module (#854)
      • Checking an intermediate result before carrying on with the workflow
      • Their problems: Qt event loop, sinks ordering
    • CLTools problem
    • crowdLabs usage? (they have 100+GB files)
      • Parameter exploration
    • Running mathematica on a cluster (JobSubmission, RemoteQ?)
    • Exporting VTK visualization (vti/vtp?) for Kitware's KiwiViewer
    • Wrapping SNARK09 (C++ code) as VisTrails modules
  • eo4vistrails (Derek, Terence)
    • Add FTP module?

Apr 9, 2014

Updates

  • Branches to be merged in: export-cells, reset-cell-sizes-button (-> v2.1; icon?), export-versiontree-dot & custom-colors, input-module-no-subclass
  • [DK] UV-CDAT release, OutputModules (see below)

Items to Discuss

  • In-place spreadsheet updates: #847
    • need to be able to replacement mode toggle on and off
    • Colin has a working solution, is there a need to do this globally?
  • Autosave for existing vistrails not working #849
  • Other tickets from USGS
    • Reason for pipeline scroll area being 100 times greater than pipeline?
    • File Selection Dialog #846
    • Copy/paste with retained connections #851
    • Auto-connect ports when moving modules #852
      • Also think about fixing some of the port-snapping for selecting a port on the same module when there are multiples of the same type
  • Unicode support
    • question on mailing list
    • [RR} dates is something else, should be able to put on v2.1
    • fix for LC on master, port to v2.1?
    • should try to persist dates in numeric formats, but GUI level can be language-specific.
  • VTK 6 support #739
  • [TE] Analogies does not delete annotations #848
    • Code never worked as intended
  • [TE] Add eo4vistrails features to vistrails? https://github.com/ict4eo/eo4vistrails
    • Look at these and try to determine what may be appropriate
    • data types are probably most important, doesn't look like anything there
  • [DK] OutputModules

Apr 2, 2014

Updates

  • [TE] Added annotations and control parameters to visual diff
    • three tabs
    • TODO: analogies
      • just work on making sure the actions execute, don't worry about the matching
  • [TE] Build documentation using buildbot
    • makefile is run directly so no need to patch (no merges)
    • the nightly build still generates the pages, and the cron job now pulls the built pages
    • cron job is set to pull built pages after nightly build
  • [TE] Updated code coverage tester (currently 33%)
    • coverage is done manually, it is not done nightly
    • takes 15-minutes to run, on vis-7.
  • [DK] Output modules

Items to Discuss

  • Unicode support
    • question on mailing list
    • fix for LC on master, port to v2.1?
  • Output modules
    • outputs, modes, mode configurations
    • spreadsheet as a package
    • output mode configuration dependencies
  • VTK 6 support #739
    • trying to complete output modules before new VTK package

Mar 26, 2014

Updates

  • Branches to be merged in: application-no-default-argv, export-cells, testsuite-module-imports, reset-cell-sizes-button (icon?), export-versiontree-dot, custom-colors
  • [TE] Added branch control-parameters (used by list handling)
    • For setting hashed control options on modules (How to combine lists, while looping, caching)
    • Missing: Include in diff/merges/analogies?

Items to Discuss

  • iPython
    • script to workflow bridge
    • access to VisTrails commands in iPython
    • changes to the API
  • Multi-faceted output
  • [RR] Documentation buildbot
    • [TE] currently here
    • problem is that we don't get a message when the documentation failed to build
    • what about copying the output from a build machine to vistrails.org?
  • [RR] Last(?) migration issue: #837
    • [DK] to look at
  • VTK 6 support #739
  • spreadsheet export button in UV-CDAT
  • Develop menu?

Mar 19, 2014

Updates

  • [DK] UV-CDAT #407 fix, version tree rewrites

Items to Discuss

  • [TE] Cleaning up test suite log
    • Deprecated core.modules.constant_configuration imports
      • check whether the deprecations for vistrails.core.vistrails_module.VistrailsModule camel-case are similar
    • KeyError (Ticket #837), one of [DK]'s commits
      • [DK] to look at
    • Skip modules not to be imported (or fix/remove)
    • Python invalid drawable (Probably from mashups)
    • Missing version 0.8 of package org.vistrails.vistrails.tests.upgrade (bug?)
  • [TE] Add annotations to signature? (hash-annotations)
    • for the streaming/automatic looping for new values on the modules
    • could be any module
    • label doesn't affect caching
    • prefix, flag involved, add fields to module about looping
    • schema change? problem is that don't want to have to recreate module each time
    • backward compatibility
    • new type of "control parameter"
  • VTK 6?
  • UV-CDAT #408: [RR] has worked on this
    • support adding more formats to export
    • decided to have single menu item, and put all available formats in the dialog
  • Spreadsheet resizing bug
    • if you resized the sheet, would reset to equal-size cells
    • spreadsheet-resizing branch changed this to allow non-equal-size cells on resize
    • has some math issues because doing this dynamically difficult
    • button to resize to equal-size on branch
    • updating the documentation
    • [TE] can we store the locations more precisely? (problem might be if a header column is also resized)

Mar 13, 2014

Updates

  • [RR] Package dependency ordering issue (#829); fixed(?), merge?
    • [DK] Yes, merge
  • [RR] Will merge in tabledata additions
  • [RR] New SQL package, using tabledata and sqlalchemy. Should be fully backward compatible(?)
    • Check if Fernando has anything here from the Oracle workflow and database provenance
  • [RR] Searching in vistrail and regex (#373, 779); fixed?
    • Error was hidden (entering '?' searched for '.*?.*' which was a valid regex but still didn't do what was expected)
  • [TE] IPython console now shows <STANDARD OUTPUT> before printing from stdout/stderr
  • [TE] Vistrail variables fixed
    • Deleting a module with a variable did not remove the variable from the workflow
    • Copying modules with variables now works

Items to Discuss

  • [RR] Do not run modules upstream of submitted job
    • jobs-use-signatures: Uses the subpipeline signature to identify jobs
    • works: if job is submitted, check it without running upstream (we don't need the parameters this time(?))
    • doesn't work in Map
    • jobs-use-signatures_/connect-folded-module
      • create a connection from folded module to InputList upstream
      • looped module can run upstream if it needs, else we don't
      • problem: the Map still needs the list, for length and type-checking
    • Could we rearchitect the looping behavior? Basically, just copy the pipelines and run them with each entry from the loop
      • Provenance could expand
  • [RR] module-upgrade-recursion status? (not yet merged)
    • [DK] to look at and merge (should be done)
  • [DK] UV-CDAT Issues: #407 and #408
  • [RR] Thumbnail comparison bug: different references are selected on master and dont-use-modules-as-data! (pipeline results are identical) Problem with upgrade code?
    • using different reference images
    • [TE] to look at
    • look at the console_mode, how it gets back the upgraded version id
    • filed as #831
  • [TE] Vistrail variables
    • Normally only one hidden module exist for each vistrail variable, but pasting a module with a vistrail variable will create a new hidden module. This does not break anything, but should be taken into account when modifying the code.
    • When a vistrail variable module is created, an input named "value" is created. A bug with default values gave it the value "None". This is fixed. To make validation of existing vistrail variables work we need to check if the value is "None", but this would not allow "None" as a string.
    • [DK] to look into this, we cannot have "None" there.
  • [JF] NSF Big Data proposals (due in June)
  • uuid status
    • need to check tests and features outside of normal execution

Mar 5, 2014

Updates

Items to Discuss

  • [RR] New SQL package, using tabledata and sqlalchemy
    • No upgrade possible? (output changed from List (of rows) to Table)
      • Should we make it a new package?
      • Will try to make backwards compatible
  • [TE] Change pinning to always show pinned panels? (#828)
    • Yes, it is a better behavior
  • [RR] Package dependency ordering issue (#829)
    • Will try to fix
  • [RR] Merging multithreaded-interpreter and JobManager
    • Need examples of JobSubmission
  • [TE] Merging automatic-loops-streaming
    • Try to merge with multithreaded-interpreter
  • Wiki menu looks strange in Chrome
    • [TE] Will try to fix

Feb 26, 2014

Updates

  • Branch to be merged in: hidden-port-icon (v2.1)
  • [RR] tabledata-merge-tabdata: added tests, fixed issues (todo: json)
  • [TE] Added package RemoteQ (BatchQ/PBS/Hadoop)
    • new Mac binary that includes everything here
  • [DK] gmaps, climate proposal

Items to Discuss

  • Climate proposal
    • recommendation and best practices
    • automatically segmenting the version tree, better modes of interaction
  • [RR] tabledata
    • selections, projections, joins
    • writing out table as csv?
    • much improved bugs
    • [DK] should merge aggregate module int
    • [RR] working on different json representations
  • [TE] Hadoop configuration
  • Data cleaning/repo:
    • always need the vistrails engine to run things
    • scripting interface

Feb 19, 2014

Updates

  • [TE] automatic-loops-streaming works
  • [TE] Hadoop package:
    • Merged BatchQ-PBS and hadoop into RemoteQ
  • [DK] improvements to tabledata, gmaps
  • [RR] java-pkg is done

Items to Discuss

  • [TE] automatic-loops-streaming
    • Add documentation
    • Perhaps a matplotlib example where the plot updates as more data streams
    • Do we want to add additional controlflow features?
      • Implemented while module, but this needs to be improved because of annotations
      • annotations don't invalidate cache
      • could include annotations in hasher for cache updates
      • how does if module work?
    • [RR] pairwise for three ports?
      • the list modules do this, but we may need an interface for this
    • http://dev.mygrid.org.uk/wiki/display/taverna/List+handlings
  • [TE] Hadoop package:
    • Best way to separate task and configuration parameters?
    • URI is different for different clusters
    • use package configuration variables
    • move from checkbox to allow more general configuration?
    • use configuration file for this?
  • [RR] working to merge multithreading code with job-monitor, look at streaming
  • [RR] java-pkg
    • works for some classes (call constructors, setters, getters)
    • cannot call any methods from VisTrails, don't want to mutate
    • can call these methods from your own modules, also insert code in Python into these packages
    • may add in an interface for loading jar files and wrapping them

Feb 11, 2014

Updates

  • [TE] automatic-loops-streaming - partially works
    • Using a python generator for each streaming module
    • TODO: Streaming Sinks
      • last module collecting values
    • Logging/Feedback performance
    • Demo?
  • [DK] vtk-new-package
  • [RR] java package: conflicts with dlls and jpype, going to work

Items to Discuss

  • [TE] Hadoop package:
    • a few installation issues, vistrails-hadoop pkg on git but was not accessible to public
    • same package for local cluster and aws? Yes, but need to change two lines, path to configuration files on server
    • could config files be set in a module instead or in the preferences?
    • directories are hardcoded in hadoop package, need to be parameters
    • anything that can change in different installations should be parameters
    • can be hard to figure out what does change
  • [DK] merge tabledata-merge-tabdata, matplotlib-add-helpers?
    • examples are not very useful, no example with multiple plotline modules?
    • Update matplotlib examples to better show off capabilities
  • [DK] package loading speed, kwargs
    • kwargs can be significantly slower
  • java issues:

Feb 5, 2014

Updates

  • [TE] automatic-loops
    • PortSpec-based version now works
      • Connections show how modules will be iterated
    • Streaming is tricky
  • [DK] Examples, merging tabledata

Items to Discuss

  • mailing list question
    • no resopnse yet
  • UV-CDAT: warnings when downsizing spreadsheet
    • problem: warnings will pop up a lot
    • keep offscreen cache of (x,y) locations in spreadsheet?
    • cache pipeline instead of full rendered pipeline
  • [TE] streaming with branches becomes interesting
    • only want to update upstream branches once
    • Taverna has streaming-style (pipelining)
  • PythonSource deleting port?
  • [RR] Java: possibly considering the matplotlib-style port creation versus java-style ports

Jan 29, 2014

Updates

  • [TE] automatic-loops - Groups and Multi-level looping now works
  • [DK] matplotlib-add-helpers: Rework the semi-automatic generators
  • [RR] java-pkg: bringing in old code, spreadsheet still works, probably going for dynamic generation of packages
    • looking at jpype versus pyjnius

Items to Discuss

  • [RR] str-format-module: in master. Merge in v2.1? Having to open the configuration window and click on a button is awkward
    • [DK] watch out for dynamically created format strings, could cause inconsistency at compute time if the format string doesn't match the ports
    • Go ahead to merge since similar problem with PythonSource
    • What about using number of connections to enforce this? (set to 0 for config-style ports?)
  • Streaming in VisTrails
    • Q: how is parallelism working in VisTrails? A: both theading and multiprocessing, but separate branch
    • for streaming, order of input and outut must be preserved
    • module can generate a number of pieces, a module that collects needs to know when the stream ends
    • need to preserve order of map operations and
    • goal is to reduce the storage costs, want to make sure that we don't keep entire data in memory, don't generate the entire list at beginning
    • hyperflow needs to know when a stream starts and when it ends: start and end token, hyperflow wraps data in a token
    • Huy advises getting rid of the list and using tokens instead?
    • interpreter can call compute-upstream and just tell interpreter to go upstream
    • VTK streaming interpreter: might be in svn
    • Remi: just be able to pass an object on the conection that knows the shape of the data and how to get next element (basically an iterator)
      • where do we call get-next? module in the collector calls
    • main advantage of streaming is being able to do aggregation during the compute (otherwise, we're doing map-reduce)
    • the iterator encapsulates the calls to the module's compute
    • issue with logging? already rewritten for parallel, should work here
    • Examples: triangle_area, image histograms
    • Slowdown caused by lists? colors? logging?
    • Schema changes: need to record depth, what type of semantics for combining port values
    • Also probably means that we don't pass just raw values, need to wrap with depth and iterator
    • Issues Huy raised:
      • issue with multithreading and streaming both enabled
      • triangle with modules: need to be careful about order with items coming from two different modules (e.g. BuildTable in Rémi's example)
  • UV-CDAT

Jan 22, 2014

Updates

  • [RR] Branches to be merged in: reload-disabled-package (#714), ungroup-keep-disconnected-ports_/hybrid
  • [TE] VisTrails 2.1.1 released
  • [DK] matplotlib, uuid work

Items to Discuss

  • [TE] Automatic Looping
    • Using ListOf module as input will trigger the looping
      • Runtime type checking because we don't have a ListOfT class for each type.
      • [DK] USGS interested in working with this
      • Add a merge module to make it clear where the iteration stops and ends
      • Have different type of ports
      • Remi to talk to Huy about streaming ideas
      • can we detect the iteration and draw those modules differently
      • decided that some fold/merge module would be useful, even if we don't always need to use it explicitly
  • [RR] Ticking in matplotlib
    • formatters, locators, tick scale
  • [RR] Sort tickets? There are issues for the 2.0 milestone and issues that affect released versions. Tickets assigned to 2.1 are here.
  • [DK] To contact USGS about the new persistence package and get feedback
  • [DK] Merge weather and subway examples to use new tabledata package
  • Look into pandas http://pandas.pydata.org/
    • could be helpful to support some of the table operations
    • in the future, we may want to support it for data analysis and stats (but for this, additional dependencies are required)

Jan 15, 2014

Updates

  • [RR] improves-logging:
    • Pass exceptions to it directly for single line, or traceback.format_exc() for traceback
    • WARNING level now printed in the console by default, -V 1: INFO (log() calls), -V 2: DEBUG (debug() calls)
    • Messages view now shows whatever is selected, no matter the console log level
    • Python warnings get captured
    • Use warnings.warn(..., category=VistrailsWarning|VistrailsDeprecation) to warn once
    • Backport to v2.1? (so deprecation warnings actually show up)
      • -> Yes, include in v2.1.1
  • [TE] Hadoop package documented here: http://www.vistrails.org/index.php/Hadoop_Package
  • [DK] UV-CDAT, LSU

Items to Discuss

  • [RR] input-module-no-subclass: input ports of type 'Module' shouldn't accept any output? These are used for 'self'->controlflow module connections...
    • controlflow modules have a function port that accepts a module
    • check for modules that might use Module for something else
    • suggestion is to use Variant for any type being supported (StandardOutput, List, others)
    • warns for now (doesn't actually disallow the connection until next version)
    • -> David wants to look at this more
  • [RR] ungroup-keep-disconnected-ports: alternative ungrouping, keeps pipeline disconnected and preserves InputPort/OutputPort modules
    • ideally, want to edit in place, could try to emulate the edit subworkflow code here, just have to save things back to the original vistrail
    • -> hybrid approach, materialize the unconnected ports
  • Bugfix release? Would solve issues with server and parameter exploration. (also, logging improvement?)
    • [TE] Also fixes to workspace
  • [DK] font issue for OS X 10.9
  • Look at Java support in Java?
    • can we leverage Rémi's past work here? on Weka
    • pyjnius: cannot subclass a Java class in Python, should be able to create and call out to java classes
    • issue with Jython is the interface
    • also had java-based spreadsheet
  • also look at moving VTK to a matplotlib style generation

Jan 8, 2014

Updates

  • [TE] Hadoop package: Added -combine option for Cs9223_Mapreduce_Assignment
    • tricky to set up the package still, need documentation to understand how to assemble workflows from scratch
  • [RR] Logging improvements underway (branch 'improves-logging')
    • Fixes some logging gotchas
    • Better exception printing (pass them directly to critical|warning|log)
    • WARNING is now the default level for the console (only CRITICAL displays a popup)
  • [RR] Alternative Mac .app (example dmg)
    • Standard build of Python (currently as a Mac .framework)
    • Have to relink libraries with install_name_tool
    • Seems to work
    • Advantages: standard, full build of Python; pip will work
    • Last problem: if Qt is installed, it gets loaded and conflicts with bundled Qt (I missed a dylib link)
  • [DK] Work with USGS, refactoring version trees

Items to Discuss

  • Colin's feedback on loops is that it has taken some time to wrap his head around what is going on
    • what does Taverna do? automatically determines when a collection is being input and does iteration over the input and determines what output looks like
    • look at what this would take, figure out what the best way to attack this (Automatic loops)
  • Colin also asked about use of persistence in another module

Jan 3, 2014

Updates

  • [TE] VisTrails 2.1 released
  • [DK] UUID branch, USGS work, upgrade recursion
    • van Wijk on tree vis, and Buneman on XML Updates

Items to Discuss

Older meetings