Development

From VistrailsWiki
Jump to navigation Jump to search

See also the Github wiki

2015

February 25, 2015

Updates

Items to Discuss

  • [TE] vtkviewcell for infovis support, can we unify with VTKCell?
    • need to test this
  • [TE] vtk wrapping
    • Mostly finished
    • VTK 5.10 produce incorrect results with old wrapping
      • Old wrapper is based mostly on VTK 4
      • Most vtk_examples affected
      • [1]
      • should be able to upgrade from SetInput to SetInputData (need to drop GetOutput and replace with self ports)
      • can we change vtkInstance to just return self and not wrap things
    • terminator example not working under 5.8?
  • How does VTK wrapping fit into general wrapping framework?
  • [RR] new persistence package

February 18, 2015

Updates

  • [RR] New VisTrails API and IPython integration (#24)

Items to Discuss

  • [TE] VTK wrapping
    • Benchmarking vtk package
      • Old: 24.7 seconds
      • New: 10.5 seconds (Except first time that adds 8 sec)
        • The parsing that calls is_abstract (that tries to instanciate all vtk classes) is now only run the first time.
        • get_items_from_sigstring takes 2 seconds, maybe we can use a lookup dict for already computed sigstrings?
    • Now using a general python function wrapper
      • VTK classes are wrapped into python function that does not depend on vistrails
      • VTK functions can be executed without vistrails
      • The spec maps functions into vistrails modules, but can also describe wrapping
      • A general python function wrapper that supports
        • kwarg inputs
        • single, list, dict outputs
        • callback for progress reporting
        • temporary file generator for using FilePool
        • optional output generation
      • Creating specs:
          • Create spec by hand
          • Auto-create spec outline (TODO) and manually finish it
          • Dynamically create spec (VTK)
          • Implement documentation wrappers (Can use scikitlearn wrapper to wrap numpydoc) (TODO)
          • Classes as bad functions needs to be wrapped in new functions before they are wrapped. This is different for each package.
            • Classes is hard: Like VTK, and matplotlib. Scikit-learn does still not wrap classes
          • Spec diffing and patching could be done using code from matplotlib.
    • Still needs upgrades from old VTK package
      • Is it possible to dynamically wrap functions, e.g, you see a SetFunc and just remove the 'Set' prefix. Or do you need to create a complete mapping?

February 11, 2015

Updates

  • Update from Friday's meeting
    • discussed VisTrails internals
    • discussed wrapping
      • xml discussion, hard to modify because tied to db code
      • TE has made it possible to add the schema-defined attributes to the intermediate representation
      • higher-level operations on the port specs
    • make sure the simple case works
      • [JF] take a simple package with documentation and figure out what the base case for wrapping is

Items to Discuss

  • [TE] VTK wrapping
    • Dynamic loading works
    • Reading XML is fast enough, but unserializing data is slow
    • Working on patterns for patching
    • Matplotlib has many advanced patterns like argument ordering, nested arguments, alternateSpecs, output types.
      • Having all this in a general wrapper might confuse users?
    • [RR] Delay module (except for identifiers) until you need it---e.g. don't deal with port specs, etc. until necessary
  • Scripting Support #950
    • [RR] Issue with getting code from modules
    • Design a simple solution
    • [JF] Couldn't you use modules as black boxes without conversion, just to call into modules/subworkflows easily from e.g. IPython?
      • [RR] This is a job for the API, and a very separate use case. See #24

February 4, 2015

Updates

Items to Discuss

  • Wrapping
    • Format to use? Currently XML (like current matplotlib)
      • JSON and YAML have simple "to python dictionary" methods
      • But don't stream
      • YAML a lot easier for humans
    • [DK] vtk-new-package also changes parameter names, creates enumerations
      • intermediate schema needs to be extensible
      • packages will want to store there specific infos for compute() method generation
      • also might have specs-altering info, like matplotlib's alternateSpec
    • representation to code , registry already has schema for some aspects
    • [RR] We might want to see if Module subclasses can be created lazily
      • no need to create all the classes just to register them in the registry and never actually use most of them
      • future effort
  • [RR] Where should VisTrails packages live?
    • tej installs as 'vistrailspkg.tej', TE installed it as 'userpackages.tej'
    • Currently, standard packages are 'vistrails.packages.', user packages are 'userpackages.' and packages loaded through pkg_resources might be anything
    • [RR] Use 'vistrailspkg.' everywhere?
    • Long-term effort to simplify package distribution/installation (and have VisTrails get them automatically?)

January 28, 2015

Updates

  • T. Caswell to come visit on Fri 6 to discuss wrapping work

Items to Discuss

  • [TE] New VTK wrapping
    • Current code by DK seems a good deal faster
    • Generates XML that can be patched/tweaked, generates Python code from it, VisTrails only loads generated Python code
    • RR would rather have VisTrails load intermediate representation (= XML) directly, wants to make sure this is not slower
    • The goal is to turn the intermediate step into something generic that would be used for every wrapped package (vtk, numpy, matplotlib, sklearn, java) instead of each having its own
    • TC has its own code at github:VTTools which parses numpy docstrings and generates modules, doesn't yet handle classes or persist anything
  • Web crawler
    • Right now, TE starts jobs for "start crawler", "stop crawler", "install classifier"
    • RR would rather have the crawling be a job as far as VisTrails and tej are concerned
    • The whole thing would be one pipeline: load examples, train classifier, start crawler [check for job, kill previous one, upload model, start processes], get snapshot, visualize
      • Need some support in tej and job submission system: long-running jobs, stop a job (wait for it to finish?), restart a job even though results are cached

January 21, 2015

Updates

Items to Discuss

  • [RR] Unified wrapping method discussion #991
    • TE to work on reusable method with intermediate representation, starting with VTK
  • [RR] Examples for scikit-learn: JF has an old example using Weka with parameter exploration (not currently in source tree)
    • AM's examples are enough
  • [AM] scikit-learn package is done, merge it in? #955
    • RR will merge
  • [RR] What should copyright headers say? #994
    • Let's keep everything in there: Utah/Poly/NYU

January 14, 2015

Updates

  • [TE] Working on classifier
  • [RR] Scripting integration, work in progress

Items to Discuss

  • [RR] Unified wrapping method discussion (#991)
    • Let's talk next week, [AM] and [DK] are not here

January 7, 2015

Updates

Items to Discuss

  • make sure that we address critical issues, questions, and pending review branches in a timely manner
  • scripting support
    • [RR] no issues if we want to just keep annotations in the generated code to allow the link back to a workflow
    • [RR] can translate from workflow to script, working on script to workflow
    • will work for parameter value changes, structural changes require changes to the annotations
    • need to publish best practices here
    • would be cool to do looping in scripts (easier interface than with workflows)
  • notebook support (convert form notebook to workflow)
    • RR will sync with FC on this
  • Issue with console in built-from-scratch
    • [TC] iPython rearranged some of the completion stuff in 2.2 and 2.3
    • binary has old version of iPython -> 1.0.0, should we update?
  • [TC] automated wrapping of numpy and scipy
    • discovered a bunch of malformed documentation in numpy and scipy
    • has github repo for vistrails tools
    • example modules wrap a bunch of R stuff (not baked in, just how things are)
    • will be pushing wrapping logic up
    • port names forbidden (window and domain)
    • have an import hook to get from yaml directly to VisTrails Modules
    • should work for any python modules with well-formed numpy docstrings.
  • [Action] should make it clear in documentation that Constant now means serializable not that the value doesn't change (e.g. List)
  • [TC] might be interesting to try to build components of matplotlib and accumulate in figure (long-term project, but thinking about how this might work)
  • [TE] build and build scripts
    • completely automatic, buildbot
    • need to set the build machines for the environment we want for the binary
    • would virtualenv work here?
    • [TC] anaconda can pin versions, potential path to test different configurations
    • Q: upload nightly binary builds? A: makes sense, make sure they are well-labeled
  • sourceforge stats: e.g. http://sourceforge.net/projects/vistrails/files/vistrails/nightly/vistrails-src-nightly.tar.gz/stats/timeline?dates=2014-01-01+to+2015-01-07
  • package issues (see Remi's message)
  • [TE] Scope of tej
    • Support single ssh commands?
    • Queue can be used as a remote machine (crawler is using queue.call*)
  • SourceForge stats: http://sourceforge.net/projects/vistrails/files/vistrails/nightly/vistrails-src-nightly.tar.gz/stats/timeline?dates=2014-01-01+to+2015-01-07

Older meetings