Difference between revisions of "Provenance challenge"

From VistrailsWiki
Jump to navigation Jump to search
Line 4: Line 4:
This page describes the implementation of how to answer the queries of the second provenance challenge.
This page describes the implementation of how to answer the queries of the second provenance challenge.


The goal of this project is to  
The goal of this project is to create an api capable of querying different kinds of databases containing provenance data. The main focus will be on provenance generated by scientific workflows.


== primitives ==
The api will deal with the basic primitives describing workflow executions.


--[[User:Tommy|Tommy]] 08:51, 12 April 2007 (MDT)
 
node types:
 
<tr><td>dataitem</td></tr>
 
<tr><td>module</td></tr>
<tr><td>moduleInstance</td></tr>
<tr><td>moduleExecution</td></tr>
<tr><td>workflow</td></tr>
<tr><td>workflowExecution</td></tr>
<tr><td>inputPort</td></tr>
<tr><td>outputPort</td></tr>
 
 
Relations:
 
Relation Input Output
----------------------------------------------
exists all boolean
equals all boolean
annotations all dict of key/value pairs
 
getInputPortForData dataItem inputPort
getOutputPortForData dataItem outputPort
getDataFromInputPort inputPort dataItem
getDataFromOutputPort outputPort dataItem
 
hasInputPort moduleInstance inputPort
inputPortOf inputPort moduleInstance
hasOutputPort moduleInstance outputPort
outputPortOf outputPort moduleInstance
 
 
outputOf dataItem moduleExecution
inputOf dataItem moduleExecution
hasOutput moduleExecution dataItem
hasInput moduleExecution dataItem
 
startTime moduleExecution time
endTime moduleExecution time
startTime workflowExecution time
endTime workflowExecution time
 
executionOf    moduleExecution moduleInstance
executionOf workflowExecution workflowInstance
 
hasExecution    moduleInstance moduleExecution
hasExecution workflowInstance workflowExecution
 
executions      workflowExecution moduleExecution
executedIn moduleExecution workflowExecution
 
inWorkflow moduleInstance workflow
hasModule workflow moduleInstance
 
connectedTo inputPort outputPort
connectedTo outputPort inputPort
 
runsModule moduleInstance module
hasInstance module moduleInstance
 
 
derived relations: (might be native)
 
derivedFrom   dataItem dataItem
derivedData   dataItem dataItem
previousModuleExecution   moduleExecution moduleExecution
 
 
 
transitive relations:
 
 
datatype relation
--------------------------------
upstreams:
 
dataitem derivedFrom - .outputOf()[forall].hasInput()
moduleInstance prevModuleInstance - .hasInputPort()[forall].connectedTo().outputPortOf()
moduleExecution prevModuleExecution - .hasInput()[forall].OutputOf()
 
downstreams:
 
dataitem derivedData - .inputOf()[forall].hasOutput()
moduleInstance nextModuleInstance - .hasOutputPort()[forall].connectedTo().inputPortOf()
moduleExecution nextModuleExecution - .hasOutput()[forall].inputOf()
 
 
 
--[[User:Tommy|Tommy]] 09:05, 12 April 2007 (MDT)

Revision as of 15:05, 12 April 2007

Second provenance challenge design overview

This page describes the implementation of how to answer the queries of the second provenance challenge.

The goal of this project is to create an api capable of querying different kinds of databases containing provenance data. The main focus will be on provenance generated by scientific workflows.

primitives

The api will deal with the basic primitives describing workflow executions.


node types:

dataitem module moduleInstance moduleExecution workflow workflowExecution inputPort outputPort


Relations:

Relation Input Output


exists all boolean equals all boolean annotations all dict of key/value pairs

getInputPortForData dataItem inputPort getOutputPortForData dataItem outputPort getDataFromInputPort inputPort dataItem getDataFromOutputPort outputPort dataItem

hasInputPort moduleInstance inputPort inputPortOf inputPort moduleInstance hasOutputPort moduleInstance outputPort outputPortOf outputPort moduleInstance


outputOf dataItem moduleExecution inputOf dataItem moduleExecution hasOutput moduleExecution dataItem hasInput moduleExecution dataItem

startTime moduleExecution time endTime moduleExecution time startTime workflowExecution time endTime workflowExecution time

executionOf moduleExecution moduleInstance executionOf workflowExecution workflowInstance

hasExecution moduleInstance moduleExecution hasExecution workflowInstance workflowExecution

executions workflowExecution moduleExecution executedIn moduleExecution workflowExecution

inWorkflow moduleInstance workflow hasModule workflow moduleInstance

connectedTo inputPort outputPort connectedTo outputPort inputPort

runsModule moduleInstance module hasInstance module moduleInstance


derived relations: (might be native)

derivedFrom dataItem dataItem derivedData dataItem dataItem previousModuleExecution moduleExecution moduleExecution


transitive relations:


datatype relation


upstreams:

dataitem derivedFrom - .outputOf()[forall].hasInput() moduleInstance prevModuleInstance - .hasInputPort()[forall].connectedTo().outputPortOf() moduleExecution prevModuleExecution - .hasInput()[forall].OutputOf()

downstreams:

dataitem derivedData - .inputOf()[forall].hasOutput() moduleInstance nextModuleInstance - .hasOutputPort()[forall].connectedTo().inputPortOf() moduleExecution nextModuleExecution - .hasOutput()[forall].inputOf()


--Tommy 09:05, 12 April 2007 (MDT)