Difference between revisions of "Assignment 1 - Data Exploration"

From VistrailsWiki
Jump to navigation Jump to search
Line 16: Line 16:
* Make sure your pipelines are portable, i.e., I should be able to run them on my own machine. For example, you should avoid using files stored in your local file system.
* Make sure your pipelines are portable, i.e., I should be able to run them on my own machine. For example, you should avoid using files stored in your local file system.


[Image:http://vgc.poly.edu/~juliana/courses/BigData2014/Assignments/1-DataAnalysis/vistrails-screenshot.png|480px|right]
[[Image:vtbd2014-ss.png|480px|right]]
 
http://vgc.poly.edu/~juliana/courses/BigData2014/Assignments/1-DataAnalysis/vistrails-screenshot.png


[[Image:Cosmology_example.png|480px|right]]
[[Image:Cosmology_example.png|480px|right]]

Revision as of 06:32, 9 February 2014

Assignment Description

During our lab, we explored MTA data about subway fares. For your assignment, you will further explore this data set and try to find at least 4 interesting facts/observations. Use your creativity!

You will use VisTrails for this assignment, and you can start from the example we used in the lab. You can find more information about VisTrails in the Users' Guide.

And you are encouraged to use the Web as a resource to find more information about the different packages you will use (e.g., matplotlib) as well as to find additional data that might be interesting to integrate with the fare data.

You can exchange ideas with your classmates, but the work you submit should be your own. Copying is not allowed.

Submission Instructions

You will submit the vt file containing the trail of your analysis to NYU Classes. Some guidelines you should follow:

  • The pipelines that correspond to the interesting facts you discover should be tagged using the following convention: Fact <number>. For example, Fact 1, Fact 2, etc. You can set the tag on the left pane in the History view (see screenshot below).
  • You should add notes to these pipelines explaining your findings. The notes field is located below the tag.
  • Make sure your pipelines are portable, i.e., I should be able to run them on my own machine. For example, you should avoid using files stored in your local file system.

vistrails-screenshot.png

Cosmology example.png