https://www.vistrails.org//index.php?title=Lab_notes_02/06/14&feed=atom&action=historyLab notes 02/06/14 - Revision history2024-03-28T16:30:36ZRevision history for this page on the wikiMediaWiki 1.36.2https://www.vistrails.org//index.php?title=Lab_notes_02/06/14&diff=6991&oldid=prevJuliana: /* The Problem: Analyzing MTA Fare Data */2014-02-25T19:16:28Z<p><span dir="auto"><span class="autocomment">The Problem: Analyzing MTA Fare Data</span></span></p>
<table style="background-color: #fff; color: #202122;" data-mw="interface">
<col class="diff-marker" />
<col class="diff-content" />
<col class="diff-marker" />
<col class="diff-content" />
<tr class="diff-title" lang="en">
<td colspan="2" style="background-color: #fff; color: #202122; text-align: center;">← Older revision</td>
<td colspan="2" style="background-color: #fff; color: #202122; text-align: center;">Revision as of 19:16, 25 February 2014</td>
</tr><tr><td colspan="2" class="diff-lineno" id="mw-diff-left-l13">Line 13:</td>
<td colspan="2" class="diff-lineno">Line 13:</td></tr>
<tr><td class="diff-marker"></td><td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><br/></td><td class="diff-marker"></td><td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><br/></td></tr>
<tr><td class="diff-marker"></td><td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>The MTA data can be downloaded from http://www.mta.info/developers/fare.html. <br></div></td><td class="diff-marker"></td><td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>The MTA data can be downloaded from http://www.mta.info/developers/fare.html. <br></div></td></tr>
<tr><td class="diff-marker" data-marker="−"></td><td style="color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;"><div>Here is a description of the different <del style="font-weight: bold; text-decoration: none;">fields in the data set</del>: http://www.mta.info/developers/resources/nyct/fares/fare_type_description.txt</div></td><td class="diff-marker" data-marker="+"></td><td style="color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;"><div>Here is a description of the different <ins style="font-weight: bold; text-decoration: none;">fare types</ins>: http://www.mta.info/developers/resources/nyct/fares/fare_type_description.txt</div></td></tr>
<tr><td class="diff-marker"></td><td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><br/></td><td class="diff-marker"></td><td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><br/></td></tr>
<tr><td class="diff-marker"></td><td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>=== Installation ===</div></td><td class="diff-marker"></td><td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>=== Installation ===</div></td></tr>
</table>Julianahttps://www.vistrails.org//index.php?title=Lab_notes_02/06/14&diff=6990&oldid=prevJuliana: /* The Problem: Analyzing MTA Fare Data */2014-02-25T19:15:42Z<p><span dir="auto"><span class="autocomment">The Problem: Analyzing MTA Fare Data</span></span></p>
<table style="background-color: #fff; color: #202122;" data-mw="interface">
<col class="diff-marker" />
<col class="diff-content" />
<col class="diff-marker" />
<col class="diff-content" />
<tr class="diff-title" lang="en">
<td colspan="2" style="background-color: #fff; color: #202122; text-align: center;">← Older revision</td>
<td colspan="2" style="background-color: #fff; color: #202122; text-align: center;">Revision as of 19:15, 25 February 2014</td>
</tr><tr><td colspan="2" class="diff-lineno" id="mw-diff-left-l12">Line 12:</td>
<td colspan="2" class="diff-lineno">Line 12:</td></tr>
<tr><td class="diff-marker"></td><td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>The Wall Street Journal published a story in 2011 that examined MetroCard usage as the cost of fares changed. The original work was created by Albert Sun and Andrew Grossman and published at http://graphicsweb.wsj.com/documents/MTAFARES1108/ on October 17, 2011. To do this, they used the publicly available fare data from the Metropolitan Transportation Authority (MTA). Their results were an interesting snapshot of usage patterns in the six months before and after the fare change. Because this data is made available on a weekly basis, it is possible to analyze more recent data as it becomes available. In addition, we can restrict views to specific lines or compare different ranges of time. As we will see, by using VisTrails, it becomes much easier to do these types of explorations.</div></td><td class="diff-marker"></td><td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>The Wall Street Journal published a story in 2011 that examined MetroCard usage as the cost of fares changed. The original work was created by Albert Sun and Andrew Grossman and published at http://graphicsweb.wsj.com/documents/MTAFARES1108/ on October 17, 2011. To do this, they used the publicly available fare data from the Metropolitan Transportation Authority (MTA). Their results were an interesting snapshot of usage patterns in the six months before and after the fare change. Because this data is made available on a weekly basis, it is possible to analyze more recent data as it becomes available. In addition, we can restrict views to specific lines or compare different ranges of time. As we will see, by using VisTrails, it becomes much easier to do these types of explorations.</div></td></tr>
<tr><td class="diff-marker"></td><td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><br/></td><td class="diff-marker"></td><td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><br/></td></tr>
<tr><td class="diff-marker" data-marker="−"></td><td style="color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;"><div>The MTA data can be downloaded from http://www.mta.info/developers/fare.html. <<del style="font-weight: bold; text-decoration: none;">/</del>br></div></td><td class="diff-marker" data-marker="+"></td><td style="color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;"><div>The MTA data can be downloaded from http://www.mta.info/developers/fare.html. <br></div></td></tr>
<tr><td class="diff-marker"></td><td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>Here is a description of the different fields in the data set: http://www.mta.info/developers/resources/nyct/fares/fare_type_description.txt</div></td><td class="diff-marker"></td><td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>Here is a description of the different fields in the data set: http://www.mta.info/developers/resources/nyct/fares/fare_type_description.txt</div></td></tr>
<tr><td class="diff-marker"></td><td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><br/></td><td class="diff-marker"></td><td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><br/></td></tr>
</table>Julianahttps://www.vistrails.org//index.php?title=Lab_notes_02/06/14&diff=6989&oldid=prevJuliana: /* The Problem: Analyzing MTA Fare Data */2014-02-25T19:15:32Z<p><span dir="auto"><span class="autocomment">The Problem: Analyzing MTA Fare Data</span></span></p>
<table style="background-color: #fff; color: #202122;" data-mw="interface">
<col class="diff-marker" />
<col class="diff-content" />
<col class="diff-marker" />
<col class="diff-content" />
<tr class="diff-title" lang="en">
<td colspan="2" style="background-color: #fff; color: #202122; text-align: center;">← Older revision</td>
<td colspan="2" style="background-color: #fff; color: #202122; text-align: center;">Revision as of 19:15, 25 February 2014</td>
</tr><tr><td colspan="2" class="diff-lineno" id="mw-diff-left-l11">Line 11:</td>
<td colspan="2" class="diff-lineno">Line 11:</td></tr>
<tr><td class="diff-marker"></td><td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><br/></td><td class="diff-marker"></td><td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><br/></td></tr>
<tr><td class="diff-marker"></td><td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>The Wall Street Journal published a story in 2011 that examined MetroCard usage as the cost of fares changed. The original work was created by Albert Sun and Andrew Grossman and published at http://graphicsweb.wsj.com/documents/MTAFARES1108/ on October 17, 2011. To do this, they used the publicly available fare data from the Metropolitan Transportation Authority (MTA). Their results were an interesting snapshot of usage patterns in the six months before and after the fare change. Because this data is made available on a weekly basis, it is possible to analyze more recent data as it becomes available. In addition, we can restrict views to specific lines or compare different ranges of time. As we will see, by using VisTrails, it becomes much easier to do these types of explorations.</div></td><td class="diff-marker"></td><td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>The Wall Street Journal published a story in 2011 that examined MetroCard usage as the cost of fares changed. The original work was created by Albert Sun and Andrew Grossman and published at http://graphicsweb.wsj.com/documents/MTAFARES1108/ on October 17, 2011. To do this, they used the publicly available fare data from the Metropolitan Transportation Authority (MTA). Their results were an interesting snapshot of usage patterns in the six months before and after the fare change. Because this data is made available on a weekly basis, it is possible to analyze more recent data as it becomes available. In addition, we can restrict views to specific lines or compare different ranges of time. As we will see, by using VisTrails, it becomes much easier to do these types of explorations.</div></td></tr>
<tr><td colspan="2"></td><td class="diff-marker" data-marker="+"></td><td style="color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;"><div><ins style="font-weight: bold; text-decoration: none;"></ins></div></td></tr>
<tr><td colspan="2"></td><td class="diff-marker" data-marker="+"></td><td style="color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;"><div><ins style="font-weight: bold; text-decoration: none;">The MTA data can be downloaded from http://www.mta.info/developers/fare.html. </br></ins></div></td></tr>
<tr><td colspan="2"></td><td class="diff-marker" data-marker="+"></td><td style="color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;"><div><ins style="font-weight: bold; text-decoration: none;">Here is a description of the different fields in the data set: http://www.mta.info/developers/resources/nyct/fares/fare_type_description.txt</ins></div></td></tr>
<tr><td class="diff-marker"></td><td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><br/></td><td class="diff-marker"></td><td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><br/></td></tr>
<tr><td class="diff-marker"></td><td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>=== Installation ===</div></td><td class="diff-marker"></td><td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>=== Installation ===</div></td></tr>
</table>Julianahttps://www.vistrails.org//index.php?title=Lab_notes_02/06/14&diff=6797&oldid=prevJuliana at 17:22, 10 February 20142014-02-10T17:22:14Z<p></p>
<table style="background-color: #fff; color: #202122;" data-mw="interface">
<col class="diff-marker" />
<col class="diff-content" />
<col class="diff-marker" />
<col class="diff-content" />
<tr class="diff-title" lang="en">
<td colspan="2" style="background-color: #fff; color: #202122; text-align: center;">← Older revision</td>
<td colspan="2" style="background-color: #fff; color: #202122; text-align: center;">Revision as of 17:22, 10 February 2014</td>
</tr><tr><td colspan="2" class="diff-lineno" id="mw-diff-left-l34">Line 34:</td>
<td colspan="2" class="diff-lineno">Line 34:</td></tr>
<tr><td class="diff-marker"></td><td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><br/></td><td class="diff-marker"></td><td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><br/></td></tr>
<tr><td class="diff-marker"></td><td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><br/></td><td class="diff-marker"></td><td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><br/></td></tr>
<tr><td class="diff-marker" data-marker="−"></td><td style="color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;"><div>To install the packages, copy <del style="font-weight: bold; text-decoration: none;">them </del>to ~/.vistrails/userpackage and unzip. On MacOS, the .vistrails directory is located in ~<your_username> -- use a Terminal window to access this directory. On Windows, the .vistrails folder is in the users %HOMEPATH% directory. Now, you can enable these packages in VisTrails:</div></td><td class="diff-marker" data-marker="+"></td><td style="color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;"><div>To install the packages, <ins style="font-weight: bold; text-decoration: none;">use the terminal window and </ins>copy <ins style="font-weight: bold; text-decoration: none;">the zip files </ins>to ~/.vistrails/userpackage and unzip. On MacOS, the .vistrails directory is located in ~<your_username> -- use a Terminal window to access this directory. On Windows, the .vistrails folder is in the users %HOMEPATH% directory. Now, you can enable these packages in VisTrails:</div></td></tr>
<tr><td class="diff-marker"></td><td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><br/></td><td class="diff-marker"></td><td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><br/></td></tr>
<tr><td class="diff-marker"></td><td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>1. Start VisTrails</div></td><td class="diff-marker"></td><td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>1. Start VisTrails</div></td></tr>
</table>Julianahttps://www.vistrails.org//index.php?title=Lab_notes_02/06/14&diff=6788&oldid=prevJuliana: /* Provenance and Reproducibility */2014-02-09T05:57:52Z<p><span dir="auto"><span class="autocomment">Provenance and Reproducibility</span></span></p>
<table style="background-color: #fff; color: #202122;" data-mw="interface">
<col class="diff-marker" />
<col class="diff-content" />
<col class="diff-marker" />
<col class="diff-content" />
<tr class="diff-title" lang="en">
<td colspan="2" style="background-color: #fff; color: #202122; text-align: center;">← Older revision</td>
<td colspan="2" style="background-color: #fff; color: #202122; text-align: center;">Revision as of 05:57, 9 February 2014</td>
</tr><tr><td colspan="2" class="diff-lineno" id="mw-diff-left-l5">Line 5:</td>
<td colspan="2" class="diff-lineno">Line 5:</td></tr>
<tr><td class="diff-marker"></td><td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><br/></td><td class="diff-marker"></td><td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><br/></td></tr>
<tr><td class="diff-marker"></td><td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>Today, we will use [http://www.vistrails.org VisTrails], an open source data analysis and visualization system that systematically captures provenance as a user explores data using computational processes. We will discuss the benefits of provenance, in particular, the ability to reproduce results and re-use knowledge.</div></td><td class="diff-marker"></td><td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>Today, we will use [http://www.vistrails.org VisTrails], an open source data analysis and visualization system that systematically captures provenance as a user explores data using computational processes. We will discuss the benefits of provenance, in particular, the ability to reproduce results and re-use knowledge.</div></td></tr>
<tr><td colspan="2"></td><td class="diff-marker" data-marker="+"></td><td style="color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;"><div><ins style="font-weight: bold; text-decoration: none;"></ins></div></td></tr>
<tr><td colspan="2"></td><td class="diff-marker" data-marker="+"></td><td style="color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;"><div><ins style="font-weight: bold; text-decoration: none;">For more information about VisTrails, see the [http://www.vistrails.org/usersguide/v2.1/html/ Users' Manual], you can start with an overview of the system at http://www.vistrails.org/usersguide/v2.1/html/getting_started.html and for a step-by-step example on how to create a pipeline, see http://www.vistrails.org/usersguide/v2.1/html/creating.html.</ins></div></td></tr>
<tr><td class="diff-marker"></td><td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><br/></td><td class="diff-marker"></td><td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><br/></td></tr>
<tr><td class="diff-marker"></td><td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>=== The Problem: Analyzing MTA Fare Data ===</div></td><td class="diff-marker"></td><td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>=== The Problem: Analyzing MTA Fare Data ===</div></td></tr>
<tr><td colspan="2" class="diff-lineno" id="mw-diff-left-l32">Line 32:</td>
<td colspan="2" class="diff-lineno">Line 34:</td></tr>
<tr><td class="diff-marker"></td><td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><br/></td><td class="diff-marker"></td><td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><br/></td></tr>
<tr><td class="diff-marker"></td><td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><br/></td><td class="diff-marker"></td><td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><br/></td></tr>
<tr><td class="diff-marker" data-marker="−"></td><td style="color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;"><div>To install the packages, copy them to ~/.vistrails/userpackage and unzip. On MacOS, the .vistrails directory is located in ~<your_username>. On Windows, the .vistrails folder is in the users %HOMEPATH% directory. Now, you can enable these packages in VisTrails:</div></td><td class="diff-marker" data-marker="+"></td><td style="color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;"><div>To install the packages, copy them to ~/.vistrails/userpackage and unzip. On MacOS, the .vistrails directory is located in ~<your_username> <ins style="font-weight: bold; text-decoration: none;">-- use a Terminal window to access this directory</ins>. On Windows, the .vistrails folder is in the users %HOMEPATH% directory. Now, you can enable these packages in VisTrails:</div></td></tr>
<tr><td class="diff-marker"></td><td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><br/></td><td class="diff-marker"></td><td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><br/></td></tr>
<tr><td class="diff-marker"></td><td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>1. Start VisTrails</div></td><td class="diff-marker"></td><td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>1. Start VisTrails</div></td></tr>
</table>Julianahttps://www.vistrails.org//index.php?title=Lab_notes_02/06/14&diff=6787&oldid=prevJuliana: Created page with '== Provenance and Reproducibility == Data exploration is inherently a trial-and-error process -- as well formulate and test hypothesis, we often need to follow many different li…'2014-02-09T05:52:49Z<p>Created page with '== Provenance and Reproducibility == Data exploration is inherently a trial-and-error process -- as well formulate and test hypothesis, we often need to follow many different li…'</p>
<p><b>New page</b></p><div>== Provenance and Reproducibility ==<br />
<br />
Data exploration is inherently a trial-and-error process -- as well formulate and test hypothesis, we often need to follow many different lines of reasoning, use different tools, explore multiple parameter value combinations. It is not uncommon to arrive at an interesting result and not remember the exact path that took you there.<br />
Therefore, it is important to maintain detailed ''provenance'' of the steps followed, data and parameter values used. This is particularly important for Big Data, where complex processes and data are used.<br />
<br />
Today, we will use [http://www.vistrails.org VisTrails], an open source data analysis and visualization system that systematically captures provenance as a user explores data using computational processes. We will discuss the benefits of provenance, in particular, the ability to reproduce results and re-use knowledge.<br />
<br />
=== The Problem: Analyzing MTA Fare Data ===<br />
<br />
The Wall Street Journal published a story in 2011 that examined MetroCard usage as the cost of fares changed. The original work was created by Albert Sun and Andrew Grossman and published at http://graphicsweb.wsj.com/documents/MTAFARES1108/ on October 17, 2011. To do this, they used the publicly available fare data from the Metropolitan Transportation Authority (MTA). Their results were an interesting snapshot of usage patterns in the six months before and after the fare change. Because this data is made available on a weekly basis, it is possible to analyze more recent data as it becomes available. In addition, we can restrict views to specific lines or compare different ranges of time. As we will see, by using VisTrails, it becomes much easier to do these types of explorations.<br />
<br />
=== Installation ===<br />
<br />
You will need to install VisTrails 2.1.1 to run this example. <br />
<br />
You can download the system from http://www.vistrails.org/index.php/Downloads.<br />
Select the link that matches your operating system.<br />
<br />
You will need 3 packages to run the example:<br />
* http://vgc.poly.edu/~dakoop/mta_example/HTTP_new.zip<br />
* http://vgc.poly.edu/~dakoop/mta_example/tabledata_new.zip<br />
* http://vgc.poly.edu/~dakoop/mta_example/gmaps.zip<br />
<br />
<br />
Before you install these packages:<br />
<br />
1. Start VisTrails<br />
<br />
2. Go to Preferences -> Module Packages and disable the HTTP and tabledata packages<br />
<br />
3. Quit the system<br />
<br />
<br />
To install the packages, copy them to ~/.vistrails/userpackage and unzip. On MacOS, the .vistrails directory is located in ~<your_username>. On Windows, the .vistrails folder is in the users %HOMEPATH% directory. Now, you can enable these packages in VisTrails:<br />
<br />
1. Start VisTrails<br />
<br />
2. Go to Preferences -> Module Packages and enable HTTP_new, tabledata_new, and gmaps packages (in this order!)<br />
<br />
Here's the example we will use: http://vgc.poly.edu/~dakoop/mta_example/mta.vt<br />
After you download, load it into VisTrails.<br />
<br />
''Now, you are ready to go!''<br />
<br />
=== Acknowledgments ===<br />
This example was provided by [http://vgc.poly.edu/~dakoop/ Dr. David Koop].</div>Juliana