Course: Big Data Analysis

From VistrailsWiki
Jump to navigation Jump to search

This schedule is tentative and subject to change

Make sure to check my.poly.edu for course announcements

News

Project description

Week 1: Monday Sept. 10th - Course Overview

Required Reading

Additional References

Week 2: Monday Sept. 17th - Map-Reduce

Required Reading

Additional References

Week 3: Monday Sept. 24th - Databases and Big Data

Related Topics

Required Reading

Additional Readings

Week 4: Monday Oct. 1st - Statistics is easy - Invited Speaker: Dennis Shasha

Required Reading

Homework Assignment

Due October 9th BigDataHW1

Week 5: Monday Oct. 8st - Finding Similar Items

Required Reading

Homework Assignment

Due October 15th at noon Your assignment is in http://www.newgradiance.com/services. Please see http://vgc.poly.edu/~juliana/courses/cs9223 for instructions on how to access this service.

Week 6: Wednesday Oct. 17th - Invited Speaker: Torsten Suel

Note this class will be held on Wednesday!

Readings

Week 7: Monday Oct. 22st - Invited lecture by and Lauro Lins

Readings

The Value of Visualization. IEEE Visualization 2005. Jarke J. van Wijk. http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.78.1138

Visualization Analysis and Design: Principles, Methods, and Practice. Tamara Munzner (Book Draft 2 from Sep. 2012). http://www.cs.ubc.ca/~tmm/courses/533-11/book/vispmp-draft.pdf

Week 8: Monday Oct 29th- Class canceled due to storm

Week 9: Monday Nov 5th- Data infrastructure and information integration

Readings

  • HBase book HBase: The Definitive Guide. Random Access to Your Planet-Size Data: http://shop.oreilly.com/product/0636920014348.do
  • HBase book. Chapter 8 Architecture for information about transactional processing, WriteAhead Log notably, and how consistency is being maintained.

Week 10: Monday Nov. 12th - Frequent Itemsets

Readings

  • Mining of Massive Datasets, Chapter 4

Additional Reading

Week 11: Monday Nov 19th- Algorithms on MapReduce: text processing

Readings

  • Data-Intensive Text Processing with MapReduce, Chapter 4

Week 12: Monday Nov. 26th - Graph Algorithms and Phase-I project presentations

Readings

  • Data-Intensive Text Processing with MapReduce, Chapter 4
  • Pregel: A System for Large-Scale Graph Processing. Google. [3]

Week 13: Monday Dec. 3rd - Clustering


Readings

Week 14: Monday Dec. 10th - EM algorithms for text processing

Readings

  • Data-Intensive Text Processing with MapReduce, Chapter 6


Week 15 Monday Dec. 17 - Phase-II Project presentation

Further Readings

Other topics

Provenance

Juliana Freire and Claudio Silva. In Computing in Science and Engineering 14(4): 18-25, 2012.

Juliana Freire, David Koop, Emanuele Santos, and Claudio T. Silva. In IEEE Computing in Science & Engineering, 2008.