Course: Big Data 2014

From VistrailsWiki
Jump to navigation Jump to search
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.

DS-GA 1004/CSCI-GA 2568 Big Data: Tentative Schedule -- subject to change

  • Lecture: Mondays, 7:10pm-9:00pm at Cantor, room 101. Note new location!
    • Cantor Film Center (CANTR), 36 E 8th St, New York, NY 10003
  • Lab: Thursdays, 7:10pm-8:00pm at CIWW, room 109. Always bring your laptop.
    • Warren Weaver Hall (CIWW), 251 Mercer St, New York, NY 10012

News

  • The final exam will take place on May 12th.
  • We will have our last class on May 19th.
  • 4/21/2014: There are two new quizes on gradiance. They are due on 2014-04-28 23:59 PST.
  • Starting on Feb 10th, our class will meet at a new location: Cantor 101
  • We will have lab on Thu at CIWW, room 109. Bring your laptop!

Background (4 weeks)

Week 1 -- Jan 27: Course Overview; the evolution of Data Management


Week 2 -- Feb 3: Introduction to Databases

Week 3 -- Feb 10: Overview: Relational Model and SQL

  • Feb 13: Lab: Canceled -- University closed due to snow ==


Week 3.1 -- Feb 17: Holiday

Week 4 -- Feb 24: Overview: Advanced SQL and Query Optimization

Big Data Foundations and Infrastructure (2 weeks)

Week 5 -- Mar 3: Cloud computing, Map Reduce and Hadoop

  • Required reading:
    • Data-Intensive Text Processing with MapReduce, Chapters 1 and 2
    • Mining of Massive Datasets (2nd Edition), Chapter 2 - 2.1 and 2.2 (Large-Scale File Systems and Map-Reduce).
  • Homework Assignment -- Your first quiz is available on Gradiance. It is due on March 17th at 5pm.

Week 6 -- Mar 10: Algorithm Design for MapReduce

  • Required reading:
    • Data-Intensive Text Processing with MapReduce, Chapters 1 and 2
    • Mining of Massive Datasets (2nd Edition), Chapter 2.


Machine Learning and Big Data (3 weeks)

Week 7 -- Mar 23: Hashing and AllReduce

  • Invited lecture by John Langford

Week 8 -- Mar 30: Bandits

  • Invited lecture by John Langford

Week 9 -- Apr 7: Large Scale Machine Learning in the Real World

  • Invited lecture by Leon Bottou

Big Data Foundations and Infrastructure -- cont. (2 weeks)

Week 10 -- April 14: Parallel Databases vs MapReduce, Query Processing on Mapreduce and High-level Languages


Big Data Algorithms and Techniques (3 weeks)

Week 11 -- April 21: Data Management for Big Data (cont) and Association Rules

  • Homework Assignment -- Your quiz is available on Gradiance. It is due on April 28th.

Week 12 -- Apr 28: Finding similar items: Invited lecture by Dr. Harish Doraiswami

Week 13 -- May 5: Graph Analysis and Exam Review

Week 14 -- May 12: Final Exam

Week 15 -- May 19: Large-Scale Visualization -- Invited lecture by Dr. Lauro Lins (AT&T Research)

  • Reading:

The Value of Visualization, Jarke Van Wijk http://www.win.tue.nl/~vanwijk/vov.pdf

Tamara Munzner's Book draft 2 available online http://www.cs.ubc.ca/~tmm/courses/533/book/

Nanocubes Paper http://nanocubes.net http://nanocubes.net/assets/pdf/nanocubes_paper_preprint.pdf