Difference between revisions of "Course: Big Data 2014"

From VistrailsWiki
Jump to navigation Jump to search
Line 70: Line 70:
* Homework assignment: [[Assignment 2 - Data Exploration using SQL]]
* Homework assignment: [[Assignment 2 - Data Exploration using SQL]]


= Big Data Foundations and Infrastructure (4 weeks) =
= Big Data Foundations and Infrastructure (2 weeks) =


== Week 5 -- Mar 3: Cloud computing, Map Reduce and  Hadoop ==
== Week 5 -- Mar 3: Cloud computing, Map Reduce and  Hadoop ==
Line 94: Line 94:
** Mining of Massive Datasets (2nd Edition), Chapter 2.
** Mining of Massive Datasets (2nd Edition), Chapter 2.


== Week X1 -- Y1: Data Management for Big Data, No-SQL and NewSQL Systems ==
== Week X2 -- Y2: Query Processing on Mapreduce and High-level Languages ==


= Machine Learning and Big Data  (3 weeks) =
= Machine Learning and Big Data  (3 weeks) =
Line 123: Line 120:
** http://cilvr.cs.nyu.edu/diglib/lsml/lecture11-ads-bottou.pdf
** http://cilvr.cs.nyu.edu/diglib/lsml/lecture11-ads-bottou.pdf


= Big Data Algorithms and Techniques (6 weeks) =
= Big Data Foundations and Infrastructure -- cont. (2 weeks) =
 
== Week 10 -- April 14: Data Management for Big Data, No-SQL and NewSQL Systems ==


== Week X3 -- Y3: Map Reduce Algorithm Design ==
== Week 11 -- April 21: Query Processing on Mapreduce and High-level Languages ==


== Week 10 -- Apr 14: Finding similar items and information integration ==


== Week 11 -- Apr 21: Graph Analysis ==
= Big Data Algorithms and Techniques (3 weeks) =


== Week 12 -- Apr 28: Frequent Itemset Mining ==
== Week 12 -- Apr 28: Finding similar items and information integration ==


== Week 13 -- May 5: Interactive Analysis and Visualization of Big Data ==
== Week 13 -- May 5: Graph Analysis ==


== Week 14 -- May 12: Machine Learning for Big Data ==
== Week 14 -- May 12: Frequent Itemset Mining ==




== Week 15 -- May 19: Final Exam ==
== Week 15 -- May 19: Final Exam ==

Revision as of 01:03, 14 April 2014

DS-GA 1004/CSCI-GA 2568 Big Data: Tentative Schedule -- subject to change

  • Lecture: Mondays, 7:10pm-9:00pm at Cantor, room 101. Note new location!
    • Cantor Film Center (CANTR), 36 E 8th St, New York, NY 10003
  • Lab: Thursdays, 7:10pm-8:00pm at CIWW, room 109. Always bring your laptop.
    • Warren Weaver Hall (CIWW), 251 Mercer St, New York, NY 10012

News

  • Starting on Feb 10th, our class will meet at a new location: Cantor 101
  • We will have lab on Thu at CIWW, room 109. Bring your laptop!

Background (4 weeks)

Week 1 -- Jan 27: Course Overview; the evolution of Data Management


Week 2 -- Feb 3: Introduction to Databases

Week 3 -- Feb 10: Overview: Relational Model and SQL

  • Feb 13: Lab: Canceled -- University closed due to snow ==


Week 3.1 -- Feb 17: Holiday

Week 4 -- Feb 24: Overview: Advanced SQL and Query Optimization

Big Data Foundations and Infrastructure (2 weeks)

Week 5 -- Mar 3: Cloud computing, Map Reduce and Hadoop

  • Required reading:
    • Data-Intensive Text Processing with MapReduce, Chapters 1 and 2
    • Mining of Massive Datasets (2nd Edition), Chapter 2 - 2.1 and 2.2 (Large-Scale File Systems and Map-Reduce).
  • Homework Assignment -- Your first quiz is available on Gradiance. It is due on March 17th at 5pm.

Week 6 -- Mar 10: Algorithm Design for MapReduce

  • Required reading:
    • Data-Intensive Text Processing with MapReduce, Chapters 1 and 2
    • Mining of Massive Datasets (2nd Edition), Chapter 2.


Machine Learning and Big Data (3 weeks)

Week 7 -- Mar 23: Hashing and AllReduce

  • Invited lecture by John Langford

Week 8 -- Mar 30: Bandits

  • Invited lecture by John Langford

Week 9 -- Apr 7: Large Scale Machine Learning in the Real World

  • Invited lecture by Leon Bottou

Big Data Foundations and Infrastructure -- cont. (2 weeks)

Week 10 -- April 14: Data Management for Big Data, No-SQL and NewSQL Systems

Week 11 -- April 21: Query Processing on Mapreduce and High-level Languages

Big Data Algorithms and Techniques (3 weeks)

Week 12 -- Apr 28: Finding similar items and information integration

Week 13 -- May 5: Graph Analysis

Week 14 -- May 12: Frequent Itemset Mining

Week 15 -- May 19: Final Exam