Big Data Lab notes 02/19/14

From VistrailsWiki
Jump to navigation Jump to search

Today, we will work on SQL queries.

Before the class, please download mySQL from from http://dev.mysql.com/downloads. There are many different versions, select MySQL Community Server.

Installing and starting mySQL

mySQL has an extensive documentation, including installation instructions for different OSes, for example, MacOS: https://dev.mysql.com/doc/refman/5.7/en/macosx-installation.html and for Windows: http://www.mysql.com/why-mysql/windows/

I run mySQL on MacOS, so here's what I did after I installed the system to 'start' the database:

    shell> cd /usr/local/mysql
    shell> sudo ./bin/mysqld_safe
    (ENTER YOUR PASSWORD, IF NECESSARY)
    (PRESS CONTROL-Z)
    shell> bg
    (PRESS CONTROL-D OR ENTER "EXIT" TO EXIT THE SHELL)

You can also use the shortcut command: mysqlup

To connect to the MySQL server, on a terminal, invoke `/usr/local/mysql/bin/mysql.

You can add aliases to your shell's resource file to make it easier to access commonly used programs such as `mysql' and `mysqladmin' from the command line. I have added the following 2 lines to my .bashrc:

    alias mysql=/usr/local/mysql/bin/mysql
    alias mysqladmin=/usr/local/mysql/bin/mysqladmin

You can also modified my PATH environment variable to include /usr/local/mysql/bin:

export PATH=$PATH:./:/usr/local/mysql/bin:/usr/local/mysql/bin:/usr/local/hadoop-1.2.1/bin:/usr/local/pig-0.12.0/bin

Our running example

We will work with three tables. Here are the table definitions:

   create table Sailors(
       sid int PRIMARY KEY,
       sname varchar(30),
       rating int,
       age int);
   create table Reserves(
       sid int,
       bid int,
       day date,
       PRIMARY KEY (sid, bid, day));
   create table Boats(
       bid int PRIMARY KEY,
       bname char(20),
       color char(10));

You can download the sample file with these statements as well as some sample data from http://vgc.poly.edu/~juliana/courses/BigData2014/Lectures/Lab-SQL/sailors-mysql.sql

SQL Cheat Sheet

SELECT AcctNo, Amount FROM ATMWithdrawal WHERE Amount < 50;

A SQL query is evaluated as follows:

  • First the FROM clause tells us the input tables
  • Second, the WHERE clause is evaluated for all possible combinations from the input tables
  • Third, the SELECT clause tells us which attributes to keep in the query answer