Robert W. Weeks bob.weeks@cox.net, http://members.cox.net/bob.weeks
SQL is a fourth generation language designed for working with
relational databases. SQL usually contains several dozen
commands. The select command is the most important for most
users. Its purpose is to retrieve data.
The basic form of the SQL select command:
Select columnlist from tablelist where conditions group by grouplist order by orderlist
Columnlist: The list of columns to appear in the output. To retrieve more than one column, separate the column names with commas. * (the asterisk) used as a column name means retrieve all columns. Columns may include calculated values (eg: datein - dateout, salary * 1.1) and columns calculated with aggregating functions (eg: avg(salary), sum(fee)). The order in which columns are listed affects only the order in which they are listed in the output.
Tablelist: The list of tables that participate in the query. Separate multiple table names with commas. Note that if a column from a table appears in the condition, it needs to appear in the Tablelist even though it may not appear in the Columnlist. This is common with join conditions. Exception: Subqueries are a separate query. The order in which tables are listed doesn't matter.
Conditions: Expressions that serve to limit the number of rows returned from a table, or to join rows from two or more tables. Examples:
| where name = "John Doe" | Only rows where the name column contains "John Doe" will be included in the output. Character data is enclosed in quotation marks or apostrophes. |
| where pprice > 15.95 | Only rows where the pprice column contains a value greater than 15.95 will be included in the output. Numeric data isn't surrounded with quotation marks. |
| where pprice >= 15.95 | Only rows where the pprice column contains a value greater than or equal to 15.95 will be included in the output. |
| where pdate > 03/01/95 | Only rows where the pdate column is greater than (after) March 1, 1995 will be included. |
| Where pdate >= 3/1/95 and pdate < 4/1/95 | Only rows where the pdate column is greater than or equal to March 1, 1995 and at the same time the pdate column is less than April 1, 1995 will be included. |
| Where color = "Red" or color = "Blue" | Only rows where the color column contains Red or the color column contains Blue will be included. This produces output that contains both red and blue cars, for example. |
| where catalog.id = tape.id | An example of a join condition. (The notation catalog.id means the id column from the catalog table.) In this case, SQL will match rows from the catalog table with rows from the tape table that make this statement true. When a match is found, SQL creates a row in the output for each match. For example, the catalog table may have one row where the id column contains 1. The tape table may have three rows where the id column contains 1. SQL will produce three rows in the output, as there are three cases that match this condition. |
| where member.member in (select member from rental) | A subquery. In this example, SQL first performs the subquery (in parenthesis). This subquery produces a temporary list (or table) that contains all the values found in the member column of the rental table. Then, SQL uses the in operator to compare member.member with the temporary list. If member.member is found in the list, the expression is true. SQL can also use not in. |
Conditions may be combined together with or, and, and not. Example: where color = "Red" and pdate >= 03/01/95 and pdate < 04/01/95.
Grouplist: If specified, SQL groups the rows in the output by columns in this list. Grouping is most often used with aggregating functions to produce summaries of data. SQL produces one row for each unique value found in the grouping column, and the function is calculated for all rows that belong to the group.
Orderlist: If specified, SQL sorts the output rows by the columns in this list. Example: order by state, city
Aggregating (summary) functions: Sum, count, min, max, and avg are the most common. These function are applied to a column as in select sum(salary), which sums the values in the salary column. Aggregating functions may be used alone, as in this example:
Select sum(salary) from employee
Select sum(salary) from employee where department = "Engineering"
Select count(*) from employee
The first example sums the salary column from the employee table, for all rows in the table. The second example sums the salary column from the employee table, but only for those rows where the department column contains the value Engineering. Presumably, this will return a smaller value than the first example. Count is used with * as the argument, which means to count the rows.
Aggregating functions are often used with the group by part of the select command. This example:
select department, sum(salary) from employee group by department
produces an output table with two columns: a department name, and the sum of the salary column for all the rows in that department:
Department Sum of Salary -------------- --------------- Engineering 350,000 Marketing 125,000 Production 568,349
MEMBERID NAME ADDRESS DATEJOINED
1 John Doe 123 N. Main 02/05/94
2 Jane Wilson 100 N. Broadway 03/10/94
3 Bob Smith 100 E. Elm 03/15/94
4 Sue Williams 100 W. Oak 04/10/94
TAPEID CATALOGID SUPPLIERID PURCHASEDA PURCHASEPR
1 1 1 01/05/94 15.57
2 1 1 01/05/94 15.57
3 1 2 01/15/94 14.95
4 2 2 01/05/94 15.95
5 2 2 01/05/94 15.95
6 2 1 02/01/94 15.95
7 3 2 02/01/94 15.95
8 4 2 02/01/94 15.95
9 4 1 02/01/94 15.95
10 5 1 02/01/94 24.58
11 6 2 02/01/94 4.58
12 7 3 02/05/94 15.45
13 8 2 02/05/94 21.57
14 9 1 02/05/94 14.95
15 10 1 02/05/94 15.95
SUPPLIERID NAME ADDRESS
1 ABC Tape Co. 100 N. Main
2 Acme Tape Co. 100 N. Broadway
3 Tapes 'R' Us 100 E. Pine
CHARGECODE CHARGE A 3.00 B 2.00 C 1.15
CATALOGID TITLE RATING CHARGECODE
1 Forrest Gump PG-13 A
2 Batman Forever PG-13 A
3 The Lion King G A
4 Apollo 13 PG-13 A
5 Wyatt Earp R B
6 Tombstone R B
7 City Slickers PG C
8 101 Dalmations G C
9 Dumb and Dumber PG-13 C
10 Casablanca G C
MEMBERID TAPEID DATERENTED DATEDUE DATERETURN RENTALFEE
1 3 04/01/94 04/05/94 04/03/94 3.00
1 8 04/10/94 04/14/94 04/12/94 3.00
1 10 04/10/94 04/14/94 04/12/94 2.00
2 3 04/25/94 04/29/94 04/28/94 3.00
1 7 05/01/94 05/05/94 05/02/94 3.00
2 7 05/03/94 05/07/94 05/10/94 3.00
1 4 05/15/94 05/19/94 05/18/94 3.00
1 7 05/15/94 05/19/94 05/18/94 3.00
1 9 05/15/94 05/19/94 05/18/94 3.00
1 2 06/01/94 06/05/94 06/04/94 3.00
2 3 06/01/94 06/05/94 3.00
3 13 06/01/94 06/05/94 06/06/94 1.00
1 6 06/01/94 06/05/94 3.00
List the members' names.
select name from member NAME John Doe Jane Wilson Bob Smith Sue Williams
This is about the simplest SQL query possible, retrieving all rows for a single column from a table. This query retrieves all rows from the member table.
List all the members' names and the date they joined.
select name, datejoined from member NAME DATEJOINED John Doe 02/05/94 Jane Wilson 03/10/94 Bob Smith 03/15/94 Sue Williams 04/10/94
Separate more than one column name with commas.
List all the members.
Select *
from member
MEMBERID NAME ADDRESS DATEJOINED
1 John Doe 123 N. Main 02/05/94
2 Jane Wilson 100 N. Broadway 03/10/94
3 Bob Smith 100 E. Elm 03/15/94
4 Sue Williams 100 W. Oak 04/10/94
We used * to as a shortcut for all columns from the member table.
List all the members, sorted by name.
select * from member order by name MEMBER NAME ADDRESS DATEJOINED 3 Bob Smith 100 E. Elm 03/15/94 2 Jane Wilson 100 N. Broadway 03/10/94 1 John Doe 123 N. Main 02/05/94 4 Sue Williams 100 W. Oak 04/10/94
Without using order by, SQL presents the rows in no particular order. The order of rows means nothing to a relational database. Humans, of course, may find it convenient to have rows shown in some order.
On what date did member number 3 join?
select datejoined from member where memberid = 3 DATEJOINED 03/15/94
The question asks for the date only, so that is the only column requested. Generally it's best to output only those columns needed to answer the question; extra columns may contribute to information overload. The where part of the statement serves to determine which rows in the member table appear in the output. In this case, only rows for which the MemberID column contains the value 3 will be in the output. Since MemberID is the primary key of the member table, it's good that we had just one row of output! (What would it mean if there were two or more rows of output?)
List the names of the members who joined the club in February of 1994.
select name from member where datejoined >= 02/01/94 and datejoined < 03/01/94 NAME John Doe
Note the use of two date conditions joined by and to select dates within a certain date range.
Members often ask for G-rated movies for family viewing. List the titles of the tapes with a G rating.
select title from catalog where rating = "G" TITLE The Lion King 101 Dalmations Casablanca
Surround character data, such as a movie rating, with quotation marks or apostrophes (different implementations of SQL vary). Don't use them with numeric data. Notice that if we select rows based on the value in a column (the rating column in this example), it does not mean that we must include the column in the output. What is the meaning of the word tape in this context?
List the titles of the tapes that are rated either G or PG-13.
select title from catalog where rating = "G" or rating = "PG-13" TITLE Forrest Gump Batman Forever The Lion King Apollo 13 101 Dalmations Dumb and Dumber Casablanca
List the tapes which are currently rented out.
select * from rental
where datereturn is null
MEMBERID TAPEID DATERENTED DATEDUE DATERETURN RENTALFEE
2 3 06/01/94 06/05/94 3.00
1 6 06/01/94 06/05/94 3.00
Is null selects rows where the column datereturn is empty (has not had a value entered), meaning that the member hasn't yet returned the tape. What is the meaning of the word tape in this context?
How many tapes do we have?
select count(*) from tape COUNT 15
Count(*) counts the number of rows. What is the meaning of a row in the tape table?
How much have we spent on tapes?
select sum(purchasepr) from tape
SUM OF PURCHASEPR
238.87
This shows the use of a summary operator or aggregating function to calculate a statistic for a column of data. Be sure to distinguish summing from counting. Count counts the number of rows, while sum adds values in a column. Very often a row in a table corresponds to some real-world thing. For example, a row in the tape table corresponds to a video tape on the shelf of the store. If we want to know how many tapes we have, count the number of rows in the tape table.
What is the average price paid for a tape?
select avg(purchasepr)
from tape
AVERAGE OF PURCHASEPR
15.92
This shows the use of a summary operator or aggregating function to calculate a statistic for a column of data.
How many members are there?
select count(*) from member COUNT 4
What is the meaning of a row in the member table? What does it mean to count the rows?
How many tapes have been rented? What was the total rental revenue?
select count(*), sum(rentalfee) from rental COUNT SUM OF RENTALFEE 13 36.00
How many tapes were rented in the month of April 1994? What was the total rental revenue?
select count(*), sum(rentalfee) from rental where daterented >= 4/1/94 and daterented < 5/1/94 COUNT SUM OF RENTALFEE 4 11.00
Not a very busy month. But no matter how many tapes were rented, this SQL query will tell us how many and for how much.
List tapes. Show the tape ID, purchase date, and movie title. Sort the output by the purchase date.
select tape.tapeid, tape.purchaseda, catalog.title
from tape, catalog
where tape.catalogid = catalog.catalogid
order by purchaseda
TAPEID PURCHASEDA TITLE
1 01/05/94 Forrest Gump
2 01/05/94 Forrest Gump
4 01/05/94 Batman Forever
5 01/05/94 Batman Forever
3 01/15/94 Forrest Gump
(Not all results shown.) Because this query draws data from more than one table, we must tell SQL how to join rows in the tables. The join condition where tape.catalogid = catalog.catalogid does that. This tells SQL to join rows that contain common catalog ID values. Without the join condition, this query fails to produce the desired result.
List members' names along with names of movies they have rented. Sort by member name, then movie title, then date rented. Show date rented and date returned.
select member.name, catalog.title, rental.daterented, rental.datereturn from member, catalog, rental, tape where member.memberid = rental.memberid and rental.tapeid = tape.tapeid and tape.catalogid = catalog.catalogid order by member.name, catalog.title, daterented NAME TITLE DATERENTED DATERETURN Bob Smith 101 Dalmations 06/01/94 06/06/94 Jane Wilson Forrest Gump 04/25/94 04/28/94 Jane Wilson Forrest Gump 06/01/94 / / Jane Wilson The Lion King 05/03/94 05/10/94 John Doe Apollo 13 04/10/94 04/12/94 John Doe Apollo 13 05/15/94 05/18/94 John Doe Batman Forever 05/15/94 05/18/94 John Doe Batman Forever 06/01/94 / / John Doe Forrest Gump 04/01/94 04/03/94 John Doe Forrest Gump 06/01/94 06/04/94 John Doe The Lion King 05/01/94 05/02/94 John Doe The Lion King 05/15/94 05/18/94 John Doe Wyatt Earp 04/10/94 04/12/94
Note the use of the tablename.columnname notation,
which is necessary as some of the tables have columns with the
same name. Here our where condition provided the linkage
between the member and rental tables (common memberid
column), the linkage between the rental and tape tables (common tapeid
column), and the linkage between the tape and catalog tables
(common catalogid column). How do we know that these
tables need to be linked in this manner? One way is by examining
the entity-relationship diagram for this database.
How many tapes were rented by John Doe?
select count(*) from rental, member where rental.memberid = member.memberid and name = "John Doe" COUNT 9
Is it valid to refer to members by their names as does this query?
Our supplier Acme Tape Co. has called us to say that they have discovered mice in their warehouse, and that the tapes we purchased during January 1994 have been infected with the deadly hantavirus. We need to contact the members who have rented these tapes.
select member.name, member.address from member, rental, tape, supplier where member.memberid = rental.memberid and rental.tapeid = tape.tapeid and tape.supplierid = supplier.supplierid and supplier.name = "Acme Tape Co." and tape.purchaseda >= 1/1/94 and tape.purchaseda < 2/1/94 NAME ADDRESS John Doe 123 N. Main Jane Wilson 100 N. Broadway Jane Wilson 100 N. Broadway John Doe 123 N. Main
If we add the word distinct after the word select, SQL will eliminate the duplicates. It would be helpful if this database included members' telephone numbers.
List our tapes along with the number of copies we have of each tape.
select catalogid, count(*)
from tape
group by catalogid
CATALOGID COUNT
1 3
2 3
3 1
4 2
5 1
6 1
7 1
8 1
9 1
10 1
Group by tells SQL to create one row of output for each unique value found in the group by column (catalogid in this case). Then, calculate the summary operator for each row. The summary operator in this example counts the number of rows. Each row corresponds to a copy of a tape. But what does catalogid mean? What is the movie title for catalogid number 1?
List our tapes along with the number of copies we have of each tape. This time show the movie titles.
select catalog.title, count(*) from catalog, tape where catalog.catalogid = tape.catalogid group by catalog.title TITLE COUNT Forrest Gump 3 Batman Forever 3 The Lion King 1 Apollo 13 2 Wyatt Earp 1 Tombstone 1 City Slickers 1 101 Dalmations 1 Dumb and Dumber 1 Casablanca 1
In this case, we join the tape table with the catalog table so that we can display movie titles rather than catalog numbers
Count the number of times that each of our tapes has been rented.
select tape.tapeid, catalog.title, count(*)
from catalog, tape, rental
where catalog.catalogid = tape.catalogid
and tape.tapeid = rental.tapeid
group by rental.tapeid
TAPEID TITLE COUNT
2 Forrest Gump 1
3 Forrest Gump 3
4 Batman Forever 1
6 Batman Forever 1
7 The Lion King 3
8 Apollo 13 1
9 Apollo 13 1
10 Wyatt Earp 1
13 101 Dalmations 1
Count the number of times that each of our movie titles has been rented.
select catalog.catalogid, catalog.title, count(*)
from catalog, tape, rental
where catalog.catalogid = tape.catalogid
and tape.tapeid = rental.tapeid
group by catalog.catalogid
CATALOGID TITLE COUNT
1 Forrest Gump 4
2 Batman Forever 2
3 The Lion King 3
4 Apollo 13 2
5 Wyatt Earp 1
8 101 Dalmations 1
What is the average price paid for a tape from each supplier? How many tapes have we purchased from each supplier?
select supplier.name, count(*), avg(tape.purchasepr) from supplier, tape where supplier.supplierid = tape.supplierid group by supplier.name NAME COUNT AVERAGE OF PURCHASEPR ABC Tape Co. 7 16.93 Acme Tape Co. 7 14.99 Tapes 'R' Us 1 15.45
The group by tells SQL to create one row of output for each supplier name. The summary statistics are calculated for each row.
List the tapes which are currently rented for each member and the due date.
select member.name, catalog.title, rental.daterented, rental.daterented + 4 from rental, tape, catalog, member where empty(rental.datereturn) and rental.tapeid = tape.tapeid and tape.catalogid = catalog.catalogid and rental.memberid = member.memberid order by member.name NAME TITLE DATERENTED EXP_4 Jane Wilson Forrest Gump 06/01/94 06/05/94 John Doe Batman Forever 06/01/94 06/05/94
Are there any members who have not yet rented a tape?
select * from member
where memberid not in
(select distinct memberid from rental)
MEMBERID NAME ADDRESS DATEJOINED
4 Sue Williams 100 W. Oak 04/10/94
This query works in this way: The "inner" or nested select (select distinct memberid from rental) selects those members who have rented a tape. That's because the meaning of a row in the rental table is that a certain member rented a certain tape. Given this result, that is, the members who have rented a tape, all we have to do is select the members from the member table who don't appear in this list of members who have rented a tape. The clause where memberid not in selects those rows from the member table where the memberid isn't in the list of members who have rented tapes.
Produce a list of G-rated movies, but include only those movies that a customer (say customer number 2) has not yet rented.
select * from catalog
where rating = "G"
and catalogid not in
(Select catalogid from tape, rental
where rental.memberid = 2
and tape.tapeid = rental.tapeid)
CATALOGID TITLE RATING CHARGECODE
8 101 Dalmations G C
10 Casablanca G C
The nested select part of this command generates a list of the catalogid values of tapes that member number 2 has rented. The not in operator selects rows from the catalog table where the catalogid is not in this list.
Members who have rented Forrest Gump have also rented what other movies?.
Select title from catalog where catalogid in (select distinct catalog.catalogid from catalog, tape, rental where tape.catalogid = catalog.catalogid and tape.tapeid = rental.tapeid and rental.memberid in (select distinct member.memberid from member, tape, rental, catalog where catalog.title = "Forrest Gump" and tape.catalogid = catalog.catalogid and tape.tapeid = rental.tapeid and rental.memberid = member.memberid)) TITLE Forrest Gump Batman Forever The Lion King Apollo 13 Wyatt Earp