SQL: The Complete Reference

Overview

SQL: The Complete Reference provides an in-depth discussion of SQL fundamentals, modern SQL products, and SQLs role in trends such as data warehousing, "thinclient" architectures, and Internet-based e-commerce. This book is your one-stop resource for all you need to know about SQL.
Read More Show Less
... See more details below
Available through our Marketplace sellers.
Other sellers (Other Format)
  • All (34) from $1.99   
  • New (4) from $13.98   
  • Used (30) from $1.99   
Close
Sort by
Page 1 of 1
Showing All
Note: Marketplace items are not eligible for any BN.com coupons and promotions
$13.98
Seller since Sun Jan 01 01:01:01 EST 2006

Feedback rating:

(986)

Condition:

New — never opened or used in original packaging.

Like New — packaging may have been opened. A "Like New" item is suitable to give as a gift.

Very Good — may have minor signs of wear on packaging but item works perfectly and has no damage.

Good — item is in good condition but packaging may have signs of shelf wear/aging or torn packaging. All specific defects should be noted in the Comments section associated with each item.

Acceptable — item is in working order but may show signs of wear such as scratches or torn packaging. All specific defects should be noted in the Comments section associated with each item.

Used — An item that has been opened and may show signs of wear. All specific defects should be noted in the Comments section associated with each item.

Refurbished — A used item that has been renewed or updated and verified to be in proper working condition. Not necessarily completed by the original manufacturer.

New
1999-03-01 Paperback New New Item. Item delivered via UPS in 7-9 business days. Tracking available by request Ships from US. Please allow 1-3 weeks for delivery outside US.

Ships from: Appleton, WI

Usually ships in 1-2 business days

  • Canadian
  • International
  • Standard, 48 States
  • Standard (AK, HI)
  • Express, 48 States
  • Express (AK, HI)
$39.99
Seller since Tue Jan 01 01:01:01 EST 2008

Feedback rating:

(171)

Condition: New
0072118458 BRAND NEW NEVER USED IN STOCK 125,000+ HAPPY CUSTOMERS SHIP EVERY DAY WITH FREE TRACKING NUMBER

Ships from: fallbrook, CA

Usually ships in 1-2 business days

  • Standard, 48 States
  • Standard (AK, HI)
$60.00
Seller since Tue Oct 07 09:37:03 EDT 2014

Feedback rating:

(184)

Condition: New
Brand new.

Ships from: acton, MA

Usually ships in 1-2 business days

  • Standard, 48 States
  • Standard (AK, HI)
$60.00
Seller since Tue Oct 07 09:37:03 EDT 2014

Feedback rating:

(184)

Condition: New
Brand new.

Ships from: acton, MA

Usually ships in 1-2 business days

  • Standard, 48 States
  • Standard (AK, HI)
Page 1 of 1
Showing All
Close
Sort by
Sending request ...

Overview

SQL: The Complete Reference provides an in-depth discussion of SQL fundamentals, modern SQL products, and SQLs role in trends such as data warehousing, "thinclient" architectures, and Internet-based e-commerce. This book is your one-stop resource for all you need to know about SQL.
Read More Show Less

Editorial Reviews

From The Critics
A beginners' guide to performing SQL database queries, creating and updating a database, and programming with SQL. The CD-ROM contains evaluation versions of Microsoft SQL Server 2000, IBM DB2, and MySQL. The second edition adds two chapters on the role of SQL in application server architectures and the integration of SQL with XML. Annotation c. Book News, Inc., Portland, OR
Read More Show Less

Product Details

  • ISBN-13: 9780072118452
  • Publisher: McGraw-Hill Companies, The
  • Publication date: 3/1/1999
  • Series: Complete Reference Series
  • Edition description: Older Edition
  • Pages: 1008
  • Product dimensions: 7.39 (w) x 9.12 (h) x 2.37 (d)

Meet the Author


James R. Groff is currently CEO of Times Ten Performance Software, developer of an ultra-high-performance main-memory SQL database for communications and Internet applications.

Paul N. Weinberg is Vice President of A2i, Inc., developer of a database-driven, cross media catalog publishing system that supports printed and electronic output from a single data source.

Together, they were the cofounders of Netowork Innovations Corporation, an early developer of SQL-based networking software that linked personal computers to corporate databases.

Read More Show Less

Table of Contents

Acknowledgments
Introduction
1 Introduction 3
2 A Quick Tour of SQL 15
3 SQL in Perspective 25
4 Relational Databases 51
5 SQL Basics 71
6 Simple Queries 95
7 Multitable Queries (Joins) 139
8 Summary Queries 187
9 Subqueries and Query Expressions 217
10 Database Updates 269
11 Data Integrity 291
12 Transaction Processing 327
13 Creating a Database 365
14 Views 409
15 SQL Security 433
16 The System Catalog 459
17 Embedded SQL 491
18 Dynamic SQL 547
19 SQL APIs 599
20 Database Processing and Stored Procedures 701
21 SQL and Data Warehousing 759
22 SQL and Application Servers 775
23 SQL Networking and Distributed Databases 795
24 SQL and Objects 835
25 SQL and XML 875
26 The Future of SQL 913
A The Sample Database 933
B Database Vendor Profiles 941
C SQL Syntax Reference 961
D SQL Call-Level Interface 971
E SQL Information Schema Standard 991
F CD-ROM Installation Guide 1013
Index 1025
Read More Show Less

First Chapter


Chapter Six: Simple Queries

In many ways, queries are the heart of the SQL language. The SELECT statement, which is used to express SQL queries, is the most powerful and complex of the SQL statements. Despite the many options afforded by the SELECT statement, it"s possible to start simply and then work up to more complex queries. This chapter discusses the simplest SQL queries-those that retrieve data from a single table in the database.

The SELECT Statement

The SELECT statement retrieves data from a database and returns it to you in the form of query results. You have already seen many examples of the SELECT statement in the quick tour presented in Chapter 2. Here are several more sample queries that retrieve information about sales offices:

List the sales offices with their targets and actual sales.

SELECT CITY, TARGET, SALES
FROM OFFICES

CITY TARGET SALES
------------ ------------ ------------
Denver $300,000.00 $186,042.00
New York $575,000.00 $692,637.00
Chicago $800,000.00 $735,042.00
Atlanta $350,000.00 $367,911.00
Los Angeles $725,000.00 $835,915.00

List the Eastern region sales offices with their targets and sales.

SELECT CITY, TARGET, SALES
FROM OFFICES
WHERE REGION = "Eastern"

CITY TARGET SALES
------------ ------------ ------------
New York $575,000.00 $692,637.00
Chicago $800,000.00 $735,042.00
Atlanta $350,000.00 $367,911.00

List Eastern region sales offices whose sales exceed their targets, sorted in alphabetical order by city.

SELECT CITY, TARGET, SALES
FROM OFFICES
WHERE REGION = "Eastern"
ORDER BY CITY

CITY TARGET SALES
------------ ------------ ------------
Atlanta $350,000.00 $367,911.00
New York $575,000.00 $692,637.00

What are the average target and sales for Eastern region offices?

SELECT AVG(TARGET), AVG(SALES)
FROM OFFICES
WHERE REGION = "Eastern"

AVG(TARGET) AVG(SALES)
------------ ------------
$575,000.00 $598,530.00

For simple queries, the English language request and the SQL SELECT statement are very similar. When the requests become more complex, more features of the SELECT statement must be used to specify the query precisely.

Figure 6-1 shows the full form of the SELECT statement, which consists of six clauses. The SELECT and FROM clauses of the statement are required. The remaining four clauses are optional. You include them in a SELECT statement only when you want to use the functions they provide. The following list summarizes the function of each clause:

  • The SELECT clause lists the data items to be retrieved by the SELECT statement. The items may be columns from the database, or columns to be calculated by SQL as it performs the query. The SELECT clause is described in later sections of this chapter.
  • The FROM clause lists the tables that contain the data to be retrieved by the query. Queries that draw their data from a single table are described in this chapter. More complex queries that combine data from two or more tables are discussed in Chapter 7.
  • The WHERE clause tells SQL to include only certain rows of data in the query results. A search condition is used to specify the desired rows. The basic uses of the WHERE clause are described later in this chapter. Those that involve subqueries are discussed in Chapter 9.
  • The GROUP BY clause specifies a summary query. Instead of producing one row of query results for each row of data in the database, a summary query groups together similar rows and then produces one summary row of query results for each group. Summary queries are described in Chapter 8.
  • The HAVING clause tells SQL to include only certain groups produced by the GROUP BY clause in the query results. Like the WHERE clause, it uses a search condition to specify the desired groups. The HAVING clause is described in Chapter 8.
  • The ORDER BY clause sorts the query results based on the data in one or more columns. If it is omitted, the query results are not sorted. The ORDER BY clause is described later in this chapter.

The SELECT Clause

The SELECT clause that begins each SELECT statement specifies the data items to be retrieved by the query. The items are usually specified by a select list, a list of select items separated by commas. Each select item in the list generates a single column of query results, in left-to-right order. A select item can be:

  • a column name, identifying a column from the table(s) named in the FROM clause. When a column name appears as a select item, SQL simply takes the value of that column from each row of the database table and places it in the corresponding row of query results.
  • a constant, specifying that the same constant value is to appear in every row of the query results.
  • a SQL expression, indicating that SQL must calculate the value to be placed into the query results, in the style specified by the expression.

Each type of select item is described later in this chapter.

The FROM Clause

The FROM clause consists of the keyword FROM, followed by a list of table specifications separated by commas. Each table specification identifies a table containing data to be retrieved by the query. These tables are called the source tables of the query (and of the SELECT statement) because they are the source of all of the data in the query results. All of the queries in this chapter have a single source table, and every FROM clause contains a single table name.

Query Results

The result of a SQL query is always a table of data, just like the tables in the database. If you type a SELECT statement using interactive SQL, the DBMS displays the query results in tabular form on your computer screen. If a program sends a query to the DBMS using programmatic SQL, the table of query results is returned to the program. In either case, the query results always have the same tabular, row/column format as the actual tables in the database, as shown in Figure 6-2. Usually the query results will be a table with several columns and several rows. For example, this query produces a table of three columns (because it asks for three items of data) and ten rows (because there are ten salespeople):

List the names, offices, and hire dates of all salespeople.

SELECT NAME, REP_OFFICE, HIRE_DATE
FROM SALESREPS

NAME REP_OFFICE HIRE_DATE
-------------- ----------- ----------
Bill Adams 13 12-FEB-88
Mary Jones 11 12-OCT-89
Sue Smith 21 10-DEC-86
Sam Clark 11 14-JUN-88
Bob Smith 12 19-MAY-87
Dan Roberts 12 20-OCT-86
Tom Snyder NULL 13-JAN-90
Larry Fitch 21 12-OCT-89
Paul Cruz 12 01-MAR-87
Nancy Angelli 22 14-NOV-88

In contrast, the following query produces a single row because only one salesperson has the requested employee number. Even though this single row of query results looks less "tabular" than the multi-row results, SQL still considers it to be a table of three columns and one row.

What are the name, quota, and sales of employee number 107?

SELECT NAME, QUOTA, SALES
FROM SALESREPS
WHERE EMPL_NUM = 107

NAME QUOTA SALES
-------------- ------------ ------------
Nancy Angelli $300,000.00 $186,042.00

In some cases the query results can be a single value, as in the following example:

What are the average sales of our salespeople?

SELECT AVG(SALES)
FROM SALESREPS

AVG(SALES)
------------
$289,353.20

These query results are still a table, although it"s a very small one consisting of one column and one row.

Finally, it"s possible for a query to produce zero rows of query results, as in this example:

List the name and hire date of anyone with sales over $500,000.

SELECT NAME, HIRE_DATE
FROM SALESREPS
WHERE SALES > 500000.00

NAME HIRE_DATE
------------ ----------

Even in this situation, the query results are still a table. This one is an empty table with two columns and zero rows.

Note that SQL"s support for missing data extends to query results as well. If a data item in the database has a NULL value, the NULL value appears in the query results when the data item is retrieved. For example, the SALESREPS table contains NULL values in its QUOTA and MANAGER columns. The following query returns these NULL values in the second and third columns of query results:

List the salespeople, their quotas, and their managers.

SELECT NAME, QUOTA, MANAGER
FROM SALESREPS

NAME QUOTA MANAGER
-------------- ------------ --------
Bill Adams $350,000.00 104
Mary Jones $300,000.00 106
Sue Smith $350,000.00 108
Sam Clark $275,000.00 NULL
Bob Smith $200,000.00 106
Dan Roberts $300,000.00 104
Tom Snyder NULL 101
Larry Fitch $350,000.00 106
Paul Cruz $275,000.00 104
Nancy Angelli $300,000.00 108

The fact that a SQL query always produces a table of data is very important. It means that the query results can be stored back into the database as a table. It means that the results of two similar queries can be combined to form a larger table of query results. Finally, it means that the query results can themselves be the target of further queries. A relational database"s tabular structure thus has a very synergistic relationship with the relational query facilities of SQL. Tables can be queried, and queries produce tables.

Simple Queries

The simplest SQL queries request columns of data from a single table in the database. For example, this query requests three columns from the OFFICES table:

List the location, region, and sales of each sales office.

SELECT CITY, REGION, SALES
FROM OFFICES

CITY REGION SALES
------------ -------- ------------
Denver Western $186,042.00
New York Eastern $692,637.00
Chicago Eastern $735,042.00
Atlanta Eastern $367,911.00
Los Angeles Western $835,915.00

The SELECT statement for simple queries like this one includes only the two required clauses. The SELECT clause names the requested columns; the FROM clause names the table that contains them.

Conceptually, SQL processes the query by going through the table named in the FROM clause, one row at a time, as shown in Figure 6-3. For each row, SQL takes the values of the columns requested in the select list and produces a single row of query results. The query results thus contain one row of data for each row in the table.

Calculated Columns

In addition to columns whose values come directly from the database, a SQL query can include calculated columns whose values are calculated from the stored data values. To request a calculated column, you specify a SQL expression in the select list. As discussed in Chapter 5, SQL expressions can involve addition, subtraction, multiplication, and division. You can also use parentheses to build more complex expressions. Of course the columns referenced in an arithmetic expression must have a numeric type. If you try to add, subtract, multiply, or divide columns containing text data, SQL will report an error.

This query shows a simple calculated column:

List the city, region, and amount over/under target for each office.

SELECT CITY, REGION, (SALES - TARGET)
FROM OFFICES

CITY REGION (SALES-TARGET)
------------ -------- ---------------
Denver Western -$113,958.00
New York Eastern $117,637.00
Chicago Eastern -$64,958.00
Atlanta Eastern $17,911.00
Los Angeles Western $110,915.00

To process the query, SQL goes through the offices, generating one row of query results for each row of the OFFICES table, as shown in Figure 6-4. The first two columns of query results come directly from the OFFICES table. The third column of query results is calculated, row-by-row, using the data values from the current row of the OFFICES table.

Here are other examples of queries that use calculated columns:

Show the value of the inventory for each product.

SELECT MFR_ID, PRODUCT_ID, DESCRIPTION, (QTY_ON_HAND * PRICE)
FROM PRODUCTS

MFR_ID PRODUCT_ID DESCRIPTION (QTY_ON_HAND*PRICE)
------ ----------- --------------- --------------------
REI 2A45C Ratchet Link $16,590.00
ACI 4100Y Widget Remover $68,750.00
QSA XK47 Reducer $13,490.00
BIC 41672 Plate $0.00
IMM 779C 900-lb Brace $16,875.00
ACI 41003 Size 3 Widget $22,149.00
ACI 41004 Size 4 Widget $16,263.00
BIC 41003 Handle $1,956.00

Show me the result if I raised each salesperson"s quota by 3 percent of their year-to-date sales.

SELECT NAME, QUOTA, (QUOTA + (.03 * SALES))
FROM SALESREPS

NAME QUOTA (QUOTA+(.03*SALES))
-------------- ------------ --------------------
Bill Adams $350,000.00 $361,037.33
Mary Jones $300,000.00 $311,781.75
Sue Smith $350,000.00 $364,221.50
Sam Clark $275,000.00 $283,997.36
Bob Smith $200,000.00 $204,277.82
Dan Roberts $300,000.00 $309,170.19
Tom Snyder NULL NULL
Larry Fitch $350,000.00 $360,855.95
Paul Cruz $275,000.00 $283,603.25
Nancy Angelli $300,000.00 $305,581.26

As mentioned in Chapter 5, many SQL products provide additional arithmetic operations, character string operations, and built-in functions that can be used in SQL expressions. These can appear in select list expressions, as in this DB2 example:

List the name and month and year of hire for each salesperson.

SELECT NAME, MONTH(HIRE_DATE), YEAR(HIRE_DATE)
FROM SALESREPS

SQL constants can also be used by themselves as items in a select list. This can be useful for producing query results that are easier to read and interpret, as in this example:

List the sales for each city.

SELECT CITY, "has sales of", SALES
FROM OFFICES

CITY HAS SALES OF SALES
------------ ------------- ------------
Denver has sales of $186,042.00
New York has sales of $692,637.00
Chicago has sales of $735,042.00
Atlanta has sales of $367,911.00
Los Angeles has sales of $835,915.00

The query results appear to consist of a separate "sentence" for each office, but they"re really a table of three columns. The first and third columns contain values from the OFFICES table. The second column always contains the same 12-character text string. This distinction is subtle when the query results are displayed on a screen, but it is crucial in programmatic SQL, when the results are being retrieved into a program and used for calculations.

Selecting All Columns (SELECT *)

Sometimes it"s convenient to display the contents of all the columns of a table. This can be particularly useful when you first encounter a new database and want to get a quick understanding of its structure and the data it contains. As a convenience, SQL lets you use an asterisk (*) in place of the select list as an abbreviation for "all columns":

Show me all the data in the OFFICES table.

SELECT *
FROM OFFICES

OFFICE CITY REGION MGR TARGET SALES
------- ------------ -------- ---- ------------ ------------

The query results contain all six columns of the OFFICES table, in the same left-to-right order as in the table itself.

The ANSI/ISO SQL standard specifies that a SELECT statement can have either an all-column selection or a select list, but not both, as shown in Figure 6-1. However, many SQL implementations treat the asterisk (*) as just another element of the select list. Thus the query:

SELECT *, (SALES - TARGET)
FROM OFFICES

is legal in most commercial SQL dialects (for example in DB2, Oracle, and SQL Server), but it is not permitted by the ANSI/ISO standard.

The all-columns selection is most appropriate when you are using interactive SQL casually. It should be avoided in programmatic SQL, because changes in the database structure can cause a program to fail. For example, suppose the OFFICES table were dropped from the database and then re-created with its columns rearranged and a new seventh column added. SQL automatically takes care of the database-related details of such changes, but it cannot modify your application program for you. If your program expects a SELECT * FROM OFFICES query to return six columns of query results with certain data types, it will almost certainly stop working when the columns are rearranged and a new one is added.

These difficulties can be avoided if you write the program to request the columns it needs by name. For example, the following query produces the same results as SELECT * FROM OFFICES. It is also immune to changes in the database structure, so long as the named columns continue to exist in the OFFICES table:

SELECT OFFICE, CITY, REGION, MGR, TARGET, SALES
FROM OFFICES

Duplicate Rows (DISTINCT)

If a query includes the primary key of a table in its select list, then every row of query results will be unique (because the primary key has a different value in each row). If the primary key is not included in the query results, duplicate rows can occur. For example, suppose you made this request:

List the employee numbers of all sales office managers.

SELECT MGR
FROM OFFICES

MGR
----
108
106
104
105
108

The query results have five rows (one for each office), but two of them are exact duplicates of one another. Why? Because Larry Fitch manages both the Los Angeles and Denver offices, and his employee number (108) appears in both rows of the OFFICES table. These query results are probably not exactly what you had in mind. If there are four different managers, you might have expected only four employee numbers in the query results.

You can eliminate duplicate rows of query results by inserting the keyword DISTINCT in the SELECT statement just before the select list. Here is a version of the previous query that produces the results you want:

List the employee numbers of all sales office managers.

SELECT DISTINCT MGR
FROM OFFICES

MGR
----
104
105
106
108

Conceptually, SQL carries out this query by first generating a full set of query results (five rows) and then eliminating rows that are exact duplicates of one another to form the final query results. The DISTINCT keyword can be specified regardless of the contents of the SELECT list (with certain restrictions for summary queries, as described in Chapter 8).

If the DISTINCT keyword is omitted, SQL does not eliminate duplicate rows. You can also specify the keyword ALL to explicitly indicate that duplicate rows are to be retained, but it is unnecessary since this is the default behavior.

Row Selection (WHERE Clause)

SQL queries that retrieve all rows of a table are useful for database browsing and reports, but for little else. Usually you"ll want to select only some of the rows in a table and include only these rows in the query results. The WHERE clause is used to specify the rows you want to retrieve. Here are some examples of simple queries that use the WHERE clause:

Show me the offices where sales exceed target.

SELECT CITY, SALES, TARGET
FROM OFFICES
WHERE SALES > TARGET

CITY SALES TARGET
------------ ------------ ------------
New York $692,637.00 $575,000.00
Atlanta $367,911.00 $350,000.00
Los Angeles $835,915.00 $725,000.00

Show me the name, sales, and quota of employee number 105.

SELECT NAME, SALES, QUOTA
FROM SALESREPS
WHERE EMPL_NUM = 105

NAME SALES QUOTA
----------- ------------ ------------
Bill Adams $367,911.00 $350,000.00

Show me the employees managed by Bob Smith (employee 104).

SELECT NAME, SALES
FROM SALESREPS
WHERE MANAGER = 104

NAME SALES
------------ ------------
Bill Adams $367,911.00
Dan Roberts $305,673.00
Paul Cruz $286,775.00

The WHERE clause consists of the keyword WHERE followed by a search condition that specifies the rows to be retrieved. In the previous query, for example, the search condition is MANAGER = 104. Figure 6-5 shows how the WHERE clause works. Conceptually, SQL goes through each row of the SALESREPS table, one-by-one, and applies the search condition to the row. When a column name appears in the search condition (such as the MANAGER column in this example), SQL uses the value of the column in the current row. For each row, the search condition can produce one of three results:

  • If the search condition is TRUE, the row is included in the query results. For example, the row for Bill Adams has the correct MANAGER value and is included.
  • If the search condition is FALSE, the row is excluded from the query results. For example, the row for Sue Smith has the wrong MANAGER value and is excluded.
  • If the search condition has a NULL (unknown) value, the row is excluded from the query results. For example, the row for Sam Clark has a NULL value for the MANAGER column and is excluded.

Figure 6-6 shows another way to think about the role of the search condition in the WHERE clause. Basically, the search condition acts as a filter for rows of the table. Rows that satisfy the search condition pass through the filter and become part of the query results. Rows that do not satisfy the search condition are trapped by the filter and excluded from the query results.

Search Conditions

SQL offers a rich set of search conditions that allow you to specify many different kinds of queries efficiently and naturally. Five basic search conditions (called predicates in the ANSI/ISO standard) are summarized here and are described in the sections that follow:

  • Comparison test Compares the value of one expression to the value of another expression. Use this test to select offices in the Eastern region, or salespeople whose sales are above their quotas.
  • Range test Tests whether the value of an expression falls within a specified range of values. Use this test to find salespeople whose sales are between $100,000 and $500,000.
  • Set membership test Checks whether the value of an expression matches one of a set of values. Use this test to select offices located in New York, Chicago, or Los Angeles.
  • Pattern matching test Checks whether the value of a column containing string data matches a specified pattern. Use this test to select customers whose names start with the letter "E".
  • Null value test Checks whether a column has a NULL (unknown) value. Use this test to find the salespeople who have not yet been assigned to a manager.

Comparison Test (=, <>, <, <=, >, >=)

The most common search condition used in a SQL query is a comparison test. In a comparison test, SQL computes and compares the values of two SQL expressions for each row of data. The expressions can be as simple as a column name or a constant, or they can be more complex arithmetic expressions. SQL offers six different ways of comparing the two expressions, as shown in Figure 6-7. Here are some examples of typical comparison tests:

Find salespeople hired before 1988.

SELECT NAME
FROM SALESREPS
WHERE HIRE_DATE < "01-JAN-88"

NAME
------------
Sue Smith
Bob Smith
Dan Roberts
Paul Cruz

List the offices whose sales fall below 80 percent of target.

SELECT CITY, SALES, TARGET
FROM OFFICES
WHERE SALES < (.8 * TARGET)

CITY SALES TARGET
------- ------------ ------------
Denver $186,042.00 $300,000.00

List the offices not managed by employee number 108.

SELECT CITY, MGR
FROM OFFICES
WHERE MGR <> 108

CITY MGR
--------- ----
New York 106
Chicago 104
Atlanta 105

As shown in Figure 6-7, the inequality comparison test is written as "A < > B" according to the ANSI/ISO SQL specification. Several SQL implementations use alternate notations, such as "A != B" (used by SQL Server) and "AØ =B" (used by DB2 and SQL/DS). In some cases, these are alternative forms; in others, they are the only acceptable form of the inequality test.

When SQL compares the values of the two expressions in the comparison test, three results can occur:

  • If the comparison is true, the test yields a TRUE result.
  • If the comparison is false, the test yields a FALSE result.
  • If either of the two expressions produces a NULL value, the comparison yields a NULL result.

Single-row Retrieval

The most common comparison test is one that checks whether a column"s value is equal to some constant. When the column is a primary key, the test isolates a single row of the table, producing a single row of query results, as in this example:

Retrieve the name and credit limit of customer number 2107.

SELECT COMPANY, CREDIT_LIMIT
FROM CUSTOMERS
WHERE CUST_NUM = 2107

COMPANY CREDIT_LIMIT
------------------ -------------
Ace International $35,000.00

This type of query is the foundation of forms-based database retrieval programs. The user enters a customer number into the form, and the program uses the number to construct and execute a query. It then displays the retrieved data in the form.

Note that the SQL statements for retrieving a specific customer by number, as in this example, and retrieving all customers with a certain characteristic (such as those with credit limits over $25,000) both have exactly the same form. These two types of queries (retrieval by primary key and retrieval based on a search of the data) would be very different operations in a nonrelational database. This uniformity of approach makes SQL much simpler to learn and use than earlier query languages.

NULL Value Considerations

The behavior of NULL values in comparison tests can reveal some "obviously true" notions about SQL queries to be, in fact, not necessarily true. For example, it would seem that the results of these two queries:

List salespeople who are over quota.

SELECT NAME
FROM SALESREPS
WHERE SALES > QUOTA

NAME
------------
Bill Adams
Mary Jones
Sue Smith
Sam Clark
Dan Roberts
Larry Fitch
Paul Cruz

List salespeople who are under or at quota.

SELECT NAME
FROM SALESREPS
WHERE SALES < = QUOTA

NAME
--------------
Bob Smith
Nancy Angelli

would include every row of the SALESREPS table, but the queries produce seven and two rows, respectively, for a total of nine rows, while there are ten rows in the SALESREPS table. Tom Snyder"s row has a NULL value in the QUOTA column because he has not yet been assigned a quota. This row is not listed by either query; it "vanishes" in the comparison test.

As this example shows, you need to think about NULL value handling when you specify a search condition. In SQL"s three-valued logic, a search condition can yield a TRUE, FALSE, or NULL result. Only rows where the search condition yields a TRUE result are included in the query results.

Range Test (BETWEEN)

SQL provides a different form of search condition with the range test (BETWEEN) shown in Figure 6-8. The range test checks whether a data value lies between two specified values. It involves three SQL expressions. The first expression defines the value to be tested; the second and third expressions define the low and high ends of the range to be checked. The data types of the three expressions must be comparable.

This example shows a typical range test:

Find orders placed in the last quarter of 1989.

SELECT ORDER_NUM, ORDER_DATE, MFR, PRODUCT, AMOUNT
FROM ORDERS
WHERE ORDER_DATE BETWEEN "01-OCT-89" AND "31-DEC-89"

ORDER_NUM ORDER_DATE MFR PRODUCT AMOUNT
--------- ----------- ---- -------- -----------

The BETWEEN test includes the endpoints of the range, so orders placed on October 1 or December 31 are included in the query results. Here is another example of a range test:

Find the orders that fall into various amount ranges.

SELECT ORDER_NUM, AMOUNT
FROM ORDERS
WHERE AMOUNT BETWEEN 20000.00 AND 29999.99

ORDER_NUM AMOUNT
---------- -----------

SELECT ORDER_NUM, AMOUNT
FROM ORDERS
WHERE AMOUNT BETWEEN 30000.00 AND 39999.99

ORDER_NUM AMOUNT
---------- -----------

SELECT ORDER_NUM, AMOUNT
FROM ORDERS
WHERE AMOUNT BETWEEN 40000.00 AND 49999.99

ORDER_NUM AMOUNT
---------- -----------

The negated version of the range test (NOT BETWEEN) checks for values that fall outside the range, as in this example:

List salespeople whose sales are not between 80 percent and 120 percent of quota.

SELECT NAME, SALES, QUOTA
FROM SALESREPS
WHERE SALES NOT BETWEEN (.8 * QUOTA) AND (1.2 * QUOTA)

NAME SALES QUOTA
-------------- ------------ ------------
Mary Jones $392,725.00 $300,000.00
Sue Smith $474,050.00 $350,000.00
Bob Smith $142,594.00 $200,000.00
Nancy Angelli $186,042.00 $300,000.00

The test expression specified in the BETWEEN test can be any valid SQL expression, but in practice it"s usually just a column name, as in the previous examples.

The ANSI/ISO standard defines relatively complex rules for the handling of NULL values in the BETWEEN test:

  • If the test expression produces a NULL value, or if both expressions defining the range produce NULL values, then the BETWEEN test returns a NULL result.
  • If the expression defining the lower end of the range produces a NULL value, then the BETWEEN test returns FALSE if the test value is greater than the upper bound, and NULL otherwise.
  • If the expression defining the upper end of the range produces a NULL value, then the BETWEEN test returns FALSE if the test value is less than the lower bound, and NULL otherwise.

Before relying on this behavior, it"s a good idea to experiment with your DBMS.

It"s worth noting that the BETWEEN test doesn"t really add to the expressive power of SQL, because it can be expressed as two comparison tests. The range test:

A BETWEEN B AND C

is completely equivalent to:

(A >= B) AND (A < = C)

However, the BETWEEN test is a simpler way to express a search condition when you"re thinking of it in terms of a range of values.

Set Membership Test (IN)

Another common search condition is the set membership test (IN), shown in Figure 6-9. It tests whether a data value matches one of a list of target values. Here are several queries that use the set membership test:

List the salespeople who work in New York, Atlanta, or Denver.

SELECT NAME, QUOTA, SALES
FROM SALESREPS
WHERE REP_OFFICE IN (11, 13, 22)

NAME QUOTA SALES
-------------- ------------ ------------
Bill Adams $350,000.00 $367,911.00
Mary Jones $300,000.00 $392,725.00
Sam Clark $275,000.00 $299,912.00
Nancy Angelli $300,000.00 $186,042.00

Find all orders placed on a Thursday in January 1990.

SELECT ORDER_NUM, ORDER_DATE, AMOUNT
FROM ORDERS
WHERE ORDER_DATE IN ("04-JAN-90", "11-JAN-90", "18-JAN-90", "25-JAN-90")

ORDER_NUM ORDER_DATE AMOUNT
---------- ----------- ----------

Find all orders placed with four specific salespeople.

SELECT ORDER_NUM, REP, AMOUNT
FROM ORDERS
WHERE REP IN (107, 109, 101, 103)

ORDER_NUM REP AMOUNT
---------- ---- -----------

You can check if the data value does not match any of the target values by using the NOT IN form of the set membership test. The test expression in an IN test can be any SQL expression, but it"s usually just a column name, as in the preceding examples. If the test expression produces a NULL value, the IN test returns NULL. All of the items in the list of target values must have the same data type, and that type must be comparable to the data type of the test expression.

Like the BETWEEN test, the IN test doesn"t add to the expressive power of SQL, because the search condition:

X IN (A, B, C)

is completely equivalent to:

(X = A) OR (X = B) OR (X = C)

However, the IN test offers a much more efficient way of expressing the search condition, especially if the set contains more than a few values. The ANSI/ISO SQL standard doesn"t specify a maximum limit to the number of items that can appear in the value list, and most commercial implementations do not state an explicit upper limit either. For portability reasons, it"s generally a good idea to avoid lists with only a single item, such as this one:

CITY IN ("New York")

and replace them with a simple comparison test:

CITY = "New York"

Pattern Matching Test (LIKE)

A simple comparison test can be used to retrieve rows where the contents of a text column match some particular text. For example, this query retrieves a row of the CUSTOMERS table by name:

Show the credit limit for Smithson Corp.

SELECT COMPANY, CREDIT_LIMIT
FROM CUSTOMERS
WHERE COMPANY = "Smithson Corp."

However, you might easily forget whether the company"s name was "Smith," "Smithson," or "Smithsonian." SQL"s pattern matching test can be used to retrieve the data based on a partial match of the customer"s name.

The pattern matching test (LIKE), shown in Figure 6-10, checks to see whether the data value in a column matches a specified pattern. The pattern is a string that may include one or more wildcard characters. These characters are interpreted in a special way.

Wildcard Characters

The percent sign (%) wildcard character matches any sequence of zero or more characters. Here"s a modified version of the previous query that uses the percent sign for pattern matching:

SELECT COMPANY, CREDIT_LIMIT
FROM CUSTOMERS
WHERE COMPANY LIKE "Smith% Corp."

The LIKE keyword tells SQL to compare the NAME column to the pattern "Smith% Corp." Any of the following names would match the pattern:

Smith Corp. Smithson Corp. Smithsen Corp. Smithsonian Corp.

but these names would not:

SmithCorp Smithson Inc.

The underscore (_) wildcard character matches any single character. If you are sure that the company"s name is either "Smithson" or "Smithsen," for example, you can use this query:

SELECT COMPANY, CREDIT_LIMIT
FROM CUSTOMERS
WHERE COMPANY LIKE "Smiths_n Corp."

In this case, any of these names will match the pattern:

Smithson Corp. Smithsen Corp. Smithsun Corp.

but these names will not:

Smithsoon Corp. Smithsn Corp.

Wildcard characters can appear anywhere in the pattern string, and several wildcard characters can be within a single string. This query allows either the "Smithson" or "Smithsen" spelling and will also accept "Corp.", "Inc.", or any other ending on the company name:

SELECT COMPANY, CREDIT_LIMIT
FROM CUSTOMERS
WHERE COMPANY LIKE "Smiths_n %"

You can locate strings that do not match a pattern by using the NOT LIKE form of the pattern matching test. The LIKE test must be applied to a column with a string data type. If the data value in the column is NULL, the LIKE test returns a NULL result.

If you have used computers through a command-line interface (such as the Unix shell), you"ve probably seen string pattern matching before. Frequently, the asterisk (*) is used instead of SQL"s percent sign (%), and the question mark (?) is used instead of SQL"s underscore (_), but the pattern matching capabilities themselves are similar in most situations where a computer application offers the ability to match selected parts of a word or text.

Escape Characters *

One of the problems with string pattern matching is how to match the wildcard characters themselves as literal characters. To test for the presence of a percent sign character in a column of text data, for example, you can"t simply include the percent sign in the pattern because SQL will treat it as a wildcard. With some popular SQL products, you cannot literally match the two wildcard characters. This usually doesn"t pose serious problems, because the wildcard characters don"t frequently appear in names, product numbers, and other text data of the sort that is usually stored in a database.

The ANSI/ISO SQL standard does specify a way to literally match wildcard characters, using a special escape character. When the escape character appears in the pattern, the character immediately following it is treated as a literal character rather than a wildcard character. (The latter character is said to be escaped.) The escaped character can be either of the two wildcard characters, or the escape character itself, which has now taken on a special meaning within the pattern.

The escape character is specified as a one-character constant string in the ESCAPE clause of the search condition, as shown in Figure 6-10. Here is an example using a dollar sign ($) as the escape character:

Find products whose product IDs start with the four letters "A%BC".

SELECT ORDER_NUM, PRODUCT
FROM ORDERS
WHERE PRODUCT LIKE "A$%BC%" ESCAPE "$"

The first percent sign in the pattern, which follows an escape character, is treated as a literal percent sign; the second functions as a wildcard.

The use of escape characters is very common in pattern matching applications, which is why the ANSI/ISO standard specified it. However, it was not a part of the early SQL implementations and has been slowly adopted. To insure portability, the ESCAPE clause should be avoided.

Null Value Test (IS NULL)

NULL values create a three-valued logic for SQL search conditions. For any given row, the result of a search condition may be TRUE or FALSE, or it may be NULL because one of the columns used in evaluating the search condition contains a NULL value. Sometimes it"s useful to check explicitly for NULL values in a search condition and handle them directly. SQL provides a special NULL value test (IS NULL), shown in Figure 6-11, to handle this task.

This query uses the NULL value test to find the salesperson in the sample database who has not yet been assigned to an office:

Find the salesperson not yet assigned to an office.

SELECT NAME
FROM SALESREPS
WHERE REP_OFFICE IS NULL

NAME
-----------
Tom Snyder

The negated form of the NULL value test (IS NOT NULL) finds rows that do not contain a NULL value:

List the salespeople who have been assigned to an office.

SELECT NAME
FROM SALESREPS
WHERE REP_OFFICE IS NOT NULL

NAME
--------------
Bill Adams
Mary Jones
Sue Smith
Sam Clark
Bob Smith
Dan Roberts
Larry Fitch
Paul Cruz
Nancy Angelli

Unlike the previously described search conditions, the NULL value test cannot yield a NULL result. It is always either TRUE or FALSE.

It may seem strange that you can"t just test for a NULL value using a simple comparison search condition, such as this:

SELECT NAME
FROM SALESREPS
WHERE REP_OFFICE = NULL

The NULL keyword can"t be used here because it isn"t really a value; it"s just a signal that the value is unknown. Even if the comparison test:

REP_OFFICE = NULL

were legal, the rules for handling NULL values in comparisons would cause it to behave differently from what you might expect. When SQL encountered a row where the REP_OFFICE column was NULL, the search condition would test:

NULL = NULL

Is the result TRUE or FALSE? Because the values on both sides of the equal sign are unknown, SQL can"t tell, so the rules of SQL logic say that the search condition itself must yield a NULL result. Because the search condition doesn"t produce a true result, the row is excluded from the query results-precisely the opposite of what you wanted to happen! As a result of the way SQL handles NULLs in comparisons, you must explicitly use the NULL value test to check for NULL values.Compound Search Conditions (AND, OR, and NOT)

The simple search conditions described in the preceding sections return a value of TRUE, FALSE, or NULL when applied to a row of data. Using the rules of logic, you can combine these simple SQL search conditions to form more complex ones, as shown in Figure 6-12. Note that the search conditions combined with AND, OR, and NOT may themselves be compound search conditions.

The keyword OR is used to combine two search conditions when one or the other (or both) must be true:

Find salespeople who are under quota or with sales under $300,000.

SELECT NAME, QUOTA, SALES
FROM SALESREPS
WHERE SALES < QUOTA

NAME QUOTA SALES
-------------- ------------ ------------
Sam Clark $275,000.00 $299,912.00
Bob Smith $200,000.00 $142,594.00
Tom Snyder NULL $75,985.00
Paul Cruz $275,000.00 $286,775.00
Nancy Angelli $300,000.00 $186,042.00

You can also use the keyword AND to combine two search conditions that must both be true:

Find salespeople who are under quota and with sales under $300,000.

SELECT NAME, QUOTA, SALES
FROM SALESREPS
WHERE SALES < QUOTA

NAME QUOTA SALES
-------------- ------------ ------------
Bob Smith $200,000.00 $142,594.00
Nancy Angelli $300,000.00 $186,042.00

Finally, you can use the keyword NOT to select rows where a search condition is false:

Find all salespeople who are under quota, but whose sales are not under $150,000.

SELECT NAME, QUOTA, SALES
FROM SALESREPS
WHERE SALES < QUOTA

NAME QUOTA SALES
-------------- ------------ ------------
Nancy Angelli $300,000.00 $186,042.00

Using the logical AND, OR, and NOT keywords and parentheses to group the search criteria, you can build very complex search criteria, such as the one in this query:

Find all salespeople who either: (a) work in Denver, New York, or Chicago; or (b) have no manager and were hired since June 1988; or (c) are over quota, but have sales of $600,000 or less.

SELECT NAME
FROM SALESREPS
WHERE (REP_OFFICE IN (22, 11, 12))

Exactly why you might want to see this particular list of names is a mystery, but the example does illustrate a reasonably complex query.

As with simple search conditions, NULL values influence the outcome of compound search conditions, and the results are subtle. In particular, the result of (NULL OR TRUE) is TRUE, not NULL as you might expect. Tables 6-1, 6-2, and 6-3 specify truth tables for AND, OR, and NOT, respectively, and show the impact of NULL values.

AND

TRUE

FALSE

NULL

TRUE

TRUE

FALSE

NULL

FALSE

FALSE

FALSE

FALSE

NULL

NULL

FALSE

NULL

Table 1: AND Truth Table

OR

TRUE

FALSE

NULL

TRUE

TRUE

TRUE

TRUE

FALSE

TRUE

FALSE

NULL

NULL

TRUE

NULL

NULL

Table 2: OR Truth Table

NOT

TRUE

FALSE

NULL

 

FALSE

TRUE

NULL

Table 3: NOT Truth Table

When more than two search conditions are combined with AND, OR, and NOT, the ANSI/ISO standard specifies that NOT has the highest precedence, followed by AND and then OR. To ensure portability, it"s always a good idea to use parentheses and remove any possible ambiguity.

The SQL2 standard adds another logical search condition, the IS test, to the logic provided by AND, OR, and NOT. Figure 6-13 shows the syntax of the IS test, which checks to see whether the logical value of an expression or comparison test is TRUE, FALSE, or UNKNOWN (NULL).

For example, the IS test:

((SALES - QUOTA) > 10000.00) IS UNKNOWN

can be used to find rows where the comparison cannot be done because either SALES or QUOTA has a NULL value. Similarly, the IS test:

((SALES - QUOTA) > 10000.00) IS FALSE

will select rows where SALES are not significantly above QUOTA. As this example shows, the IS test doesn"t really add to the expressive power of SQL, since the test could just as easily have been written:

NOT ((SALES - QUOTA) > 10000.00)

For maximum portability, it"s a good idea to avoid the tests and write the expressions using only AND, OR, and NOT. It"s not always possible to avoid the IS UNKNOWN form of the test.

Sorting Query Results (ORDER BY Clause)

Like the rows of a table in the database, the rows of query results are not arranged in any particular order. You can ask SQL to sort the results of a query by including the ORDER BY clause in the SELECT statement. The ORDER BY clause, shown in Figure 6-14, consists of the keywords ORDER BY, followed by a list of sort specifications separated by commas. For example, the results of this query are sorted on two columns, REGION and CITY:

Show the sales for each office, sorted in alphabetical order by region, and within each region by city.

SELECT CITY, REGION, SALES
FROM OFFICES
ORDER BY REGION, CITY

CITY REGION SALES
------------ -------- ------------
Atlanta Eastern $367,911.00
Chicago Eastern $735,042.00
New York Eastern $692,637.00
Denver Western $186,042.00
Los Angeles Western $835,915.00

The first sort specification (REGION) is the major sort key; those that follow (CITY, in this case) are progressively more minor sort keys, used as "tie breakers" when two rows of query results have the same values for the more major keys. Using the ORDER BY clause, you can request sorting in an ascending or descending sequence, and you can sort on any item in the select list of the query.

By default, SQL sorts data in ascending sequence. To request sorting in descending sequence, the keyword DESC is included in the sort specification, as in this example:

List the offices, sorted in descending order by sales, so that the offices with the largest sales appear first.

SELECT CITY, REGION, SALES
FROM OFFICES
ORDER BY SALES DESC

CITY REGION SALES
------------ -------- ------------
Los Angeles Western $835,915.00
Chicago Eastern $735,042.00
New York Eastern $692,637.00
Atlanta Eastern $367,911.00
Denver Western $186,042.00

As indicated in Figure 6-14, you can also use the keyword ASC to specify an ascending sort, but because that"s the default sorting sequence, the keyword is usually omitted.

If the column of query results to be used for sorting is a calculated column, it has no column name to be used in a sort specification. In this case, you must specify a column number instead of a column name, as in this example:

List the offices, sorted in descending order by sales performance, so that the offices with the best performance appear first.

SELECT CITY, REGION, (SALES - TARGET)
FROM OFFICES
ORDER BY 3 DESC

CITY REGION (SALES-TARGET)
------------ -------- ---------------
New York Eastern $117,637.00
Los Angeles Western $110,915.00
Atlanta Eastern $17,911.00
Chicago Eastern -$64,958.00
Denver Western -$113,958.00

These query results are sorted on the third column, which is the calculated difference between the SALES and TARGET for each office. By combining column numbers, column names, ascending sorts, and descending sorts, you can specify quite complex sorting of the query results, as in the following final example:

List the offices, sorted in alphabetical order by region, and within each region in descending order by sales performance.

SELECT CITY, REGION, (SALES - TARGET)
FROM OFFICES
ORDER BY REGION ASC, 3 DESC

CITY REGION (SALES-TARGET)
------------ -------- ---------------
New York Eastern $117,637.00
Atlanta Eastern $17,911.00
Chicago Eastern -$64,958.00
Los Angeles Western $110,915.00
Denver Western -$113,958.00

The SQL2 standard allows you to control the sorting order used by the DBMS for each sort key. This can be important when working with international character sets or to insure portability between ASCII and EBCDIC character set systems. However, this area of the SQL2 specification is quite complex, and in practice many SQL implementations either ignore sorting sequence issues or use their own proprietary scheme for user control of the sorting sequence.

Rules for Single-table Query Processing

Single-table queries are generally simple, and it"s usually easy to understand the meaning of a query just by reading the SELECT statement. As queries become more complex, however, it"s important to have a more precise "definition" of the query results that will be produced by a given SELECT statement. The following steps describe the procedure for generating the results of a SQL query that includes the clauses described in this chapter.

As the next steps show, the query results produced by a SELECT statement are specified by applying each of its clauses, one-by-one. The FROM clause is applied first (selecting the table containing data to be retrieved). The WHERE clause is applied next (selecting specific rows from the table). The SELECT clause is applied next (generating the specific columns of query results and eliminating duplicate rows, if requested). Finally, the ORDER BY clause is applied to sort the query results.

To generate the query results for a select statement follow these steps:

  1. Start with the table named in the FROM clause.
  2. If there is a WHERE clause, apply its search condition to each row of the table, retaining those rows for which the search condition is TRUE, and discarding those rows for which it is FALSE or NULL.
  3. For each remaining row, calculate the value of each item in the select list to produce a single row of query results. For each column reference, use the value of the column in the current row.
  4. If SELECTED DISTINCT is specified, eliminate any duplicate rows of query results that were produced.
  5. If there is an ORDER BY clause, sort the query results as specified.

The rows generated by this procedure comprise the query results.

These "rules" for SQL query processing will be expanded several times in the next three chapters to include the remaining clauses of the SELECT statement.

Combining Query Results (UNION) *

Occasionally, it"s convenient to combine the results of two or more queries into a single table of query results. SQL supports this capability through the UNION feature of the SELECT statement. Figure 6-15 illustrates how the UNION operation can be used to satisfy the following request:

List all the products where the price of the product exceeds $2,000 or where more than $30,000 of the product has been ordered in a single order.

The first part of the request can be satisfied with the top query in the figure:

List all the products whose price exceeds $2,000.

SELECT MFR_ID, PRODUCT_ID
FROM PRODUCTS
WHERE PRICE > 2000.00

MFR_ID PRODUCT_ID
------- -----------
ACI 4100Y
REI 2A44L
ACI 4100Z
REI 2A44R

Similarly, the second part of the request can be satisfied with the bottom query in the figure:

List all the products where more than $30,000 of the product has been ordered in a single order.

SELECT DISTINCT MFR, PRODUCT
FROM ORDERS
WHERE AMOUNT > 30000.00

MFR PRODUCT
---- --------
IMM 775C
REI 2A44L
REI 2A44R

As shown in the figure, the UNION operation produces a single table of query results that combines the rows of the top query results with the rows of the bottom query results. The SELECT statement that specifies the UNION operation looks like this:

List all the products where the price of the product exceeds $2,000 or where more than $30,000 of the product has been ordered in a single order.

SELECT MFR_ID, PRODUCT_ID
FROM PRODUCTS
WHERE PRICE > 2000.00
UNION
SELECT DISTINCT MFR, PRODUCT
FROM ORDERS
WHERE AMOUNT > 30000.00

ACI 4100Y
ACI 4100Z
IMM 775C
REI 2A44L
REI 2A44R

There are severe restrictions on the tables that can be combined by a UNION operation:

  • The two tables must contain the same number of columns.
  • The data type of each column in the first table must be the same as the data type of the corresponding column in the second table.
  • Neither of the two tables can be sorted with the ORDER BY clause. However, the combined query results can be sorted, as described in the following section.

Note that the column names of the two queries combined by a UNION do not have to be identical. In the preceding example, the first table of query results has columns named MFR_ID and PRODUCT_ID, while the second table of query results has columns named MFR and PRODUCT. Because the columns in the two tables can have different names, the columns of query results produced by the UNION operation are unnamed.

The ANSI/ISO SQL standard specifies a further restriction on a SELECT statement that participates in a UNION. It permits only column names or an "all columns" specification (SELECT *) in the select list and prohibits expressions in the select list. Most commercial SQL implementations relax this restriction and permit simple expressions in the select list. However, many SQL implementations do not allow the SELECT statements to include the GROUP BY or HAVING clauses, and some do not allow column functions in the select list (prohibiting summary queries as described in Chapter 8). In fact, some SQL implementations (including SQL Server) do not support the UNION operation at all.

Unions and Duplicate Rows *

Because the UNION operation combines the rows from two sets of query results, it would tend to produce query results containing duplicate rows. For example, in the query of Figure 6-15, product REI-2A44L sells for $4,500.00, so it appears in the top set of query results. There is also an order for $31,500.00 worth of this product in the ORDERS table, so it also appears in the bottom set of query results. By default, the UNION operation eliminates duplicate rows as part of its processing. Thus, the combined set of query results contains only one row for product REI-2A44L.

If you want to retain duplicate rows in a UNION operation, you can specify the ALL keyword immediately following the word "UNION". This form of the query produces two duplicate rows for product REI-2A44L:

List all the products where the price of the product exceeds $2,000 or where more than $30,000 of the product has been ordered in a single order.

SELECT MFR_ID, PRODUCT_ID
FROM PRODUCTS
WHERE PRICE > 2000.00
UNION ALL
SELECT DISTINCT MFR, PRODUCT
FROM ORDERS
WHERE AMOUNT > 30000.00

ACI 4100Y
REI 2A44L
ACI 4100Z
REI 2A44R
IMM 775C
REI 2A44L
REI 2A44R

Note that the default duplicate row handling for the UNION operation and for the simple SELECT statement is exactly opposite. For the SELECT statement, SELECT ALL (duplicates retained) is the default. To eliminate duplicate rows, you must explicitly specify SELECT DISTINCT. For the UNION operation, UNION (duplicates eliminated) is the default. To retain duplicate rows, you must explicitly specify UNION ALL.

Database experts have criticized the handling of duplicate rows in SQL and point to this inconsistency as an example of the problems. The reason for the inconsistency is that the SQL defaults were chosen to produce the correct behavior most of the time:

  • In practice, most simple SELECT statements do not produce duplicate rows, so the default is no duplicate elimination.
  • In practice, most UNION operations would produce unwanted duplicate rows, so the default is duplicate elimination.

Eliminating duplicate rows from query results is a very time-consuming process, especially if the query result s contain a large number of rows. If you know, based on the individual queries involved, that a UNION operation cannot produce duplicate rows, you should specifically use the UNION ALL operation because the query will execute much more quickly.

Unions and Sorting *

The ORDER BY clause cannot appear in either of the two SELECT statements combined by a UNION operation. It wouldn"t make much sense to sort the two sets of query results anyway, because they are fed directly into the UNION operation and are never visible to the user. However, the combined set of query results produced by the UNION operation can be sorted by specifying an ORDER BY clause after the second SELECT statement. Since the columns produced by the UNION operation are not named, the ORDER BY clause must specify the columns by column number.

Here is the same products query as that shown in Figure 6-15, with the query results sorted by manufacturer and product number:

List all the products where the price of the product exceeds $2,000 or where more than $30,000 of the product has been ordered in a single order, sorted by manufacturer and product number.

SELECT MFR_ID, PRODUCT_ID
FROM PRODUCTS
WHERE PRICE > 2000.00
UNION
SELECT DISTINCT MFR, PRODUCT
FROM ORDERS
WHERE AMOUNT > 30000.00
ORDER BY 1, 2

ACI 4100Y
ACI 4100Z
IMM 775C
REI 2A44L
REI 2A44R

Multiple UNIONs *

The UNION operation can be used repeatedly to combine three or more sets of query results, as shown in Figure 6-16. The union of Table B and Table C in the figure produces a single, combined table. This table is then combined with Table A in another UNION operation. The query in the figure is written this way:

SELECT *
FROM A
UNION (SELECT *

Bill
Mary
George
Fred
Sue
Julia
Harry

The parentheses in the query indicate which UNION should be performed first. In fact, if all of the UNIONs in the statement eliminate duplicate rows, or if all of them retain duplicate rows, the order in which they are performed is unimportant. These three expressions are completely equivalent:

A UNION (B UNION C)

A UNION B) UNION C

(A UNION C) UNION B

and produce seven rows of query results. Similarly, the following three expressions are completely equivalent and produce twelve rows of query results, because the duplicates are retained:

A UNION ALL (B UNION ALL C)

(A UNION ALL B) UNION ALL C

(A UNION ALL C) UNION ALL B

However, if the unions involve a mixture of UNION and UNION ALL, the order of evaluation matters. If this expression:

A UNION ALL B UNION C

is interpreted as:

A UNION ALL (B UNION C)

then it produces ten rows of query results (six from the inner UNION, plus four rows from Table A). However, if it is interpreted as:

(A UNION ALL B) UNION C

then it produces only four rows, because the outer UNION eliminates all duplicate rows. For this reason, it"s always a good idea to use parentheses in UNIONs of three or more tables to specify the order of evaluation intended.

Summary

This chapter is the first of four chapters about SQL queries. It described the following query features:

  • The SELECT statement is used to express a SQL query. Every SELECT statement produces a table of query results containing one or more columns and zero or more rows.
  • The FROM clause specifies the table(s) containing the data to be retrieved by a query.
  • The SELECT clause specifies the column(s) of data to be included in the query results, which can be columns of data from the database, or calculated columns.
  • The WHERE clause selects the rows to be included in the query results by applying a search condition to rows of the database.
  • A search condition can select rows by comparing values, by checking a value against a range or set of values, by matching a string pattern, and by checking for NULL values.
  • Simple search conditions can be combined with AND, OR, and NOT to form more complex search conditions.
  • The ORDER BY clause specifies that the query results should be sorted in ascending or descending order, based on the values of one or more columns.
  • The UNION operation can be used within a SELECT statement to combine two or more sets of query results into a single set.
Read More Show Less

Customer Reviews

Be the first to write a review
( 0 )
Rating Distribution

5 Star

(0)

4 Star

(0)

3 Star

(0)

2 Star

(0)

1 Star

(0)
Sort by: Showing 1 Customer Reviews
  • Anonymous

    Posted Thu Aug 16 00:00:00 EDT 2001

    A complete must!

    This book is a necessity for those who use SQL. It contains complete coverage on every aspect. What even suprised me was it's introduction with Oracle. I started a project with Oracle and now see myself as a novice.

    Was this review helpful? Yes  No   Report this review
Sort by: Showing 1 Customer Reviews

If you find inappropriate content, please report it to Barnes & Noble
Why is this product inappropriate?
Comments (optional)