3.6. Querying with SQL SELECT
We begin this section by covering the basics of using the SELECT statement. We then introduce the WHERE clause for selecting data that matches a condition. The section concludes with an introduction to the more advanced features of SELECT statements.3.6.1. Basic Querying
The SELECT statement is used to query a database and for all output operations in SQL. Consider an example query:This outputs the values of the attributes surname and firstname from all rows, or records, in the customer table. Assuming we previously inserted four rows when we created the winestore database, the output from the MySQL command interpreter is:SELECT surname, firstname FROM customer;
Any attributes of a table may be listed in a SELECT statement by separating each with a comma. If all attributes are required, the shortcut of an asterisk character (*) can be used. Consider the statement:+-----------+-----------+ | surname | firstname | +-----------+-----------+ | Marzalla | Dimitria | | LaTrobe | Anthony | | Fong | Nicholas | | Stribling | James | +-----------+-----------+ 4 rows in set (0.04 sec)
This outputs all the data from the table region:SELECT * FROM region;
SELECT statements can also output data that isn't from a database. Consider the following example:+-----------+---------------------+-------------+------+ | region_id | region_name | description | map | +-----------+---------------------+-------------+------+ | 1 | Goulburn Valley | NULL | NULL | | 2 | Rutherglen | NULL | NULL | | 3 | Coonawarra | NULL | NULL | | 4 | Upper Hunter Valley | NULL | NULL | +-----------+---------------------+-------------+------+ 4 rows in set (0.07 sec)
This example runs a function that displays the current time:SELECT curtime( );
The SELECT statement can even be used as a simple calculator, using the mathematical functions described in the later section Section 3.9:+-----------+ | curtime( ) | +-----------+ | 08:41:50 | +-----------+ 1 row in set (0.02 sec)
This outputs:SELECT log(100)*4*pi( );
+-----------------+ | log(100)*4*pi( ) | +-----------------+ | 57.870275 | +-----------------+ 1 row in set (0.19 sec)
3.6.2. WHERE Clauses
A WHERE clause is used as part of most SELECT queries; it limits retrieval to those rows that match a condition.Consider this grape-growing region table containing the details of nine regions:
It is possible to select only a few rows with a SELECT statement by adding a WHERE clause. For example, to show only the first three regions, you can issue the following statement:SELECT * from region; +-----------+---------------------+-------------+------+ | region_id | region_name | description | map | +-----------+---------------------+-------------+------+ | 1 | Goulburn Valley | NULL | NULL | | 2 | Rutherglen | NULL | NULL | | 3 | Coonawarra | NULL | NULL | | 4 | Upper Hunter Valley | NULL | NULL | | 5 | Lower Hunter Valley | NULL | NULL | | 6 | Barossa Valley | NULL | NULL | | 7 | Riverland | NULL | NULL | | 8 | Margaret River | NULL | NULL | | 9 | Swan Valley | NULL | NULL | +-----------+---------------------+-------------+------+ 9 rows in set (0.00 sec)
This outputs all attributes for only the first three region rows:SELECT * FROM region WHERE region_id<=3;
You can combine the attribute and row restrictions and select only the region_name and region_id attributes for the first three regions:+-----------+------------------+-------------+------+ | region_id | region_name | description | map | +-----------+------------------+-------------+------+ | 1 | Goulburn Valley | NULL | NULL | | 2 | Rutherglen | NULL | NULL | | 3 | Coonawarra | NULL | NULL | +-----------+------------------+-------------+------+ 3 rows in set (0.00 sec)
More complex WHERE clauses use the Boolean operators AND and OR, as well as the functions described later in Section 3.9. The Boolean operators AND and OR have the same function as the PHP && and ||operators introduced in Chapter 2.SELECT region_id, region_name FROM region WHERE region_id <= 3; +-----------+------------------+ | region_id | region_name | +-----------+------------------+ | 1 | Goulburn Valley | | 2 | Rutherglen | | 3 | Coonawarra | +-----------+------------------+ 3 rows in set (0.00 sec)
Consider an example query that uses the Boolean operators:
This retrieves rows that match both criteria, that is, those customers with a surname Marzalla and a firstname Dimitria.SELECT * FROM customer WHERE surname='Marzalla' AND firstname='Dimitria';
Consider a more complex example:
This finds rows with either the surname Marzalla and a firstname beginning with M, or customers with the email address john@lucaston.com. The OR operator isn't exclusive, so an answer can have anemail of john@lucaston.com, a surname of Marzalla, and a firstname beginning with M. This query, when run on the winestore database, returns:SELECT cust_id FROM customer WHERE (surname='Marzalla' AND firstname LIKE 'M%') OR email='john@lucaston.com';
SELECT queries are often sophisticated and a long WHERE clause may include many AND and OR operators. More complex examples of queries are shown in the later section Section 3.7.+---------+ | cust_id | +---------+ | 440 | | 493 | +---------+ 2 rows in set (0.01 sec)
The WHERE clause is also a common component of UPDATE and DELETE statements, and we have shown simple examples of using WHERE with these earlier in this chapter. Consider another example of anUPDATE with a WHERE clause:
In this case, for wines that are made by the winery with winery_id=299, the winery_id is changed to winery_id=298.UPDATE wine SET winery_id = 298 WHERE winery_id = 299;
The WHERE clause can be used similarly in a DELETE. Consider an example:
This removes only selected rows based on a condition; here the wines made by the winery with winery_id=299 are deleted.DELETE FROM wine WHERE winery_id = 299;
3.6.3. Sorting and Grouping Output
We will now discuss techniques to manage the order and grouping of the output.3.6.3.1. ORDER BY
The ORDER BY clause sorts the data after the query has been evaluated. Consider an example:This query finds all customers who have a title Mr and live in Portsea. It then presents the results sorted alphabetically by ascending surname:SELECT surname, firstname FROM customer WHERE title='Mr' AND city = 'Portsea' ORDER by surname;
Sorting can be on multiple attributes. For example:+-----------+-----------+ | surname | firstname | +-----------+-----------+ | Dalion | Anthony | | Galti | Jim | | Keisling | Mark | | Leramonth | James | | Mellili | Derryn | | Mockridge | James | | Nancarral | Joshua | | Ritterman | James | +-----------+-----------+ 8 rows in set (0.01 sec)
This presents a list of customers in areas with zipcode='3000' or zipcode='3001', sorted first by ascending surname, then (for those customers with the same surname) by firstname, and (for those customers with the same surname and first name), by initial. So, for example, the output may be:SELECT surname, firstname, initial FROM customer WHERE zipcode='3001' OR zipcode='3000' ORDER BY surname, firstname, initial;
By default, the ORDER BY clause sorts in ascending order, or ASC. To sort in reverse or descending order, DESC can be used. Consider an example:+-----------+-----------+---------+ | surname | firstname | initial | +-----------+-----------+---------+ | Keisling | Belinda | C | | Leramonth | Hugh | D | | Leramonth | Joshua | H | | Leramonth | Joshua | R | | Young | Bob | A | +-----------+-----------+---------+ 5 rows in set (0.11 sec)
SELECT * FROM customer WHERE city='Melbourne' ORDER BY surname DESC;
3.6.3.2. GROUP BY
The GROUP BY clause is different from ORDER BY because it doesn't sort the data for output. Instead, it sorts the data early in the query process, for the purpose of grouping or aggregation. An example shows the difference:This query outputs a sorted list of cities and, for each city, the COUNT of the number of customers who live in that city. The effect of COUNT(*) is to count the number of rows per group. In this example, it doesn't matter what is counted; COUNT(surname) has exactly the same result.SELECT city, COUNT(*) FROM customer GROUP BY city;
Here are the first few lines output by the query:
The query aggregates or groups all the rows for each city into sets, and the COUNT(*) operation counts the number in each set. So, for example, there are 14 customers who live in Alexandra.+--------------+----------+ | city | COUNT(*) | +--------------+----------+ | Alexandra | 14 | | Armidale | 7 | | Athlone | 9 | | Bauple | 6 | | Belmont | 11 | | Bentley | 10 | | Berala | 9 | | Broadmeadows | 11 |
The GROUP BY clause can find different properties of the aggregated rows. Here's an example:
This query first groups the rows by city and then shows the maximum salary in each city. The first few rows of the output are as follows:SELECT city, MAX(salary) FROM customer GROUP BY city;
+-----------+-------------+ | city | MAX(salary) | +-----------+-------------+ | Alexandra | 109000 | | Armidale | 75000 | | Athlone | 84000 | | Bauple | 33000 |
TIP: The GROUP BY clause should be used only when the query is designed to find a characteristic of a group of rows, not the details of individual rows.There are several functions that can be used in aggregation with the GROUP BY clause. Five particularly useful functions are:
- AVG( )
- Finds the average value of a numeric attribute in a set
- MIN( )
- Finds a minimum value of a string or numeric attribute in a set
- MAX( )
- Finds a maximum value of a string or numeric attribute in a set
- SUM( )
- Finds the sum total of a numeric attribute
- COUNT( )
- Counts the number of rows in a set
3.6.3.3. HAVING
The HAVING clause permits conditional aggregation of data into groups. For example, consider the following query:The query groups rows by city, but only for cities that have more than 10 resident customers. For those groups, the city, count( ) of customers, and maximum salary of a customer in that city is output. Cities with less than 10 customers are omitted from the result set. The first few rows of the output are as follows:SELECT city, count(*), max(salary) FROM customer GROUP BY city HAVING count(*) > 10;
The HAVING clause must contain an attribute or expression from the SELECT clause.+--------------+----------+-------------+ | city | count(*) | max(salary) | +--------------+----------+-------------+ | Alexandra | 14 | 109000 | | Belmont | 11 | 71000 | | Broadmeadows | 11 | 51000 | | Doveton | 13 | 77000 | | Eleker | 11 | 97000 | | Gray | 12 | 77000 |
TIP: The HAVING clause is used exclusively with the GROUP BY clause. It is slow and should never be used instead of a WHERE clause.
3.6.3.4. DISTINCT
The DISTINCT operator presents only one example of each row from a query. Consider an example:This shows one example of each different customer surname in the customer table. This example has exactly the same effect as:SELECT DISTINCT surname FROM customer;
The DISTINCT clause is usually slow to run, much like the GROUP BY clause. We discuss how indexes and query optimization can speed queries later in this chapter.SELECT surname FROM customer GROUP BY surname;
3.6.4. Limiting Result Sets in MySQL
An additional operator is available in MySQL that limits the size of the result sets. For example, the following query returns only the first five rows from the wine table:This saves query evaluation time and reduces the size of the result set that must be buffered by the DBMS. The LIMIT operator is MySQL-specific.SELECT * FROM wine LIMIT 5;
Categories:
PHP