从表中查询数据时,可能会收到重复的行记录。为了删除这些重复行,可以在SELECT语句中使用DISTINCT子句。
DISTINCT子句的语法如下:
SELECT DISTINCT columns FROM table_name WHERE where_conditions;
下面来看看一个使用DISTINCT子句从employees表中选择员工的唯一姓氏(lastName)的简单示例。
首先,使用SELECT语句从employees表中查询员工的姓氏(lastName),如下所示:
SELECT lastname FROM employees ORDER BY lastname;
执行上面查询语句,得到以下结果 -
mysql> SELECT lastname FROM employees ORDER BY lastname; +-----------+ | lastname | +-----------+ | Bondur | | Bondur | | Bott | | Bow | | Castillo | | Firrelli | | Firrelli | | Fixter | | Gerard | | Hernandez | | Jennings | | Jones | | Kato | | King | | Marsh | | Murphy | | Nishi | | Patterson | | Patterson | | Patterson | | Thompson | | Tseng | | Vanauf | +-----------+ 23 rows in set
可看到上面结果中,有好些结果是重复的,比如:Bondur,Firrelli等,那如何做到相同的结果只显示一个呢?要删除重复的姓氏,请将DISTINCT子句添加到SELECT语句中,如下所示:
SELECT DISTINCT lastname FROM employees ORDER BY lastname;
执行上面查询,得到以下输出结果 -
mysql> SELECT DISTINCT lastname FROM employees ORDER BY lastname; +-----------+ | lastname | +-----------+ | Bondur | | Bott | | Bow | | Castillo | | Firrelli | | Fixter | | Gerard | | Hernandez | | Jennings | | Jones | | Kato | | King | | Marsh | | Murphy | | Nishi | | Patterson | | Thompson | | Tseng | | Vanauf | +-----------+ 19 rows in set
当使用DISTINCT子句时,重复的姓氏(lastname)在结果集中被消除。
如果列具有NULL值,并且对该列使用DISTINCT子句,MySQL将保留一个NULL值,并删除其它的NULL值,因为DISTINCT子句将所有NULL值视为相同的值。
例如,在customers表中,有很多行的州(state)列是NULL值。 当使用DISTINCT子句来查询客户所在的州时,我们将看到唯一的州和NULL值,如下查询所示:
SELECT DISTINCT state FROM customers;
执行上面查询语句后,输出结果如下 -
mysql> SELECT DISTINCT state FROM customers; +---------------+ | state | +---------------+ | NULL | | NV | | Victoria | | CA | | NY | | PA | | CT | | MA | | Osaka | | BC | | Qubec | | Isle of Wight | | NSW | | NJ | | Queensland | | Co. Cork | | Pretoria | | NH | | Tokyo | +---------------+ 19 rows in set
可以使用具有多个列的DISTINCT子句。 在这种情况下,MySQL使用所有列的组合来确定结果集中行的唯一性。
例如,要从customers表中获取城市(city)和州(state)的唯一组合,可以使用以下查询:
SELECT DISTINCT state, city FROM customers WHERE state IS NOT NULL ORDER BY state , city;
SQL
执行上面查询,得到以下结果 -
mysql> SELECT DISTINCT state, city FROM customers WHERE state IS NOT NULL ORDER BY state ,city; +---------------+----------------+ | state | city | +---------------+----------------+ | BC | Tsawassen | | BC | Vancouver | | CA | Brisbane | | CA | Burbank | | CA | Burlingame | | CA | Glendale | | CA | Los Angeles | | CA | Pasadena | | CA | San Diego | | CA | San Francisco | | CA | San Jose | | CA | San Rafael | | Co. Cork | Cork | | CT | Bridgewater | | CT | Glendale | | CT | New Haven | | Isle of Wight | Cowes | | MA | Boston | | MA | Brickhaven | | MA | Cambridge | | MA | New Bedford | | NH | Nashua | | NJ | Newark | | NSW | Chatswood | | NSW | North Sydney | | NV | Las Vegas | | NY | NYC | | NY | White Plains | | Osaka | Kita-ku | | PA | Allentown | | PA | Philadelphia | | Pretoria | Hatfield | | Qubec | Montral | | Queensland | South Brisbane | | Tokyo | Minato-ku | | Victoria | Glen Waverly | | Victoria | Melbourne | +---------------+----------------+ 37 rows in set
Shell
没有DISTINCT子句,将查询获得州(state)和城市(city)的重复组合如下:
SELECT state, city FROM customers WHERE state IS NOT NULL ORDER BY state , city;
SQL
执行上面查询,得到以下结果 -
如果在SELECT语句中使用GROUP BY子句,而不使用聚合函数,则GROUP BY子句的行为与DISTINCT子句类似。
以下语句使用GROUP BY子句来选择customers表中客户的唯一state列的值。
SELECT state FROM customers GROUP BY state;
SQL
执行上面查询,得到以下结果 -
mysql> SELECT state FROM customers GROUP BY state; +---------------+ | state | +---------------+ | NULL | | BC | | CA | | Co. Cork | | CT | | Isle of Wight | | MA | | NH | | NJ | | NSW | | NV | | NY | | Osaka | | PA | | Pretoria | | Qubec | | Queensland | | Tokyo | | Victoria | +---------------+ 19 rows in set
Shell
可以通过使用DISTINCT子句来实现类似的结果:
mysql> SELECT DISTINCT state FROM customers; +---------------+ | state | +---------------+ | NULL | | NV | | Victoria | | CA | | NY | | PA | | CT | | MA | | Osaka | | BC | | Qubec | | Isle of Wight | | NSW | | NJ | | Queensland | | Co. Cork | | Pretoria | | NH | | Tokyo | +---------------+ 19 rows in set
SQL
一般而言,DISTINCT子句是GROUP BY子句的特殊情况。 DISTINCT子句和GROUP BY子句之间的区别是GROUP BY子句可对结果集进行排序,而DISTINCT子句不进行排序。
如果将ORDER BY子句添加到使用DISTINCT子句的语句中,则结果集将被排序,并且与使用GROUP BY子句的语句返回的结果集相同。
SELECT DISTINCT state FROM customers ORDER BY state;
执行上面查询,得到以下结果 -
mysql> SELECT DISTINCT state FROM customers ORDER BY state; +---------------+ | state | +---------------+ | NULL | | BC | | CA | | Co. Cork | | CT | | Isle of Wight | | MA | | NH | | NJ | | NSW | | NV | | NY | | Osaka | | PA | | Pretoria | | Qubec | | Queensland | | Tokyo | | Victoria | +---------------+ 19 rows in set
可以使用具有聚合函数(例如SUM,AVG和COUNT)的DISTINCT子句中,在MySQL将聚合函数应用于结果集之前删除重复的行。
例如,要计算美国客户的唯一state列的值,可以使用以下查询:
SELECT COUNT(DISTINCT state) FROM customers WHERE country = 'USA';
执行上面查询,得到以下结果 -
mysql> SELECT COUNT(DISTINCT state) FROM customers WHERE country = 'USA'; +-----------------------+ | COUNT(DISTINCT state) | +-----------------------+ | 8 | +-----------------------+ 1 row in set
如果要将DISTINCT子句与LIMIT子句一起使用,MySQL会在查找LIMIT子句中指定的唯一行数时立即停止搜索。
以下查询customers表中的前3个非空(NOT NULL)唯一state列的值。
mysql> SELECT DISTINCT state FROM customers WHERE state IS NOT NULL LIMIT 3; +----------+ | state | +----------+ | NV | | Victoria | | CA | +----------+ 3 rows in set
在本教程中,我们学习了使用MySQL DISTINCT子句的各种方法,例如消除重复行和计数非NULL值。