show create table sbtest\G *************************** 1. row *************************** Table: sbtest Create Table: CREATE TABLE `sbtest` ( `aid` bigint(20) unsigned NOT NULL auto_increment, `id` int(10) unsigned NOT NULL default '0', `k` int(10) unsigned NOT NULL default '0', `c` char(120) NOT NULL default '', `pad` char(60) NOT NULL default '', PRIMARY KEY (`aid`), KEY `k` (`k`), KEY `id` (`id`) ) ENGINE=InnoDB AUTO_INCREMENT=1000001 DEFAULT CHARSET=latin1 show index from sbtest; +--------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+ | Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | --phpfensi.com +--------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+ | sbtest | 0 | PRIMARY | 1 | aid | A | 1000099 | NULL | NULL | | BTREE | | | sbtest | 1 | k | 1 | k | A | 18 | NULL | NULL | | BTREE | | | sbtest | 1 | id | 1 | id | A | 1000099 | NULL | NULL | | BTREE | | +--------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+ 填充了 100萬條 記錄.
1、直接 count(*)
explain SELECT COUNT(*) FROM sbtest; +----+-------------+--------+-------+---------------+---------+---------+------+---------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+--------+-------+---------------+---------+---------+------+---------+-------------+ | 1 | SIMPLE | sbtest | index | NULL | PRIMARY | 8 | NULL | 1000099 | Using index | +----+-------------+--------+-------+---------------+---------+---------+------+---------+-------------+ SELECT COUNT(*) FROM sbtest; +----------+ | COUNT(*) | +----------+ | 1000000 | +----------+ 1 row in set (1.42 sec) 可以看到,如果不加任何條件,那么優化器優先采用 primary key 來進行掃描.
2、count(*) 使用 primary key 字段做條件.
explain SELECT COUNT(*) FROM sbtest WHERE aid>=0; +----+-------------+--------+-------+---------------+---------+---------+------+--------+--------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+--------+-------+---------------+---------+---------+------+--------+--------------------------+ | 1 | SIMPLE | sbtest | range | PRIMARY | PRIMARY | 8 | NULL | 485600 | Using where; Using index | +----+-------------+--------+-------+---------------+---------+---------+------+--------+--------------------------+ SELECT COUNT(*) FROM sbtest WHERE aid>=0; +----------+ | COUNT(*) | +----------+ | 1000000 | +----------+ 1 row in set (1.39 sec) 可以看到,盡管優化器認為只需要掃描 485600 條記錄(其實是索引),比剛才少多了,但其實仍然要做全表(索引)掃描,因此耗時和第一種相當.
3、count(*) 使用 secondary index 字段做條件
explain SELECT COUNT(*) FROM sbtest WHERE id>=0; +----+-------------+--------+-------+---------------+------+---------+------+--------+--------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+--------+-------+---------------+------+---------+------+--------+--------------------------+ | 1 | SIMPLE | sbtest | range | id | id | 4 | NULL | 500049 | Using where; Using index | +----+-------------+--------+-------+---------------+------+---------+------+--------+--------------------------+ SELECT COUNT(*) FROM sbtest WHERE id>=0; +----------+ | COUNT(*) | +----------+ | 1000000 | +----------+ 1 row in set (0.43 sec) 可以看到,采用這種方式查詢會非常快,有人也許會問了,會不會是因為 id 字段的長度比 aid 字段的長度來的小,導致它掃描起來比較快呢?先不著急下結論,咱們來看看下面的測試例子.
二、sbtest1 表上的測試
show create table sbtest1\G *************************** 1. row *************************** Table: sbtest1 Create Table: CREATE TABLE `sbtest1` ( `aid` int(10) unsigned NOT NULL AUTO_INCREMENT, `id` bigint(20) unsigned NOT NULL DEFAULT '0', `k` int(10) unsigned NOT NULL DEFAULT '0', `c` char(120) NOT NULL DEFAULT '', `pad` char(60) NOT NULL DEFAULT '', PRIMARY KEY (`aid`), KEY `k` (`k`), KEY `id` (`id`) ) ENGINE=InnoDB AUTO_INCREMENT=1000001 DEFAULT CHARSET=latin1 show index from sbtest1; +---------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+ | Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | +---------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+ | sbtest1 | 0 | PRIMARY | 1 | aid | A | 1000099 | NULL | NULL | | BTREE | | | sbtest1 | 1 | k | 1 | k | A | 18 | NULL | NULL | | BTREE | | | sbtest1 | 1 | id | 1 | id | A | 1000099 | NULL | NULL | | BTREE | | +---------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+ 這個表里,把 aid 和 id 的字段長度調換了一下,也填充了 1000萬條記錄.
1、直接 count(*).
explain SELECT COUNT(*) FROM sbtest1; +----+-------------+---------+-------+---------------+---------+---------+------+---------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+---------+-------+---------------+---------+---------+------+---------+-------------+ | 1 | SIMPLE | sbtest1 | index | NULL | PRIMARY | 4 | NULL | 1000099 | Using index | +----+-------------+---------+-------+---------------+---------+---------+------+---------+-------------+ SELECT COUNT(*) FROM sbtest1; +----------+ | COUNT(*) | +----------+ | 1000000 | +----------+ 1 row in set (1.42 sec) 可以看到,如果不加任何條件,那么優化器優先采用 primary key 來進行掃描.
2、count(*) 使用 primary key 字段做條件.
explain SELECT COUNT(*) FROM sbtest1 WHERE aid>=0; +----+-------------+---------+-------+---------------+---------+---------+------+--------+--------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+---------+-------+---------------+---------+---------+------+--------+--------------------------+ | 1 | SIMPLE | sbtest1 | range | PRIMARY | PRIMARY | 4 | NULL | 316200 | Using where; Using index | +----+-------------+---------+-------+---------------+---------+---------+------+--------+--------------------------+ 1 row in set (0.00 sec) SELECT COUNT(*) FROM sbtest1 WHERE aid>=0; +----------+ | COUNT(*) | +----------+ | 1000000 | +----------+ 1 row in set (1.42 sec) 可以看到,盡管優化器認為只需要掃描 485600 條記錄(其實是索引),比剛才少多了,但其實仍然要做全表(索引)掃描,因此耗時和第一種相當.
3、count(*) 使用 secondary index 字段做條件.
explain SELECT COUNT(*) FROM sbtest1 WHERE id>=0; +----+-------------+---------+-------+---------------+------+---------+------+--------+--------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+---------+-------+---------------+------+---------+------+--------+--------------------------+ | 1 | SIMPLE | sbtest1 | range | id | id | 8 | NULL | 500049 | Using where; Using index | +----+-------------+---------+-------+---------------+------+---------+------+--------+--------------------------+ 1 row in set (0.00 sec) SELECT COUNT(*) FROM sbtest1 WHERE id>=0; +----------+ | COUNT(*) | +----------+ | 1000000 | +----------+ 1 row in set (0.45 sec) 可以看到,采用這種方式查詢會非常快,上面的所有測試,均在 mysql 5.1.24 環境下通過,并且每次查詢前都重啟了 mysqld.
可以看到,把 aid 和 id 的長度調換之后,采用 secondary index 查詢仍然是要比用 primary key 查詢來的快很多。看來主要不是字段長度引起的索引掃描快慢,而是采用 primary key 以及 secondary index 引起的區別,那么,為什么用 secondary index 掃描反而比 primary key 掃描來的要快呢?我們就需要了解innodb的 clustered index 和secondary index 之間的區別了.
innodb 的 clustered index 是把 primary key 以及 row data 保存在一起的,而 secondary index 則是單獨存放,然后有個指針指向 primary key,因此,需要進行 count(*) 統計表記錄總數時,利用 secondary index 掃描起來,顯然更快,而primary key則主要在掃描索引,同時要返回結果記錄時的作用較大,例如:
in the example table, the secondary index is inserted into in a perfect order! That is very unusual. Normally the secondary index would be fragmented, causing random disk I/O, and the scan would be slower than in the primary index. I am changing this to a feature request: keep 'clustering ratio' statistics on a secondary index and do the scan there if the order is almost the same as in the primary index. I doubt this feature will ever be implemented, though.