本文主要是對MariaDB中的EXPLAIN部分進行翻譯學習,以期能瞭解並掌握EXPLAIN的使用。

文中使用到的數據庫來自 Try to grab recruitment data from lagouGitHub

Syntax

EXPLAIN tbl_name

EXPLAIN [EXTENDED | PARTITIONS]
  {SELECT select_options | UPDATE update_options | DELETE delete_options}

使用命令help explain

MariaDB [(none)]> help explain
Name: 'EXPLAIN'
Description:
Syntax:
EXPLAIN [explain_type] SELECT select_options

explain_type:
    EXTENDED
  | PARTITIONS

Or:

EXPLAIN tbl_name

可以看到,命令help explain顯示的Syntax主要針對SELECT,而官網的Syntax則涵蓋SELECT, DELETE, UPDATE

Description

EXPLAIN語句既可用作DESCRIBE的同義詞,也可作爲一種獲取MariaDB是如何執行SELECT(從MariaDB 10.0.5開始支持UPDATEDELETE)的信息的方式。

MariaDB [(none)]> explain lagou.city;
+-------------+------------------+------+-----+-------------------+----------------+
| Field       | Type             | Null | Key | Default           | Extra          |
+-------------+------------------+------+-----+-------------------+----------------+
| id          | int(10) unsigned | NO   | PRI | NULL              | auto_increment |
| name        | char(20)         | NO   | MUL | NULL              |                |
| create_time | timestamp        | NO   |     | CURRENT_TIMESTAMP |                |
+-------------+------------------+------+-----+-------------------+----------------+
3 rows in set (0.00 sec)

MariaDB [(none)]> desc lagou.city;
+-------------+------------------+------+-----+-------------------+----------------+
| Field       | Type             | Null | Key | Default           | Extra          |
+-------------+------------------+------+-----+-------------------+----------------+
| id          | int(10) unsigned | NO   | PRI | NULL              | auto_increment |
| name        | char(20)         | NO   | MUL | NULL              |                |
| create_time | timestamp        | NO   |     | CURRENT_TIMESTAMP |                |
+-------------+------------------+------+-----+-------------------+----------------+
3 rows in set (0.00 sec)

MariaDB [(none)]> show columns from lagou.city;
+-------------+------------------+------+-----+-------------------+----------------+
| Field       | Type             | Null | Key | Default           | Extra          |
+-------------+------------------+------+-----+-------------------+----------------+
| id          | int(10) unsigned | NO   | PRI | NULL              | auto_increment |
| name        | char(20)         | NO   | MUL | NULL              |                |
| create_time | timestamp        | NO   |     | CURRENT_TIMESTAMP |                |
+-------------+------------------+------+-----+-------------------+----------------+
3 rows in set (0.00 sec)

MariaDB [(none)]>
  • 當在SELECT語句(MariaDB 10.0.5開始支持UPDATEDELETE)之前使用關鍵詞EXPLAIN,MariaDB顯示來自優化器(optimizer)的關於查詢執行計劃(query execution plan)的信息。即MariaDB解釋它將如何處理(process)SELECT, UPDATE, DELETE,包括關於表是如何組合(join),以何順序(in which order)的信息。 EXPLAIN EXTENDED可用於提供額外信息
MariaDB [(none)]> explain select count(*) from lagou.city;
+------+-------------+-------+-------+---------------+--------------+---------+------+------+-------------+
| id   | select_type | table | type  | possible_keys | key          | key_len | ref  | rows | Extra       |
+------+-------------+-------+-------+---------------+--------------+---------+------+------+-------------+
|    1 | SIMPLE      | city  | index | NULL          | indcity_name | 60      | NULL |    5 | Using index |
+------+-------------+-------+-------+---------------+--------------+---------+------+------+-------------+
1 row in set (0.00 sec)

MariaDB [(none)]> explain extended select count(*) from lagou.city;
+------+-------------+-------+-------+---------------+--------------+---------+------+------+----------+-------------+
| id   | select_type | table | type  | possible_keys | key          | key_len | ref  | rows | filtered | Extra       |
+------+-------------+-------+-------+---------------+--------------+---------+------+------+----------+-------------+
|    1 | SIMPLE      | city  | index | NULL          | indcity_name | 60      | NULL |    5 |   100.00 | Using index |
+------+-------------+-------+-------+---------------+--------------+---------+------+------+----------+-------------+
1 row in set, 1 warning (0.00 sec)

MariaDB [(none)]> show warnings;
+-------+------+---------------------------------------------------+
| Level | Code | Message                                           |
+-------+------+---------------------------------------------------+
| Note  | 1003 | select count(0) AS `count(*)` from `lagou`.`city` |
+-------+------+---------------------------------------------------+
1 row in set (0.00 sec)

MariaDB [(none)]>

可以看到使用EXPLAIN EXTENDED後,列中多了filtered列,執行結果多了一個1 warning,通過命令show warnings可看到具體的warning信息。

  • EXPLAIN PARTITIONS自 MySQL 5.1.5起可用。只有在測試(exam)包含分區表(partitioned tables)的查詢時有才有用。具體細節,查看 Partition pruning and selection

  • ANALYZE statement,執行時也會生成(produce) 形如 EXPLAIN的輸出,同時提供真實以及預估的統計數據,(該功能)從 MariaDB 10.1.0開始支持。

Compared to EXPLAIN, ANALYZE produces two extra columns:

r_rows is an observation-based counterpart of the rows column. It shows how many rows were actually read from the table.

r_filtered is an observation-based counterpart of the filtered column. It shows which fraction of rows was left after applying the WHERE condition. — ANALYZE statement

在配置文件中添加參數

[mysqld]
log-slow-verbosity=query_plan,explain

MariaDB 10.0 起,SHOW EXPLAIN 顯示一個運行中的語句(running statement)的輸出。在一些場景中,其輸出比EXPLAIN更接近真實情況。

MariaDB 10.1ANALYZE statement運行語句,返回關於執行計劃的信息。同時顯示增加列,檢查優化器估算的,接近實際情況的 過濾找到 的行數。

It also shows additional columns, to check how much the optimizer’s estimation about filtering and found rows are close to reality.

此處有在線的 EXPLAIN Analyzer,可用於與其他人分享EXPLAINEXPLAIN EXTENDED的輸出信息。

EXPLAIN can acquire metadata locks in the same way that SELECT does, as it needs to know table metadata and, sometimes, data as well.

The columns in EXPLAIN … SELECT

Column name Description
id 顯示連接哪些順序表的序列號; Sequence number that shows in which order tables are joined.
select_type 數據表表來自於哪種SELECT類型; What kind of SELECT the table comes from.
table 數據表別名,用於子查詢的實例化的臨時表命名格式是<subquery#>; Alias name of table. Materialized temporary tables for sub queries are named
type 行數據在數據表中以何種形式找到(join類型);How rows are found from the table (join type).
possible_keys 數據表中可用於找到(符合條件的)行數據的key;keys in table that could be used to find rows in the table
key 可用於獲取(符合條件的)行數據的key名,NULL意爲沒有key被使用;The name of the key that is used to retrieve rows. NULL is no key was used.
key_len key有多少字節被使用(顯示我們是否值使用多列key中的部分);How many bytes of the key that was used (shows if we are using only parts of the multi-column key).
ref 用於作爲key值的reference;The reference that is used to as the key value.
rows 預估將爲每個key查詢,從數據表中找出多少行數據;An estimate of how many rows we will find in the table for each key lookup.
Extra 關於這此join的額外信息;Extra information about this join.

此處有一些描述,關於EXPLAIN ... SELECT中的更複雜的列的值。

“select_type” column

select_type可有以下值

Value Description
PRIMARY 類型是PRIMARY;The SELECT is a PRIMARY one.
SIMPLE 類型是SIMPLE;The SELECT is a SIMPLE one.
DERIVED 源自於PRIMARY;The SELECT is DERIVED from the PRIMARY.
SUBQUERY 是PRIMARY的子查詢;The SELECT is a SUBQUERY of the PRIMARY.
DEPENDENT SUBQUERY 子查詢是依賴的;The SUBQUERY is DEPENDENT.
UNCACHEABLE SUBQUERY 子查詢是不可緩存的;The SUBQUERY is UNCACHEABLE.
UNION 是PRIMARY的聯合(UNION);The SELECT is a UNION of the PRIMARY.
UNION RESULT 聯合(UNION)的結果;The result of the UNION.
DEPENDENT UNION 聯合(UNION)是依賴的;The UNION is DEPENDENT.
UNCACHEABLE UNION 聯合(UNION)是不可緩存的;The UNION is UNCACHEABLE.

“Type” column

此列包含數據表是如何被訪問的信息 This column contains information on how the table is accessed.

Value Description
ALL 對數據表進行全表掃描(讀取所有行),如果數據表很大或數據表與之前的表聯表(join),這是 很糟糕 的!當優化器無法找到任何可用於訪問數據行的索引時會發生(這種情況);A full table scan is done for the table (all rows are read). This is bad if the table is large and the table is joined against a previous table! This happens when the optimizer could not find any usable index to access rows.
const 數據表中只有一種可能的匹配數據行,該數據行在優化階段之前被讀取,表中所有列都被視作常數;There is only one possibly matching row in the table. The row is read before the optimization phase and all columns in the table are treated as constants.
eq_ref 惟一索引,用於查找數據行,是最好的查找數據行的可能計劃;A unique index is used to find the rows. This is the best possible plan to find the row.
fulltext 全文索引,用於訪問數據行;A fulltext index is used to access the rows.
index_merge 多個索引的範圍訪問,找到的數據行被合併,列key顯示被使用的key;A ‘range’ access is done for for several index and the found rows are merged. The key column shows which keys are used.
index_subquery ref類似,但用於變換爲key查找的子查詢;This is similar as ref, but used for sub queries that are transformed to key lookups.
index 對使用到的索引進行全索引掃描,比ALL要好,但如果索引很大或數據表與之前的表聯表(join),情況仍很糟糕;A full scan over the used index. Better than ALL but still bad if index is large and the table is joined against a previous table.
range 數據表將被訪問,在一個或多個值範圍的key中; The table will be accessed with a key over one or more value ranges.
ref_or_null ref,除此之外,如果第一個值未被找到,其它用於搜索null的值也會生效(?done),通常發生於子查詢;Like ‘ref’ but in addition another search for the ‘null’ value is done if the first value was not found. This happens usually with sub queries.
ref 用於查找數據行的非惟一索引或惟一索引的前綴,如果前綴沒有匹配過多數據行,可以認爲Good;A non unique index or prefix of an unique index is used to find the rows. Good if the prefix doesn’t match many rows.
system 表有0行或1行數據;The table has 0 or 1 rows.
unique_subquery eq_ref類似,但用於變換爲key查找的子查詢;This is similar as eq_ref, but used for sub queries that are transformed to key lookups

“Extra” column

此列包含一個或多個值,用分號;分隔

注意 其中的一些值是在優化階段(optimization phase)後被偵測到的。

優化階段(optimization phase)可爲WHERE子句(clause)做如下改變: * 從ONUSING字句添加表達式到WHERE字句; * 常數傳播(Constant propagation):如果有column=constant,使用該常數替換所有列實例(column instance); * 用const表中的值替換所有列 (Replace all columns from ‘const’ tables with their values.); * 從WHERE移除使用過的key列(key column), (將被作爲key查找的部分來測試); * 移除不可能的恆等子表達式(constant sub expression),例如WHERE '(a=1 and a=2) OR b=1'變成b=1; * 替換列爲其它含有相關值的列,如WHERE a=b a=c可能被視爲WHERE a=b and a=c and b=c * 添加額外的條件,以期更早的發現(detect)不可能的行條件。主要發生在OUTER JOIN,在某些場景中,在WHRER中添加偵測值NULLNot exists優化的部分)。可能會在Extra列中造成不期望的Using where; * 對每個table level,當熱區之前的行數據時,移除已經被測試過的表達式。例如,使用WHERE 't1.a=1 and t1.a=t2.b'對表t1t2進行聯接(join),當檢查表t2中的行數據時,我們不需要測試t1.a=1,因爲我們已經知道這個表達式是成立的(true);

Value Description
const row not found The table was a system table (a table with should exactly one row), but no row was found.
Distinct If distinct optimization (remove duplicates) was used. This is marked only for the last table in the SELECT.
Full scan on NULL key The table is a part of the sub query and if the value that is used to match the sub query will be NULL, we will do a full table scan.
Impossible HAVING The used HAVING clause is always false so the SELECT will return no rows.
Impossible WHERE noticed after reading const tables. The used WHERE clause is always false so the SELECT will return no rows. This case was detected after we had read all ‘const’ tables and used the column values as constant in the WHERE clause. For example: WHERE const_column=5 and const_column had a value of 4.
Impossible WHERE The used WHERE clause is always false so the SELECT will return no rows. For example: WHERE 1=2
No matching min/max row During early optimization of MIN()/MAX() values it was detected that no row could match the WHERE clause. The MIN()/MAX() function will return NULL.
no matching row in const table The table was a const table (a table with only one possible matching row), but no row was found.
No tables used The SELECT was a sub query that did not use any tables. For example a there was no FROM clause or a FROM DUAL clause.
Not exists Stop searching after more row if we find one single matching row. This optimization is used with LEFT JOIN where one is explicitly searching for rows that doesn’t exists in the LEFT JOIN TABLE. Example: SELECT * FROM t1 LEFT JOIN t2 on (…) WHERE t2.not_null_column IS NULL. As t2.not_null_column can only be NULL if there was no matching row for on condition, we can stop searching if we find a single matching row.
Open_frm_only For INFORMATION_SCHEMA tables. Only the frm (table definition file was opened) was opened for each matching row.
Open_full_table For INFORMATION_SCHEMA tables. A full table open for each matching row is done to retrieve the requested information. (Slow)
Open_trigger_only For INFORMATION_SCHEMA tables. Only the trigger file definition was opened for each matching row.
Range checked for each record (index map: ...) This only happens when there was no good default index to use but there may some index that could be used when we can treat all columns from previous table as constants. For each row combination the optimizer will decide which index to use (if any) to fetch a row from this table. This is not fast, but faster than a full table scan that is the only other choice. The index map is a bitmask that shows which index are considered for each row condition.
Scanned 0/1/all databases For INFORMATION_SCHEMA tables. Shows how many times we had to do a directory scan.
Select tables optimized away All tables in the join was optimized away. This happens when we are only using COUNT(*), MIN() and MAX() functions in the SELECT and we where able to replace all of these with constants.
Skip_open_table For INFORMATION_SCHEMA tables. The queried table didn’t need to be opened.
unique row not found The table was detected to be a const table (a table with only one possible matching row) during the early optimization phase, but no row was found.
Using filesort Filesort is needed to resolve the query. This means an extra phase where we first collect all columns to sort, sort them with a disk based merge sort and then use the sorted set to retrieve the rows in sorted order. If the column set is small, we store all the columns in the sort file to not have to go to the database to retrieve them again.
Using index Only the index is used to retrieve the needed information from the table. There is no need to perform an extra seek to retrieve the actual record.
Using index condition Like ‘Using where’ but the where condition is pushed down to the table engine for internal optimization at the index level.
Using index condition(BKA) Like ‘Using index condition’ but in addition we use batch key access to retrieve rows.
Using index for group-by The index is being used to resolve a GROUP BY or DISTINCT query. The rows are not read. This is very efficient if the table has a lot of identical index entries as duplicates are quickly jumped over.
Using intersect(...) For index_merge joins. Shows which index are part of the intersect.
Using join buffer We store previous row combinations in a row buffer to be able to match each row against all of the rows combinations in the join buffer at one go.
Using sort_union(...) For index_merge joins. Shows which index are part of the union.
Using temporary A temporary table is created to hold the result. This typically happens if you are using GROUP BY, DISTINCT or ORDER BY.
Using where A WHERE expression (in additional to the possible key lookup) is used to check if the row should be accepted. If you don’t have ‘Using where’ together with a join type of ALL, you are probably doing something wrong!
Using where with pushed condition Like ‘Using where’ but the where condition is pushed down to the table engine for internal optimization at the row level.
Using buffer The UPDATE statement will first buffer the rows, and then run the updates, rather than do updates on the fly. See Using Buffer UPDATE Algorithm for a detailed explanation.

EXPLAIN EXTENDED

EXTENDED關鍵詞在輸出中添加了名爲filtered的列,是表中數據行將被條件過濾的估算百分比。

The EXTENDED keyword adds another column, filtered, to the output. This is a percentage estimate of the table rows that will be filtered by the condition.

EXPLAIN EXTENDED通常會拋出一條warning信息,添加額外的消息信息到隨後的SHOW WARNINGS語句中。包含在優化和重寫規則被採用後,SELECT query將看起來像什麼,優化器如何描述列和表。

An EXPLAIN EXTENDED will always throw a warning, as it adds extra Message information to a subsequent SHOW WARNINGS statement. This includes what the SELECT query would look like after optimizing and rewriting rules are applied and how the optimizer qualifies columns and tables.

Examples

DESCRIBESHOW COLUMNS FROM的同義詞。

DESCRIBE city;
+------------+----------+------+-----+---------+----------------+
| Field      | Type     | Null | Key | Default | Extra          |
+------------+----------+------+-----+---------+----------------+
| Id         | int(11)  | NO   | PRI | NULL    | auto_increment |
| Name       | char(35) | YES  |     | NULL    |                |
| Country    | char(3)  | NO   | UNI |         |                |
| District   | char(20) | YES  | MUL |         |                |
| Population | int(11)  | YES  |     | NULL    |                |
+------------+----------+------+-----+---------+----------------+

一組簡單的例子,看EXPLAIN如何識別差(poor)的索引使用:

A simple set of examples to see how EXPLAIN can identify poor index usage:

CREATE TABLE IF NOT EXISTS `employees_example` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `first_name` varchar(30) NOT NULL,
  `last_name` varchar(40) NOT NULL,
  `position` varchar(25) NOT NULL,
  `home_address` varchar(50) NOT NULL,
  `home_phone` varchar(12) NOT NULL,
  `employee_code` varchar(25) NOT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `employee_code` (`employee_code`),
  KEY `first_name` (`first_name`,`last_name`)
) ENGINE=Aria;

INSERT INTO `employees_example` (`first_name`, `last_name`, `position`, `home_address`, `home_phone`, `employee_code`)
  VALUES
  ('Mustapha', 'Mond', 'Chief Executive Officer', '692 Promiscuous Plaza', '326-555-3492', 'MM1'),
  ('Henry', 'Foster', 'Store Manager', '314 Savage Circle', '326-555-3847', 'HF1'),
  ('Bernard', 'Marx', 'Cashier', '1240 Ambient Avenue', '326-555-8456', 'BM1'),
  ('Lenina', 'Crowne', 'Cashier', '281 Bumblepuppy Boulevard', '328-555-2349', 'LC1'),
  ('Fanny', 'Crowne', 'Restocker', '1023 Bokanovsky Lane', '326-555-6329', 'FC1'),
  ('Helmholtz', 'Watson', 'Janitor', '944 Soma Court', '329-555-2478', 'HW1');

SHOW INDEXES FROM employees_example\G
*************************** 1. row ***************************
        Table: employees_example
   Non_unique: 0
     Key_name: PRIMARY
 Seq_in_index: 1
  Column_name: id
    Collation: A
  Cardinality: 6
     Sub_part: NULL
       Packed: NULL
         Null:
   Index_type: BTREE
      Comment:
Index_comment:
*************************** 2. row ***************************
        Table: employees_example
   Non_unique: 0
     Key_name: employee_code
 Seq_in_index: 1
  Column_name: employee_code
    Collation: A
  Cardinality: 6
     Sub_part: NULL
       Packed: NULL
         Null:
   Index_type: BTREE
      Comment:
Index_comment:
*************************** 3. row ***************************
        Table: employees_example
   Non_unique: 1
     Key_name: first_name
 Seq_in_index: 1
  Column_name: first_name
    Collation: A
  Cardinality: NULL
     Sub_part: NULL
       Packed: NULL
         Null:
   Index_type: BTREE
      Comment:
Index_comment:
*************************** 4. row ***************************
        Table: employees_example
   Non_unique: 1
     Key_name: first_name
 Seq_in_index: 2
  Column_name: last_name
    Collation: A
  Cardinality: NULL
     Sub_part: NULL
       Packed: NULL
         Null:
   Index_type: BTREE
      Comment:
Index_comment:
4 rows in set (0.00 sec)

SELECTING on a primary key:

EXPLAIN SELECT * FROM employees_example WHERE id=1\G

*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: employees_example
         type: const
possible_keys: PRIMARY
          key: PRIMARY
      key_len: 4
          ref: const
         rows: 1
        Extra:

type值是const,意味着只有一行可能的結果被返回。現在,使用home_phone字段來返回相同的數據:

The type is const, which means that only one possible result could be returned. Now, returning the same record but searching by their phone number:

EXPLAIN SELECT * FROM employees_example WHERE home_phone='326-555-3492'\G

*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: employees_example
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 6
        Extra: Using where

此處type值是All,意味着沒有索引被使用到。看rows的計數,爲了返回記錄進行了全表掃描(所有的數據6行)。如果需要通過 phone number 來搜索,需要在字段home_phone上建立索引。

Here, the type is All, which means no index could be used. Looking at the rows count, a full table scan (all six rows) had to be performed in order to retrieve the record. If it’s a requirement to search by phone number, an index will have to be created.

SHOW EXPLAIN示例

SHOW EXPLAIN FOR 1;
+------+-------------+-------+-------+---------------+------+---------+------+---------+-------------+
| id   | select_type | table | type  | possible_keys | key  | key_len | ref  | rows    | Extra       |
+------+-------------+-------+-------+---------------+------+---------+------+---------+-------------+
|    1 | SIMPLE      | tbl   | index | NULL          | a    | 5       | NULL | 1000107 | Using index |
+------+-------------+-------+-------+---------------+------+---------+------+---------+-------------+
1 row in set, 1 warning (0.00 sec)

Example of ref_or_null optimization

SELECT * FROM table_name
  WHERE key_column=expr OR key_column IS NULL;

ref_or_null通常發生在使用NOT IN的子查詢時,如果第一個值沒有匹配到行數據,另外一個將會檢查NULL值。

ref_or_null is something that often happens when you use subqueries with NOT IN as then one has to do an extra check for NULL values if the first value didn’t have a matching row.

See also

References

Change Logs

  • 2016.03.18 14:50 Fri Asia/Beijing
    • 初稿完成

  • 2016.03.18 14:50 Fri
  • Note Location: Asia/Beijing
  • Writer: lempstacker