MYSQL 阿里的一個sql優(yōu)化疑問

2024-07-24 12:34:55

字體：大中小

來源：轉載

供稿：網友

　　今天研究了一天innodb事物，臨近下班了同事田興春告訴我有個阿里發(fā)出來的優(yōu)化題，并且把建表和語句給我了，并且告訴我語句里面有隱式轉換
　　昨天群里也有人說這道題但是一直沒空看，剛好這會沒事就看了一下，整個腳本如下：

　　點擊(此處)折疊或打開

　　數據準備：

　　create table a (id int auto_increment,seller_id bigint,seller_name varchar(100) collate utf8_bin ,gmt_create varchar(30),primary key(id));

　　insert into a (seller_id,seller_name,gmt_create) values (100000,'uniqla','2017-01-01');
　　insert into a (seller_id,seller_name,gmt_create) values (100001,'uniqlb','2017-02-01');
　　insert into a (seller_id,seller_name,gmt_create) values (100002,'uniqlc','2017-03-01');
　　insert into a (seller_id,seller_name,gmt_create) values (100003,'uniqld','2017-04-01');
　　insert into a (seller_id,seller_name,gmt_create) values (100004,'uniqle','2017-05-01');
　　insert into a (seller_id,seller_name,gmt_create) values (100005,'uniqlf','2017-06-01');
　　insert into a (seller_id,seller_name,gmt_create) values (100006,'uniqlg','2017-07-01');
　　insert into a (seller_id,seller_name,gmt_create) values (100007,'uniqlh','2017-08-01');
　　insert into a (seller_id,seller_name,gmt_create) values (100008,'uniqli','2017-09-01');
　　insert into a (seller_id,seller_name,gmt_create) values (100009,'uniqlj','2017-10-01');
　　insert into a (seller_id,seller_name,gmt_create) values (100010,'uniqlk','2017-11-01');
　　insert into a (seller_id,seller_name,gmt_create) values (100011,'uniqll','2017-12-01');
　　insert into a (seller_id,seller_name,gmt_create) values (100012,'uniqlm','2018-01-01');
　　insert into a (seller_id,seller_name,gmt_create) values (100013,'uniqln','2018-02-01');
　　insert into a (seller_id,seller_name,gmt_create) values (100014,'uniqlo','2018-03-01');
　　insert into a (seller_id,seller_name,gmt_create) values (100015,'uniqlp','2018-04-01');

　　待優(yōu)化SQL：
　　select a.seller_id,a.seller_name,b.user_name,c.state
　　from a,b,c
　　where a.seller_name=b.seller_name
　　and b.user_id=c.user_id
　　and c.user_id=17
　　and a.gmt_create BETWEEN DATE_ADD(NOW(), INTERVAL - 600 MINUTE)
　　AND DATE_ADD(NOW(), INTERVAL 600 MINUTE)
　　order by a.gmt_create
　　先說明這個優(yōu)化題目主要考察下面5點：
　　1、BNL和NJL的區(qū)別
　　2、NJL的實現
　　3、DBA對于數據分布的觀察
　　4、隱式轉換索引不能使用
　　5、比較字符集不同索引不能使用

　　一、我們先來分別描述
　　1、BNL和NJL的區(qū)別
　　這個區(qū)別參考我的文章
　　http://blog.itpub.net/7728585/viewspace-2129502/
　　(從順序隨機I/O原理來討論MYSQL MRR NLJ BNL BKA )
　　簡單的說BNL一般用于TYPE=INDEX以及TYPE=ALL的情況，因為被驅動表連接條件沒有索引，而需要join buffer 將驅動表中待連接的
　　數據取出來(物理/邏輯讀取)，放到join buffer，主要目的在于減少被驅動表的驅動次數，從而提高效率，因為沒有索引的情況
　　被驅動表掃描一次實在太慢了，這里的B就是BLOCK的意思.
　　而NJL一般用于被驅動表連接條件有索引的情況，通過索引上的ref或者eq_ref(取決于索引是否唯一)就理所當然的快很多很多，這個時候join buffer是不會
　　使用的，它只需要讀取一條數據(物理/邏輯讀取)來驅動一次驅動表，因為驅動表連接條件有索引，自然就快了(索引定位回表)
　　2、NJL的實現
　　同樣可以參考上面的文章，上面也大概說了一下，就不在廢話了
　　3、DBA對于數據分布的觀察
　　這一點是人為可以達到的，簡單的說比如一個表有100條數據 99條為no=1 1條為no=2，那么我們
　　需要對這個有所警覺，如果這個表示用作驅動表那么no=2的時候效果要遠遠好于no=1。這道題也有
　　這個因素
　　明顯and c.user_id='17' 只有一條數據
　　4、隱式轉換索引不能使用
　　這個不管是MYSQL還是ORACLE都有的問題，
　　ORACLE會顯示給出來to_char(id)='1'之類的
　　MYSQL中會有如下類似的警告
　　| Warning | 1739 | Cannot use ref access on index 'user_id' due to type or collation conversion on field 'user_id'                                                                                                                                             |
　　| Warning | 1739 | Cannot use range access on index 'user_id' due to type or collation conversion on field 'user_id'

　　比如這里的
　　c.user_id=17
　　而
　　user_id 是varchar類型不是int類型
　　又比如這里的
　　 a.gmt_create BETWEEN DATE_ADD(NOW(), INTERVAL - 600 MINUTE)
　　AND DATE_ADD(NOW(), INTERVAL 600 MINUTE)
　　這里
　　gmt_create varchar(30) 居然也是varchar 擦??！
　　5、比較字符集不同索引使用異常
　　這個關于字符串的比較問題我已經在文章里面有所描述
　　http://blog.itpub.net/7728585/viewspace-2141914/
　　簡單的說這里
　　a.seller_name=b.seller_name
　　a.seller_name 比較字符集是utf8_bin 區(qū)分大小寫
　　而
　　b.seller_name 是不區(qū)分大小寫的這是默認的。
　　他們之間做join必然被驅動表用不到索引使用異常。(innodb 可以icp)
　　也會有類似如下的警告：
　　 Cannot use ref access on index 'seller_name' due to type or collation conversion on field 'seller_name'

　　二、優(yōu)化原則問題
　　我們知道基本所有的語句執(zhí)行算法邏輯都在MYSQL層次，INNODB只是負責將數據通過幾種方式
　　(PAGE_CUR_G，PAGE_CUR_GE，PAGE_CUR_L，PAGE_CUR_LE)掃描出來，遞送給MYSQL層次進行處理，這之間存在掃描拿到
　　innodb record-->innodb tuple-->mysql record的一個轉換的過程，這個步驟大部分被標記為sending data過程
　　(update/delete為updating)，那么我們就有必要減少中間結果集的產生，來減少整個從innodb拿數據到MYSQL層的
　　整個數據量。這里以NJL的優(yōu)化原則為列解釋，因為這道題就是這個目的

　　1、減少NJL驅動結果集的數據
　　這事顯而易見的，減少驅動次數自然就減少了數據在innodb和mysql之間的傳遞
　　2、被驅動表的索引唯一性要盡量好
　　這個問題稍微難理解一點，但是仔細想一下也沒什么，如果被驅動表索引唯一性更好，那么通過索引回表的次數就少了，
　　這里可以通過rows和filter進行大概判斷，大概是因為他們本來就不準。
　　曾經我們就有一個列子也是同事田興春和我一起看的。一個被驅動表有兩個鏈接條件，一個索引唯一性很差，而唯一性好的連接
　　列上沒有索引，我們在唯一性好的列上建立了索引性能馬上提升了。

（編輯：武林網）

上一篇：Linux下安裝MySQL后root密碼的設定

下一篇：Mysql導出文件及備份操作筆記