再检查执行计划:子查询物化后(select_type=DERIVED)参与 JOIN。虽然估算行扫描仍然为90万,但是利用了索引以及 LIMIT 子句后,实际执行时间变得很小。
- +----+-------------+------------+--------+---------------+---------+---------+-------+--------+----------------------------------------------------+
- | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
- +----+-------------+------------+--------+---------------+---------+---------+-------+--------+----------------------------------------------------+
- | 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 15 | Using temporary; Using filesort |
- | 1 | PRIMARY | u | eq_ref | PRIMARY | PRIMARY | 4 | o.uid | 1 | NULL |
- | 1 | PRIMARY | p | ALL | PRIMARY | NULL | NULL | NULL | 6 | Using where; Using join buffer (Block Nested Loop) |
- | 2 | DERIVED | o | index | NULL | idx_1 | 5 | NULL | 909112 | Using where |
- +----+-------------+------------+--------+---------------+---------+---------+-------+--------+----------------------------------------------------+
8、中间结果集下推
再来看下面这个已经初步优化过的例子(左连接中的主表优先作用查询条件):
- SELECT a.*,
- c.allocated
- FROM (
- SELECT resourceid
- FROM my_distribute d
- WHERE isdelete = 0
- AND cusmanagercode = '1234567'
- ORDER BY salecode limit 20) a
- LEFT JOIN
- (
- SELECT resourcesid, sum(ifnull(allocation, 0) * 12345) allocated
- FROM my_resources
- GROUP BY resourcesid) c
- ON a.resourceid = c.resourcesid
那么该语句还存在其它问题吗?不难看出子查询 c 是全表聚合查询,在表数量特别大的情况下会导致整个语句的性能下降。
其实对于子查询 c,左连接最后结果集只关心能和主表 resourceid 能匹配的数据。因此我们可以重写语句如下,执行时间从原来的2秒下降到2毫秒。
- SELECT a.*,
- c.allocated
- FROM (
- SELECT resourceid
- FROM my_distribute d
- WHERE isdelete = 0
- AND cusmanagercode = '1234567'
- ORDER BY salecode limit 20) a
- LEFT JOIN
- (
- SELECT resourcesid, sum(ifnull(allocation, 0) * 12345) allocated
- FROM my_resources r,
- (
- SELECT resourceid
- FROM my_distribute d
- WHERE isdelete = 0
- AND cusmanagercode = '1234567'
- ORDER BY salecode limit 20) a
- WHERE r.resourcesid = a.resourcesid
- GROUP BY resourcesid) c
- ON a.resourceid = c.resourcesid
但是子查询 a 在我们的SQL语句中出现了多次。这种写法不仅存在额外的开销,还使得整个语句显的繁杂。使用 WITH 语句再次重写:
- WITH a AS
- (
- SELECT resourceid
- FROM my_distribute d
- WHERE isdelete = 0
- AND cusmanagercode = '1234567'
- ORDER BY salecode limit 20)
- SELECT a.*,
- c.allocated
- FROM a
- LEFT JOIN
- (
- SELECT resourcesid, sum(ifnull(allocation, 0) * 12345) allocated
- FROM my_resources r,
- a
- WHERE r.resourcesid = a.resourcesid
- GROUP BY resourcesid) c
- ON a.resourceid = c.resourcesid
总结
数据库编译器产生执行计划,决定着SQL的实际执行方式。但是编译器只是尽力服务,所有数据库的编译器都不是尽善尽美的。
上述提到的多数场景,在其它数据库中也存在性能问题。了解数据库编译器的特性,才能避规其短处,写出高性能的SQL语句。
程序员在设计数据模型以及编写SQL语句时,要把算法的思想或意识带进来。
编写复杂SQL语句要养成使用 WITH 语句的习惯。简洁且思路清晰的SQL语句也能减小数据库的负担 。 (编辑:晋中站长网)
【声明】本站内容均来自网络,其相关言论仅代表作者个人观点,不代表本站立场。若无意侵犯到您的权利,请及时与联系站长删除相关内容!
|