本文共 3697 字,大约阅读时间需要 12 分钟。
-- 当我们需要排序字段时,比如order by name,如果name字段定义时没有not null时,就有可能涉及到null值的排序-- 如果不注意,可能会造成隐藏的bug,pg默认null是无穷大,在升序时排在最后面,当然在排序时也可以指定 nulls first 或 nulls last-- 具体使用方法在此不在复述,本文主要是讲的是在创建索引时指定 nulls first 或 nulls last 对查询的影响-- 基础创建索引语法如下CREATE INDEX [ ASC | DESC ] [ NULLS { FIRST | LAST } ] -- 在升序或降序时的默认值-- NULLS FIRSTSpecifies that nulls sort before non-nulls. This is the default when DESC is specified.-- NULLS LASTSpecifies that nulls sort after non-nulls. This is the default when DESC is not specified.--创建测试表create table t as select n id ,'rudy'||n as name ,n||'password' as password,now() + (n||' second')::interval as create_date from generate_seies(1,1000000) n;postgres=# update t set name = null where mod(id,2)=0;UPDATE 500000-- 构造随机的null值,本次测试取对半postgres=# create table t1 as select * from t; SELECT 1000000postgres=# create index on t(name);CREATE INDEXpostgres=# create index on t1(name nulls first);CREATE INDEX-- 在不指定 null first 时,由于pg默认null无限大,故当排序为升序时,t表能够使用索引,t1表却是使用全表扫描postgres=# explain verbose select * from t where id>1000 order by name limit 10; QUERY PLAN ------------------------------------------------------------------------------------------- Limit (cost=0.42..1.26 rows=10 width=33) Output: id, name, password, create_date -> Index Scan using idx_t_name on public.t (cost=0.42..83387.28 rows=999027 width=33) Output: id, name, password, create_date Filter: (t.id > 1000)(5 rows)postgres=# explain verbose select * from t1 where id>1000 order by name limit 10; QUERY PLAN ------------------------------------------------------------------------------- Limit (cost=40949.92..40949.94 rows=10 width=33) Output: id, name, password, create_date -> Sort (cost=40949.92..43447.52 rows=999041 width=33) Output: id, name, password, create_date Sort Key: t1.name -> Seq Scan on public.t1 (cost=0.00..19361.00 rows=999041 width=33) Output: id, name, password, create_date Filter: (t1.id > 1000)(8 rows)-- 在指定 nulls first 时,由于t1表创建索引时null值放在最前面,而且排序字段为升序,所以t1表使用索引,t表使用全表扫描postgres=# explain verbose select * from t where id>1000 order by name nulls first limit 10; QUERY PLAN ------------------------------------------------------------------------------ Limit (cost=46095.61..46095.64 rows=10 width=33) Output: id, name, password, create_date -> Sort (cost=46095.61..48593.18 rows=999027 width=33) Output: id, name, password, create_date Sort Key: t.name NULLS FIRST -> Seq Scan on public.t (cost=0.00..24507.00 rows=999027 width=33) Output: id, name, password, create_date Filter: (t.id > 1000)(8 rows)postgres=# explain verbose select * from t1 where id>1000 order by name nulls first limit 10; QUERY PLAN --------------------------------------------------------------------------------------------- Limit (cost=0.42..0.98 rows=10 width=33) Output: id, name, password, create_date -> Index Scan using t1_name_idx on public.t1 (cost=0.42..55921.45 rows=999041 width=33) Output: id, name, password, create_date Filter: (t1.id > 1000)(5 rows)-- 综上,为什么会出现这样的结果,因为我们默认创建的btree索引,其叶子结点的数据是有序排列的,当创建索引不指定nulls first时,pgl默认把null值放在叶子节点的最后-- 如果排序时只是order by name,未指定nulls first,pg只需要根据索引顺序的返回需要的数据则可,否则,如果order by name nulls last,pg如果使用索引,-- 其可能先在叶子未尾节点返回null值的数据,再在叶子起始节点开始返回数据(假设需要返回10条数据,null值为5条,非null值有5条),其明显此时使用索引不是高效的
转载地址:http://yicxl.baihongyu.com/