Teradata SQLパフォーマンス改善Tips(2) - not in で比較する列にはnot null制約を付ける


こんなSQLでパフォーマンスが出ない場合、

	SELECT *
	FROM t1
	WHERE c1 NOT IN (
		SELECT c1 FROM t2
	)

not inで比較してる列=t1.c1, t2.c1 にnot null制約が付いているか確認してみるといい。
もしもc1にnullがあり得ないにもかかわらず、not null制約が抜け落ちているなら、制約を付けることによって、多分2倍ぐらい速くなるだろう。


t1.c1, t2.c1にnot null制約が付いている場合、上記SQLの実行計画(の核になる部分)はこうなる:

	  4) We execute the following steps in parallel. 
	       1) We do an all-AMPs RETRIEVE step from TST.t1 by way of an
	          all-rows scan with no residual conditions into Spool 2
	          (all_amps), which is built locally on the AMPs.  Then we do a
	          SORT to order Spool 2 by the hash code of (TST.t1.c1). 
	          The size of Spool 2 is estimated with low confidence to be
	          1,537,850 rows (138,406,500 bytes).  The estimated time for
	          this step is 1 minute and 11 seconds. 
	       2) We do an all-AMPs RETRIEVE step from TST.t2 by way of an
	          all-rows scan with no residual conditions into Spool 3
	          (all_amps), which is duplicated on all AMPs.  Then we do a
	          SORT to order Spool 3 by the hash code of (TST.t2.c1). 
	          The size of Spool 3 is estimated with low confidence to be
	          149,716 rows (3,293,752 bytes).  The estimated time for this
	          step is 3.34 seconds. 
	  5) We do an all-AMPs JOIN step from Spool 2 (Last Use) by way of an
	     all-rows scan, which is joined to Spool 3 (Last Use) by way of an
	     all-rows scan.  Spool 2 and Spool 3 are joined using an exclusion
	     merge join, with a join condition of ("c1 = c1"). 
	     The result goes into Spool 1 (group_amps), which is built locally
	     on the AMPs.  The size of Spool 1 is estimated with index join
	     confidence to be 1,537,850 rows (150,709,300 bytes).  The
	     estimated time for this step is 22.97 seconds. 
  1. 150万件のt1を各AMP上でc1順にソート
  2. 7.5万件のt2を全AMPにコピーしてc1順にソート
  3. 1,2の結果をexclusion merge joinして「t2にないもの」を抽出

という、素直な実行計画になっている。


ここで、t1.c1, t2.c1からnot null制約を削除すると、こんな風になってしまう:

	  4) We do an all-AMPs SUM step to aggregate from TST.t1 by way of an
	     all-rows scan with no residual conditions.  Aggregate Intermediate
	     Results are computed globally, then placed in Spool 4. 
	  5) We do an all-AMPs SUM step to aggregate from TST.t2 by way of an
	     all-rows scan with no residual conditions.  Aggregate Intermediate
	     Results are computed globally, then placed in Spool 7. 
	  6) We execute the following steps in parallel. 
	       1) We do an all-AMPs RETRIEVE step from Spool 7 (Last Use) by
	          way of an all-rows scan into Spool 2 (all_amps), which is
	          duplicated on all AMPs. 
	       2) We do an all-AMPs RETRIEVE step from Spool 4 (Last Use) by
	          way of an all-rows scan into Spool 3 (all_amps), which is
	          duplicated on all AMPs. 
	  7) We execute the following steps in parallel. 
	       1) We do an all-AMPs RETRIEVE step from TST.t1 by way of an
	          all-rows scan with no residual conditions into Spool 6
	          (all_amps), which is built locally on the AMPs.  Then we do a
	          SORT to order Spool 6 by the hash code of (TST.t1.c1),
	          and null value information in Spool 3 and Spool 2.  Skip this
	          retrieve step if null exists.  The size of Spool 6 is
	          estimated with low confidence to be 1,535,724 rows (
	          138,215,160 bytes).  The estimated time for this step is 1
	          minute and 11 seconds. 
	       2) We do an all-AMPs RETRIEVE step from TST.t2 by way of an
	          all-rows scan with no residual conditions into Spool 9
	          (all_amps), which is duplicated on all AMPs.  Then we do a
	          SORT to order Spool 9 by the hash code of (TST.t2.c1),
	          and null value information in Spool 3 and Spool 2.  Skip this
	          retrieve step if null exists.  The size of Spool 9 is
	          estimated with low confidence to be 149,716 rows (3,293,752
	          bytes).  The estimated time for this step is 3.34 seconds. 
	  8) We execute the following steps in parallel. 
	       1) We do an all-AMPs JOIN step from Spool 6 (Last Use) by way of
	          an all-rows scan, which is joined to Spool 9 (Last Use) by
	          way of an all-rows scan.  Spool 6 and Spool 9 are joined
	          using an exclusion merge join, with a join condition of (
	          "c1 = c1"), and null value information in Spool
	          3 and Spool 2.  Skip this join step if null exists.  The
	          result goes into Spool 1 (group_amps), which is built locally
	          on the AMPs.  The size of Spool 1 is estimated with index
	          join confidence to be 1,535,724 rows (150,500,952 bytes). 
	          The estimated time for this step is 29.22 seconds. 
	       2) We do an all-AMPs RETRIEVE step from TST.t1 by way of an
	          all-rows scan with no residual conditions into Spool 10
	          (all_amps), which is built locally on the AMPs.  Then we do a
	          SORT to order Spool 10 by the hash code of (TST.t1.c1),
	          and null value information in Spool 3 and Spool 2.  Skip this
	          retrieve step if there is no null.  The size of Spool 10 is
	          estimated with low confidence to be 1,535,724 rows (
	          138,215,160 bytes).  The estimated time for this step is 1
	          minute and 11 seconds. 
	       3) We do an all-AMPs RETRIEVE step from TST.t2 by way of an
	          all-rows scan with no residual conditions into Spool 11
	          (all_amps), which is duplicated on all AMPs.  Then we do a
	          SORT to order Spool 11 by the hash code of (TST.t2.c1),
	          and null value information in Spool 3 and Spool 2.  Skip this
	          retrieve step if there is no null.  The size of Spool 11 is
	          estimated with low confidence to be 149,716 rows (3,293,752
	          bytes).  The estimated time for this step is 3.34 seconds. 
	  9) We do an all-AMPs JOIN step from Spool 10 (Last Use) by way of an
	     all-rows scan, which is joined to Spool 11 (Last Use) by way of an
	     all-rows scan.  Spool 10 and Spool 11 are joined using an
	     exclusion merge join, with a join condition of ("c1 =
	     c1"), and null value information in Spool 3 (Last Use) and
	     Spool 2 (Last Use).  Skip this join step if there is no null.  The
	     result goes into Spool 1 (group_amps), which is built locally on
	     the AMPs.  The size of Spool 1 is estimated with index join
	     confidence to be 1,535,724 rows (150,500,952 bytes).  The
	     estimated time for this step is 29.22 seconds. 

正直、これは何をやっているのか読み切れない。が、t1, t2ともに最低2回ずつテーブルスキャンしているので、not null版よりかなり遅くなることが予想できる。
not inは、サブクエリが返す結果表に1行でもnullが含まれている場合、メインクエリの結果を0件にする必要があるので、その辺の手当てのためにStepが増えているようだ。
何やってるかを理解するのは元気なときにやることにして、とりあえず「本来あるべきnot nullの付け忘れは、not in の動作を遅くする」ということだけ覚えておこう。


ちなみに上記SQLを not exists で書き直すと、not null制約を付け忘れたままでも、速い方の実行計画が出てくる。
これは「not existsはサブクエリ側の t2.c1 にnullが混じっていても、(not nullのように)メインクエリの結果が0件にはならない」ことに対応している、のだろう。