通过10104阅读hash join工作机制

      Hash Join 执行有三种工作机制: Optimal，onepass, multipass，两张表进行

hash join，通常是将小表在内存中built hash table，另一张表进行filter，hash值匹

配的行，再校验具体条件是否满足（因为我们知道hash计算时候，存在hash-

collision问题，这就意味着可能不同的行值，出现相同的hash值，所以必须进行校

验）。小表在内存中搭建hash 表，可能存在内存不足的问题，这就出现了上述描述

的三个工作机制，以下的部分是通过event 10104来窥探具体工作机制。

一、概述部分

1、hash area size的总体分配

    hash area size = hash_table size + bitmap table size + 管理空间大小

具体体现在跟踪文件内容如下：

Original hash-area size: 1009238
--数据库分配hash-area大小，由参数hash_area_size决定

Memory for slot table: 737280
--实际存放小表hash值的内存大小，我们可以把他称为hash table或者多个partitions

Calculated overhead for partitions and row/slot managers: 271958
--主要包含bitmap表和管理信息

2、hash table的分配

    hash table size = sum(partitions) = N * slots =N * Block_size * Multiblocks IO
     --每个分区的大小不一定相同
     --N：表示slot 的数量
    如：hash_table size = 18 * 8*5=720KB = 737280

Number of partitions: 8
--数据库将hash table的内存空间划分成8个区域

Number of slots: 18
--数据库的内存采用chunk方式管理，分为18个chunks，在跟踪文件中以slot或者cluster表示。

Multiblock IO: 5
--hash table的每次IO是5个blocks

Block size(KB): 8
--每个block的大小为8K

Cluster (slot) size(KB): 40
--每个slot的大小是40KB，具体计算：block_size * Multiblock IO = 8*5=40KB

Minimum number of bytes per block: 8160
--每个block最小可用空间为8160

3、bitmap表空间分配情况

Bit vector memory allocation(KB): 32
--bitmap表分配空间为32KB

Per partition bit vector length(KB): 4
--因为有8个partition，所以每个分区大小是4K

4、build表的信息
Maximum possible row length: 577
--最大存放的行数是577，而我们实际所需要build表的行数是500，所以我们可以大概猜测为optimal hash join。

二、Optimal hash join

1、hash表的总述

*** RowSrcId: 1 HASH JOIN BUILD HASH TABLE (PHASE 1) ***
Total number of partitions: 8
--总分区表是8个

Number of partitions which could fit in memory: 8
--在内存中我们所能装的分区表数量为8，意味着小表能全部放置在内存中

Number of partitions left in memory: 8
--留在内存中的分区表为8

Total number of slots in in-memory partitions: 8
--放在内存中分区表所用到的slots数量为8，即 8 *40 = 320KB

Total number of rows in in-memory partitions: 500
--放入内存中分区表的小表行数是500，意味着小表全部行放入了内存分区表

2、内存分区表具体存放行数的信息

### Partition Distribution ###
Partition:0    rows:67         clusters:1      slots:1      kept=1
Partition:1    rows:67         clusters:1      slots:1      kept=1
Partition:2    rows:46         clusters:1      slots:1      kept=1
Partition:3    rows:74         clusters:1      slots:1      kept=1
Partition:4    rows:57         clusters:1      slots:1      kept=1
Partition:5    rows:58         clusters:1      slots:1      kept=1
Partition:6    rows:74         clusters:1      slots:1      kept=1
Partition:7    rows:57         clusters:1      slots:1      kept=1

kept=1：表示该分区在内存中，如果kept=0表示分区表在磁盘中。

Final number of hash buckets: 1024
--hash表由1024个hash bucket来管理

3、hash表的具体存放信息：

645+282+77+18+2 =1024
### Hash table ###
Number of buckets with   0 rows:        645
Number of buckets with   1 rows:        282
Number of buckets with   2 rows:         77
Number of buckets with   3 rows:         18
Number of buckets with   4 rows:          2
Number of buckets with   5 rows:          0
Number of buckets with   6 rows:          0
Number of buckets with   7 rows:          0
Number of buckets with   8 rows:          0
Number of buckets with   9 rows:          0
Number of buckets with between 10 and 19 rows:          0
Number of buckets with between 20 and 29 rows:          0
Number of buckets with between 30 and 39 rows:          0
Number of buckets with between 40 and 49 rows:          0
Number of buckets with between 50 and 59 rows:          0
Number of buckets with between 60 and 69 rows:          0
Number of buckets with between 70 and 79 rows:          0
Number of buckets with between 80 and 89 rows:          0
Number of buckets with between 90 and 99 rows:          0
Number of buckets with 100 or more rows:          0

--hash表由hash bucket进行管理，从上面信息我们可以知道500行数据具体分布信息：
有282个hash bucket包含1行数据
有77个hash bucket包含2行数据
有18个hash bucket包含3行数据
有2个hash bucket包含4行数据

hash bucket数量=282+77+18+2 = 379

如果hash bucket分布异常不平均，如某个hash bucket所包含的行数特别多，有可能是小表中参与hash计算join columns的重复的值太多造成，需要重写语句，剔除该重复值，因为hash bucket下所包含的行数越多，消耗的CPU资源越大。

4、hash表的统计信息：
Total buckets: 1024 Empty buckets: 645 Non-empty buckets: 379
Total number of rows: 500
Maximum number of rows in a bucket: 4
Average number of rows in non-empty buckets: 1.319261

5、大表的使用bitmap进行匹配过滤信息
Used bitmap filtering: filtered rows=7 minimum required=50 out f=1000
Used bitmap filtering: filtered rows=114 minimum required=50 out f=1000
Used bitmap filtering: filtered rows=258 minimum required=50 out f=1000
Used bitmap filtering: filtered rows=427 minimum required=50 out f=1000
.....................................................

6、以上Optimal工作机制描述

A、先将hash area size 空间分为三部分，第一部分是ORACLE内部管理内存；第二部分是将内存的hash table分为8个分区表，hash table 由hash bucket管理，第三部分是bitmap表；

B、小表在内存中构建hash表，先将join columns的数值进行hash function计算，对应的值为hash bucket，如以上案例，进行hash function计算，500行数据，需要379个hash buckets来管理，我手工计算为376个，不知道为什么计算偏差3个。小表中所需的columns存放到内存hash table。hash bucket其所对应的bitmap设置为1。

C、大表在与小表进行hash join时候，先将join columns进行相同的hash function计算，计算结果与bitmap位图进行比对，如果该位图的bit=0，表示对应的hash bucket所管理的内存不存在小表rows，因此直接将该行抛弃，如果bit=1，则在对应的hash bucket中找到对应的存储小表行，与大表的行按照join columns条件，进行实际列数值匹配，如果匹配，表示该行符合查询条件，如果不匹配，则直接丢弃。

本站仅提供存储服务，所有内容均由用户发布，如发现有害或侵权内容，请点击举报。