全表扫描成本估算源码分析
目录1. 目的2.跟踪执行计划optimizer_trace方法3.源码中计算步骤4.核心源码详解4.1 主流程4.2 estimate_in_memory_buffer函数很重要在index扫描场景会调用该函数5.通过OPTIMIZER_TRACE去验证全表扫描成本源码逻辑结论1. 目的通过trace和gdb工具验证全表扫描成本估算源码逻辑这里有个注意事项这里开始一直无法复现通过gdb发现不带where条件的全表扫描是不走test_quick_select函数的简单总结1. 无WHERE 条件的全表扫描 - 不会调用 test_quick_select优化器直接选择全表扫描2. 有 WHERE 条件的查询- 会调用 test_quick_select 评估是否使用索引2.跟踪执行计划optimizer_trace方法set optimizer_trace_max_mem_size102400000; set optimizer_traceenabledon; SELECT * FROM platform_field FORCE INDEX (PRIMARY) WHERE id 0; select * from information_schema.optimizer_trace;3.源码中计算步骤// 计算全表扫描的成本 double scan_time cost_model-row_evaluate_cost(records) 1; // CPU 成本 Cost_estimate cost_est table-file-table_scan_cost(); // I/O 成本 cost_est.add_io(1.1); // 额外 I/O 成本 总的io成本额外io成本io成本4.核心源码详解4.1 主流程入口函数int test_quick_select DBUG_PRINT(enter,(keys_to_use: %lu prev_tables: %lu const_tables: %lu, (ulong) keys_to_use.to_ulonglong(), (ulong) prev_tables, (ulong) const_tables)); const Cost_model_server *const cost_model thd-cost_model(); /*核心目标获取成本模型如加载默认成本常量值 ROW_EVALUATE_COST 0.2 KEY_COMPARE_COST 0.05 MEMORY_TEMPTABLE_CREATE_COST 1.0 */ TABLE *const head tab-table(); /*获取表对象 */ ha_rows records head-file-stats.records; /* 获取表的统计信息中的记录数 */ if (!records) records; /* purecov: inspected */ double scan_time cost_model-row_evaluate_cost(static_castdouble(records)) 1; /* 计算cpu总成本 Cost_model_server::row_evaluate_cost(rows) - rows * m_server_cost_constants-row_evaluate_cost()- Server_cost_constants::row_evaluate_cost() - return m_row_evaluate_cost; // 返回常量值 0.2 核心目标CPU总成本 行数 × 单行评估成本scan_time records × 0.2 1 主要干了2件事获取ROW_EVALUATE_COST (0.2)。然后将该值和rows行数取乘积 | MySQL 版本 | ROW_EVALUATE_COST 默认值 | |-----------|------------------------| | MySQL 5.7 | 0.2 | | MySQL 8.0 | 0.1 | */ Cost_estimate cost_est head-file-table_scan_cost(); /* io成本 */ cost_est.add_io(1.1); /* 额外io成本*/ /* 计算io总成本 table_scan_cost()- scan_time() * table-cost_model()-page_read_cost(1.0)- return ulonglong2double(stats.data_file_length) / IO_SIZE 2; scan_time -stats.data_file_length: 数据文件的字节大小 -IO_SIZE: 通常是 4096 字节一个页面 -2: 额外的固定开销启动成本 page_read_cost(1.0) - buffer_block_read_cost(pages_in_mem) io_block_read_cost(pages_on_disk); -blocks * m_se_cost_constants-memory_block_read_cost() -默认值: 0.25MySQL 5.7和8.0.32 -含义: 从内存缓冲区读取一个块的成本 -配置: mysql.engine_cost 表中的 memory_block_read_cost -blocks * m_se_cost_constants-io_block_read_cost() -默认值: 1.0MySQL 5.7和8.0.32 -含义: 从磁盘读取一个块的成本 -配置: mysql.engine_cost 表中的 io_block_read_cost 计算单页读取成本:in_mem × buffer_cost on_disk × io_cost 这个page_read_cost函数的估算page在内存中的占比核心函数estimate_in_memory_buffer 然后分别计算buffer读io成本及磁盘读io成本总和出io的成本 */ cost_est.add_cpu(scan_time); /* 全表扫描的总成本 */ .... Opt_trace_object trace_range(trace, range_analysis); /*这里注意这个提示词*/ Opt_trace_object(trace, table_scan). add(rows, head-file-stats.records). add(cost, cost_est); total cost 启发式算法计算page在内存中的占比传入一个表的统计信息值来做计算 data_file_length stat_clustered_index_size × page_size - stat_clustered_index_size: 聚簇索引的页数 - page_size: 页大小默认 16KB 16384 字节 --这里获取 SELECT * FROM mysql.innodb_table_stats WHERE database_name otter AND table_name platform_field;4.2 estimate_in_memory_buffer函数很重要在index扫描场景会调用该函数double handler::estimate_in_memory_buffer(ulonglong table_index_size) const { // 获取内存缓冲区大小 longlong memory_buf_size get_memory_buffer_size(); if (memory_buf_size 0) memory_buf_size 100 * 1024 * 1024; // 默认 100 MB // 计算表在缓冲区的比例逻辑 const double percent_of_mem static_castdouble(table_index_size) / memory_buf_size; double in_mem_est; if (percent_of_mem 0.2) // 20% in_mem_est 1.0; // 完全在内存中 else if (percent_of_mem 1.0) // 100% in_mem_est 0.0; // 完全在磁盘上 else // 20% ~ 100% in_mem_est 1.0 - (percent_of_mem - 0.2) / 0.8; // 线性插值 return in_mem_est; }5.通过OPTIMIZER_TRACE去验证全表扫描成本源码逻辑结论基于8.0的环境进行分析的 mysql select count(*) from platform_field; ---------- | count(*) | ---------- | 6269 | ---------- 1 row in set (0.01 sec) # where条件很重要让其走到上述代码逻辑上range_analysis mysql show table status like platform_field ; --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | Name | Engine | Version | Row_format | Rows | Avg_row_length | Data_length | Max_data_length | Index_length | Data_free | Auto_increment | Create_time | Update_time | Check_time | Collation | Checksum | Create_options | Comment | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | platform_field | InnoDB | 10 | Dynamic | 6307 | 251 | 1589248 | 0 | 458752 | 3145728 | 7655 | 2024-12-09 17:47:50 | NULL | NULL | utf8mb3_general_ci | NULL | row_formatDYNAMIC | | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 1 row in set (0.00 sec) SELECT * FROM platform_field WHERE ID1000; SELECT * FROM information_schema.OPTIMIZER_TRACE\G { rows_estimation: [ { table: platform_field, range_analysis: { table_scan: { rows: 6307, cost: 657.05 }, potential_range_indexes: [ { index: PRIMARY, usable: true, key_parts: [ id ] }, 获取对应表在内存和磁盘的page数及比例 SELECT t.table_name, t.data_length AS table_size_bytes, ROUND(t.data_length / 1024 / 1024, 2) AS table_size_mb, v.variable_value AS buffer_pool_size, ROUND(v.variable_value / 1024 / 1024 / 1024, 2) AS buffer_pool_gb, ROUND(t.data_length / v.variable_value * 100, 2) AS percent_of_buffer, ROUND(t.data_length / 16384, 0) AS total_pages, CASE WHEN t.data_length / v.variable_value 0.2 THEN 100.0 WHEN t.data_length / v.variable_value 1.0 THEN 0.0 ELSE ROUND( (1.0 - (t.data_length / v.variable_value - 0.2) / 0.8) * 100, 2 ) END AS percent_in_memory, ROUND( t.data_length / 16384 * CASE WHEN t.data_length / v.variable_value 0.2 THEN 1.0 WHEN t.data_length / v.variable_value 1.0 THEN 0.0 ELSE (1.0 - (t.data_length / v.variable_value - 0.2) / 0.8) END, 0 ) AS pages_in_memory, ROUND( t.data_length / 16384 * ( 1 - CASE WHEN t.data_length / v.variable_value 0.2 THEN 1.0 WHEN t.data_length / v.variable_value 1.0 THEN 0.0 ELSE (1.0 - (t.data_length / v.variable_value - 0.2) / 0.8) END ), 0 ) AS pages_on_disk FROM information_schema.tables t CROSS JOIN ( SELECT variable_value FROM performance_schema.global_variables WHERE variable_name innodb_buffer_pool_size ) v WHERE t.table_schema otter AND t.table_name platform_field \G; *************************** 1. row *************************** TABLE_NAME: platform_field table_size_bytes: 1589248 table_size_mb: 1.52 buffer_pool_size: 10737418240 buffer_pool_gb: 10 percent_of_buffer: 0.01 total_pages: 97 percent_in_memory: 100 pages_in_memory: 97 pages_on_disk: 0 1 row in set (0.00 sec) 说明100%在内存中 cpu_cost6307*0.11631.7 io_cost97*0.2524.25 额外io1.1 总和是657.05 和上面的总成本对能对齐 从源码来看cpu成本计算不复杂复杂的是在io成本中特别是针对page在内存和磁盘中的分布情况 分为两个逻辑一种是在引擎中设置好的另一种通过启发式来算内存中page的数量及比例 核心函数estimate_in_memory_buffer