Upgrade to Pro — share decks privately, control downloads, hide ads and more …

RMUG - MySQL Indexes and Histograms

lefred
November 03, 2023

RMUG - MySQL Indexes and Histograms

MySQL, the most popular Open Source Database is easy to use. But you can improve your queries using indexes and histograms. Join this session to learn how to create optimal indexes and histograms for your queries.

Learn how to display and understand the Query Execution Plan of your queries, find those that need to be optimized, ...

Latest MySQL Innovation 8.2 features for EXPLAIN are also covered.

lefred

November 03, 2023
Tweet

More Decks by lefred

Other Decks in Technology

Transcript

  1. Frédéric Descamps Community Manager Oracle MySQL RIGA MySQL User Group

    - November 2023 MySQL Indexes and Histograms How They Really Speed Up Your Queries
  2. @lefred MySQL Evangelist using MySQL since version 3.20 devops believer

    living in h ps://lefred.be Frédéric Descamps Copyright @ 2023 Oracle and/or its affiliates. 3
  3. Speed and Performance ! Nobody ever complains that the database

    is too fast ! Copyright @ 2023 Oracle and/or its affiliates. 5
  4. Speed and Performance ! Nobody ever complains that the database

    is too fast ! It's often the reverse... blame the database ! Copyright @ 2023 Oracle and/or its affiliates. 5
  5. Speed and Performance ! Nobody ever complains that the database

    is too fast ! It's often the reverse... blame the database ! Speeding up queries is not a 'dark art'... but understanding how to speed up queries is often treated as magic. Copyright @ 2023 Oracle and/or its affiliates. 5
  6. Speed and Performance ! Nobody ever complains that the database

    is too fast ! It's often the reverse... blame the database ! Speeding up queries is not a 'dark art'... but understanding how to speed up queries is often treated as magic. So we will be looking at the proper use of indexes, histograms and see how to keep the rigth balance for your workload. Copyright @ 2023 Oracle and/or its affiliates. 5
  7. No coverage today of: System Con guration OS MySQL Hardware

    Networking/Cloud This is a dry subject ! Do not try to absorb all the content at once, get the slides (h ps://speakerdeck.com/lefred) Copyright @ 2023 Oracle and/or its affiliates. 6
  8. Daniel Nichter, E cient MySQL Performance - Best Practices and

    Techniques, O'Reilly, 2021 Query Response time Query response time is the only metric anyone truly cares about [...] because query response time is the only metric we experience. When a query takes 7.5 seconds to execute, we experience 7.5 seconds of impatience. That same query might examine a million rows, but we don't experience a million rows examined. Our time is precious.(*) Copyright @ 2023 Oracle and/or its affiliates. 8
  9. What are bad queries ? We can de ne bad

    queries in two di erent categories: Copyright @ 2023 Oracle and/or its affiliates. 11
  10. queries called way to often queries that are way too

    slow full table scan using lesort using temporary tables What are bad queries ? We can de ne bad queries in two di erent categories: Copyright @ 2023 Oracle and/or its affiliates. 11
  11. If there could be only one ? If you should

    optimize only one query, the best candidate should be the query that consumes the most of the execution time (seen as latency in PFS, but usually called "response time"). Copyright @ 2023 Oracle and/or its affiliates. 13
  12. If there could be only one ? If you should

    optimize only one query, the best candidate should be the query that consumes the most of the execution time (seen as latency in PFS, but usually called "response time"). sys Schema contains all the necessary info to nd that ugly Duckling: Copyright @ 2023 Oracle and/or its affiliates. 13
  13. If there could be only one ? If you should

    optimize only one query, the best candidate should be the query that consumes the most of the execution time (seen as latency in PFS, but usually called "response time"). sys Schema contains all the necessary info to nd that ugly Duckling: SELECT SELECT schema_name schema_name, , format_pico_time format_pico_time( (total_latency total_latency) ) tot_lat tot_lat, , exec_count exec_count, , format_pico_time format_pico_time( (total_latency total_latency/ /exec_count exec_count) ) latency_per_call latency_per_call, , query_sample_text query_sample_text FROM FROM sys sys. .x$statements_with_runtimes_in_95th_percentile x$statements_with_runtimes_in_95th_percentile AS AS t1 t1 JOIN JOIN performance_schema performance_schema. .events_statements_summary_by_digest events_statements_summary_by_digest AS AS t2 t2 ON ON t2 t2. .digest digest= =t1 t1. .digest digest WHERE WHERE schema_name schema_name NOT NOT in in ( ('performance_schema' 'performance_schema', , 'sys' 'sys') ) ORDER ORDER BY BY ( (total_latency total_latency/ /exec_count exec_count) ) desc desc LIMIT LIMIT 1 1\G \G Copyright @ 2023 Oracle and/or its affiliates. 13
  14. If there could be only one ? If you should

    optimize only one query, the best candidate should be the query that consumes the most of the execution time (seen as latency in PFS, but usually called "response time"). sys Schema contains all the necessary info to nd that ugly Duckling: SELECT SELECT schema_name schema_name, , format_pico_time format_pico_time( (total_latency total_latency) ) tot_lat tot_lat, , exec_count exec_count, , format_pico_time format_pico_time( (total_latency total_latency/ /exec_count exec_count) ) latency_per_call latency_per_call, , query_sample_text query_sample_text FROM FROM sys sys. .x$statements_with_runtimes_in_95th_percentile x$statements_with_runtimes_in_95th_percentile AS AS t1 t1 JOIN JOIN performance_schema performance_schema. .events_statements_summary_by_digest events_statements_summary_by_digest AS AS t2 t2 ON ON t2 t2. .digest digest= =t1 t1. .digest digest WHERE WHERE schema_name schema_name NOT NOT in in ( ('performance_schema' 'performance_schema', , 'sys' 'sys') ) ORDER ORDER BY BY ( (total_latency total_latency/ /exec_count exec_count) ) desc desc LIMIT LIMIT 1 1\G \G Copyright @ 2023 Oracle and/or its affiliates. *************************** 1. row *************************** schema_name: piday tot_lat: 4.29 h exec_count: 5 latency_per_call: 51.51 min query_sample_text: select a.device_id, max(a.value) as `max temp`, min(a.value) as `min temp`, avg(a.value) as `avg temp`, max(b.value) as `max humidity`, min(b.value) as `min humidity`, avg(b.value) as `avg humidity` from temperature_history a join humidity_history b on b.device_id=a.device_id where date(a.time_stamp) = date(now()) and date(b.time_stamp)=date(now()) group by device_id 13
  15. More info about Queries Sys Schema contains all the required

    information in these tables : statements_with_full_table_scans statements_with_runtimes_in_95th_percentile statements_with_sorting statements_with_temp_tables Copyright @ 2023 Oracle and/or its affiliates. 14
  16. More info about Queries Sys Schema contains all the required

    information in these tables : statements_with_full_table_scans statements_with_runtimes_in_95th_percentile statements_with_sorting statements_with_temp_tables And since MySQL 8.0 you can join the table performance_schema.events_statements_summary_by_digest to have a sample you can use. Copyright @ 2023 Oracle and/or its affiliates. 14
  17. More info about Queries Sys Schema contains all the required

    information in these tables : statements_with_full_table_scans statements_with_runtimes_in_95th_percentile statements_with_sorting statements_with_temp_tables And since MySQL 8.0 you can join the table performance_schema.events_statements_summary_by_digest to have a sample you can use. We will check the meaning of this tables in some slides... be patient ;) Copyright @ 2023 Oracle and/or its affiliates. 14
  18. The MySQL Optimizer Consider the Optimizer the brain and nervous

    system of MySQL Query optimization is a feature of many Relational Database Management Systems The query optimizer a empts to determine the most e ective way to execute a given query by considering the possible query plans (h ps://en.wikipedia.org/wiki/Query_optimization) Copyright @ 2023 Oracle and/or its affiliates. 17
  19. The MySQL Optimizer - estimation One of the hardest problems

    in query optimization is to accurately estimate the costs of alternative query plans. These costs are the result of a mathematical model of query execution costs that relies heavily on estimates of the cardinality, or number of tuple, owing through each edge in a query plan. Copyright @ 2023 Oracle and/or its affiliates. 18
  20. The MySQL Optimizer - evaluation of the options The Optimizer

    wants to get your data the cheapest way possible. Like a route planner, the cost is built on historical statistics. And these statistics can change while the optimizer is working. So like a tra c jam, washed out road, or other tra c problem, the optimizer may be making poor decisions for the present situation... but this is very rare ! The nal determination from the optimizer is called the Query Execution Plan (or QEP, or Query Plan). MySQL wants to optimize each query every time it sees it (there is no locking down the query plan like Oracle). Copyright @ 2023 Oracle and/or its affiliates. 19
  21. 120 if your query has ve joins the optimizer may

    have to evaluate 120 di erent options 5! (5 * 4 * 3 * 2 * 1) Copyright @ 2023 Oracle and/or its affiliates. 20
  22. EXPLAIN is the command used to obtain the Query Execution

    Plan for a query including information about how tables are joined and in which order, which indexes are used and estimation of rows, ... EXPLAIN Syntax Copyright @ 2023 Oracle and/or its affiliates. 22
  23. EXPLAIN Example this is an ESTIMATION on how MySQL would

    run the query as it is not executed ! Copyright @ 2023 Oracle and/or its affiliates. 23
  24. system: the table contains excatly 1 row const: at most

    1 row is matched for the table eq_ref: the table is the right-hand table in a join where the condition is on a PK or not null unique key. ref: the table is ltered by a nonunique secondary index. ref_or_null: the same as ref but the ltered column may also be NULL. index_merge: the Optimizer chooses a combination of two or more indexes to resolve a lter that includes an OR or AND between columns in di erent indexes. fulltext: use of a full text index to lter the table. range: this is used when an index is used to look up several values either in sequence or in groups. EXPLAIN - Access type Copyright @ 2023 Oracle and/or its affiliates. 24
  25. index: the Optimizer chosen to perform a full index scan.

    ALL: full table scan !! EXPLAIN - Access type (2) Get much more info an examples in Chapter 20, Analyzing Queries from Jesper Wisborg Krogh's book: MySQL 8 Query Performance Tuning, Apress, 2020. Copyright @ 2023 Oracle and/or its affiliates. 25
  26. EXPLAIN FORMAT=JSON Example this is the most detailed estimation !

    Copyright @ 2023 Oracle and/or its affiliates. 29
  27. can we know the real numbers ? Copyright @ 2023

    Oracle and/or its affiliates. 30
  28. Estimated cost Actual execution statistics Time to return rst row

    Time to return all rows Number of rows returned Number of loops EXPLAIN ANALYZE Copyright @ 2023 Oracle and/or its affiliates. 31
  29. Estimated cost Actual execution statistics Time to return rst row

    Time to return all rows Number of rows returned Number of loops Instruments and executes the query EXPLAIN ANALYZE Copyright @ 2023 Oracle and/or its affiliates. 31
  30. More with EXPLAIN It's also possible to get the QEP

    of a running connection using EXPLAIN FOR CONNECTION: show show processlist processlist; ; + +----+-------+--------+------+---------+-------+--------------+-----------------------------------------+ ----+-------+--------+------+---------+-------+--------------+-----------------------------------------+ | | Id Id | | User User | | Host Host | | db db | | Command Command | | Time Time | | State State | | Info Info | | + +----+-------+--------+------+---------+-------+--------------+-----------------------------------------+ ----+-------+--------+------+---------+-------+--------------+-----------------------------------------+ | | 8 8 | | root root | | localh localh | | NULL NULL | | Query Query | | 0 0 | | init init | | show show processlist processlist | | | | 9 9 | | root root | | localh localh | | test test | | Query Query | | 7 7 | | User User sleep sleep | | select select * *, , SLEEP SLEEP( (10 10) ) from from foo foo where where id id> >3 3 | | + +----+-------+--------+------+---------+-------+--------------+-----------------------------------------+ ----+-------+--------+------+---------+-------+--------------+-----------------------------------------+ 3 3 rows rows in in set set, , 0 0 warning warning ( (0.0006 0.0006 sec sec) ) explain explain for for connection connection 9 9; ; + +----+-------------+-------+---------+-------+---------------+---------+---------+------+------+----------+-------------+ ----+-------------+-------+---------+-------+---------------+---------+---------+------+------+----------+-------------+ | | id id | | select_type select_type | | table table | | partit partit. . | | type type | | possible_keys possible_keys | | key key | | key_len key_len | | ref ref | | rows rows | | filtered filtered | | Extra Extra | | + +----+-------------+-------+---------+-------+---------------+---------+---------+------+------+----------+-------------+ ----+-------------+-------+---------+-------+---------------+---------+---------+------+------+----------+-------------+ | | 1 1 | | SIMPLE SIMPLE | | foo foo | | NULL NULL | | range range | | PRIMARY PRIMARY | | PRIMARY PRIMARY | | 4 4 | | NULL NULL | | 2 2 | | 100 100 | | Using Using where where | | + +----+-------------+-------+---------+-------+---------------+---------+---------+------+------+----------+-------------+ ----+-------------+-------+---------+-------+---------------+---------+---------+------+------+----------+-------------+ Copyright @ 2023 Oracle and/or its affiliates. 37
  31. More with EXPLAIN (2) Since MySQL 8.2 you can secify

    the default format of EXPLAIN using the variable explain_format. When using FORMAT=json it's also possible to save the QEP into a variable using the option INTO: EXPLAIN EXPLAIN FORMAT FORMAT= =JSON JSON INTO INTO @myselect @myselect SELECT SELECT name name FROM FROM a a WHERE WHERE id id = = 2 2; ; It's alo possible to use EXPLAIN FOR SCHEMA now. Copyright @ 2023 Oracle and/or its affiliates. 38
  32. Indexes nd rows with speci c column values quickly Copyright

    @ 2023 Oracle and/or its affiliates. 40
  33. Indexes A database index is a data structure that improves

    the speed of data retrieval operations on a database table at the cost of additional writes and storage space to maintain the index data structure. Indexes are used to quickly locate data without having to search every row in a database table every time a database table is accessed. Indexes can be created using one or more columns of a database table, providing the basis for both rapid random lookups and e cient access of ordered records. (h ps://en.wikipedia.org/wiki/Database_index) Copyright @ 2023 Oracle and/or its affiliates. 41
  34. MySQL supports multiple kind of indexes: primary key / clustered

    index secondary index full-text index spatial index Indexes in MySQL Copyright @ 2023 Oracle and/or its affiliates. 42
  35. pre x of a column mutli-column unique covering functional multi-value

    Indexes in MySQL (2) MySQL supports 2 types of indexes: BTREE HASH and Indexes can have some "properties": Copyright @ 2023 Oracle and/or its affiliates. 43
  36. Clustered Indexes Each InnoDB table has a special index called

    the clustered index that stores row data. Typically, the clustered index is synonymous with the primary key. Copyright @ 2023 Oracle and/or its affiliates. 44
  37. Clustered Indexes Each InnoDB table has a special index called

    the clustered index that stores row data. Typically, the clustered index is synonymous with the primary key. Let's check now an example on how we usually mentally represent a table and an index. Copyright @ 2023 Oracle and/or its affiliates. 44
  38. Mental representation of table and an index Table 1 Indexed

    Column column 1 column 2 column x column n Index Copyright @ 2023 Oracle and/or its affiliates. 45
  39. Mental representation of table and an index Table 1 Index

    Indexed Column column 1 column 2 column x column n 5001 5001 a b 2022-03-14 NULL Copyright @ 2023 Oracle and/or its affiliates. 46
  40. Mental representation of a table and an index Table 1

    Index Indexed Column column 1 column 2 column x column n 5001 3 3 5001 a a c b 2022-03-14 2022-03-15 NULL NULL Copyright @ 2023 Oracle and/or its affiliates. 47
  41. Mental representation of a table and an index Table 1

    Index Indexed Column column 1 column 2 column x column n 5001 3 3 6 a a c b 2022-03-14 2022-03-15 NULL NULL 6 5001 be f 2022-03-15 1 Copyright @ 2023 Oracle and/or its affiliates. 48
  42. Mental representation of a table and an index Table 1

    Index Indexed Column column 1 column 2 column x column n 5001 3 3 6 a a c b 2022-03-14 2022-03-15 NULL NULL 6 27 be f 2022-03-15 1 27 5001 it d 2022-03-16 101 Copyright @ 2023 Oracle and/or its affiliates. 49
  43. Mental representation of a table and an index Table 1

    Index Indexed Column column 1 column 2 column x column n 5001 3 3 6 a a c b 2022-03-14 2022-03-15 NULL NULL 6 27 be f 2022-03-15 1 27 12 it d 2022-03-16 101 12 5001 uk e 2022-03-22 NULL Copyright @ 2023 Oracle and/or its affiliates. 50
  44. Mental representation of a table and an index Table 1

    Index Indexed Column column 1 column 2 column x column n 5001 3 3 6 a a c b 2022-03-14 2022-03-15 NULL NULL 6 27 be f 2022-03-15 1 27 12 it d 2022-03-16 101 12 5001 uk e 2022-03-22 NULL Copyright @ 2023 Oracle and/or its affiliates. 51
  45. InnoDB representation of a table and PK Table 1 PRIMARY

    KEY column 1 column 2 column x column n 5001 a b 2022-03-14 NULL insert into table1 values (5001,'a','b','2022-03-14', NULL); Copyright @ 2023 Oracle and/or its affiliates. 52
  46. InnoDB representation of a table and PK Table 1 PRIMARY

    KEY column 1 column 2 column x column n 5001 a b 2022-03-14 NULL insert into table1 values (3,'c','a','2022-03-15', NULL); 3 a c 2022-03-15 NULL Copyright @ 2023 Oracle and/or its affiliates. 53
  47. InnoDB representation of a table and PK 6 be f

    2022-03-15 1 Table 1 PRIMARY KEY column 1 column 2 column x column n 5001 a b 2022-03-14 NULL insert into table1 values (6,'f','be','2022-03-15', 1); 3 a c 2022-03-15 NULL Copyright @ 2023 Oracle and/or its affiliates. 54
  48. InnoDB representation of a table and PK 6 be f

    2022-03-15 1 Table 1 PRIMARY KEY column 1 column 2 column x column n 5001 a b 2022-03-14 NULL insert into table1 values (27,'d','it','2022-03-16', 101); 3 a c 2022-03-15 NULL 27 it d 2022-03-16 101 Copyright @ 2023 Oracle and/or its affiliates. 55
  49. InnoDB representation of a table and PK 6 be f

    2022-03-15 1 Table 1 PRIMARY KEY column 1 column 2 column x column n 5001 a b 2022-03-14 NULL insert into table1 values (12,'e','uk','2022-03-22', NULL); 3 a c 2022-03-15 NULL 27 it d 2022-03-16 101 12 uk e 2022-03-22 NULL Copyright @ 2023 Oracle and/or its affiliates. 56
  50. InnoDB representation of a table and PK Table 1 PRIMARY

    KEY column 1 column 2 column x column n 5001 a b 2022-03-14 NULL 3 a c 2022-03-15 NULL 6 be f 2022-03-15 1 27 it d 2022-03-16 101 12 uk e 2022-03-22 NULL Copyright @ 2023 Oracle and/or its affiliates. 57
  51. InnoDB representation of a table and PK Table 1 PRIMARY

    KEY column 1 column 2 column x column n 5001 a b 2022-03-14 NULL 3 a c 2022-03-15 NULL 6 be f 2022-03-15 1 27 it d 2022-03-16 101 12 uk e 2022-03-22 NULL This is the clustered index representation: stored by order of Primary Key Copyright @ 2023 Oracle and/or its affiliates. 57
  52. InnoDB Primary Key InnoDB stores data in table spaces. And

    so far, we know that records are stored and sorted using the clustered index. The Primary Key is a key for the index that uniquely de ned for a row, should be immutable. InnoDB needs a PRIMARY KEY No NULL values are allowed Monotonically increasing use UUID_TO_BIN() if you must use UUIDs, otherwise avoid them Copyright @ 2023 Oracle and/or its affiliates. 58
  53. InnoDB Primary Key (2) What we don't know is that

    all secondary indexes also contain the primary key as the right- most column in the index (even if this is not exposed). That means when a secondary index is used to retrieve a record, two indexes are used: rst the secondary one pointing to the primary key that will be used to nally retrieve the record. Copyright @ 2023 Oracle and/or its affiliates. 59
  54. InnoDB Primary Key (2) What we don't know is that

    all secondary indexes also contain the primary key as the right- most column in the index (even if this is not exposed). That means when a secondary index is used to retrieve a record, two indexes are used: rst the secondary one pointing to the primary key that will be used to nally retrieve the record. When no primary key is de ned, the rst unique not null key is used. And if none is available, InnoDB will create an hidden primary key (6 bytes). Copyright @ 2023 Oracle and/or its affiliates. 59
  55. InnoDB Primary Key (2) What we don't know is that

    all secondary indexes also contain the primary key as the right- most column in the index (even if this is not exposed). That means when a secondary index is used to retrieve a record, two indexes are used: rst the secondary one pointing to the primary key that will be used to nally retrieve the record. When no primary key is de ned, the rst unique not null key is used. And if none is available, InnoDB will create an hidden primary key (6 bytes). The problem with such key is that you don’t have any control on it and worse, this value is global to all tables without primary keys and can be a contention problem if you perform multiple simultaneous writes on such tables (dict_sys->mutex). Copyright @ 2023 Oracle and/or its affiliates. 59
  56. InnoDB Primary Key (2) What we don't know is that

    all secondary indexes also contain the primary key as the right- most column in the index (even if this is not exposed). That means when a secondary index is used to retrieve a record, two indexes are used: rst the secondary one pointing to the primary key that will be used to nally retrieve the record. When no primary key is de ned, the rst unique not null key is used. And if none is available, InnoDB will create an hidden primary key (6 bytes). The problem with such key is that you don’t have any control on it and worse, this value is global to all tables without primary keys and can be a contention problem if you perform multiple simultaneous writes on such tables (dict_sys->mutex). If you use HA solutions, Primary Keys are mandatory ! Copyright @ 2023 Oracle and/or its affiliates. 59
  57. InnoDB Primary Key (3) Primary Keys impact how the values

    are inserted and the size of the secondary indexes. A non sequential PK can lead to many random IOPS. Copyright @ 2023 Oracle and/or its affiliates. 60
  58. Also, it's more and more common to use applications that

    generate complete random primary keys...that means if the Primary Key is not sequential, InnoDB will have to heavily re-balance all the pages on inserts. InnoDB Primary Key (3) Primary Keys impact how the values are inserted and the size of the secondary indexes. A non sequential PK can lead to many random IOPS. Copyright @ 2023 Oracle and/or its affiliates. 60
  59. InnoDB Primary Key (4) If we compare the same load

    (inserts) when using an auto_increment integer as Primary Key, we can see that only the latest pages are recently touched: Generated with h ps://github.com/jeremycole/innodb_ruby from @jeremycole Copyright @ 2023 Oracle and/or its affiliates. 61
  60. InnoDB Primary Key (5) < > Copyright @ 2023 Oracle

    and/or its affiliates. My legacy application didn't define any Primary Key, adding one (auto_increment) breaks the application ! What can I do ? 62
  61. InnoDB Primary Key (5) < > Copyright @ 2023 Oracle

    and/or its affiliates. My legacy application didn't define any Primary Key, adding one (auto_increment) breaks the application ! What can I do ? Easy, just create a new invisible column and define it as Primary Key ! 62
  62. select select * * from from actors actors; ; +

    +----------------+-----+ ----------------+-----+ | | name name | | age age | | + +----------------+-----+ ----------------+-----+ | | Al Pacino Al Pacino | | 80 80 | | | | Robert De Niro Robert De Niro | | 77 77 | | | | Joe Pesci Joe Pesci | | 78 78 | | | | Sharon Stone Sharon Stone | | 63 63 | | | | Diane Keaton Diane Keaton | | 75 75 | | | | Talia Shire Talia Shire | | 74 74 | | + +----------------+-----+ ----------------+-----+ Invisible column as Primary Key Copyright @ 2023 Oracle and/or its affiliates. 63
  63. select select * * from from actors actors; ; +

    +----------------+-----+ ----------------+-----+ | | name name | | age age | | + +----------------+-----+ ----------------+-----+ | | Al Pacino Al Pacino | | 80 80 | | | | Robert De Niro Robert De Niro | | 77 77 | | | | Joe Pesci Joe Pesci | | 78 78 | | | | Sharon Stone Sharon Stone | | 63 63 | | | | Diane Keaton Diane Keaton | | 75 75 | | | | Talia Shire Talia Shire | | 74 74 | | + +----------------+-----+ ----------------+-----+ Do we have a Primary Key ? Invisible column as Primary Key Copyright @ 2023 Oracle and/or its affiliates. 63
  64. Invisible column as Primary Key (2) Let's nd out by

    listing all tables where the clustered index was generated (internal hidden key): select select i i. .table_id table_id, , t t. .name name from from information_schema information_schema. .innodb_indexes i innodb_indexes i join join information_schema information_schema. .innodb_tables t innodb_tables t on on ( (i i. .table_id table_id = = t t. .table_id table_id) ) where where i i. .name name= ='GEN_CLUST_INDEX' 'GEN_CLUST_INDEX'; ; + +----------+------------------+ ----------+------------------+ | | table_id table_id | | name name | | + +----------+------------------+ ----------+------------------+ | | 1293 1293 | | hollywood hollywood/ /actors actors | | + +----------+------------------+ ----------+------------------+ 1 1 row row in in set set ( (0.0211 0.0211 sec sec) ) Copyright @ 2023 Oracle and/or its affiliates. 64
  65. Invisible column as Primary Key (3) We can verify with

    the table's de nition: show show create create table table actors\G actors\G * ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** * 1. 1. row row * ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** * Table Table: actors : actors Create Create Table Table: : CREATE CREATE TABLE TABLE ` `actors actors` ` ( ( ` `name name` ` varchar varchar( (20 20) ) DEFAULT DEFAULT NULL NULL, , ` `age age` ` int int unsigned unsigned DEFAULT DEFAULT NULL NULL ) ) ENGINE ENGINE= =InnoDB InnoDB DEFAULT DEFAULT CHARSET CHARSET= =utf8mb4 utf8mb4 COLLATE COLLATE= =utf8mb4_0900_ai_ci utf8mb4_0900_ai_ci Copyright @ 2023 Oracle and/or its affiliates. 65
  66. Invisible column as Primary Key (3) We can verify with

    the table's de nition: show show create create table table actors\G actors\G * ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** * 1. 1. row row * ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** * Table Table: actors : actors Create Create Table Table: : CREATE CREATE TABLE TABLE ` `actors actors` ` ( ( ` `name name` ` varchar varchar( (20 20) ) DEFAULT DEFAULT NULL NULL, , ` `age age` ` int int unsigned unsigned DEFAULT DEFAULT NULL NULL ) ) ENGINE ENGINE= =InnoDB InnoDB DEFAULT DEFAULT CHARSET CHARSET= =utf8mb4 utf8mb4 COLLATE COLLATE= =utf8mb4_0900_ai_ci utf8mb4_0900_ai_ci Now let's add a hidden column as primary key: alter alter table table actors actors add add id id int int unsigned unsigned auto_increment auto_increment primary primary key key invisible invisible first first; ; Copyright @ 2023 Oracle and/or its affiliates. 65
  67. select select * * from from actors actors; ; +

    +----------------+-----+ ----------------+-----+ | | name name | | age age | | + +----------------+-----+ ----------------+-----+ | | Al Pacino Al Pacino | | 80 80 | | | | Robert De Niro Robert De Niro | | 77 77 | | | | Joe Pesci Joe Pesci | | 78 78 | | | | Sharon Stone Sharon Stone | | 63 63 | | | | Diane Keaton Diane Keaton | | 75 75 | | | | Talia Shire Talia Shire | | 74 74 | | + +----------------+-----+ ----------------+-----+ Invisible column as Primary Key (4) We can now test again our application's queries: Copyright @ 2023 Oracle and/or its affiliates. 66
  68. select select * * from from actors actors; ; +

    +----------------+-----+ ----------------+-----+ | | name name | | age age | | + +----------------+-----+ ----------------+-----+ | | Al Pacino Al Pacino | | 80 80 | | | | Robert De Niro Robert De Niro | | 77 77 | | | | Joe Pesci Joe Pesci | | 78 78 | | | | Sharon Stone Sharon Stone | | 63 63 | | | | Diane Keaton Diane Keaton | | 75 75 | | | | Talia Shire Talia Shire | | 74 74 | | + +----------------+-----+ ----------------+-----+ show show create create table table actors\G actors\G * ** ** ** ** ** ** ** ** ** ** ** ** * 1. 1. row row * ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** * Table Table: actors : actors Create Create Table Table: : CREATE CREATE TABLE TABLE ` `actors actors` ` ( ( ` `id id` ` int int unsigned unsigned NOT NOT NULL NULL AUTO_INCREMENT AUTO_INCREMENT /*!80023 INVISIBLE */ /*!80023 INVISIBLE */, , ` `name name` ` varchar varchar( (20 20) ) DEFAULT DEFAULT NULL NULL, , ` `age age` ` int int unsigned unsigned DEFAULT DEFAULT NULL NULL, , PRIMARY PRIMARY KEY KEY ( (` `id id` `) ) ) ) ENGINE ENGINE= =InnoDB InnoDB AUTO_INCREMENT AUTO_INCREMENT= =7 7 DEFAULT DEFAULT CHARSET CHARSET= =utf8mb4 utf8mb4 COLLATE COLLATE= =utf8mb4_0900_ai_ci utf8mb4_0900_ai_ci Invisible column as Primary Key (4) We can now test again our application's queries: Copyright @ 2023 Oracle and/or its affiliates. 66
  69. Invisible column as Primary Key (5) < > Copyright @

    2023 Oracle and/or its affiliates. Great ! But what about inserts without specifying the columns ? 67
  70. Invisible column as Primary Key (5) < > insert insert

    into into actors actors values values ( ('James Caan' 'James Caan', , 81 81) ); ; Query OK Query OK, , 1 1 row row affected affected ( (0.0248 0.0248 sec sec) ) Copyright @ 2023 Oracle and/or its affiliates. Great ! But what about inserts without specifying the columns ? 67
  71. select select id id, , a a. .* * from

    from actors a actors a; ; + +----+----------------+-----+ ----+----------------+-----+ | | id id | | name name | | age age | | + +----+----------------+-----+ ----+----------------+-----+ | | 1 1 | | Al Pacino Al Pacino | | 80 80 | | | | 2 2 | | Robert De Niro Robert De Niro | | 77 77 | | | | 3 3 | | Joe Pesci Joe Pesci | | 78 78 | | | | 4 4 | | Sharon Stone Sharon Stone | | 63 63 | | | | 5 5 | | Diane Keaton Diane Keaton | | 75 75 | | | | 6 6 | | Talia Shire Talia Shire | | 74 74 | | | | 7 7 | | James Caan James Caan | | 81 81 | | + +----+----------------+-----+ ----+----------------+-----+ Invisible column as Primary Key (6) But if needed we have access to that PK id: Copyright @ 2023 Oracle and/or its affiliates. 68
  72. select select id id, , a a. .* * from

    from actors a actors a; ; + +----+----------------+-----+ ----+----------------+-----+ | | id id | | name name | | age age | | + +----+----------------+-----+ ----+----------------+-----+ | | 1 1 | | Al Pacino Al Pacino | | 80 80 | | | | 2 2 | | Robert De Niro Robert De Niro | | 77 77 | | | | 3 3 | | Joe Pesci Joe Pesci | | 78 78 | | | | 4 4 | | Sharon Stone Sharon Stone | | 63 63 | | | | 5 5 | | Diane Keaton Diane Keaton | | 75 75 | | | | 6 6 | | Talia Shire Talia Shire | | 74 74 | | | | 7 7 | | James Caan James Caan | | 81 81 | | + +----+----------------+-----+ ----+----------------+-----+ And this id is sequential, used as clustered index to store the data and externalized for replication ! Invisible column as Primary Key (6) But if needed we have access to that PK id: Copyright @ 2023 Oracle and/or its affiliates. 68
  73. Invisible column as Primary Key (7) Since MySQL 8.0.30 you

    can also enable GIPK mode ! GIPK mode is controlled by the sql_generate_invisible_primary_key server system variable. When MySQL is running in GIPK mode, a primary key is added to a table by the server, the column and key name is always my_row_id. Copyright @ 2023 Oracle and/or its affiliates. 69
  74. InnoDB Secondary Key Indexes other than the clustered index are

    known as secondary indexes. Remember that in InnoDB, each record in a secondary index contains the primary key columns for the row (right most), as well as the columns speci ed for the secondary index. InnoDB uses this primary key value to search for the row in the clustered index. If the Primary Key is long, the secondary indexes use more space. It's advantageous to have a short Primary Key. Copyright @ 2023 Oracle and/or its affiliates. 70
  75. Indexing on a pre x of a column create create

    index index part_of_name part_of_name on on city city ( (name name( (10 10) )) ); ; Only the rst 10 characters are indexed in this example and this can save space/speed. Copyright @ 2023 Oracle and/or its affiliates. 71
  76. Indexing on a pre x of a column (2) Let's

    compare the between this pre x index and an index using the full column: select select database_name database_name, , table_name table_name, , index_name index_name, , stat_value stat_value * * @ @@innodb_page_size @innodb_page_size as as size_in_bytes size_in_bytes from from mysql mysql. .innodb_index_stats innodb_index_stats where where stat_name stat_name = = 'size' 'size' and and database_name database_name= ='world' 'world' and and table_name table_name= ='city' 'city' and and index_name index_name like like '%name%' '%name%' order order by by size_in_bytes size_in_bytes desc desc; ; + +---------------+------------+--------------+---------------+ ---------------+------------+--------------+---------------+ | | database_name database_name | | table_name table_name | | index_name index_name | | size_in_bytes size_in_bytes | | + +---------------+------------+--------------+---------------+ ---------------+------------+--------------+---------------+ | | world world | | city city | | name_idx name_idx | | 212992 212992 | | | | world world | | city city | | part_of_name part_of_name | | 114688 114688 | | + +---------------+------------+--------------+---------------+ ---------------+------------+--------------+---------------+ Copyright @ 2023 Oracle and/or its affiliates. 72
  77. Indexing on a pre x of a column (3) Copyright

    @ 2023 Oracle and/or its affiliates. 73
  78. Indexing on a pre x of a column (4) Copyright

    @ 2023 Oracle and/or its affiliates. 74
  79. Indexing on a pre x of a column (4) We

    see that both indexes on name are candidates and the partial one got the preference. Copyright @ 2023 Oracle and/or its affiliates. 74
  80. Index key_len What does that 40 mean ? Copyright @

    2023 Oracle and/or its affiliates. 75
  81. Index key_len What does that 40 mean ? The key_len

    column indicates the length of the key that MySQL decided to use. Copyright @ 2023 Oracle and/or its affiliates. 75
  82. Index key_len (2) < > Copyright @ 2023 Oracle and/or

    its affiliates. Oh... Okay... but why 40 ? It doesn't make any sense, does it ? 76
  83. Index key_len (2) < > Copyright @ 2023 Oracle and/or

    its affiliates. Oh... Okay... but why 40 ? It doesn't make any sense, does it ? In fact, we indexed the first 10 characters of the 'name' column... but this uses utf8mb4 charset: 1 character is encoded in up to 4 bytes 10 x 4 bytes = 40 bytes per record in the index 76
  84. Multi-column Index It's also possible to index multiple columns in

    one single index: create create index index first_last_idx first_last_idx on on employees employees ( (first_name first_name, , last_name last_name) ); ; Copyright @ 2023 Oracle and/or its affiliates. 77
  85. Multi-column Index It's also possible to index multiple columns in

    one single index: create create index index first_last_idx first_last_idx on on employees employees ( (first_name first_name, , last_name last_name) ); ; This index will be work on ( rst_name, lastname) and ( rst_name) but not on (last_name). Put highest cardinality eld rst ! Copyright @ 2023 Oracle and/or its affiliates. 77
  86. Multi-column Index It's also possible to index multiple columns in

    one single index: create create index index first_last_idx first_last_idx on on employees employees ( (first_name first_name, , last_name last_name) ); ; This index will be work on ( rst_name, lastname) and ( rst_name) but not on (last_name). Put highest cardinality eld rst ! Indexes are parsed from left to right Copyright @ 2023 Oracle and/or its affiliates. 77
  87. Multi-column Index Example The value of key_len enables you to

    determine how many parts of a multiple-part key MySQL actually uses. Copyright @ 2023 Oracle and/or its affiliates. 78
  88. Multi-column Index Example The value of key_len enables you to

    determine how many parts of a multiple-part key MySQL actually uses. Copyright @ 2023 Oracle and/or its affiliates. show show create create table table employees\G employees\G * ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** * 1. 1. row row * ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** * Table Table: employees : employees Create Create Table Table: : CREATE CREATE TABLE TABLE ` `employees employees` ` ( ( ` `emp_no emp_no` ` int int NOT NOT NULL NULL, , ` `birth_date birth_date` ` date date NOT NOT NULL NULL, , ` `first_name first_name` ` varchar varchar( (14 14) ) NOT NOT NULL NULL, , ` `last_name last_name` ` varchar varchar( (16 16) ) NOT NOT NULL NULL, , ` `gender gender` ` enum enum( ('M' 'M', ,'F' 'F') ) NOT NOT NULL NULL, , ` `hire_date hire_date` ` date date NOT NOT NULL NULL, , PRIMARY PRIMARY KEY KEY ( (` `emp_no emp_no` `) ), , KEY KEY ` `first_last_idx first_last_idx` ` ( (` `first_name first_name` `, ,` `last_name last_name` `) ) ) ) ENGINE ENGINE= =InnoDB InnoDB DEFAULT DEFAULT CHARSET CHARSET= =utf8mb4 utf8mb4 COLLATE COLLATE= =utf8mb4_0900_ai_ci utf8mb4_0900_ai_ci select select ( (14 14* *4 4) )+ +2 2+ +( (16 16* *4 4) )+ +2 2; ; + +-------------------+ -------------------+ | | ( (14 14* *4 4) )+ +2 2+ +( (16 16* *4 4) )+ +2 2 | | ( (+ +2 2 --> VARCHAR's length is coded on 2 bytes) --> VARCHAR's length is coded on 2 bytes) + +-------------------+ -------------------+ | | 124 124 | | + +-------------------+ -------------------+ 78
  89. Multi-column Index Example (2) 14 x 4 + 2 =

    58 Copyright @ 2023 Oracle and/or its affiliates. 79
  90. Multi-column Index Example (3) the left-most part of the index

    cannot be used --> the index is not used Copyright @ 2023 Oracle and/or its affiliates. 80
  91. select select emp_no emp_no, , first_name first_name, , last_name last_name,

    , hire_date hire_date from from employees employees where where last_name last_name like like 'de%' 'de%' limit limit 10 10; ; select select emp_no emp_no, , first_name first_name, , last_name last_name, , hire_date hire_date from from employees employees where where last_name last_name like like 'de%' 'de%' order order by by first_name first_name limit limit 10 10; ; Multi-column Index: challenge What do you think about these two statements: [A] none uses the index [B] the left-one uses the index [C] the right-one uses the index Copyright @ 2023 Oracle and/or its affiliates. 81
  92. select select emp_no emp_no, , first_name first_name, , last_name last_name,

    , hire_date hire_date from from employees employees where where last_name last_name like like 'de%' 'de%' limit limit 10 10; ; select select emp_no emp_no, , first_name first_name, , last_name last_name, , hire_date hire_date from from employees employees where where last_name last_name like like 'de%' 'de%' order order by by first_name first_name limit limit 10 10; ; Multi-column Index: challenge What do you think about these two statements: [A] none uses the index [B] the left-one uses the index [C] the right-one uses the index Copyright @ 2023 Oracle and/or its affiliates. 81
  93. Multi-column Index: hashing values If you need to perform search

    of the exact value (not using like or range) of multiple large columns, some times it could be more e cient to use a hash function and index its result: alter alter table table employees employees add add column column hash_bin_names hash_bin_names binary binary( (16 16) ) generated always generated always as as ( (unhex unhex( (md5 md5( (concat concat( (first_name first_name, , last_name last_name) )) )) )) ) virtual virtual, , add add key key hash_bin_idx hash_bin_idx( (hash_bin_names hash_bin_names) ); ; Copyright @ 2023 Oracle and/or its affiliates. 82
  94. Multi-column Index: hashing values And now let's rewrite the query

    and exame the QEP: explain explain select select emp_no emp_no, , first_name first_name, , last_name last_name, , hire_date hire_date from from employees employees where where hash_bin_names hash_bin_names= =unhex unhex( (md5 md5( ('AamodDeville' 'AamodDeville') )) ) and and first_name first_name= ='Aamod' 'Aamod' and and last_name last_name like like 'Deville' 'Deville' order order by by first_name first_name limit limit 10 10\G \G * ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** * 1. 1. row row * ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** * id: id: 1 1 select_type: select_type: SIMPLE SIMPLE table table: employees : employees partitions: partitions: NULL NULL type type: ref : ref possible_keys: first_last_idx possible_keys: first_last_idx, ,hash_bin_idx hash_bin_idx key key: hash_bin_idx : hash_bin_idx key_len: key_len: 17 17 ref: const ref: const rows rows: : 1 1 filtered: filtered: 5 5 Extra: Extra: Using Using where where Copyright @ 2023 Oracle and/or its affiliates. 83
  95. Functional Indexes MySQL supports functional key parts that index expression

    values rather than column or column pre x values. Use of functional key parts enables indexing of values not stored directly in the table. Copyright @ 2023 Oracle and/or its affiliates. 84
  96. Functional Indexes MySQL supports functional key parts that index expression

    values rather than column or column pre x values. Use of functional key parts enables indexing of values not stored directly in the table. Let's suppose we want to retriev all employees that were hired in March: select select first_name first_name, , hire_date hire_date from from employees employees where where month month( (hire_date hire_date) )= =3 3; ; Copyright @ 2023 Oracle and/or its affiliates. 84
  97. Functional Indexes (2) Get the Query Execution Plan: explain explain

    select select first_name first_name, , hire_date hire_date from from employees employees where where month month( (hire_date hire_date) )= =3 3\G \G * ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** * 1. 1. row row * ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** * id: id: 1 1 select_type: select_type: SIMPLE SIMPLE table table: employees : employees partitions: partitions: NULL NULL type type: : ALL ALL possible_keys: possible_keys: NULL NULL key key: : NULL NULL key_len: key_len: NULL NULL ref: ref: NULL NULL rows rows: : 299379 299379 filtered: filtered: 100 100 Extra: Extra: Using Using where where Copyright @ 2023 Oracle and/or its affiliates. 85
  98. Functional Indexes (2) Get the Query Execution Plan: explain explain

    select select first_name first_name, , hire_date hire_date from from employees employees where where month month( (hire_date hire_date) )= =3 3\G \G * ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** * 1. 1. row row * ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** * id: id: 1 1 select_type: select_type: SIMPLE SIMPLE table table: employees : employees partitions: partitions: NULL NULL type type: : ALL ALL possible_keys: possible_keys: NULL NULL key key: : NULL NULL key_len: key_len: NULL NULL ref: ref: NULL NULL rows rows: : 299379 299379 filtered: filtered: 100 100 Extra: Extra: Using Using where where FULL TABLE SCAN ! Copyright @ 2023 Oracle and/or its affiliates. 85
  99. Functional Indexes (3) create create index index month_hire_idx month_hire_idx on

    on employees employees ( (( (month month( (hire_date hire_date) )) )) ); ; please mind the please mind the ( (( (. .. .. .) )) ) notation notation Copyright @ 2023 Oracle and/or its affiliates. 86
  100. Functional Indexes (3) create create index index month_hire_idx month_hire_idx on

    on employees employees ( (( (month month( (hire_date hire_date) )) )) ); ; please mind the please mind the ( (( (. .. .. .) )) ) notation notation explain explain select select first_name first_name, , hire_date hire_date from from employees employees where where month month( (hire_date hire_date) )= =3 3\G \G * ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** * 1. 1. row row * ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** * id: id: 1 1 select_type: select_type: SIMPLE SIMPLE table table: employees : employees partitions: partitions: NULL NULL type type: ref : ref possible_keys: month_hire_idx possible_keys: month_hire_idx key key: month_hire_idx : month_hire_idx key_len: key_len: 5 5 ref: const ref: const rows rows: : 51684 51684 filtered: filtered: 100 100 Extra: Extra: NULL NULL Copyright @ 2023 Oracle and/or its affiliates. 86
  101. Please Keep in Mind... If there is a choice between

    multiple indexes, MySQL normally uses the index that nds the smallest number of rows (the most selective index). MySQL can use indexes on columns more e ciently if they are declared as the same type and size. Copyright @ 2023 Oracle and/or its affiliates. 87
  102. NULL is used to designate a LACK of data: False

    0 True 1 Don't know NULL NULL Indexing NULL values really drives down the performances of indexes. Copyright @ 2023 Oracle and/or its affiliates. 88
  103. Before Invisible Indexes . doubt usefulness of index . remove

    that index . get phone/text/screams from power user about slow performance . the rest of the planet seems to need that dang index ! . recreate the index... and it can take a looooooong time After Invisible Indexes . doubt usefulness of index . make index invisible - optimizer can not see it! . get phone/text/screams from power user about slow performance . make index visible again . blame a problem on { network | hardware | cloud | a colleague} Invisible Indexes MySQL o ers the possibility to hide indexes from the optimizer. This feature is very useful for testing the relevance of indexes before deleting them. And very useful for the operations team. Copyright @ 2023 Oracle and/or its affiliates. 89
  104. How to use INVISIBLE INDEX alter alter table table employees

    employees alter alter index index first_last_idx invisible first_last_idx invisible; ; alter alter table table employees employees alter alter index index first_last_idx visible first_last_idx visible; ; Copyright @ 2023 Oracle and/or its affiliates. 90
  105. How to use INVISIBLE INDEX alter alter table table employees

    employees alter alter index index first_last_idx invisible first_last_idx invisible; ; alter alter table table employees employees alter alter index index first_last_idx visible first_last_idx visible; ; List all invisible indexes: select select table_name table_name, , index_name index_name, , is_visible is_visible from from information_schema information_schema. .statistics statistics where where is_visible is_visible= ='no' 'no' group group by by table_name table_name, , index_name index_name; ; + +------------+----------------+------------+ ------------+----------------+------------+ | | TABLE_NAME TABLE_NAME | | INDEX_NAME INDEX_NAME | | IS_VISIBLE IS_VISIBLE | | + +------------+----------------+------------+ ------------+----------------+------------+ | | employees employees | | first_last_idx first_last_idx | | NO NO | | + +------------+----------------+------------+ ------------+----------------+------------+ Copyright @ 2023 Oracle and/or its affiliates. 90
  106. Unused Indexes We learned that updating indexes that are not

    used can be expensive and increase the iops. Using sys Schema and innodb_index_stats it's possible to identify those unused indexes: select select database_name database_name, , table_name table_name, , t1 t1. .index_name index_name, , format_bytes format_bytes( (stat_value stat_value * * @ @@innodb_page_size @innodb_page_size) ) size size from from mysql mysql. .innodb_index_stats t1 innodb_index_stats t1 join join sys sys. .schema_unused_indexes t2 schema_unused_indexes t2 on on object_schema object_schema= =database_name database_name and and object_name object_name= =table_name table_name and and t2 t2. .index_name index_name= =t1 t1. .index_name index_name where where stat_name stat_name= ='size' 'size' order order by by stat_value stat_value desc desc; ; Copyright @ 2023 Oracle and/or its affiliates. 91
  107. Unused Indexes We learned that updating indexes that are not

    used can be expensive and increase the iops. Using sys Schema and innodb_index_stats it's possible to identify those unused indexes: select select database_name database_name, , table_name table_name, , t1 t1. .index_name index_name, , format_bytes format_bytes( (stat_value stat_value * * @ @@innodb_page_size @innodb_page_size) ) size size from from mysql mysql. .innodb_index_stats t1 innodb_index_stats t1 join join sys sys. .schema_unused_indexes t2 schema_unused_indexes t2 on on object_schema object_schema= =database_name database_name and and object_name object_name= =table_name table_name and and t2 t2. .index_name index_name= =t1 t1. .index_name index_name where where stat_name stat_name= ='size' 'size' order order by by stat_value stat_value desc desc; ; Copyright @ 2023 Oracle and/or its affiliates. select select database_name database_name, , table_name table_name, , t1 t1. .index_name index_name, , format_bytes format_bytes( (stat_value stat_value * * @ @@innodb_page_size @innodb_page_size) ) size size from from mysql mysql. .innodb_index_stats t1 innodb_index_stats t1 join join sys sys. .schema_unused_indexes t2 schema_unused_indexes t2 on on object_schema object_schema= =database_name database_name and and object_name object_name= =table_name table_name and and t2 t2. .index_name index_name= =t1 t1. .index_name index_name where where stat_name stat_name= ='size' 'size' and and database_name database_name= ="employees" "employees" order order by by stat_value stat_value desc desc; ; + +---------------+--------------+---------------------+-----------+ ---------------+--------------+---------------------+-----------+ | | database_name database_name | | table_name table_name | | index_name index_name | | size size | | + +---------------+--------------+---------------------+-----------+ ---------------+--------------+---------------------+-----------+ | | employees employees | | employees employees | | hash_bin_names2 hash_bin_names2 | | 9.52 9.52 MiB MiB | | | | employees employees | | employees employees | | month_year_hire_idx month_year_hire_idx | | 6.52 6.52 MiB MiB | | | | employees employees | | dept_emp dept_emp | | dept_no dept_no | | 5.52 5.52 MiB MiB | | | | employees employees | | dept_manager dept_manager | | dept_no dept_no | | 16.00 16.00 KiB KiB | | + +---------------+--------------+---------------------+-----------+ ---------------+--------------+---------------------+-----------+ 4 4 rows rows in in set set ( (0.0252 0.0252 sec sec) ) 91
  108. Duplicate Indexes And this is the same behaviour for duplicate

    indexes. There is no reason to keep maintaining them: select select t2 t2. .* *, , format_bytes format_bytes( (stat_value stat_value * * @ @@innodb_page_size @innodb_page_size) ) size size from from mysql mysql. .innodb_index_stats t1 innodb_index_stats t1 join join sys sys. .schema_redundant_indexes t2 schema_redundant_indexes t2 on on table_schema table_schema= =database_name database_name and and t2 t2. .table_name table_name= =t1 t1. .table_name table_name and and t2 t2. .redundant_index_name redundant_index_name= =t1 t1. .index_name index_name where where stat_name stat_name= ='size' 'size' order order by by stat_value stat_value desc desc\G \G Copyright @ 2023 Oracle and/or its affiliates. 92
  109. Duplicate Indexes And this is the same behaviour for duplicate

    indexes. There is no reason to keep maintaining them: select select t2 t2. .* *, , format_bytes format_bytes( (stat_value stat_value * * @ @@innodb_page_size @innodb_page_size) ) size size from from mysql mysql. .innodb_index_stats t1 innodb_index_stats t1 join join sys sys. .schema_redundant_indexes t2 schema_redundant_indexes t2 on on table_schema table_schema= =database_name database_name and and t2 t2. .table_name table_name= =t1 t1. .table_name table_name and and t2 t2. .redundant_index_name redundant_index_name= =t1 t1. .index_name index_name where where stat_name stat_name= ='size' 'size' order order by by stat_value stat_value desc desc\G \G Copyright @ 2023 Oracle and/or its affiliates. * ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** * 1. 1. row row * ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** * table_schema: world table_schema: world table_name: city table_name: city redundant_index_name: part_of_name redundant_index_name: part_of_name redundant_index_columns: Name redundant_index_columns: Name redundant_index_non_unique: redundant_index_non_unique: 1 1 dominant_index_name: name_idx dominant_index_name: name_idx dominant_index_columns: Name dominant_index_columns: Name dominant_index_non_unique: dominant_index_non_unique: 1 1 subpart_exists: subpart_exists: 1 1 sql_drop_index: sql_drop_index: ALTER ALTER TABLE TABLE ` `world world` `. .` `city city` ` DROP DROP INDEX INDEX ` `part_of_name part_of_name` ` size: size: 112.00 112.00 KiB KiB * ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** * 2. 2. row row * ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** * table_schema: world table_schema: world table_name: countrylanguage table_name: countrylanguage redundant_index_name: CountryCode redundant_index_name: CountryCode redundant_index_columns: CountryCode redundant_index_columns: CountryCode redundant_index_non_unique: redundant_index_non_unique: 1 1 dominant_index_name: dominant_index_name: PRIMARY PRIMARY dominant_index_columns: CountryCode dominant_index_columns: CountryCode, ,Language Language dominant_index_non_unique: dominant_index_non_unique: 0 0 subpart_exists: subpart_exists: 0 0 sql_drop_index: sql_drop_index: ALTER ALTER TABLE TABLE ` `world world` `. .` `countrylanguage countrylanguage` ` DROP DROP INDEX INDEX ` `CountryCode CountryCode` ` size: size: 64.00 64.00 KiB KiB 2 2 rows rows in in set set ( (0.0330 0.0330 sec sec) ) 92
  110. Don't forget ! Do not take recommendations at face value,

    check before deleting an index. Do not delete an index immediately, but rst set it as INVISIBLE for some time. Once in a while this index might be used, like for a monthly report. Copyright @ 2023 Oracle and/or its affiliates. 93
  111. Don't forget ! Do not take recommendations at face value,

    check before deleting an index. Do not delete an index immediately, but rst set it as INVISIBLE for some time. Once in a while this index might be used, like for a monthly report. Copyright @ 2023 Oracle and/or its affiliates. But when I add or remove an Index, can I estimate the time left ? 93
  112. ALTER Progression select select stmt stmt. .thread_id thread_id, , stmt

    stmt. .sql_text sql_text, , stage stage. .event_name event_name as as state state, , stage stage. .work_completed work_completed, , stage stage. .work_estimated work_estimated, , lpad lpad( (concat concat( (round round( (100 100* *stage stage. .work_completed work_completed/ /stage stage. .work_estimated work_estimated, , 2 2) ), ,"%" "%") ), ,10 10, ," " " ") ) as as completed_at completed_at, , lpad lpad( (format_pico_time format_pico_time( (stmt stmt. .timer_wait timer_wait) ), , 10 10, , " " " ") ) as as started_ago started_ago, , lpad lpad( (format_pico_time format_pico_time( (stmt stmt. .timer_wait timer_wait/ /round round( (100 100* *stage stage. .work_completed work_completed/ /stage stage. .work_estimated work_estimated, ,2 2) )* *100 100) ), , 10 10, , " " " ") ) as as estimated_full_time estimated_full_time, , lpad lpad( (format_pico_time format_pico_time( (( (stmt stmt. .timer_wait timer_wait/ /round round( (100 100* *stage stage. .work_completed work_completed/ /stage stage. .work_estimated work_estimated, ,2 2) )* *100 100) ) - -stmt stmt. .timer_wait timer_wait) ), , 10 10, , " " " ") ) as as estimated_remaining_time estimated_remaining_time, , current_allocated memory current_allocated memory from from performance_schema performance_schema. .events_statements_current stmt events_statements_current stmt inner inner join join sys sys. .memory_by_thread_by_current_bytes mt memory_by_thread_by_current_bytes mt on on mt mt. .thread_id thread_id = = stmt stmt. .thread_id thread_id inner inner join join performance_schema performance_schema. .events_stages_current stage events_stages_current stage on on stage stage. .thread_id thread_id = = stmt stmt. .thread_id\G thread_id\G Copyright @ 2023 Oracle and/or its affiliates. 94
  113. Index Creation is slow < > Copyright @ 2023 Oracle

    and/or its affiliates. Creating indexes is a very slow operation even on my powerfull server with multiple cores ! Anything I can do ? 96
  114. Index Creation is slow < > Copyright @ 2023 Oracle

    and/or its affiliates. Creating indexes is a very slow operation even on my powerfull server with multiple cores ! Anything I can do ? Since MySQL 8.0.27, you have the possibility to control the maximum of parallel threads InnoDB can use to create seconday indexes ! 96
  115. Parallel Index Creation The amount of parallel threads used by

    InnoDB is controlled by innodb_ddl_threads. This new variable is coupled with another new variable: innodb_ddl_buffer_size. If you have fast storage and multiple CPU cores, tuning these variables can speed up secondary index creation. Copyright @ 2023 Oracle and/or its affiliates. 97
  116. Parallel Index Creation - example SQL SQL> > alter alter

    table table booking booking add add index index idx_2 idx_2( (flight_id flight_id, , seat seat, , passenger_id passenger_id) ); ; Query OK Query OK, , 0 0 rows rows affected affected ( (9 9 min min 0.6838 0.6838 sec sec) ) Copyright @ 2023 Oracle and/or its affiliates. 98
  117. Parallel Index Creation - example SQL SQL> > alter alter

    table table booking booking add add index index idx_2 idx_2( (flight_id flight_id, , seat seat, , passenger_id passenger_id) ); ; Query OK Query OK, , 0 0 rows rows affected affected ( (9 9 min min 0.6838 0.6838 sec sec) ) The default se ings are: innodb_ddl_threads = 4 innodb_ddl_buffer_size = 1048576 innodb_parallel_read_threads = 4 Copyright @ 2023 Oracle and/or its affiliates. 98
  118. Parallel Index Creation - example SQL SQL> > alter alter

    table table booking booking add add index index idx_2 idx_2( (flight_id flight_id, , seat seat, , passenger_id passenger_id) ); ; Query OK Query OK, , 0 0 rows rows affected affected ( (9 9 min min 0.6838 0.6838 sec sec) ) The default se ings are: innodb_ddl_threads = 4 innodb_ddl_buffer_size = 1048576 innodb_parallel_read_threads = 4 The innodb_ddl_buffer_size is shared between all innodb_ddl_threads de ned. If you increase the amount of threads, I recommend that you also increase the bu er size. Copyright @ 2023 Oracle and/or its affiliates. 98
  119. Parallel Index Creation - example (2) To nd the best

    values for these variables, let's have a look at the amount of CPU cores: SQL SQL> > select select count count from from information_schema information_schema. .INNODB_METRICS INNODB_METRICS where where name name = = 'cpu_n' 'cpu_n'; ; + +-------+ -------+ | | count count | | + +-------+ -------+ | | 16 16 | | + +-------+ -------+ We have then 16 cores to share. As my machine as plenty of memory, I will allocate 1GB for the InnoDB DDL bu er. Copyright @ 2023 Oracle and/or its affiliates. 99
  120. Parallel Index Creation - example (3) SQL SQL> > SET

    SET innodb_ddl_threads innodb_ddl_threads = = 8 8; ; SQL SQL> > SET SET innodb_parallel_read_threads innodb_parallel_read_threads = = 8 8; ; SQL SQL> > SET SET innodb_ddl_buffer_size innodb_ddl_buffer_size = = 1048576000 1048576000; ; Copyright @ 2023 Oracle and/or its affiliates. 100
  121. Parallel Index Creation - example (3) SQL SQL> > SET

    SET innodb_ddl_threads innodb_ddl_threads = = 8 8; ; SQL SQL> > SET SET innodb_parallel_read_threads innodb_parallel_read_threads = = 8 8; ; SQL SQL> > SET SET innodb_ddl_buffer_size innodb_ddl_buffer_size = = 1048576000 1048576000; ; We can now retry the same index creation as previously: SQL SQL> > alter alter table table booking booking add add index index idx_2 idx_2( (flight_id flight_id, , seat seat, , passenger_id passenger_id) ); ; Query OK Query OK, , 0 0 rows rows affected affected ( (3 3 min min 9.1862 9.1862 sec sec) ) Copyright @ 2023 Oracle and/or its affiliates. 100
  122. Parallel Index Creation - example (4) I recommend to make

    tests to de ne the optimal se ings for your database, your hardware and data. For example, I got the best result se ing the bu er size to 2GB and both ddl threads and parallel read threads to 4. It took 2 min 43 sec, much be er than the initial 9 minutes ! For more information, go to h ps://lefred.be/content/mysql-8-0-innodb-parallel-threads- for-online-ddl-operations/ Copyright @ 2023 Oracle and/or its affiliates. 101
  123. Histograms What is a histogram? Wikipedia declares a histogram is

    an accurate representation of the distribution of numerical data. For RDBMS, a histogram is an approximation of the data distribution within a speci c column. So in MySQL, histograms help the optimizer to nd the most e cient Query Plan to fetch that data. Copyright @ 2023 Oracle and/or its affiliates. 103
  124. Histograms in MySQL MySQL provides: statement histograms optimizer histograms the

    second category is what we need to focus on today ! Copyright @ 2023 Oracle and/or its affiliates. 104
  125. Statements Histograms This is an example of query response time

    distribution for a statement: Copyright @ 2023 Oracle and/or its affiliates. 105
  126. Global Statements Histograms If you want a global overview of

    all statements: SELECT SELECT CONCAT CONCAT( ('<' '<', ,ROUND ROUND( (BUCKET_TIMER_HIGH BUCKET_TIMER_HIGH/ /1000000 1000000, ,2 2) ), , ' microsec (<' ' microsec (<', ,ROUND ROUND( (BUCKET_TIMER_HIGH BUCKET_TIMER_HIGH/ /1000000000 1000000000, ,2 2) ) , ,'ms)' 'ms)') ) QRT QRT, , CONCAT CONCAT( (RPAD RPAD( ('' '', ,ROUND ROUND( (BUCKET_QUANTILE BUCKET_QUANTILE* *100 100) ), ,'*' '*') ), , ROUND ROUND( (BUCKET_QUANTILE BUCKET_QUANTILE* *100 100, ,2 2) ), ,"%" "%") ) bar bar FROM FROM events_statements_histogram_global events_statements_histogram_global WHERE WHERE count_bucket count_bucket> >0 0; ; Copyright @ 2023 Oracle and/or its affiliates. 106
  127. Optimizer Histograms in MySQL A histogram is a distribution of

    data into logical buckets There are two types of histograms: singleton equi-height The maximum number of buckets is 1024. Copyright @ 2023 Oracle and/or its affiliates. 107
  128. Optimizer Histograms in MySQL (2) So to help the optimizer

    to nd the most e cient Query Plan, histograms can be created. As we know, a histogram is an approximation of the data distribution within a speci c column. Histograms are useful for columns NOT being candidate to have indexes. A histogram is created or updated only on demand, so it adds no overhead when table data is modi ed. On the other hand, the statistics become progressively more out of date when table modi cations occur, until the next time they are updated. Copyright @ 2023 Oracle and/or its affiliates. 108
  129. Optimizer Histograms in MySQL (2) So to help the optimizer

    to nd the most e cient Query Plan, histograms can be created. As we know, a histogram is an approximation of the data distribution within a speci c column. Histograms are useful for columns NOT being candidate to have indexes. A histogram is created or updated only on demand, so it adds no overhead when table data is modi ed. On the other hand, the statistics become progressively more out of date when table modi cations occur, until the next time they are updated. As an example is be er than 1000 words, let's see how we can create and bene t from histograms. (*)Example from MySQL 8 Query Performance Tuning, Jesper Wisborg Krogh, Apress, 2020 Copyright @ 2023 Oracle and/or its affiliates. 108
  130. SELECT SELECT film_id film_id, , title title, , length length,

    , GROUP_CONCAT GROUP_CONCAT( ( CONCAT_WS CONCAT_WS( (' ' ' ', , first_name first_name, , last_name last_name) ) ) ) AS AS actors actors FROM FROM sakila sakila. .film film INNER INNER JOIN JOIN sakila sakila. .film_actor film_actor USING USING ( (film_id film_id) ) INNER INNER JOIN JOIN sakila sakila. .actor actor USING USING ( (actor_id actor_id) ) WHERE WHERE length length < < 55 55 AND AND first_name first_name = = 'Groucho' 'Groucho' GROUP GROUP BY BY film_id film_id; ; Optimizer Histograms in MySQL (3) Consider this query: Copyright @ 2023 Oracle and/or its affiliates. 109
  131. SELECT SELECT film_id film_id, , title title, , length length,

    , GROUP_CONCAT GROUP_CONCAT( ( CONCAT_WS CONCAT_WS( (' ' ' ', , first_name first_name, , last_name last_name) ) ) ) AS AS actors actors FROM FROM sakila sakila. .film film INNER INNER JOIN JOIN sakila sakila. .film_actor film_actor USING USING ( (film_id film_id) ) INNER INNER JOIN JOIN sakila sakila. .actor actor USING USING ( (actor_id actor_id) ) WHERE WHERE length length < < 55 55 AND AND first_name first_name = = 'Groucho' 'Groucho' GROUP GROUP BY BY film_id film_id; ; It returns 6 rows: 6 rows in set (0.0143 sec) Optimizer Histograms in MySQL (3) Consider this query: Copyright @ 2023 Oracle and/or its affiliates. 109
  132. Optimizer Histograms in MySQL (4) Let's have a look at

    the Query Execution Plan: Copyright @ 2023 Oracle and/or its affiliates. 110
  133. Optimizer Histograms in MySQL (5) Now we will create an

    histogram on the length column of the lm table: ANALYZE ANALYZE TABLE TABLE sakila sakila. .film film UPDATE UPDATE HISTOGRAM HISTOGRAM ON ON length length WITH WITH 256 256 BUCKETS\G BUCKETS\G * ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** * 1. 1. row row * ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** * Table Table: sakila : sakila. .film film Op: histogram Op: histogram Msg_type: Msg_type: status status Msg_text: Histogram Msg_text: Histogram statistics statistics created created for for column column 'length' 'length'. . Copyright @ 2023 Oracle and/or its affiliates. 111
  134. Optimizer Histograms in MySQL (6) Let's have a new look

    at the Query Execution Plan: Copyright @ 2023 Oracle and/or its affiliates. 112
  135. Optimizer Histograms in MySQL (7) For info, the query is

    now faster: 6 rows in set (0.0035 sec) Copyright @ 2023 Oracle and/or its affiliates. 113
  136. Optimizer Histograms in MySQL (8) You can also retrieve some

    information about the histograms: SELECT SELECT SCHEMA_NAME SCHEMA_NAME, , TABLE_NAME TABLE_NAME, , COLUMN_NAME COLUMN_NAME, , HISTOGRAM HISTOGRAM- -> >'$."data-type"' '$."data-type"' 'data-type' 'data-type', , HISTOGRAM HISTOGRAM- -> >'$."last-updated"' '$."last-updated"' 'last-updated' 'last-updated', , HISTOGRAM HISTOGRAM- -> >'$."histogram-type"' '$."histogram-type"' 'histogram-type' 'histogram-type', , HISTOGRAM HISTOGRAM- -> >'$."number-of-buckets-specified"' '$."number-of-buckets-specified"' 'specified-buckets' 'specified-buckets' FROM FROM information_schema information_schema. .column_statistics column_statistics WHERE WHERE COLUMN_NAME COLUMN_NAME = = 'length' 'length'; ; +-------------+------------+-------------+-----------+------------------------------+----------------+-------------------+ | SCHEMA_NAME | TABLE_NAME | COLUMN_NAME | data-type | last-updated | histogram-type | specified-buckets | +-------------+------------+-------------+-----------+------------------------------+----------------+-------------------+ | sakila | film | length | "int" | "2021-02-24 09:59:13.046631" | "singleton" | 256 | +-------------+------------+-------------+-----------+------------------------------+----------------+-------------------+ Copyright @ 2023 Oracle and/or its affiliates. 114
  137. select select v v value value, , concat concat( (round

    round( (c c* *100 100, ,1 1) ), ,'%' '%') ) cumulfreq cumulfreq, , concat concat( (round round( (( (c c - - LAG LAG( (c c, , 1 1, , 0 0) ) over over( () )) ) * * 100 100, ,1 1) ), , '%' '%') ) freq freq from from information_schema information_schema. .column_statistics column_statistics, , JSON_TABLE JSON_TABLE( (histogram histogram- -> >'$.buckets' '$.buckets', , '$[*]' '$[*]' COLUMNS COLUMNS( (v v VARCHAR VARCHAR( (60 60) ) PATH PATH '$[0]' '$[0]', , c c double double PATH PATH '$[1]' '$[1]') )) ) hist hist where where schema_name schema_name = = 'sakila' 'sakila' and and table_name table_name = = 'film' 'film' and and column_name column_name = = 'length' 'length'; ; +-------+-----------+------+ | value | cumulfreq | freq | +-------+-----------+------+ | 46 | 0.5% | 0.5% | | 47 | 1.2% | 0.7% | | 48 | 2.3% | 1.1% | | 49 | 2.8% | 0.5% | | 50 | 3.7% | 0.9% | | 51 | 4.4% | 0.7% | | 52 | 5.1% | 0.7% | | 53 | 6% | 0.9% | | 54 | 6.6% | 0.6% | | 55 | 6.8% | 0.2% | | 56 | 7.3% | 0.5% | ... | 185 | 100% | 1% | +-------+-----------+------+ 140 rows in set (0.0040 sec) Optimizer Histograms (7) Copyright @ 2023 Oracle and/or its affiliates. 115
  138. Two types of Histograms Equi-height: One bucket represents a range

    of values. This type of histograms will be created when distinct values in the column are greater than the number of buckets speci ed in the ANALYZE TABLE syntax. Think A-G H-L M-T U-Z. Singleton: One bucket represents one single value in the column, it is the most accurate and will be created when the number of distinct values in the column is less than or equal to the number of buckets speci ed in the ANALYZE TABLE statement. Copyright @ 2023 Oracle and/or its affiliates. 116
  139. Optimizer Histograms Syntax ANALYZE ANALYZE TABLE TABLE t t UPDATE

    UPDATE HISTOGRAM HISTOGRAM ON ON c1 c1, , c2 c2, , c3 c3 WITH WITH 10 10 BUCKETS BUCKETS; ; ANALYZE ANALYZE TABLE TABLE t t UPDATE UPDATE HISTOGRAM HISTOGRAM ON ON c1 c1, , c3 c3 WITH WITH 10 10 BUCKETS BUCKETS; ; ANALYZE ANALYZE TABLE TABLE t t DROP DROP HISTOGRAM HISTOGRAM ON ON c2 c2; ; Note that the rst statement creates three di erent histograms on c1, c2 and c3 as an histogram is created per columns Copyright @ 2023 Oracle and/or its affiliates. 117
  140. Optimizer Histograms Syntax ANALYZE ANALYZE TABLE TABLE t t UPDATE

    UPDATE HISTOGRAM HISTOGRAM ON ON c1 c1, , c2 c2, , c3 c3 WITH WITH 10 10 BUCKETS BUCKETS; ; ANALYZE ANALYZE TABLE TABLE t t UPDATE UPDATE HISTOGRAM HISTOGRAM ON ON c1 c1, , c3 c3 WITH WITH 10 10 BUCKETS BUCKETS; ; ANALYZE ANALYZE TABLE TABLE t t DROP DROP HISTOGRAM HISTOGRAM ON ON c2 c2; ; Note that the rst statement creates three di erent histograms on c1, c2 and c3 as an histogram is created per columns Histograms can be created for almost any data type. If a type is not supported you will get: The The column column 'doc' 'doc' has an unsupported has an unsupported data data type type. . Copyright @ 2023 Oracle and/or its affiliates. 117
  141. Optimizer Histograms Syntax (2) I've created histograms for some columns

    in di erent tables in the world database. Information_Schema can be used to retrieve the info related to these histograms: select select table_name table_name, , column_name column_name, , histogram histogram- ->> >>'$."data-type"' '$."data-type"' AS AS 'data-type' 'data-type', , json_length json_length( (histogram histogram- ->> >>'$."buckets"' '$."buckets"') ) AS AS 'bucket-count' 'bucket-count' from from information_schema information_schema. .column_statistics column_statistics; ; + +------------+-------------+-----------+--------------+ ------------+-------------+-----------+--------------+ | | TABLE_NAME TABLE_NAME | | COLUMN_NAME COLUMN_NAME | | data data- -type type | | bucket bucket- -count count | | + +------------+-------------+-----------+--------------+ ------------+-------------+-----------+--------------+ | | country country | | Population Population | | int int | | 226 226 | | | | city city | | Population Population | | int int | | 1024 1024 | | | | countrylan countrylan | | Language Language | | string string | | 457 457 | | + +------------+-------------+-----------+--------------+ ------------+-------------+-----------+--------------+ Copyright @ 2023 Oracle and/or its affiliates. 118
  142. Where Histograms Shine create create table table h1 h1 (

    (id id int int unsigned unsigned auto_increment auto_increment, , x x int int unsigned unsigned, , primary primary key key( (id id) )) ); ; insert insert into into h1 h1 ( (x x) ) values values ( (1 1) ), ,( (1 1) ), ,( (2 2) ), ,( (2 2) ), ,( (2 2) ), ,( (3 3) ), ,( (3 3) ), ,( (3 3) ), ,( (3 3) ); ; select select x x, , count count( (x x) ) from from h1 h1 group group by by x x; ; + +---+----------+ ---+----------+ | | x x | | count count( (x x) ) | | + +---+----------+ ---+----------+ | | 1 1 | | 2 2 | | | | 2 2 | | 3 3 | | | | 3 3 | | 4 4 | | + +---+----------+ ---+----------+ 3 3 rows rows in in set set ( (0.0234 0.0234 sec sec) ) Copyright @ 2023 Oracle and/or its affiliates. 119
  143. explain explain select select * * from from h1 h1

    where where x x > > 0 0\G \G * ** ** ** ** ** ** ** * 1. 1. row row * ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** * id: id: 1 1 select_type: select_type: SIMPLE SIMPLE table table: h1 : h1 partitions: partitions: NULL NULL type type: : ALL ALL possible_keys: possible_keys: NULL NULL key key: : NULL NULL key_len: key_len: NULL NULL ref: ref: NULL NULL rows rows: : 9 9 filtered: filtered: 33.33 33.33 Extra: Extra: Using Using where where The ltered column indicates an estimated percentage of table rows that will be ltered by the table condition. The maximum value is 100, which means no ltering of rows occurred. Values decreasing from 100 indicate increasing amounts of ltering. rows shows the estimated number of rows examined and rows × ltered shows the number of rows the opimizer plans to retrieve. In this example 9 rows x 33% = 3 rows The Optimizer estimates about 1/3 of the data will go match the 'X > 0' condition Where Histograms Shine (2) Copyright @ 2023 Oracle and/or its affiliates. 120
  144. Where Histograms Shine (3) analyze analyze table table h1 h1

    update update histogram histogram on on x x with with 3 3 buckets\G buckets\G * ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** * 1. 1. row row * ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** * Table Table: test : test. .h1 h1 Op: histogram Op: histogram Msg_type: Msg_type: status status Msg_text: Histogram Msg_text: Histogram statistics statistics created created for for column column 'x' 'x'. . 1 1 row row in in set set ( (0.0287 0.0287 sec sec) ) Copyright @ 2023 Oracle and/or its affiliates. 121
  145. explain explain select select * * from from h1 h1

    where where x x > > 0 0\G \G * ** ** ** ** ** ** ** * 1. 1. row row * ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** * id: id: 1 1 select_type: select_type: SIMPLE SIMPLE table table: h1 : h1 partitions: partitions: NULL NULL type type: : ALL ALL possible_keys: possible_keys: NULL NULL key key: : NULL NULL key_len: key_len: NULL NULL ref: ref: NULL NULL rows rows: : 9 9 filtered: filtered: 100 100 Extra: Extra: Using Using where where all rows ! Where Histograms Shine (4) Copyright @ 2023 Oracle and/or its affiliates. 122
  146. Without Histogram: id: id: 1 1 select_type: select_type: SIMPLE SIMPLE

    table table: h1 : h1 partitions: partitions: NULL NULL type type: : ALL ALL possible_keys: possible_keys: NULL NULL key key: : NULL NULL key_len: key_len: NULL NULL ref: ref: NULL NULL rows rows: : 9 9 filtered: filtered: 33.33 33.33 Extra: Extra: Using Using where where With Histogram: id: id: 1 1 select_type: select_type: SIMPLE SIMPLE table table: h1 : h1 partitions: partitions: NULL NULL type type: : ALL ALL possible_keys: possible_keys: NULL NULL key key: : NULL NULL key_len: key_len: NULL NULL ref: ref: NULL NULL rows rows: : 9 9 filtered: filtered: 44.44 44.44 Extra: Extra: Using Using where where Where Histograms Shine (5) Let's change the condition to x > 2: Copyright @ 2023 Oracle and/or its affiliates. 123
  147. Where Histograms Shine (6) select select round round( (9 9*

    *0.3333 0.3333) ) 'without histogram' 'without histogram', , round round( (9 9* *0.4444 0.4444) ) 'with histogram' 'with histogram'; ; + +-------------------+----------------+ -------------------+----------------+ | | without histogram without histogram | | with with histogram histogram | | + +-------------------+----------------+ -------------------+----------------+ | | 3 3 | | 4 4 | | + +-------------------+----------------+ -------------------+----------------+ select select count count( (* *) ) from from h1 h1 where where x x > > 2 2; ; + +----------+ ----------+ | | count count( (* *) ) | | + +----------+ ----------+ | | 4 4 | | + +----------+ ----------+ Copyright @ 2023 Oracle and/or its affiliates. 124
  148. Histogram vs Index Why you might consider a histogram instead

    of an index: Maintaining an index has a cost. If you have an index, every INSERT/UPDATE/DELETE causes the index to be updated. This is not free, and will have an impact on your performance. A histogram on the other hand is created once and never updated unless you explicitly ask for it. It will thus not hurt your write performance. If you have an index, the optimizer will need to use it to do what we call "index dives" to estimate the number of records in a given range. This might become too costly if you have for instance very long IN-lists in your query. Histogram statistics are much cheaper in this case, and might thus be more suitable. Copyright @ 2023 Oracle and/or its affiliates. 125
  149. MySQL HeatWave I need more performance, much more ! Copyright

    @ 2023 Oracle and/or its affiliates. 126
  150. MySQL HeatWave MySQL is also available in Oracle Cloud Infrastructure

    (OCI) as a managed service. HeatWave is a massively, high performance, in-memory query accelerator for OCI MySQL Database Service that accelerates MySQL performance by orders of magnitude for analytics and mixed workload. HeatWave can be enabled on demand. Copyright @ 2023 Oracle and/or its affiliates. 127
  151. When queries are still too slow Some times, the data

    is to heavy and the indexes are not manageable or don't t in memory... in that case it's very complicate to perform query optimization. This is especially true for Analytics queries. Copyright @ 2023 Oracle and/or its affiliates. 128
  152. HeatWave Example - data from PiDay SQL SQL > >

    select select * * from from ( ( select select date date( (time_stamp time_stamp) ) as as ` `day day` `, , device_id device_id, , count count( (* *) ) as as ` `tot tot` `, , max max( (value value) ) as as ` `max hum max hum` `, , min min( (value value) ) as as ` `min hum min hum` `, , avg avg( (value value) ) as as ` `avg hum avg hum` ` from from humidity_history humidity_history group group by by device_id device_id, , day day) ) a a natural natural join join ( ( select select date date( (time_stamp time_stamp) ) as as ` `day day` `, , device_id device_id, , count count( (* *) ) as as ` `tot tot` `, , max max( (value value) ) as as ` `max temp max temp` `, , min min( (value value) ) as as ` `min temp min temp` `, , avg avg( (value value) ) as as ` `avg temp avg temp` ` from from temperature_history temperature_history group group by by device_id device_id, , day day) ) b b order order by by day day, , device_id device_id; ; + +------------+------------------------------+--------+---------+---------+-----------+----------+----------+-----------+ ------------+------------------------------+--------+---------+---------+-----------+----------+----------+-----------+ | | day day | | device_id device_id | | tot tot | | max hum max hum | | min hum min hum | | avg hum avg hum | | max max temp temp | | min min temp temp | | avg avg temp temp | | + +------------+------------------------------+--------+---------+---------+-----------+----------+----------+-----------+ ------------+------------------------------+--------+---------+---------+-----------+----------+----------+-----------+ | | 2022 2022- -03 03- -09 09 | | 00006227543 00006227543c0000000000000002 c0000000000000002 | | 14534 14534 | | 65.00 65.00 | | 55.00 55.00 | | 60.009273 60.009273 | | 29.99 29.99 | | 20.00 20.00 | | 22.597118 22.597118 | | | | 2022 2022- -03 03- -09 09 | | 00006227543 00006227543c0000000000000003 c0000000000000003 | | 31605 31605 | | 800.21 800.21 | | 1.00 1.00 | | 8.570861 8.570861 | | 814.36 814.36 | | 0.00 0.00 | | 5.079733 5.079733 | | | | 2022 2022- -03 03- -09 09 | | 00006227543 00006227543c0000000000000004 c0000000000000004 | | 31284 31284 | | 279.32 279.32 | | 30.00 30.00 | | 35.294440 35.294440 | | 288.44 288.44 | | 10.00 10.00 | | 12.797445 12.797445 | | | | 2022 2022- -03 03- -10 10 | | 00006227543 00006227543c0000000000000001 c0000000000000001 | | 114906 114906 | | 50.00 50.00 | | 40.00 40.00 | | 45.001613 45.001613 | | 14.00 14.00 | | 9.00 9.00 | | 11.499796 11.499796 | | | | 2022 2022- -03 03- -10 10 | | 00006227543 00006227543c0000000000000002 c0000000000000002 | | 100913 100913 | | 65.00 65.00 | | 55.00 55.00 | | 59.999105 59.999105 | | 25.00 25.00 | | 20.00 20.00 | | 22.501319 22.501319 | | | | 2022 2022- -03 03- -10 10 | | 00006227543 00006227543c0000000000000003 c0000000000000003 | | 101465 101465 | | 11.00 11.00 | | 1.00 1.00 | | 5.998472 5.998472 | | 5.00 5.00 | | 0.00 0.00 | | 2.501763 2.501763 | | | | 2022 2022- -03 03- -10 10 | | 00006227543 00006227543c0000000000000004 c0000000000000004 | | 101044 101044 | | 40.00 40.00 | | 30.00 30.00 | | 34.991012 34.991012 | | 15.00 15.00 | | 10.00 10.00 | | 12.496505 12.496505 | | + +------------+------------------------------+--------+---------+---------+-----------+----------+----------+-----------+ ------------+------------------------------+--------+---------+---------+-----------+----------+----------+-----------+ 7 7 rows rows in in set set ( (1.2717 1.2717 sec sec) ) Copyright @ 2023 Oracle and/or its affiliates. 129
  153. HeatWave Example - data from PiDay Same Query after having

    enabled and loaded data to HeatWave SQL SQL > > select select * * from from ( ( select select date date( (time_stamp time_stamp) ) as as ` `day day` `, , device_id device_id, , count count( (* *) ) as as ` `tot tot` `, , max max( (value value) ) as as ` `max hum max hum` `, , min min( (value value) ) as as ` `min hum min hum` `, , avg avg( (value value) ) as as ` `avg hum avg hum` ` from from humidity_history humidity_history group group by by device_id device_id, , day day) ) a a natural natural join join ( ( select select date date( (time_stamp time_stamp) ) as as ` `day day` `, , device_id device_id, , count count( (* *) ) as as ` `tot tot` `, , max max( (value value) ) as as ` `max temp max temp` `, , min min( (value value) ) as as ` `min temp min temp` `, , avg avg( (value value) ) as as ` `avg temp avg temp` ` from from temperature_history temperature_history group group by by device_id device_id, , day day) ) b b order order by by day day, , device_id device_id; ; + +------------+------------------------------+--------+---------+---------+-----------+----------+----------+-----------+ ------------+------------------------------+--------+---------+---------+-----------+----------+----------+-----------+ | | day day | | device_id device_id | | tot tot | | max hum max hum | | min hum min hum | | avg hum avg hum | | max max temp temp | | min min temp temp | | avg avg temp temp | | + +------------+------------------------------+--------+---------+---------+-----------+----------+----------+-----------+ ------------+------------------------------+--------+---------+---------+-----------+----------+----------+-----------+ | | 2022 2022- -03 03- -09 09 | | 00006227543 00006227543c0000000000000002 c0000000000000002 | | 14534 14534 | | 65.00 65.00 | | 55.00 55.00 | | 60.009272 60.009272 | | 29.99 29.99 | | 20.00 20.00 | | 22.597117 22.597117 | | | | 2022 2022- -03 03- -09 09 | | 00006227543 00006227543c0000000000000003 c0000000000000003 | | 31605 31605 | | 800.21 800.21 | | 1.00 1.00 | | 8.570860 8.570860 | | 814.36 814.36 | | 0.00 0.00 | | 5.079732 5.079732 | | | | 2022 2022- -03 03- -09 09 | | 00006227543 00006227543c0000000000000004 c0000000000000004 | | 31284 31284 | | 279.32 279.32 | | 30.00 30.00 | | 35.294440 35.294440 | | 288.44 288.44 | | 10.00 10.00 | | 12.797445 12.797445 | | | | 2022 2022- -03 03- -10 10 | | 00006227543 00006227543c0000000000000001 c0000000000000001 | | 115609 115609 | | 50.00 50.00 | | 40.00 40.00 | | 45.001736 45.001736 | | 14.00 14.00 | | 9.00 9.00 | | 11.499157 11.499157 | | | | 2022 2022- -03 03- -10 10 | | 00006227543 00006227543c0000000000000002 c0000000000000002 | | 100913 100913 | | 65.00 65.00 | | 55.00 55.00 | | 59.999104 59.999104 | | 25.00 25.00 | | 20.00 20.00 | | 22.501318 22.501318 | | | | 2022 2022- -03 03- -10 10 | | 00006227543 00006227543c0000000000000003 c0000000000000003 | | 101465 101465 | | 11.00 11.00 | | 1.00 1.00 | | 5.998472 5.998472 | | 5.00 5.00 | | 0.00 0.00 | | 2.501762 2.501762 | | | | 2022 2022- -03 03- -10 10 | | 00006227543 00006227543c0000000000000004 c0000000000000004 | | 101044 101044 | | 40.00 40.00 | | 30.00 30.00 | | 34.991011 34.991011 | | 15.00 15.00 | | 10.00 10.00 | | 12.496504 12.496504 | | + +------------+------------------------------+--------+---------+---------+-----------+----------+----------+-----------+ ------------+------------------------------+--------+---------+---------+-----------+----------+----------+-----------+ 7 7 rows rows in in set set ( (0.1267 0.1267 sec sec) ) Copyright @ 2023 Oracle and/or its affiliates. 130
  154. HeatWave Example - data from PiDay 0.1267 sec VS 1.2717

    sec 10x faster... but that is only 1 day of data... Now lets increase the data to 14 days: +11G of data: + +----------+----------+------------+ ----------+----------+------------+ | | DATA DATA | | INDEXES INDEXES | | TOTAL SIZE TOTAL SIZE | | + +----------+----------+------------+ ----------+----------+------------+ | | 8.66 8.66 GiB GiB | | 2.93 2.93 GiB GiB | | 11.59 11.59 GiB GiB | | + +----------+----------+------------+ ----------+----------+------------+ Copyright @ 2023 Oracle and/or its affiliates. 131
  155. Without HeatWave: 44 44 rows rows in in set set

    ( (10 10 min min 14.1022 14.1022 sec sec) ) With HeatWave: 45 45 rows rows in in set set ( (1.6051 1.6051 sec sec) ) HeatWave Example - data from PiDay 14 days of data (11GB) 69M ROWS 383x faster ! Copyright @ 2023 Oracle and/or its affiliates. 132
  156. HeatWave - loading data alter alter table table temperature_history secondary_engine

    temperature_history secondary_engine= =rapid rapid; ; Query OK Query OK, , 0 0 rows rows affected affected ( (0.0257 0.0257 sec sec) ) alter alter table table temperature_history secondary_load temperature_history secondary_load; ; Query OK Query OK, , 0 0 rows rows affected affected ( (17.3070 17.3070 sec sec) ) 4.6GB of data for this table (whitout indexes) loaded to HeatWave Copyright @ 2023 Oracle and/or its affiliates. 133
  157. MySQL Workload Read or Write Intensive ? You should know

    ! And many people just have an idea (and usually wrong...). Copyright @ 2023 Oracle and/or its affiliates. 136
  158. MySQL Shell with plugins can help ! MySQL Workload Read

    or Write Intensive ? You should know ! And many people just have an idea (and usually wrong...). Copyright @ 2023 Oracle and/or its affiliates. 136
  159. MySQL Workload Changes What can cause a (one-time) performance drop

    ? Copyright @ 2023 Oracle and/or its affiliates. 138
  160. MySQL Workload Changes What can cause a (one-time) performance drop

    ? Backup (physical, logical), Clone Filesystem snapshot External process consuming resources (IOPS, CPU...) Maintenance operation (ALTER, Changes in partitions, Archiving data..) Hardware problem or maintenance Di erent statistics, di erent Query Execution Plan Copyright @ 2023 Oracle and/or its affiliates. 138
  161. Save the QEP for your queries I also recommend to

    save the Query Execution Plan of some of your queries in a table to compare them: Copyright @ 2023 Oracle and/or its affiliates. 139
  162. Save the QEP for your queries (2) Copyright @ 2023

    Oracle and/or its affiliates. 140
  163. MySQL Shell is now included in Visual Studio Code: Copyright

    @ 2023 Oracle and/or its affiliates. 144
  164. Share your ❤ to MySQL #mysql #MySQLCommunity Join our slack

    channel! bit.ly/mysql-slack Copyright @ 2023 Oracle and/or its affiliates. 146