I Use This!
High Activity

News

Analyzed 1 day ago. based on code collected 1 day ago.
Posted over 2 years ago by Frederic Descamps
This post is the fourth post of a series of articles on extending MySQL with the Component Infrastructure, the list above will be updated as new articles are published: Extending MySQL using the Component Infrastructure – part 1Extending MySQL ... [More] using the Component Infrastructure – part 2: building the serverExtending MySQL using the Component Infrastructure – part 3: component servicesExtending MySQL using the Component Infrastructure – part 4: error loggingExtending MySQL using the Component Infrastructure – part 5: privilegesExtending MySQL using the Component Infrastructure – part 6: functionsIn this article we will add to our component the capability to add messages in error log (and performance_schema.error_log table) using the log builtins component service. You will notice that this very easy and fast. We start as usual with scan.h: This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters Show hidden characters /* Copyright (c) 2017, 2022, Oracle and/or its affiliates. All rights reserved. This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License, version 2.0, as published by the Free Software Foundation. This program is also distributed with certain software (including but not limited to OpenSSL) that is licensed under separate terms, as designated in a particular file or component or in included license documentation. The authors of MySQL hereby grant you an additional permission to link the program and your derivative works with the separately licensed software that they have included with MySQL. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License, version 2.0, for more details. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA */ #define LOG_COMPONENT_TAG "viruscan" #include #include /* LogComponentErr */ #include /* Errors */ extern REQUIRES_SERVICE_PLACEHOLDER(log_builtins); extern REQUIRES_SERVICE_PLACEHOLDER(log_builtins_string); extern SERVICE_TYPE(log_builtins) * log_bi; extern SERVICE_TYPE(log_builtins_string) * log_bs; view raw scan.h hosted with ❤ by GitHub On line 22 and 23 we add the headers. mysqld_error.h is required to define ER_LOG_PRINTF_MSG. Then we specify the services that we need. We can then update scan.cc to also include those services and use them on 42 and 54: This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters Show hidden characters /* Copyright (c) 2017, 2022, Oracle and/or its affiliates. All rights reserved. This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License, version 2.0, as published by the Free Software Foundation. This program is also distributed with certain software (including but not limited to OpenSSL) that is licensed under separate terms, as designated in a particular file or component or in included license documentation. The authors of MySQL hereby grant you an additional permission to link the program and your derivative works with the separately licensed software that they have included with MySQL. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License, version 2.0, for more details. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA */ #define LOG_COMPONENT_TAG "viruscan" #define NO_SIGNATURE_CHANGE 0 #define SIGNATURE_CHANGE 1 #include REQUIRES_SERVICE_PLACEHOLDER(log_builtins); REQUIRES_SERVICE_PLACEHOLDER(log_builtins_string); SERVICE_TYPE(log_builtins) * log_bi; SERVICE_TYPE(log_builtins_string) * log_bs; static mysql_service_status_t viruscan_service_init() { mysql_service_status_t result = 0; log_bi = mysql_service_log_builtins; log_bs = mysql_service_log_builtins_string; LogComponentErr(INFORMATION_LEVEL, ER_LOG_PRINTF_MSG, "initializing…"); return result; } static mysql_service_status_t viruscan_service_deinit() { mysql_service_status_t result = 0; log_bi = mysql_service_log_builtins; log_bs = mysql_service_log_builtins_string; LogComponentErr(INFORMATION_LEVEL, ER_LOG_PRINTF_MSG, "uninstalled."); return result; } BEGIN_COMPONENT_PROVIDES(viruscan_service) END_COMPONENT_PROVIDES(); BEGIN_COMPONENT_REQUIRES(viruscan_service) REQUIRES_SERVICE(log_builtins), REQUIRES_SERVICE(log_builtins_string), END_COMPONENT_REQUIRES(); /* A list of metadata to describe the Component. */ BEGIN_COMPONENT_METADATA(viruscan_service) METADATA("mysql.author", "Oracle Corporation"), METADATA("mysql.license", "GPL"), METADATA("mysql.dev", "lefred"), END_COMPONENT_METADATA(); /* Declaration of the Component. */ DECLARE_COMPONENT(viruscan_service, "mysql:viruscan_service") viruscan_service_init, viruscan_service_deinit END_DECLARE_COMPONENT(); /* Defines list of Components contained in this library. Note that for now we assume that library will have exactly one Component. */ DECLARE_LIBRARY_COMPONENTS &COMPONENT_REF(viruscan_service) END_DECLARE_LIBRARY_COMPONENTS view raw scan.cc hosted with ❤ by GitHub We also need to tell that our component requires those services on line 63 and 64. We can compile it again (using make component_viruscan), restart mtr and test it again: And in error log file, we can also see the following lines: 2022-01-06T09:26:14.450784Z 8 [Note] [MY-011071] [Server] Component viruscan reported: 'initializing...' 2022-01-06T09:26:21.906295Z 8 [Note] [MY-011071] [Server] Component viruscan reported: 'uninstalled.' On the next article, we will create our new privilege, stay tuned and enjoy MySQL! [Less]
Posted over 2 years ago by Vitess
What is Schema Tracking? # In a distributed relational database system, like Vitess, a central component is responsible for serving queries across multiple shards. For Vitess, it is VTGate. One of the challenges this component faces is being aware of ... [More] the underlying SQL schema being used. This awareness facilitates query planning. Table schemas are stored in MySQL’s information_schema, meaning that they are located in a VTTablet’s MySQL instance and not in VTGate. [Less]
Posted over 2 years ago by Oracle MySQL Group
MySQL Query Optimization is usually simple engineering. But seeking information about how to tune queries is treated on many web reference sites like some Harry Potter-ish spell casting. There are simple tips you need to be aware of to get the best of your queries...
Posted over 2 years ago by Oracle MySQL Group
Performance issues are often stubborn things to cure. Suddenly a query or operation seems slow, or slower-ish. Sadly, the database is often pointed at as the culprit by developers who are either positive their code cannot be at fault or those who see databases as a mysterious box that does odd things ...
Posted over 2 years ago by Kristian Köhntopp
On Twitter, Jan Wildeboer linked an article by Adam Hooper on MySQL and the weird utf8mb4 character set. The recommendation is correct: In MySQL, use utf8mb4 when you mean to work with utf8 in your programming language. The background and ... [More] reasoning why this is is wrong and way more complicated. So let’s walk through this: MySQL utf8 means Unicode 1.0 BMP UTF8 in MySQL encodes the Unicode BMP . The Unicode Basic Multilingual Plane, or BMP, is the original 65536 character plane of Unicode 1.0. It was thought to be enough for all scripts in the world. It wasn’t. Unicode was extended, as early as with the 2.0 release, to have more characters than that, in more code planes. There can now be up to 17 Unicode planes. There are various ways to encode codepoints of Unicode. Some of them are fixed with. For example, Windows uses internally a 16 bit character encoding, also from the time when people thought that Unicode 1.0 would solve the worlds writing problems. This is not only limited, but also wasteful: When writing western text, every other byte in Windows Unicode text is a null byte. In Unix, specifically originally in Plan 9 from AT&T, a variable length encoding for Unicode was developed. Basically, the number of consecutive high bits of the first byte determines how long the byte sequence is. So a thing starting with 0x8? denotes a single character sequence, a thing starting with 0xc? identifies a two character sequence and a 0xe? a three byte sequence. Four byte codepoints start with 0xf? MySQL also is from the time when people still believed in 65536 glyphs, and offers a number of encodings. You can list the interesting ones with SHOW CHARSET WHERE MAXLEN > 1. kris@localhost [kris]> show charset where maxlen > 1; +---------+---------------------------------+---------------------+--------+ | Charset | Description | Default collation | Maxlen | +---------+---------------------------------+---------------------+--------+ | big5 | Big5 Traditional Chinese | big5_chinese_ci | 2 | | cp932 | SJIS for Windows Japanese | cp932_japanese_ci | 2 | | eucjpms | UJIS for Windows Japanese | eucjpms_japanese_ci | 3 | | euckr | EUC-KR Korean | euckr_korean_ci | 2 | | gb18030 | China National Standard GB18030 | gb18030_chinese_ci | 4 | | gb2312 | GB2312 Simplified Chinese | gb2312_chinese_ci | 2 | | gbk | GBK Simplified Chinese | gbk_chinese_ci | 2 | | sjis | Shift-JIS Japanese | sjis_japanese_ci | 2 | | ucs2 | UCS-2 Unicode | ucs2_general_ci | 2 | | ujis | EUC-JP Japanese | ujis_japanese_ci | 3 | | utf16 | UTF-16 Unicode | utf16_general_ci | 4 | | utf16le | UTF-16LE Unicode | utf16le_general_ci | 4 | | utf32 | UTF-32 Unicode | utf32_general_ci | 4 | | utf8 | UTF-8 Unicode | utf8_general_ci | 3 | | utf8mb4 | UTF-8 Unicode | utf8mb4_0900_ai_ci | 4 | +---------+---------------------------------+---------------------+--------+ 15 rows in set (0.00 sec) Specifically, there are the utf16, utf16le and utf32 encodings. And we can check what they do. In my terminal, which incidentally uses utf8 to talk to the database, I can select hex("ö") as ch to check the representation of an o-Umlaut as bytes. I can also select hex(convert("ö" using ))) as ch to check what these bytes would be in other encodings. Let’s do that: kris@localhost [kris]> select hex("ö") as ch; +------+ | ch | +------+ | C3B6 | +------+ 1 row in set (0.00 sec) kris@localhost [kris]> select hex(convert("ö" using latin1)) as ch; +------+ | ch | +------+ | F6 | +------+ 1 row in set (0.00 sec) kris@localhost [kris]> select hex(convert("ö" using utf16)) as ch; +------+ | ch | +------+ | 00F6 | +------+ 1 row in set (0.00 sec) kris@localhost [kris]> select hex(convert("ö" using utf16le)) as ch; +------+ | ch | +------+ | F600 | +------+ 1 row in set (0.00 sec) kris@localhost [kris]> select hex(convert("ö" using utf32)) as ch; +----------+ | ch | +----------+ | 000000F6 | +----------+ 1 row in set (0.00 sec) So we learn: The UTF8 o-Umlaut is 0xc3b6. In the fixed 8-bit charset latin1 that becomes 0xf6. In UTF16, a superset of that, it becomes 0x00f6. Unsurprisingly, in UTF16LE that is now the same, backwards, 0xf600. And in the staggering wasteful UTF32 we actually get three null bytes per glyph when storing western script, 0x000000f6. In fact, to encode all possible 17 unicode planes we require only 21 bits, so UTF32 is hopelessly wasteful even for non-western scripts. That is at least one null byte for each glyph in any script. Later Unicode revisions add more planes But: The singular Basic Multilingual Plane was not enough for all of earths scripts, so it needed to be extended. So we get characters that no longer fit a Windows wchar, and also not the 3-byte utf8 encoding. MySQL extends the character representation to 4 bytes, which covers all possible 17 Unicode planes, and names that character set utf8mb4. Case closed. When working with modern Unicode in MySQL, always define the character set as utf8mb4. It is that simple. But why is the 4-byte utf8 named differently? So why did MySQL name the 4-byte utf8 weirdly utf8mb4 and did not update the definition of utf8 to 4 bytes? That is, because in databases you cannot ever change the definition of a character set or character set sort order (“collation”) without great pains. You can easily add more, but you can never change what you have delivered to customers, ever. Indexes physically materialize columns in sort order for fast lookups So, databases have this thing called indexes. Suppose you have a table with a character column (a column of any datatype that has a charset and a collation), it may be a char, varchar or any of the four text types. And you define an index on it: kris@localhost [kris]> create table kris ( id serial, c varchar(20), index(c), d integer); Query OK, 0 rows affected (0.28 sec) kris@localhost [kris]> show create table kris\G Table: kris Table: kris Create Table: CREATE TABLE `kris` ( `id` bigint unsigned NOT NULL AUTO_INCREMENT, `c` varchar(20) DEFAULT NULL, `d` int DEFAULT NULL, UNIQUE KEY `id` (`id`), KEY `c` (`c`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci 1 row in set (0.00 sec) What happens here? We store rows in the table kris by inserting data. The database assigns a primary key id, and stores the tuple (id, c, d) in the table. It also needs to maintain the index on c, so it will take the c-value of each row and store it in the index in sorted order with a pointer to the full row. In InnoDB, the row pointer used by secondary indexes is actually the primary, id, so we do get the index c as tuples of (c, id). That means each secondary has the primary key as a tail, so the primary key better be short. In our case, it is a BIGINT, so 8 bytes are added as a row pointer to the index, for each row. Notice the fat in sorted order? The index c is sorted, in the order defined by the collation for the character set. That is going to be critically important. So let’s have a look at character ordering. Collations are comparison rules for characters Our collation is given as utf8mb4_0900_ai_ci, and that are the Unicode Collation Algorithm (UCA) 9.0 sorting rules for 4-byte UTF8, with accent insensitivity (ai)and case insensitivity (ci`). We need to learn even more: Collations are rules for handling comparisons between glyphs in a specific encoding. There are encodings for latin1, other ones specific to utf8 and again different ones for utf8mb4. Collations define sort orders and equality rules. ci means the sort order is case insensitive, so “Köhntopp”, “KÖHNTOPP” and “köhntopp” sort the same. With cs they would not. ai means accent insentitive, so WHERE c = "Kohntopp" matches “Köhntopp”. With as it would not. We can show the collations for a given character set. There are over 250 of them, and 75 alone for utf8mb4. Let’s check just the utf8mb4%_ai_ci ones. kris@localhost [kris]> show collation where collation like "utf8mb4%_ai_ci"; +----------------------------+---------+-----+---------+----------+---------+---------------+ | Collation | Charset | Id | Default | Compiled | Sortlen | Pad_attribute | +----------------------------+---------+-----+---------+----------+---------+---------------+ | utf8mb4_0900_ai_ci | utf8mb4 | 255 | Yes | Yes | 0 | NO PAD | | utf8mb4_cs_0900_ai_ci | utf8mb4 | 266 | | Yes | 0 | NO PAD | ... | utf8mb4_vi_0900_ai_ci | utf8mb4 | 277 | | Yes | 0 | NO PAD | +----------------------------+---------+-----+---------+----------+---------+---------------+ 22 rows in set (0.00 sec) So many nation specific sort orders. And if you define an index on a character column, it has a collation attached and the data will be physically sorted in that nations sort order. These things are in fact even versioned: If there is an UCA 9.0, are there other, earlier ones? Sure enough: kris@localhost [information_schema]> select * from collations where collation_name like "%general_mysql%"; +--------------------------+--------------------+-----+------------+-------------+---------+---------------+ | COLLATION_NAME | CHARACTER_SET_NAME | ID | IS_DEFAULT | IS_COMPILED | SORTLEN | PAD_ATTRIBUTE | +--------------------------+--------------------+-----+------------+-------------+---------+---------------+ | ucs2_general_mysql500_ci | ucs2 | 159 | | Yes | 1 | PAD SPACE | | utf8_general_mysql500_ci | utf8 | 223 | | Yes | 1 | PAD SPACE | +--------------------------+--------------------+-----+------------+-------------+---------+---------------+ 2 rows in set (0.00 sec) The fine manual notes, at the bottom of Unicode Character Sets : Miscellaneous Information The xxx_general_mysql500_ci collations preserve the pre-5.1.24 ordering of the original xxx_general_ci collations and permit upgrades for tables created before MySQL 5.1.24 (Bug #27877). MySQL 5.1.24 was released on 2008-04-08, 14 years ago. We have buggy sort orders used back then maintained in current MySQL for the sake of upgradability. TL;DR, so far Databases use indexes to access data quickly. Indexes physically materialize a column (and row pointers) in sort order. Sort order for character sets character set dependent, and national language dependent. Sort order also got major upgrades during the evolution of Unicode: Not only did we move from “only BMP” to 17 planes, we also went through at least 9 revisions of the Unicode Collation Algorithm. So: If you change the collation (the sort order) of an index, the index needs to be rebuilt. Sort orders are a property of character sets. UTF8 sorted according to UCA 4.x and UCA 5.0, UTF8MB4 sorts according to UCA 9.0. Specifically, the concept of segregating sort order comparison from equality comparison, and introducing accent insensitivity is new in UCA 9.0 and not present in previous UCA versions. That means, while utf8 is a strict subset of utf8mb4 in terms of character repertoire and encoding (modulo bug fixes), it sorts differently. An ALTER TABLE kris MODIFY COLUMN c varchar(20) charset utf8 in our example table implies a drop on the index c, and it’s recreation with a different sort order, from scratch. Not fast on a large table. So what would happen if utf8 suddently meant utf8mb4? Let’s imagine a MySQL 9.0 comes out, and in that release utf8 suddenly means utf8mb4. If you greenfield a new installation, no problem. If you apply the update to an existing installation, all your character columns that are part of an index need to be checked, and if required, rebuilt. At the time when you apply the database update. Which can take a very long time, during which the database is not available. That can be quite inconvenient. What happens instead if we don’t do that? In that case, utf8 means utf8, nothing changes, and utf8mb4 is added and not used at all. Unless you do. So you could create new columns using the new character set and collations, even in existing tables. That is possible, because unlike in the other database, in MySQL a charset and collation is a column attribute. Tables, databases and servers just provide defaults that are inherited at the time of subordinate object creation. So if you wanted to upgrade your database, you’d install a new version. Then you would check all columns and note which one need changing. And then you would change them, table by table, at your own leisure. Possibly using tooling such as Percona Online Table Change (“OSC”), making the change in the background without service interruption. That is what MySQL does. It never changes collations any more (it did, in the dark, evil past, creating a lot of suffering). It is not even correcting bugs in existing collations any more (except by creating newly renamed collations). Instead you run your collation migrations as it fits into your operational schedule. For Unicode that means, 4-byte Unicode is utf8mb4 in MySQL, because the name utf8 was already taken by a previous version of the Unicode character set. Not using libc. MySQL does not use libc and the character set comparison functions in there for itself. Instead it brings its own collations. That is also an important lesson. It means that, unlike the other database, MySQL does not break when you upgrade the operating system libc on your machine. If MySQL used libc fort sorting and comparing, the database could break without any changes to the database whatsoever - an innocent security update to the operation systems libc that for some reason also touched libc sorting and comparison would break all character indexes in your database. MySQL does not do that. It lets you migrate these things in a way that is operationally convenient. It also means that MySQL versions sort the same between Linux, FreeBSD, Solaris and even Windows. Just like the on disk format is compatible across all platforms, so is character handling and time zone handling. MySQL is MySQL on any platform, and copying files and replication always work across these boundaries. Dude, that’s complicated Yes. Databases are weird, complicated and large. That is because they keep the state for your program, so that your program can be stateless and easy to manage. Guess where all that complexity and technical debt management went? Where I work there is a replication hierarchy where the database instances are 120 TB in size, with the largest table being 35 TB in size. Cloning a new instance at 400 MB/s takes around a week, plus replication catchup, so you wait around two weeks for a new instance, if nothing goes wrong. Moving this to utf8mb4 is a 6 month project, at least, and most of it is waiting. This database was created around 16 years ago. It has an unbroken chain of state changes (“inserts, updates and deletes”) from back then to the present. It was born as MySQL 4.x, at some point became a MariaDB and is now an Oracle MySQL 5.7. None of the code that was running when it was created still exists. But the bugs and faulty updates that code did, back then, still exist somewhere in the data, unless they have been found and corrected. Persistence systems - databases and their NoSQL brethren that actually manage to write data to disk properly - are stateful systems. The data they store usually outlives the code that created the records. Changes to this data are forever. Mistakes in handling the data or the updates are forever, too. DBA are a very special folk. They are very paranoid. They have tested restores of their backups, and their backups have backups. That is, because if they make a mistake and data is gone, it is gone forever, and no rollout to fix things is possible. Sorting stuff Not only can databases, tables and individual indexes be very large, they also need to be kept sorted, and sometimes data needs to be re-sorted. Databases ususally have a large number of strategies for that. Small things are sorted in memory. That is, the database creates the full rows of the result table in memory, using all columns. It then sorts these rows in memory by shuffling around the full rows. That uses up a lot of memory bandwidth, because we copy around full rows, which can be large. Sometimes we do not want to move around full rows. So we only sort the columns specified in the ORDER BY CLAUSE, which is a tiny subset of the full columns set of each row of the result table, and attach a row pointer to each of these sort rows. In the end we have a tiny in-memory table in sort order, which only holds columns from the ORDER BY, and each row in there points to the unsorted full rows somewhere else - in memory or on disk. We then output the full rows in sort order, but that means we generate a lot of seeks - possibly on disk. That is, because the full rows have not been sorted, so the row pointers from the sorted data to the full rows are a mess. Sometimes we have very many rows. So we do as before, but for a subset of all rows - as many as fit into our sort buffer. When the sort buffer is full, we write it out to disk. Then we continue with the next batch. Across all partial results we perform a merge sort. If the things sorted were full rows, we can emit full result rows rather quickly. If the things sorted were only ORDER BY columns or other partial result set rows with row pointers attached, we may have to perform one or multiple seeks in order to get the missing columns from all tables involved. The optimizer has to decide which strategy to use here, based on estimates of the row width, size of the result set (which has not yet been produced, so it’s a guess) and the available amount of resources. And then, implementation issues Older versions of MySQL have different storage engines: MYISAM, INNODB, MEMORY and others. Modern MySQL has only INNODB, but for temporary tables has a few tricks ready to make things faster (for example, temporary tables can be recreated and do not need to be crash safe, so no redo and undo logging is required). Other storage engines than INNODB had restrictions. For example, MEMORY tables could not have variable length columns, there was no VARCHAR, only CHAR. So when sorting with temporary tables in memory, MySQL created these tables as MEMORY tables, promoting every varchar(255) to CHAR(255), and so a row with a varchar value of hello suddenly used 255 bytes instead of 6 bytes to represent the string hello (plus a lot of padding). Of course, with utf8, a varchar(255) charset utf8 is promoted to a char(255) charset utf8, which instead allocates 765 bytes of memory, out of which the first five bytes contain the string hello, and the rest is wasteful padding. MEMORY tables had a configurable size limit, and if your table exceeded that (which happened rather quickly, given the above restrictions), it was converted to an on-disk temporary table in MYISAM format. That, of course had varchar(255) charset utf8 as a native type and your hello would again shrink to six bytes. Looking at the on disk temporary file, which was rather small, you might wander what caused the spill. MySQL 8 introduces unlogged INNODB temporary tables (and a failed experiment involving mmap that you may want to disable). So this problem is gone, once and for all: varchar is varchar at all times, and representing Unicode variants in temporary tables is no longer a problem. In MySQL 8. This is in fact one of the major reasons to graduate from 5.7 for many people. Sorting TLDR and Unicode Sorting Unicode columns was expensive in versions of MySQL before MySQL 8. That is because the memory engine used did not understand variable length data types at all. That led to all kinds of recommendations restricting the use of Unicode columns: Don’t use them, if you do not need them. Keep them short. Exclude BLOB and TEXT columns from sorting. And so on. MySQL 8 fixes all that. Use Unicode columns as you see fit. They work well, and as expected. BLOB and TEXT are no longer a problem with sorting (well, they are, but for other reasons, but they do not longer trigger an immediate spill, no matter how tiny they are). TLDR We learned: Your programming languages utf8 is called utf8mb4 in MySQL. utf8 and utf8mb4 sort differently due to changes in the Unicode Collation Algorithm (UCA). Indexes are physically materialized sort orders of column sets. They can become pretty large. MySQL never changes sort orders of character sets (collations), even if they are buggy. Instead a new collation with a new name is created. That is, because changing a sort order may require dropping and recreating an index, which is expensive if the index is large. Sorting things can be done in many different ways, depending on data set size, column size and other considerations. Implementation details can make sorting even more complicated. MySQL 8 is a worthwhile upgrade. Databases manage state so you don’t have to. If you think databases are complicated, consider what you would have to do if the database and your DBA would not be there for you. [Less]
Posted over 2 years ago by MyDBOPS
We have planned for archiving the data to improve the DB performance and to reclaim the space. We were evaluating Compression in InnoDB and TokuDB. To find out the best compression method. We started benchmarking the compression ratio between ... [More] InnoDB and TokuDB. Everything goes well for some time, but after a few hours got an error message that can’t able to insert the data into the TokuDB table due to storage being full. It is so strange and the host has enough free space. Table structure:- mysql> show create table mydbops.tokudb\G *************************** 1. row *************************** Table: tokudb Create Table: CREATE TABLE `tokudb` ( `ID` int DEFAULT NULL, `Name` longtext, `Image` blob ) ENGINE=TokuDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci 1 row in set (2.18 sec) mysql> show create table mydbops.innodb\G *************************** 1. row *************************** Table: innodb Create Table: CREATE TABLE `innodb` ( `ID` int DEFAULT NULL, `Name` longtext, `Image` blob ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci 1 row in set (2.27 sec) Server version root@centos17:/var/lib/mysql# mysql --version mysql Ver 8.0.25-15 for Linux on x86_64 (Percona Server (GPL), Release '15', Revision 'a558ec2') Error faced:- mysql> INSERT IGNORE INTO mydbops.tokudb (ID,Name,Image) VALUES (583532949, "aut maxime sint dolores quibusdam nam fuga eos hic.", "incidunt qui maxime consectetur nulla delectus."); ERROR 1030 (HY000): Got error 28 - 'No space left on device' from storage engine;- Available storage:- root@centos17:/var/lib/mysql# df -h / Filesystem Size Used Avail Use% Mounted on /dev/mapper/packer--debian--9--amd64--vg-root 78G 69G 4.4G 95% / Around 4.4 GB of free space is available. But the inserts are not failing for the table with the InnoDB engine. For InnoDB mysql> INSERT IGNORE INTO mydbops.innodb (ID,Name,Image) VALUES (583532949, "aut maxime sint dolores quibusdam nam fuga eos hic.", "incidunt qui maxime consectetur nulla delectus."); Query OK, 1 row affected (0.20 sec) After around analysis and discussions, it is found that a TokuDB internal variable is blocking the writes. The “tokudb_fs_reserve_percent” variable will block the writes (Inserts) to the TokuDB table when the server reaches the mentioned percentage of free space. The default value is 5. TokuDB will check the filesystem for every 5 sec in the background to determine how much percentage of free space is available, if the free space drops below the mentioned value, then the inserts will get blocked. This kind of behavior is set to avoid the server crash due to disk FULL. Even the disk got filled due to other engines. Then the TokuDB will freeze instead of crash. Until the free space is created/recovered. It is a static variable, so changing the value requires a MySQL server restart. mysql> show global variables like 'tokudb_fs_reserve_percent'; +---------------------------+-------+ | Variable_name | Value | +---------------------------+-------+ | tokudb_fs_reserve_percent | 5 | +---------------------------+-------+ 1 row in set (2.62 sec) Hope we have figured it out, so in the future, if we faced any disk issue reg the TokuDB engine we can have a look at both OS as well as its configuration as well. Happy troubleshooting. [Less]
Posted over 2 years ago by Corrado Pandiani
Working with hundreds of different customers I often face similar problems around running queries. One very common problem when trying to optimize a database environment is index usage. A query that cannot use an index is usually a long-running one ... [More] , consuming more memory or triggering more disk iops. A very common case is when a query uses a filter condition against a column that is involved in some kind of functional expression. An index on that column can not be used. Starting from MySQL 8.0.13 functional indexes are supported. In this article, I’m going to show what they are and how they work. The Well-Known Problem As already mentioned, a very common problem about index usage is when you have a filter condition against one or more columns involved in some kind of functional expression. Let’s see a simple example. You have a table called products containing the details of your products, including a create_time TIMESTAMP column. If you would like to calculate the average price of your products on a specific month you could do the following:mysql> SELECT AVG(price) FROM products WHERE MONTH(create_time)=10; +------------+ | AVG(price) | +------------+ | 202.982582 | +------------+The query returns the right value, but take a look at the EXPLAIN:mysql> EXPLAIN SELECT AVG(price) FROM products WHERE MONTH(create_time)=10\G *************************** 1. row ***************************            id: 1   select_type: SIMPLE         table: products    partitions: NULL          type: ALL possible_keys: NULL           key: NULL       key_len: NULL           ref: NULL          rows: 99015      filtered: 100.00         Extra: Using where  The query triggers a full scan of the table. Let’s create an index on create_time and check again:mysql> ALTER TABLE products ADD INDEX(create_time); Query OK, 0 rows affected (0.71 sec) Records: 0  Duplicates: 0  Warnings: 0mysql> explain SELECT AVG(price) FROM products WHERE MONTH(create_time)=10\G *************************** 1. row ***************************            id: 1   select_type: SIMPLE         table: products    partitions: NULL          type: ALL possible_keys: NULL           key: NULL       key_len: NULL           ref: NULL          rows: 99015      filtered: 100.00         Extra: Using where  A full scan again. The index we have created is not effective. Indeed any time an indexed column is involved in a function the index can not be used. To optimize the query the workaround is rewriting it differently in order to isolate the indexed column from the function. Let’s test the following equivalent query:mysql> SELECT AVG(price) FROM products WHERE create_time BETWEEN '2019-10-01' AND '2019-11-01'; +------------+ | AVG(price) | +------------+ | 202.982582 | +------------+mysql> EXPLAIN SELECT AVG(price) FROM products WHERE create_time BETWEEN '2019-10-01' AND '2019-11-01'\G *************************** 1. row ***************************            id: 1   select_type: SIMPLE         table: products    partitions: NULL          type: range possible_keys: create_time           key: create_time       key_len: 5           ref: NULL          rows: 182      filtered: 100.00         Extra: Using index condition  Cool, now the index is used. Then rewriting the query was the typical suggestion. Quite a simple solution, but not all the times it was possible to change the application code for many valid reasons. So, what to do then?   MySQL 8.0 Functional Indexes Starting from version 8.0.13, MySQL supports functional indexes. Instead of indexing a simple column, you can create the index on the result of any function applied to a column or multiple columns. Long story short, now you can do the following:mysql> ALTER TABLE products ADD INDEX((MONTH(create_time))); Query OK, 0 rows affected (0.74 sec) Records: 0  Duplicates: 0  Warnings: 0Be aware of the double parentheses. The syntax is correct since the expression must be enclosed within parentheses to distinguish it from columns or column prefixes. Indeed the following returns an error:mysql> ALTER TABLE products ADD INDEX(MONTH(create_time)); ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'create_time))' at line 1Let’s check now our original query and see what happens to the EXPLAINmysql> SELECT AVG(price) FROM products WHERE MONTH(create_time)=10; +------------+ | AVG(price) | +------------+ | 202.982582 | +------------+ mysql> EXPLAIN SELECT AVG(price) FROM products WHERE MONTH(create_time)=10\G *************************** 1. row ***************************            id: 1   select_type: SIMPLE         table: products    partitions: NULL          type: ref possible_keys: functional_index           key: functional_index       key_len: 5           ref: const          rows: 182      filtered: 100.00         Extra: NULL  The query is no longer a full scan and runs faster. The functional_index has been used, with only 182 rows examined. Awesome. Thanks to the functional index we are no longer forced to rewrite the query. Which Functional Indexes are Permitted We have seen an example involving a simple function applied to a column, but you are granted to create more complex indexes. A functional index may contain any kind of expressions, not only a single function. The following patterns are valid functional indexes: INDEX( ( col1 + col2 ) ) INDEX( ( FUNC(col1) + col2 – col3 ) ) You can use ASC or DESC as well: INDEX( ( MONTH(col1) ) DESC ) You can have multiple functional parts, each one included in parentheses: INDEX( ( col1 + col2 ), ( FUNC(col2) ) ) You can mix functional with nonfunctional parts: INDEX( (FUNC(col1)), col2, (col2 + col3), col4 ) There are also limitations you should be aware of: A functional key can not contain a single column. The following is not permitted: INDEX( (col1), (col2) ) The primary key can not include a functional key part The foreign key can not include a functional key part SPATIAL and FULLTEXT indexes can not include functional key parts A functional key part can not refer to a column prefix At last, remember that the functional index is useful only to optimize the query that uses the exact same expression. An index created with nonfunctional parts can be used instead to solve multiple different queries. For example, the following conditions can not rely on the functional index we have created: WHERE YEAR(create_time) = 2019 WHERE create_time > ‘2019-10-01’ WHERE create_time BETWEEN ‘2019-10-01’ AND ‘2019-11-01’ WHERE MONTH(create_time+INTERVAL 1 YEAR) All these will trigger a full scan. Functional Index Internal The functional indexes are implemented as hidden virtual generated columns. For this reason, you can emulate the same behavior even on MySQL 5.7 by explicitly creating the virtual column. We can test this, starting by dropping the indexes we have created so far.mysql> SHOW CREATE TABLE products\G *************************** 1. row ***************************        Table: products Create Table: CREATE TABLE `products` (   `id` int unsigned NOT NULL AUTO_INCREMENT,   `description` longtext,   `price` decimal(8,2) DEFAULT NULL,   `create_time` timestamp NULL DEFAULT NULL,   PRIMARY KEY (`id`),   KEY `create_time` (`create_time`),   KEY `functional_index` ((month(`create_time`))) ) ENGINE=InnoDB AUTO_INCREMENT=149960 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci mysql> ALTER TABLE products DROP INDEX `create_time`, DROP INDEX `functional_index`; Query OK, 0 rows affected (0.03 sec)We can try now to create the virtual generated column:mysql> ALTER TABLE products ADD COLUMN create_month TINYINT GENERATED ALWAYS AS (MONTH(create_time)) VIRTUAL; Query OK, 0 rows affected (0.04 sec)Create the index on the virtual column:mysql> ALTER TABLE products ADD INDEX(create_month); Query OK, 0 rows affected (0.55 sec) mysql> SHOW CREATE TABLE products\G *************************** 1. row ***************************        Table: products Create Table: CREATE TABLE `products` (   `id` int unsigned NOT NULL AUTO_INCREMENT,   `description` longtext,   `price` decimal(8,2) DEFAULT NULL,   `create_time` timestamp NULL DEFAULT NULL,   `create_month` tinyint GENERATED ALWAYS AS (month(`create_time`)) VIRTUAL,   PRIMARY KEY (`id`),   KEY `create_month` (`create_month`) ) ENGINE=InnoDB AUTO_INCREMENT=149960 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci  We can now try our original query. We expect to see the same behavior as the functional index.mysql> SELECT AVG(price) FROM products WHERE MONTH(create_time)=10; +------------+ | AVG(price) | +------------+ | 202.982582 | +------------+mysql> EXPLAIN SELECT AVG(price) FROM products WHERE MONTH(create_time)=10\G *************************** 1. row ***************************            id: 1   select_type: SIMPLE         table: products    partitions: NULL          type: ref possible_keys: create_month           key: create_month       key_len: 2           ref: const          rows: 182      filtered: 100.00         Extra: NULL  Indeed, the behavior is the same. The index on the virtual column can be used and the query is optimized. The good news is that you can use this workaround to emulate a functional index even on 5.7, getting the same benefits. The advantage of MySQL 8.0 is that it is completely transparent, no need to create the virtual column. Since the functional index is implemented as a hidden virtual column, there is no additional space needed for the data, only the index space will be added to the table. By the way, this is the same technique used for creating indexes on JSON documents’ fields. Conclusion The functional index support is an interesting improvement you can find in MySQL 8.0. Some of the queries that required rewriting to get optimized don’t require that anymore. Just remember that only the queries having the same filter pattern can rely on the functional index. Then you need to create additional indexes or other functional indexes to improve other search patterns. The same feature can be implemented on MySQL 5.7 with the explicit creation of a virtual generated column and the index. For more detailed information, read the following page: https://dev.mysql.com/doc/refman/8.0/en/create-index.html#create-index-functional-key-parts [Less]
Posted over 2 years ago by Joshua Otwell
It’s that time once again. I’ve published another issue of OpenLamp.tech, the newsletter for PHP/MySQL developers, over the weekend with all the curated content for your reading experience. There is plenty to learn for everyone so dig in… I ... [More] start my day every day with Refind. You should too. Self-Promotion: If you enjoy the content written here, by all means, share this blog and your favorite post(s) with others who may benefit from or like it as well. Since coffee is my favorite drink, you can even buy me one if you would like! This week in the OpenLamp.tech newsletter, we have articles covering: WordPress theme migration checklist Form validation in CodeIgniter 4Hitchhiker’s Guide to SQL InjectionThe databases of WordPressPHP ternary blockAnd much more…Are you a Medium member? If so, receive an email notification each time I publish a blog post if you prefer the Medium platform. Not a member? No worries! Use my sign-up link (I will get a commission at no extra cost to you) and join. I really enjoy reading all the great content there and I know you will too!!! Instead of waiting on me to republish the OpenLamp.tech newsletter here each week, subscribe and have it delivered directly to your inbox each week: The Newsletter for PHP and MySQL Developers Without further ado, here is this week’s issue: OpenLamp.tech issue #8. Like what you have read? See anything incorrect? Please comment below and thank you for reading!!! A Call To Action! Thank you for taking the time to read this post. I truly hope you discovered something interesting and enlightening. Please share your findings here, with someone else you know who would get the same value out of it as well. Visit the Portfolio-Projects page to see blog posts/technical writing I have completed for clients. To receive email notifications (Never Spam) from this blog (“Digital Owl’s Prose”) for the latest blog posts as they are published, please subscribe (of your own volition) by clicking the ‘Click To Subscribe!’ button in the sidebar on the homepage! (Feel free at any time to review the Digital Owl’s Prose Privacy Policy Page for any questions you may have about: email updates, opt-in, opt-out, contact forms, etc…) Be sure and visit the “Best Of” page for a collection of my best blog posts. Josh Otwell has a passion to study and grow as a SQL Developer and blogger. Other favorite activities find him with his nose buried in a good book, article, or the Linux command line. Among those, he shares a love of tabletop RPG games, reading fantasy novels, and spending time with his wife and two daughters. Disclaimer: The examples presented in this post are hypothetical ideas of how to achieve similar types of results. They are not the utmost best solution(s). The majority, if not all, of the examples provided, are performed on a personal development/learning workstation environment and should not be considered production quality or ready. Your particular goals and needs may vary. Use those practices that best benefit your needs and goals. Opinions are my own. How can I help you? Are you thinking of starting up a blog? I use WordPress for my blog. Let’s both save money on the plans offered. Grab a Gmail HTML Email Signature template from my Etsy shop and make your emails pop and stand out. Need hosting for your next web application or WordPress site? I use and highly recommend Hostinger. They have great pricing and service.I enjoy reading Refind: The essence of the web, every morning in your inbox. Subscribe for free. Help me get a premium subscription by signing up yourself with my referral link.Grab a free pack of mobile Creator wallpapers.Disclosure: Some of the services and products links in this post are affiliate links. At no additional cost to you, should you make a purchase by clicking through one of them, I will receive a commission. The post OpenLamp.tech issue #8 appeared first on Digital Owl's Prose. [Less]
Posted over 2 years ago by Joshua Otwell
It’s that time once again. I’ve published another issue of OpenLamp.tech, the newsletter for PHP/MySQL developers, over the weekend with all the curated content for your reading experience. There is plenty to learn for everyone so dig in… I ... [More] start my day every day with Refind. You should too. Self-Promotion: If you enjoy the content written here, by all means, share this blog and your favorite post(s) with others who may benefit from or like it as well. Since coffee is my favorite drink, you can even buy me one if you would like! This week in the OpenLamp.tech newsletter, we have articles covering: WordPress theme migration checklist Form validation in CodeIgniter 4Hitchhiker’s Guide to SQL InjectionThe databases of WordPressPHP ternary blockAnd much more…Are you a Medium member? If so, receive an email notification each time I publish a blog post if you prefer the Medium platform. Not a member? No worries! Use my sign-up link (I will get a commission at no extra cost to you) and join. I really enjoy reading all the great content there and I know you will too!!! Instead of waiting on me to republish the OpenLamp.tech newsletter here each week, subscribe and have it delivered directly to your inbox each week: The Newsletter for PHP and MySQL Developers Without further ado, here is this week’s issue: OpenLamp.tech issue #8. Like what you have read? See anything incorrect? Please comment below and thank you for reading!!! A Call To Action! Thank you for taking the time to read this post. I truly hope you discovered something interesting and enlightening. Please share your findings here, with someone else you know who would get the same value out of it as well. Visit the Portfolio-Projects page to see blog posts/technical writing I have completed for clients. To receive email notifications (Never Spam) from this blog (“Digital Owl’s Prose”) for the latest blog posts as they are published, please subscribe (of your own volition) by clicking the ‘Click To Subscribe!’ button in the sidebar on the homepage! (Feel free at any time to review the Digital Owl’s Prose Privacy Policy Page for any questions you may have about: email updates, opt-in, opt-out, contact forms, etc…) Be sure and visit the “Best Of” page for a collection of my best blog posts. Josh Otwell has a passion to study and grow as a SQL Developer and blogger. Other favorite activities find him with his nose buried in a good book, article, or the Linux command line. Among those, he shares a love of tabletop RPG games, reading fantasy novels, and spending time with his wife and two daughters. Disclaimer: The examples presented in this post are hypothetical ideas of how to achieve similar types of results. They are not the utmost best solution(s). The majority, if not all, of the examples provided, are performed on a personal development/learning workstation environment and should not be considered production quality or ready. Your particular goals and needs may vary. Use those practices that best benefit your needs and goals. Opinions are my own. How can I help you? Are you thinking of starting up a blog? I use WordPress for my blog. Let’s both save money on the plans offered. Grab a Gmail HTML Email Signature template from my Etsy shop and make your emails pop and stand out. Need hosting for your next web application or WordPress site? I use and highly recommend Hostinger. They have great pricing and service.I enjoy reading Refind: The essence of the web, every morning in your inbox. Subscribe for free. Help me get a premium subscription by signing up yourself with my referral link.Grab a free pack of mobile Creator wallpapers.Disclosure: Some of the services and products links in this post are affiliate links. At no additional cost to you, should you make a purchase by clicking through one of them, I will receive a commission. The post OpenLamp.tech issue #8 appeared first on Digital Owl's Prose. [Less]
Posted over 2 years ago by Joshua Otwell
It’s that time once again. I’ve published another issue of OpenLamp.tech, the newsletter for PHP/MySQL developers, over the weekend with all the curated content for your reading experience. There is plenty to learn for everyone so dig in… I ... [More] start my day every day with Refind. You should too. Self-Promotion: If you enjoy the content written here, by all means, share this blog and your favorite post(s) with others who may benefit from or like it as well. Since coffee is my favorite drink, you can even buy me one if you would like! This week in the OpenLamp.tech newsletter, we have articles covering: WordPress theme migration checklist Form validation in CodeIgniter 4Hitchhiker’s Guide to SQL InjectionThe databases of WordPressPHP ternary blockAnd much more…Are you a Medium member? If so, receive an email notification each time I publish a blog post if you prefer the Medium platform. Not a member? No worries! Use my sign-up link (I will get a commission at no extra cost to you) and join. I really enjoy reading all the great content there and I know you will too!!! Instead of waiting on me to republish the OpenLamp.tech newsletter here each week, subscribe and have it delivered directly to your inbox each week: The Newsletter for PHP and MySQL Developers Without further ado, here is this week’s issue: OpenLamp.tech issue #8. Like what you have read? See anything incorrect? Please comment below and thank you for reading!!! A Call To Action! Thank you for taking the time to read this post. I truly hope you discovered something interesting and enlightening. Please share your findings here, with someone else you know who would get the same value out of it as well. Visit the Portfolio-Projects page to see blog posts/technical writing I have completed for clients. To receive email notifications (Never Spam) from this blog (“Digital Owl’s Prose”) for the latest blog posts as they are published, please subscribe (of your own volition) by clicking the ‘Click To Subscribe!’ button in the sidebar on the homepage! (Feel free at any time to review the Digital Owl’s Prose Privacy Policy Page for any questions you may have about: email updates, opt-in, opt-out, contact forms, etc…) Be sure and visit the “Best Of” page for a collection of my best blog posts. Josh Otwell has a passion to study and grow as a SQL Developer and blogger. Other favorite activities find him with his nose buried in a good book, article, or the Linux command line. Among those, he shares a love of tabletop RPG games, reading fantasy novels, and spending time with his wife and two daughters. Disclaimer: The examples presented in this post are hypothetical ideas of how to achieve similar types of results. They are not the utmost best solution(s). The majority, if not all, of the examples provided, are performed on a personal development/learning workstation environment and should not be considered production quality or ready. Your particular goals and needs may vary. Use those practices that best benefit your needs and goals. Opinions are my own. How can I help you? Are you thinking of starting up a blog? I use WordPress for my blog. Let’s both save money on the plans offered. Grab a Gmail HTML Email Signature template from my Etsy shop and make your emails pop and stand out. Need hosting for your next web application or WordPress site? I use and highly recommend Hostinger. They have great pricing and service.I enjoy reading Refind: The essence of the web, every morning in your inbox. Subscribe for free. Help me get a premium subscription by signing up yourself with my referral link.Grab a free pack of mobile Creator wallpapers.Just getting started or wanting to learn MySQL? Find out about my premium blog posts and MySQL Beginner Series here.Disclosure: Some of the services and products links in this post are affiliate links. At no additional cost to you, should you make a purchase by clicking through one of them, I will receive a commission. The post OpenLamp.tech issue #8 appeared first on Digital Owl's Prose. [Less]