13
I Use This!
Activity Not Available

News

Analyzed 4 months ago. based on code collected about 3 years ago.
Posted over 12 years ago
I’ve just arrived at ANU in Canberra for the Open Source Developers Conference 2011 (OSDC). I’ve spoken at several of the past OSDCs that have been held around Australia: 2005, 2007, 2008, 2010 and now 2011. It’s one of those conferences with great ... [More] energy and great people that’s organised by dedicated members in the community who build the conference they want to go to. I’ll be giving two talks this year: Dropping ACID: Eating Data in a Web 2.0 Cloud World Drizzle 7: GA and Supported: Current and Future Features [Less]
Posted over 12 years ago
I tend to speak highly of the random query generator as a testing tool and thought I would share a story that shows how it can really shine. At our recent dev team meeting, we spent approximately 30 minutes of hack time to produce test cases for 3 ... [More] rather hard to duplicate bugs. Of course, I would also like to think that the way we have packaged our randgen tests into unittest format for dbqp played some small part, but I might be mildly biased. The best description of the randgen’s power comes courtesy of Andrew Hutchings – “fishing with dynamite“. This is a very apt metaphor for how the tool works – it can be quite effective for stressing a server and finding bugs, but it can also be quite messy, possibly even fatal if one is careless. ; ) However, I am not writing this to share any horror stories, but glorious tales of bug hunting! The randgen uses yacc-style grammar files that define a realm of possible queries (provided you did it right…the zen of grammar writing is a topic for another day). Doing this allows us to produce high volumes of queries that are hopefully interesting (see previous comment about grammar-writing-zen). It takes a certain amount of care to produce a grammar that is useful and interesting, but the gamble is that this effort will produce more interesting effects on the database than the hand-written queries that could be produced in similar time. This is especially useful when you aren’t quite sure where a problem is and are just trying to see what shakes out under a certain type of stress. Another win is that a well-crafted grammar can be used for a variety of scenarios. The transactional grammars that were originally written for testing Drizzle’s replication system have been reused many times (including for two of these bugs!) This brings us to our first bug: mysql process crashes after setting innodb_dict_size The basics of this were that the server was crashing under load when innodb_dict_size_limit was set to a smaller value. In order to simulate the situation, Stewart suggested we use a transactional load against a large number of tables. We were able to make this happen in 4 easy steps: 1) Create a test case module that we can execute. All of the randgen test cases are structured similarly, so all we had to do was copy an existing test case and tweak our server options and randgen command line as needed. 2) Make an altered copy of the general, percona.zz gendata file. This file is used by the randgen to determine the number, composition, and population of any test tables we want to use and generate them for us. As the original reporter indicated they had a fair number of tables: $tables = { rows => [1..50], partitions => [ undef ] }; The value in the ‘rows’ section tells the data generator to produce 50 tables, with sizes from 1 row to 50 rows. 3) Specify the server options. We wanted the server to hit similar limits as the original bug reporter, but we were working on a smaller scale. To make this happen, we set the following options in the test case: server_requirements = [["--innodb-dict-size-limit=200k --table-open-cache=10"]] Granted, these are insanely small values, but this is a test and we’re trying to do horrible things to the server ; ) 4) Set up our test_* method in our testcase class. This is all we need to specify in our test case: def test_bug758788(self): test_cmd = ("./gentest.pl " "--gendata=conf/percona/innodb_dict_size_limit.zz " "--grammar=conf/percona/translog_concurrent1.yy " "--queries=1000 " "--threads=1") retcode, output = execute_randgen(test_cmd, test_executor, servers) self.assertTrue(retcode==0, output) The test is simply to ensure that the server remains up and running under a basic transactional load From there, we only need to use the following command to execute the test: ./dbqp.py –default-server-type=mysql –basedir=/path/to/Percona-Server –suite=randgen_basic innodbDictSizeLimit_test This enabled us to reproduce the crash within 5 seconds. The reason I think this is interesting is that we were unable to duplicate this bug otherwise. The combination of the randgen’s power and dbqp’s organization helped us knock this out with about 15 minutes of tinkering. Once we had a bead on this bug, we went on to try a couple of other bugs: Crash when query_cache_strip_comments enabled For this one, we only modified the grammar file to include this as a possible WHERE clause for SELECT queries: WHERE X . char_field_name != 'If you need to translate Views labels into other languages, consider installing the <a href=\" !path\">Internationalization</a> package\'s Views translation module.' The test value was taken from the original bug report. Similar creation of a test case file modifications resulted in another easily reproduced crash. I will admit that there may be other ways to go about hitting that particular bug, but we *were* practicing with new tools and playing with dynamite can be quite exhilarating ; ) parallel option breaks backups and restores For this bug, we needed to ensure that the server used –innodb_file_per_table and that we used Xtrabackup‘s –parallel option. I also wanted to create multiple schemas and we did via a little randgen / python magic: # populate our server with a test bed test_cmd = "./gentest.pl --gendata=conf/percona/bug826632.zz " retcode, output = execute_randgen(test_cmd, test_executor, servers) # create additional schemas for backup schema_basename='test' for i in range(6): schema = schema_basename str(i) query = "CREATE SCHEMA %s" %(schema) retcode, result_set = execute_query(query, master_server) self.assertEquals(retcode,0, msg=result_set) retcode, output = execute_randgen(test_cmd, test_executor, servers, schema) This gave us 7 schemas, all with 100 tables per schema (with rows 1-100). From here we take a backup with –parallel=50 and then try to restore it. These are basically the same steps we use in our basic_test from the xtrabackup suite. We just copied and modified the test case to suit our needs for this bug. With this setup, we need a crash / failure during the prepare phase of the backup. Interestingly this only happens with this number of tables, schemas, and –parallel threads. Not too shabby for about 30 minutes of hacking explaining things, if I do say so myself. One of the biggest difficulties in fixing bugs comes from being able to recreate them reliably and easily. Between the randgen’s brutal ability to produce test data and queries and dbqp’s efficient test organization, we are now able to quickly produce complicated test scenarios and reproduce more bugs so our amazing dev team can fix them into oblivion : ) [Less]
Posted over 12 years ago
So I’m back from the Percona dev team’s recent meeting.  While there, we spent a fair bit of time discussing Xtrabackup development.  One of our challenges is that as we add richer features to the tool, we need equivalent testing capabilities.  ... [More] However, it seems a constant in the MySQL world that available QA tools often leave something to be desired.  The randgen is a literal wonder-tool for database testing, but it is also occasionally frustrating / doesn’t scratch every testing itch.  It is based on technology SQL Server was using in 1998 (MySQL began using it in ~2007, IIRC).  So this is no knock, it is merely meant to be an example of a poor QA engineer’s frustrations ; )  While the current Xtrabackup test suite is commendable, it also has its limitations. Enter the flexible, adaptable, and expressive answer: dbqp. One of my demos at the dev meeting was showing how we can set up tests for Xtrabackup using the unittest paradigm.  While this sounds fancy, basically, we take advantage of Python’s unittest and write classes that use their code.  The biggest bit dbqp does is search the specified server code (to make sure we have everything we should), allocate and manage servers as requested by the test cases, and do some reporting and management of the test cases.  As the tool matures, I will be striving to let more of the work be done by unittest code rather than things I have written : ) To return to my main point, we now have two basic tests of xtrabackup: Basic test of backup restore: Populate server Take a validation snapshot (mysqldump) Take the backup (via innobackupex) Clean datadir Restore from backup Take restored state snapshot and compare to original state Slave setup Similar to our basic test except we create a slave from the backup, replicating from the backed up server. After the initial setup, we ensure replication is set up ok, then we do additional work on the master and compare master and slave states One of the great things about this is that we have the magic of assertions.  We can insert them at any point of the test we feel like validating and the test will fail with useful output at that stage.  The backup didn’t take correctly?  No point going through any other steps — FAIL! : )  The assertion methods just make it easy to express what behavior we are looking for.  We want the innobackupex prepare call to run without error? Boom goes the dynamite!: # prepare our backup cmd = ("%s --apply-log --no-timestamp --use-memory=500M " "--ibbackup=%s %s" %( innobackupex , xtrabackup , backup_path)) retcode, output = execute_cmd(cmd, output_path, exec_path, True) self.assertEqual(retcode, 0, msg = output) From these basic tests, it will be easy to craft more complex test cases.  Creating the slave test was simply matter of adapting the initial basic test case slightly.  Our plans include: *heavy* crash testing of both xtrabackup and the server, enhancing / expanding replication tests by creating heavy randgen loads against the master during backup and slave setup, and other assorted crimes against database software.  We will also be porting the existing test suite to use dbqp entirely…who knows, we may even start working on Windows one day ; ) These tests are by no means the be-all-end-all, but I think they do represent an interesting step forward.  We can now write actual, honest-to-goodness Python code to test the server.  On top of that, we can make use of the included unittest module to give us all sorts of assertive goodness to express what we are looking for.  We will need to and plan to refine things as time moves forward, but at the moment, we are able to do some cool testing tricks that weren’t easily do-able before. If you’d like to try these tests out, you will need the following: * dbqp (bzr branch lp:dbqp) * DBD:mysql installed (test tests use the randgen and this is required…hey, it is a WONDER-tool!) : ) * Innobackupex, a MySQL / Percona server and the appropriate xtrabackup binary. The tests live in dbqp/percona_tests/xtrabackup_basic and are named basic_test.py and slave_test.py, respectively. To run them: $./dbqp.py –suite=xtrabackup_basic –basedir=/path/to/mysql –xtrabackup-path=/mah/path –innobackupex-path=/mah/other/path –default-server-type=mysql –no-shm Some next steps for dbqp include: 1)  Improved docs 2)  Merging into the Percona Server trees 3)  Setting up test jobs in Jenkins (crashme / sqlbench / randgen) 4)  Other assorted awesomeness Naturally, this testing goodness will also find its way into Drizzle (which currently has a 7.1 beta out).  We definitely need to see some Xtrabackup test cases for Drizzle’s version of the tool (mwa ha ha!) >: ) [Less]
Posted over 12 years ago
Replication in Drizzle is very simple and multi-source replication is supported. For a walk through of multi-master (multi-source) replication see David Shrewsbury’s excellent post here. Because it was very succinctly written, here I am quoting a lot ... [More] of his provisioning a new slave post on replication here. But I have added in some detail on the slave.cfg file for clarity for newbies like me, as well as some more detail on the options and their purpose. A lot of this can also be found in the documentation but here I’m going to walk through the steps. Also see the slave docs here for any questions you may have. For our purposes we will walk through the features of setting up basic replication between a master and slave server. You will need to set up your slave.cfg file before you do anything else. It should be located in the “/usr/local” directory but could also be located anywhere you like. Mine is in the /tmp/slave.cfg. This is a typical setup. master-host = “your ip address” master-port = 4427 master-user = kent master-pass = samplepassword io-thread-sleep = 10 applier-thread-sleep = 10 Setting up the master is the next step. An important requirement is to start the master Drizzle database server with the –innodb.replication-log option, and a few other options in most circumstances. More options can be found in the options documentation. These are the most common options needed for a replication master. For example: The InnoDB replication log must be running: –innodb.replication-log PID must be set: –pid-file=/var/run/drizzled/drizzled.pid the address binding for Drizzle’s default port (4427): –drizzle-protocol.bind-address=0.0.0.0 The address binding for systems replicating through MySQL’s default port (3306): –mysql-protocol.bind-address=0.0.0.0 Data Directory can be set other than default: –datadir=$PWD/var For more complex setups, the server id option may be appropriate to use: –server-id=? To run Drizzle in the background, thereby keeping the database running if the user logs out: –daemon So the start command looks like this on my server: master> usr/local/sbin/drizzled \ –innodb.replication-log \ –pid-file=/var/run/drizzled/drizzled.pid \ –drizzle-protocol.bind-address=0.0.0.0 \ –mysql-protocol.bind-address=0.0.0.0 \ –daemon Starting the slave is very similar to starting the master but there are a couple of steps before you are ready to start it up. The following is quoted from David’s blog post on simple replication. “ 1. Make a backup of the master databases. 2. Record the state of the master transaction log at the point the backup was made. 3. Restore the backup on the new slave machine. 4. Start the new slave and tell it to begin reading the transaction log from the point recorded in #2. Steps #1 and #2 are covered with the drizzledump client program. If you use the –single-transaction option to drizzledump, it will place a comment near the beginning of the dump output with the InnoDB transaction log metadata. For example: master> drizzledump –all-databases –single-transaction > master.backup master> head -1 master.backup – SYS_REPLICATION_LOG: COMMIT_ID = 33426, ID = 35074 The SYS_REPLICATION_LOG tells the slave where to start reading from. It has two pieces of information: • COMMIT_ID: This value is the commit sequence number recorded for the most recently executed transaction stored in the transaction log. We can use this value to determine proper commit order within the log. The unique transaction ID cannot be used since that value is assigned when the transaction is started, not when it is committed. • ID: This is the unique transaction identifier associated with the most recently executed transaction stored in the transaction log. Now you need to start the server without the slave plugin, then import the backup from the master, then shutdown and restart the server with the slave plugin. This is straight out of the docs: slave> sbin/drizzled –datadir=$PWD/var & slave> drizzle < master.backup slave> drizzle –shutdown Now that the backup is imported, restart the slave with the replication slave plugin enabled and use a new option, –slave.max-commit-id, to force the slave to begin reading the master’s transaction log at the proper location: “ You need two options for sure, the add slave plugin and defining the slave.cfg file. So the most basic start command is: slave> /usr/local/sbin/drizzled \ –plugin-add=slave \ –slave.config-file=/usr/local/etc/slave.cfg A more typical startup will need more options, My startup looks like this: slave> /usr/local/sbin/drizzled \ –plugin-add=slave \ – datadir=$PWD/var \ –slave.config-file=/usr/local/etc//slave.cfg \ –pid-file=/var/run/drizzled/drizzled.pid \ –drizzle-protocol.bind-address=0.0.0.0 \ –mysql-protocol.bind-address=0.0.0.0 \ –daemon \ – slave.max-commit-id=33426 The slave.max-commit-id is found in the dump file that we made from the master and tells the slave where to start reading from. If you need more info for your particular setup you can view a lot of detail in the sys replication log and the innodb replication log tables that will help you with clarity. Two tables in the DATA_DICTIONARY schema provide the different views into the transaction log: the SYS_REPLICATION_LOG table and the INNODB_REPLICATION_LOG table. drizzle> SHOW CREATE TABLE data_dictionary.sys_replication_log\G *************************** 1. row *************************** Table: SYS_REPLICATION_LOG Create Table: CREATE TABLE `SYS_REPLICATION_LOG` ( `ID` BIGINT, `SEGID` INT, `COMMIT_ID` BIGINT, `END_TIMESTAMP` BIGINT, `MESSAGE_LEN` INT, `MESSAGE` BLOB, PRIMARY KEY (`ID`,`SEGID`) USING BTREE, KEY `COMMIT_IDX` (`COMMIT_ID`,`ID`) USING BTREE ) ENGINE=InnoDB COLLATE = binary drizzle> SHOW CREATE TABLE data_dictionary.innodb_replication_log\G *************************** 1. row *************************** Table: INNODB_REPLICATION_LOG Create Table: CREATE TABLE `INNODB_REPLICATION_LOG` ( `TRANSACTION_ID` BIGINT NOT NULL, `TRANSACTION_SEGMENT_ID` BIGINT NOT NULL, `COMMIT_ID` BIGINT NOT NULL, `END_TIMESTAMP` BIGINT NOT NULL, `TRANSACTION_MESSAGE_STRING` TEXT COLLATE utf8_general_ci NOT NULL, `TRANSACTION_LENGTH` BIGINT NOT NULL ) ENGINE=FunctionEngine COLLATE = utf8_general_ci REPLICATE = FALSE There you are, you should be up and running with your replication set up. For more details you can always check the online documentation. And make sure you check out dshrewsbury.blogspot.com. [Less]
Posted over 12 years ago
The Fremont beta2, version 2011.11.29, is out and ready to be tested. In this release: * continuing refactoring, restructuring, and code quality improvements * many more documentation improvements * documentation available at docs.drizzle.org ... [More] * fixes to libdrizzle .pc support * fixes to build scripts * additional bugs fixed The Drizzle download file can be found here [Less]
Posted over 12 years ago
I’m surprised and delighted to see that the Drizzle documentation was updated recently. Last time I looked, it was the original documentation which was missing, among other things, information about its 70+ plugins. So Henrik and I began filling in ... [More] missing pieces of crucial information like administering Drizzle. I even generated skeleton documentation for every [...] [Less]
Posted over 12 years ago
The entire drizzle.org domain was unavailable for about 10 hours today. This made our website, documentation, jenkins master and mail server inaccessible. On the other hand as we use public services such as Launchpad and Freenode for code repository ... [More] , bug tracking, mailing list and IRC, this meant that development work continued as active as ever - in fact I think it was the most active day on IRC #drizzle channel in a while! The DNS outage was related to our transferring of the drizzle.org domain from an individual Drizzle developer to Software in the Public Interest, Inc, our umbrella non-profit corporation. We don't know exactly why, but something went wrong between the registrars, so that the Whois record listed Tucows, the sponsoring registrar used by SPI, as the new registrar, but all other information was still pointing to the old registrar, including some Godaddy nameservers. As Godaddy eventually stopped answering DNS queries for drizzle.org - as they should - the drizzle.org domain became unavailable. 10 hours later the issue was fixed, and the correct SPI nameservers started to propagate through the DNS system. At the time of this writing, everything should have been working normally for some hours already. And yes, in related news, drizzle.org is now transfered to the ownership of Software in the Public Interest. This is yet another step in our process of becoming a solid non-profit community project, with fiscal services provided by the SPI. So far the experience has been enjoyable and we've really felt a warm welcome into the family of SPI hosted free and open source software projects. On that note I'd like to thank Ganneff, Solver and Hydroxide from the #spi  channel for actively helping in troubleshooting and fixing the problem today. [Less]
Posted over 12 years ago
So I’m back from the Percona dev team’s recent meeting.  While there, we spent a fair bit of time discussing Xtrabackup development.  One of our challenges is that as we add richer features to the tool, we need equivalent testing capabilities.  ... [More] However, it seems a constant in the MySQL world that available QA tools often leave something to be desired.  The randgen is a literal wonder-tool for database testing, but it is also occasionally frustrating / doesn’t scratch every testing itch.  It is based on technology SQL Server was using in 1998 (MySQL began using it in ~2007, IIRC).  So this is no knock, it is merely meant to be an example of a poor QA engineer’s frustrations ; )  While the current Xtrabackup test suite is commendable, it also has its limitations. Enter the flexible, adaptable, and expressive answer: dbqp. One of my demos at the dev meeting was showing how we can set up tests for Xtrabackup using the unittest paradigm.  While this sounds fancy, basically, we take advantage of Python’s unittest and write classes that use their code.  The biggest bit dbqp does is search the specified server code (to make sure we have everything we should), allocate and manage servers as requested by the test cases, and do some reporting and management of the test cases.  As the tool matures, I will be striving to let more of the work be done by unittest code rather than things I have written : ) To return to my main point, we now have two basic tests of xtrabackup: Basic test of backup restore: Populate server Take a validation snapshot (mysqldump) Take the backup (via innobackupex) Clean datadir Restore from backup Take restored state snapshot and compare to original state Slave setup Similar to our basic test except we create a slave from the backup, replicating from the backed up server. After the initial setup, we ensure replication is set up ok, then we do additional work on the master and compare master and slave states One of the great things about this is that we have the magic of assertions.  We can insert them at any point of the test we feel like validating and the test will fail with useful output at that stage.  The backup didn’t take correctly?  No point going through any other steps — FAIL! : )  The assertion methods just make it easy to express what behavior we are looking for.  We want the innobackupex prepare call to run without error? Boom goes the dynamite!: # prepare our backup cmd = ("%s --apply-log --no-timestamp --use-memory=500M " "--ibbackup=%s %s" %( innobackupex , xtrabackup , backup_path)) retcode, output = execute_cmd(cmd, output_path, exec_path, True) self.assertEqual(retcode, 0, msg = output) From these basic tests, it will be easy to craft more complex test cases.  Creating the slave test was simply matter of adapting the initial basic test case slightly.  Our plans include: *heavy* crash testing of both xtrabackup and the server, enhancing / expanding replication tests by creating heavy randgen loads against the master during backup and slave setup, and other assorted crimes against database software.  We will also be porting the existing test suite to use dbqp entirely…who knows, we may even start working on Windows one day ; ) These tests are by no means the be-all-end-all, but I think they do represent an interesting step forward.  We can now write actual, honest-to-goodness Python code to test the server.  On top of that, we can make use of the included unittest module to give us all sorts of assertive goodness to express what we are looking for.  We will need to and plan to refine things as time moves forward, but at the moment, we are able to do some cool testing tricks that weren’t easily do-able before. If you’d like to try these tests out, you will need the following: * dbqp (bzr branch lp:dbqp) * DBD:mysql installed (test tests use the randgen and this is required…hey, it is a WONDER-tool!) : ) * Innobackupex, a MySQL / Percona server and the appropriate xtrabackup binary. The tests live in dbqp/percona_tests/xtrabackup_basic and are named basic_test.py and slave_test.py, respectively. To run them: $./dbqp.py –suite=xtrabackup_basic –basedir=/path/to/mysql –xtrabackup-path=/mah/path –innobackupex-path=/mah/other/path –default-server-type=mysql –no-shm Some next steps for dbqp include: 1)  Improved docs 2)  Merging into the Percona Server trees 3)  Setting up test jobs in Jenkins (crashme / sqlbench / randgen) 4)  Other assorted awesomeness Naturally, this testing goodness will also find its way into Drizzle (which currently has a 7.1 beta out).  We definitely need to see some Xtrabackup test cases for Drizzle’s version of the tool (mwa ha ha!) >: ) [Less]
Posted over 12 years ago by Mark Atwood
The Drizzle project regularly gets people asking what they can do to get involved in the project. One very easy way to brush up on your C++ skills and dip your toe into our open development process is to fix minor warnings. We are very proud ... [More] that Drizzle builds with zero warnings for with "gcc -Wall -Wextra". But we can be even better!  Our JenkinsCI system has a target that is even more picky, and also a target that runs cppcheck. Go to one of those pages, pick a build log off the build history, find a warning that you think you can fix, and then ask us in the #drizzle channel on Freenode how to send your fix to us. After you've done that a few times, you'll be ready to fix some low hanging fruit. We've had people graduate from this process into becoming a Google Summer of Code student, and eventually having a full time paying job hacking on Drizzle and other open source software. And it all starts with writing a simple warning fix. [Less]
Posted over 12 years ago
Here are the slides to my second talk at last week's Percona Live event in London: Fixed in drizzle View more presentations from Henrik Ingo read more