Eric DayThoughts, code, and other oddments. |
Dark | Light |
|
|
||||||||||
< Older Entries || Newer Entries > Archive for the "MySQL" CategoryDrizzle Protocol ChangesMarch 17th, 2010On an entirely unrelated note to the MySQL protocol discussions happening yesterday, the MySQL protocol is now the default protocol in Drizzle as of Monday’s tarball (3/15). Drizzle supports a limited version of the MySQL protocol, only supporting the subset of commands Drizzle cares about (no server-side prepared statements, replication, or deprecated commands due to SQL query equivalents). Not all MySQL clients have been fully tested with it, but our entire test suite is using it now with the libdrizzle MySQL implementation. The latest release of libdrizzle also includes defaulting to the MySQL protocol and port for Drizzle connections. There has been some debate about this change, even amongst some of the core developers. The current Drizzle protocol is a slight modification of the MySQL protocol. It has been running on the IANA assigned Drizzle port (4427) by default since the beginning. We are developing a new Drizzle protocol which will have a number of new features, as well as being more extensible than the old protocol. It may be some time before this protocol is stable and tested well enough to make it the default. Thinking into the future, there are a couple upgrade paths to consider when it is ready: The first option is to declare a certain release the “new protocol release” and clients talking to it would need to be upgraded at the same time. This would switch the line-level protocol on port 4427 from the old Drizzle protocol to the new. Clients would be required to upgrade because those using the old protocol would break when trying to connect to the new server-side protocol module (there are no clever, efficient hacks here because different sides send the first byte for each protocol). This option doesn’t involve any dependencies on the MySQL protocol module or APIs at all, but does make a forced upgrade situation in the future for clients tools and APIs. The second option is to make the MySQL protocol the default protocol now, and when the new Drizzle protocol module is ready and available, tell folks they can start using it. Those people using libdrizzle or it’s derived APIs will have an easy transition because a libdrizzle release can change the default as well. This allows us to put the Drizzle protocol module (port 4427) in an experimental mode that changes between releases as we develop it. This introduces a required dependency to the MySQL protocol module for the time being, and possible confusion when we make the switch back. Both approaches have pros and cons, and for better or for worse, it has been decided to take the latter approach. If you start running the drizzle tarball being released today, the client tools will speak MySQL to port 3306 by default, and the latest libdrizzle release defaults to this as well. If you are running Drizzle on the same machine as MySQL/MariaDB and you get a port conflict on 3306 when starting, be sure to start drizzled with –mysql-protocol-port=X to bind to a different port (you’ll of course need to use the same port in the client utils/APIs when connecting). Posted in Drizzle, Main, MySQL | 8 CommentsOpen Source Bridge 2010March 16th, 2010A couple months ago Selena Deckelmann asked if I wanted to co-chair the Open Source Bridge Conference this year, and I was thrilled to say yes! This conference is all volunteer run by some of the most dedicated volunteers I have ever seen, I’m excited to be working with such a fantastic group of people. The conference is also backed by the 501(c)3 non-profit Technocation which is primarily run by Sheeri Cabral who is well known in the MySQL community. The conference is June 1-4 in Portland, OR, and will be held at the Portland Art Museum. The call for proposals is open until the end-of-day on March 25th, so please submit your ideas now! Early registration is also open until April 1st, so now would also be a great time to sign up. We are still looking for sponsors, so if your organization or company is interested, please get in touch! Emphasizing what our about page describes, this conference is slightly different in some regards to others in that we’re focused on open source citizenship. We want folks to openly share ideas in a variety of ways and to get things done when they attend. We also are trying to keep the cost as low as possible so it is accessible to a larger audience. We’re also partnering with O’Reilly’s OSCON which is in Portland during late July because both sides feel the two conference complement each other, we are in no way trying to compete. Portland is bursting with so much open source, one conference could not contain it! You should come check it out. :) Posted in Drizzle, Main, MySQL | 1 CommentDrizzling from the Rackspace CloudMarch 8th, 2010Since I left Sun back in January, folks have been asking what was next. I’m happy to say that I’m going to continue hacking on open source projects like Drizzle and Gearman, but now at the Rackspace Cloud. Not only will I be there, but I get to continue working closely with a few of the amazing Drizzle hackers who have also joined, including Monty Taylor, Jay Pipes, Stewart Smith, and Lee Bieber. Why Rackspace Cloud? Late last year I was considering what I wanted to do next with the Oracle acquisition looming near, and this was one of the options that presented itself. Rackspace had been a supporter of Drizzle from early on by offering virtual machines to develop and test on, and when talking to some folks more closely, something really hit home. Rackspace provides first-class service and “fanatical” support – they are not a software company. One might ask why an open source software developer would be interested in a company that doesn’t create software or vice-versa, and the answer is that Rackspace wants to find ways to offer the best possible service now and into the future. What better way than to help develop the next generation of service software and get a jump start into integrating this into their architecture? Both the open source community and Rackspace win. Another thing I learned while talking with Rackspace is that one of their core principles is transparency. This applies to both customer and employees, and anyone within an open source community can appreciate this. The more I learned about the company and the folks within it, the more impressed I was at the lack of internal barriers or “need-to-know” information. One of Drizzle’s core goals is also transparency, from discussing design decisions on public mailing lists and IRC, to having the entire project management infrastructure hosted out in the open at Launchpad. What does this mean for the Drizzle project? It means continued support for a number of core developers, more infrastructure for development, and most importantly in my eyes, more context. One of the Drizzle tag-lines is “A Lightweight SQL Database for Cloud and Web,” so what better place to develop a database designed for the cloud than on one of the fastest growing cloud platforms. We’ll get a detailed look at the demands, get feedback from cloud customers, and have the perfect test bed for offering new services. We’ll also be able to work closely with a top-notch group of DBAs, developers, and sysadmins in one of the most demanding service architectures out there. This invaluable context will help the Drizzle developers make more informed decisions moving forward, which also means better software for the community. Personally, this also means getting back to my hosting roots. Before Sun, I worked at Concentric for almost 10 years in a clustered hosting environment. I’m very familiar with many of the multi-tenant scalability concerns Rackspace has, and I’m excited to be working in this type of environment again. We’ve already been working closely with the MySQL DBAs at Rackspace to learn what the biggest pain points are for a multi-tenant architecture, and we’ll be taking steps to address these as it will help anyone wanting to run Drizzle in a cloud-like environment. Drizzle’s modular architecture has already proved useful, as some of these concerns are easily answered with “oh, we have a plugin point for that.” I’m excited, this is going to be a fun ride. Posted in Drizzle, Gearman, Main, MySQL | 16 CommentsC++, or Something Like ItMarch 5th, 2010I’ve developed primarily in C most of my career, and recently decided to give C++ a shot as my “primary language” due to hacking on Drizzle and MySQL. The past few months I’ve read and experimented with most features C++ provides over C, including reading Scott Meyer’s excellent “Effective” series books (highly recommended). Along the way I’ve been developing a project I’ve wanted to write for a while, and I’m finding some features to be problematic. I thought I’d share these issues so others can be aware of them and perhaps I can learn better workarounds. The project I’ve been working on uses dynamic shared object loading at runtime (using dlopen() and friends), is threaded, and has about every strict compiler warning on you can find and being treated as errors (thanks to Monty Taylor’s pandora-build project). I’m also testing on various architectures and compilers, including Linux, OpenSolaris, and OSX. I also have been trying my best to avoid any dependencies on large C++ libraries like Boost and just stick to the standard language and STL. With these requirements in mind, here are the issues I’ve run into: Can’t Reliably Use Exceptions My first pass relied on exceptions, but this proved problematic on some architectures as soon as custom exceptions were being throw across module boundaries. This comes down to ABI issues for some shared object formats generated by some compiler versions. While you can make it work in some environments, it’s not going to be portable. This means I’ve had to catch exceptions closer to where they are throw, requiring a lot more try/catch blocks, and not being able to take full advantage of automatic stack cleanup. This also means resorting back to the C way or handling exceptions: returning and checking return codes while generating error strings. To be completely exception safe, this means not using std::string for error returns since they can throw exceptions while building useful error messages. Not using exceptions has had a viral effect throughout the rest of the design of the code, making it look more like C. I was a bit disappointed by this, as not having to check every function’s return code was keeping the code very clean. :) Limited Use of the STL and std::string I was excited to take advantage of the STL, as writing things like doubly-linked lists and hash tables for every C struct was getting a bit old (I did have a set of macros I used, but they were not the most popular in some circles because of certain C-preprocessor features). When I learned more about the internals of the STL, and how it relies heavily on copying objects, my heart sunk a little. It completely makes sense in the design, it’s just not as efficient as it could be (especially coming from a place where I would optimize to reduce pointer copies in C). No worries, I just created private copy constructors/assignment operators and only used pointers to objects. This came with it’s own set of issues with pointer management and avoiding leaks if the ‘new’ operator were to fail. Once working out the memory management issues, there were still exceptions to watch out for, including figuring out all the methods that may throw (due to an internal allocation usually). This is especially annoying when doing simple std::string operations like assignment or concatenation, and having to always catch around those. With other annoyances like the reference-to-reference issues and std::unary_function having a non-virtual destuctor, I’ve ended up using a watered down set of STL algorithms and resorted to a mix of non-STL containers and custom algorithms for some things. The lack of thread safety concerns in STL containers and differences in implementations have also lead me to not use STL containers for thread communication (using a mutex for every access is not efficient). Conclusion For the sake of consistency, I’ve wondered if it’s worth incorporating STL components? Is it better to have a mix or none at all? This would leave only inheritance, polymorphism, member protections, namespaces, and automatic object destruction the only C++ features being used. These are still very good reasons to use C++, but I’ve found the transition to not be as productive as I had hoped. I am very curious to hear other folks thoughts on their experience with any of the issues above. Posted in Drizzle, Main, MySQL | 4 CommentsMySQL Conf & Drizzle Dev DayFebruary 17th, 2010I’m glad to announce that we’ll be having a Drizzle developer day again this year on the Friday after the MySQL Conference! Be sure to sign up and add any topic ideas you may have so we know what folks are interested in. Space is limited! While at the MySQL Conference, I’ll be speaking with Monty Taylor on “Using Drizzle.” This will take a non-developer approach to the project, so everyday DBAs and web developers should find this interesting. I’ll also be teaming up with Giuseppe Maxia to talk about Gearman in three sessions. These include:
We’re also going to have a combo Drizzle/Gearman booth in the expo hall, so be sure to stop by and chat. See you there! Posted in Drizzle, Gearman, MySQL | No CommentsMoving OnJanuary 11th, 2010Friday was my last day at Sun Microsystems, and today is the first day at my new job (location coming soon). I’ve had a great time at Sun, and thank them for all the opportunities given to me there. I’ll be doing mostly the same work at the new gig, working on projects like Drizzle, but with a slightly different focus. For the most part my day-to-day won’t change much. Right now I’m focusing on libdrizzle again and am implementing the prepared statement API, cleaning up the MySQL protocol support a little, and also implementing the new Drizzle client/server protocol. I’ll continue to work on Gearman as well, especially where it is relevant to Drizzle. I also need to start blogging again with specific topics in the projects I’m working on, I’ve been fairly quiet lately. I’ll be in New Zealand next week at Linux Conf AU (yes, it’s not in AU this year). I have a talk on Gearman, and it looks like I’ll also be helping out with the Drizzle talk. It will be really nice to escape the Portland, OR winter for a bit. :) Posted in Drizzle, Gearman, Main, MySQL | 4 CommentsPluggable Database Client ToolNovember 23rd, 2009A few weeks ago I wrote about a student group who will be working with the Drizzle community to build a new database client tool. While the tool will be the primary replacement for the Drizzle client tool, we hope it will be generic (using the Python DB API) so it will work with others like MySQL and PostgreSQL. We’ve had a number of great discussions, including a session at OpenSQL camp last weekend. I wanted to toss out a few ideas of how such a tool could be structured to allow for maximum extensibility. One possibility is to borrow from typical Unix shells and DSP processing systems where you have a number of modules with I/O interfaces and data exchange formats between each module. Each module provides a specific signature so you know what other modules it can plug into. Here is a simple example:
New Database Command Line ClientOctober 29th, 2009A few weeks ago I proposed a project to students at Portland State University for their senior capstone class, and this weekend I found out it was chosen by a group! The project will be a rewrite of the command line tool (the Drizzle tool is currently based on the ‘mysql’ tool), plus a lot of new features. We’re really excited to be working with them, and they seem equally excited about the project too. I hope DBAs, developers, and other folks in the Drizzle/MySQL/MariaDB communities will work with them to help define what features should be part of this new command line client. Some new features we have in mind are background queries, piping and redirection of queries (like a normal shell), and plugin support. It will also support at least the MySQL/MariaDB protocol too since it will be built on libdrizzle, but possibly more if we end up using a common DB API (we’re pondering Python). If you have any ideas or feature requests, feel free to leave a comment. The student group will be sending plans to the Drizzle mailing list soon for feedback, as well as attending OpenSQL Camp and leading a session on what folks would like to see in a client tool. Join me in welcoming Clark, Ken, Max, Victoria, David, and Andreas! Posted in Drizzle, Main, MySQL | 5 CommentsOpenSQL Camp, SQL vs NoSQLOctober 26th, 2009The upcoming OpenSQL Camp is almost full! We have space for 130 people to register, and as of this writing only 10 spots are free. If you want to attend, sign up before it’s too late! We’re still looking for a few sponsors if anyone is interested in helping cover food and t-shirt costs. I’m organizing the closing keynote panel, “SQL vs NoSQL”, which will include core community members and committers from a number of open source databases. Selena has offered to take the PostgreSQL position if we don’t find another worthy contender. So far, it will include:
I’ll be sure to report who is the last one standing so we know which project to follow the closest. :) Posted in Drizzle, Main, MySQL | 1 CommentEventually Consistent Relational Database?October 12th, 2009This weekend I attended Drupal Camp PDX and listened to a session titled “Drupal in the Cloud”. The presenter, Josh Koenig from Chapter Three, gave a great introduction of what moving to “the cloud” really means, especially in the context of a typical web application like Drupal. The problem, which is of course no fault of Josh’s, is that the best high availability database practices are harder to deploy because you’re working within a different set of constraints in the cloud. Sure, you can setup MySQL replication, but without the ability to insert a hardware load balancer or better control over floating IPs, reliable single-master solutions are difficult at best. I spoke with Josh for a bit after and discussed how Drizzle is doing things to help and what it would take to have a Drizzle back end for Drupal (turns out it should not be too difficult). We then got onto the topic what some of the newer non-relational databases would look like for Drupal, and the short answer is it would be extremely difficult. Drupal, in both the core and many of the modules, depend on a relational model for the underlying data. This is not unique to Drupal. People, and the software they write, have thought “relational” for decades when it comes down to data. Sure, the various NoSQL projects are becoming more popular, but the masses are still thinking in terms of joining tables. Silver Bullet So, what would be the silver bullet? A relational database that did not depend on a single master. Not just dual-master setups with offset auto-increment, I’m talking about removing the entire concept of master-slave for replication. This is obviously nothing new in the industry, but it’s never been easy to accomplish. Just do some reading on distributed locking algorithms and you’ll get the idea. The main problem with distributed locking is that they don’t scale. But, what about an eventually consistent replication model for a relational database? So far eventually consistent databases have not been relational (document based like CouchDB or simple key/value pairs) and relational databases have always focused on atomic consistency or some close relaxed relative (various levels of serialization). As a thought experiment, I’m going to attempt to describe what this may look under the hood at a high level. Eventually consistent? Not familiar with this term? Take a look at Werner Vogels’ article on the topic. The main idea behind EC is that you sacrifice the ability for all nodes to see exact same thing at any given time (consistency), but in return you can tolerate network partitions and you have availability. This directly relates to the CAP theorem which states you only get two of: Consistency, Availability, and tolerance to network Partitions. So, we are throwing out “C” so we can get rid of those nasty distributed locking algorithms, but in return we take on “EC”. MyEventuallyConsistentSQL Let’s start off with a traditional relational database and start modifying it until we have something that looks like an ECRDBMS (ok, maybe this acronym is a bit wordy).
What are we missing? What else would break down if we toss out atomic consistency and make the above changes? One thing I left out is DDL operations. Those would require some more thought, but I’m pretty sure we could figure out a way to handle conflicting events, possibly with configuration parameters to control the decisions made in conflict resolution algorithms. For example, if an UPDATE event gets applied after a ALTER TABLE that removed a column referenced in the UPDATE, you could just ignore that value and apply the other updates (if any). Chances are you didn’t want that column if it was removed at about the same time. This model has the major benefit of not having to worry about which node is the master or keeping an ordered replication log, they all operate independently and toss deterministic events which can be applied in any order. Summary This would-be-ECRDBMS looks a bit different on the inside, but from the outside it will look pretty familiar. From the normal web application perspective we are still creating tables, inserting data, joining data, and doing all the things we depend on from a relational database. This many not be a great idea, but I think it would be possible if you are willing to accept some of the behaviors that come along with it. So what do you think? How can it be improved? Would you use it for your application? Posted in Drizzle, Main, MySQL | 9 Comments< Older Entries || Newer Entries > |
Blog Wiki About Resume RSS Comments Launchpad identi.ca OpenStack Scale Stack Gearman NW Veg Veg Food & Fit |
|||||||||
|
Copyright (C) Eric Day - eday@oddments.org All content licensed under the Creative Commons Attribution 3.0 License. Hosted by Rackspace Cloud |
||||||||||