< New Database Command Line Client || Moving On >
Pluggable Database Client Tool
November 23rd, 2009
A few weeks ago I wrote about a student group who will be working with the Drizzle community to build a new database client tool. While the tool will be the primary replacement for the Drizzle client tool, we hope it will be generic (using the Python DB API) so it will work with others like MySQL and PostgreSQL. We’ve had a number of great discussions, including a session at OpenSQL camp last weekend. I wanted to toss out a few ideas of how such a tool could be structured to allow for maximum extensibility.
One possibility is to borrow from typical Unix shells and DSP processing systems where you have a number of modules with I/O interfaces and data exchange formats between each module. Each module provides a specific signature so you know what other modules it can plug into. Here is a simple example:
 |
Simple
drizzle> SELECT TABLE_SCHEMA,TABLE_NAME FROM INFORMATION_SCHEMA.TABLES LIMIT 3;
+--------------------+---------------------------------------+
| TABLE_SCHEMA | TABLE_NAME |
+--------------------+---------------------------------------+
| information_schema | CHARACTER_SETS |
| information_schema | COLLATIONS |
| information_schema | COLLATION_CHARACTER_SET_APPLICABILITY |
+--------------------+---------------------------------------+
3 rows in set (0 sec)
The client tool would start with a single active module, the Command Line reader that provides an interactive prompt. When a command is read, it creates a pipeline including a Query Executer module then a Console Output module (these are all the sane “default” modules to use). Once the pipeline is created, it then provides a Command message to the Query Executer, which talks to the appropriate Drizzle database server, and then starts providing column headers and rows in a Data Set message format to the Console output plugin. Initially this may seem a bit over engineered for such a simple use case, but let’s explore the flexibility this provides.
|
 |
File Reader/Writer
drizzle> grep foo < commands > results
In this example, the client tool is started with the input module being a file reader. This module will read commands from the file (or stdin if things are piped into stdin) and feed each command to the Query Executer module. Since we specified an optional “grep foo” on the command line, this will initialize the ‘grep’ module (it may be more efficient to put this filtering into the WHERE clause, but ignore that for now). The result set will therefore be piped into the grep module, and then a possibly modified result set is pipelined out of that and into the File Writer module which will write the data set to ‘results’. Now imagine being able to write your own plugins, similar to the grep plugin, that can easily plug right into the pipeline just by using the common name. Each plugin can have any number of options and arguments, allowing you to create flexible modules to do client-side data set processing.
CSV Processor
drizzle> csv input.csv | INSERT INTO x VALUES ($1, $3, $5);
In this case, the command ‘csv’ causes a module to be initialized that will read the file input.csv file. The fields are separated out and formatted into a Data Set exchange message that is then fed into a query generator. The argument to the query generator is the INSERT command from the command line, and a number of INSERT commands would then be generated from the CSV input. These commands would then be piped into the query executer module and eventually results would be displayed on the console. Any other modules could also be plugged in to work on either the input or output data sets (filtering, rewriting, …).
|
 |
 |
Multi-Server/Result Aggregator
drizzle> \servers 10.0.0.1,10.0.0.2,10.0.0.3
drizzle> SELECT TABLE_SCHEMA,TABLE_NAME FROM INFORMATION_SCHEMA.TABLES LIMIT 3; | merge -c
+-------+--------------------+---------------------------------------+
| COUNT | TABLE_SCHEMA | TABLE_NAME |
+-------+--------------------+---------------------------------------+
| 3 | information_schema | CHARACTER_SETS |
| 3 | information_schema | COLLATIONS |
| 3 | information_schema | COLLATION_CHARACTER_SET_APPLICABILITY |
+-------+--------------------+---------------------------------------+
3 rows in set (0 sec)
This example takes a list of servers, executes a command on all of them, aggregates the results, and displays the final result to the console. This specific command may not be all that useful, but similar aggregate data sets could be very useful when you manage a number of servers or have sharded data sets across multiple servers. In the command above, the ‘merge -c’ command merges all the resulting data sets into a single data set, and also inserts a ‘count’ column showing how many of each row appeared during the merge (think of the uniq -c unix command line tool).
If you took this concept a step further, you could read result sets from one command and pipe them into a another command generator that could talk to one or more servers. This allows the tool to become a generic “data set router” for various purposes. Rather than using a command line interface, you could replace the Command Line and Console Output plugins with a simple GUI input/out. The entire system could also be used internally within your applications to provide a richer set of functionality that the normal simple DB API provides. Building such a system up from a series of equal modules (ie, there are no first class modules) would allow great flexibility for users and developers to customize to their environment.
|
Posted in Drizzle, Main, MySQL
Leave a Reply
< New Database Command Line Client || Moving On > |
Blog
Wiki
About
Resume
RSS
Comments
E-Mail
Launchpad
LinkedIn
Twitter
identi.ca
Facebook
OpenStack
Scale Stack
Gearman
NW Veg
Veg Food & Fit

|
very good to have seperate modules / classes like command interface, query executer and console display.
A web based client tool developer can just modify the last one.
or there can be a firefox pluggin as client tool just like sqlitemanager.. who knows the future :)
but other things like multiserver Aggregator looks an overkill now.
anyway modularity is good for future.
Thanks for the post.