Eric Day

Thoughts, code, and other oddments.
Dark | Light

< || >

Asynchronous I/O – How It Could Speed Up Your App

May 6th, 2008

In a previous post I wrote about how I have started implementing asynchronous I/O into the MySQL client library. I plan on contacting and working with other client API maintainers (PHP, Python, Ruby, …) to make sure this functionality gets pushed out to those places too. Any comments or suggestions on how the interfaces should behave are of course welcome, and I’ll get patches posted somewhere for testing once I have the basics working. This is also my first project going through the MySQL Community Contributions Program so it can be included as part of a later release. I sent out the first contact e-mail to MySQL a week ago, but have not received any response yet. Anyone at MySQL listening? :)

For those of you new to the idea of asynchronous clients, check out the Wikipedia pagefor an introduction. The basic idea is to be able to issue a query, have the function return immediately, do some other work (while the server processes the query), and then check for the query response and process the result (as normal). This may not be much of a gain (if any) for fast queries, but for those potentially sluggish queries that take a few hundred milliseconds, this could be significant (assuming you have something else to process during that time). This is especially so if you need to issue multiple queries that take a little time since they could all run in parallel connections.

Let me demonstrate with a simple use case, ignoring error checking for brevity:

...
$mysql = mysql_connect("localhost", "myuser", "mypass");
$result = mysql_query("...query that takes 500ms...");
...process result...
$result = mysql_query("...query that takes 500ms...");
...process result...
...

Imagine this is part of a PHP script for our new webpage, and it needs to issue two queries to the MySQL server. Each of these queries takes approximately 500ms, resulting in a total processing time of ~1 second. Now if the code uses asynchronous I/O:

...
$mysql[0] = mysql_connect("localhost", "myuser", "mypass", 1, MYSQL_CLIENT_ASIO);
$mysql[1] = mysql_connect("localhost", "myuser", "mypass", 1, MYSQL_CLIENT_ASIO);
mysql_query_start("...query that takes 500ms...", $mysql[0]);
mysql_query_start("...query that takes 500ms...", $mysql[1]);
$result = mysql_wait($mysql[0]);
...process result...
$result = mysql_wait($mysql[1]);
...process result...
...

(Note, this is psuedo code, it will just break if you try it currently!)

In this example, the “mysql_query_start” function calls will return immediately (send the packet out on the socket and return). This allows the server to process both of them in parallel, resulting in a total processing time closer to ~500ms. We just cut our page load time in half! (…well, ignoring network latency, but you get the idea)

You can of course expand on this by doing some other time consuming processing before waiting, say image manipulation. You can also issue more than two SQL queries (possibly to different servers) and then collect all the responses in the end. I also plan on adding a query pool so that you can add multiple pending queries to it and wait for the first one that returns. This way you would not need to order your “mysql_wait” calls in how you *think* they will return, you will always get them in the order in which they finish.

Again, this is still in it’s infancy, so please let me know if you have any comments or suggestions!

Posted in MySQL

24 Responses to "Asynchronous I/O – How It Could Speed Up Your App"

  1. Hi,
    we have already code for ext/mysqli that does ASYNC queries but due to lack of time for extensive test writing (QA) we haven’t merged the code into PHP CVS. It’s implemented with mysqlnd, the new driver for PHP. Implementing with libmysql won’t be any harder after checking out the currently implemented ext/mysqli + mysqlnd code.
    The sources are in our public svn :
    http://svn.mysql.com/svnpublic/php-mysqlnd/trunk/

    Look in php5/ext/mysqli/mysqli_nonapi.c for mysqli_poll().
    mysqli_query() takes a thid parameter – how to store the data. Currently it’s either MYSQLI_USE_RESULT or MYSQLI_STORE_RESULT. In the patched sources the valud is a bitmap -> MYSQLI_USE_RESULT & MYSQLI_ASYNC , or MYSQLI_USE_RESULT & MYSQLI_ASYNC

    We have a small test file, which will show you how to use the ASYNC queries :
    http://svn.mysql.com/svnpublic/php-mysqlnd/trunk/tests/ext/mysqli/mysqli_async_query.phpt

  2. Interesting. Are you also working with the MySQLND people at all ?

  3. Matic says:

    I can how this could be useful. But, in a highly concurrent web environment (busy web site), wouldn’t this decrease the performance overall, since the SQL servers would have more queries to process at once?

  4. Ulf Wendel says:

    mysqlnd does support asynchronous queries in PHP but we have not tested and thus not promoted this feature yet

  5. Ulf Wendel says:

    Please replase PHP ext/mysql examples by ext/mysqli examples! ext/mysql does not support all features of MySQL 4.1+ and should be deprecated ASAP (which is unfortunately not possible due to the user base).

  6. I’ve wrote something somewhat related in a post about using the built-in event scheduler to perform parallel work: http://blog.shlomoid.com/2008/04/using-mysql-event-scheduler-to-emulate.html

    It’s the same idea, but a completely different implementation concept.

  7. Olaf van der Spek says:

    Will you use background threads to do the actual IO? Or will ‘polling’ be used?

  8. Eric Day says:

    Andrey: Thank you for the pointers! I will certainly check out that code. I want to start with the C lib (what I need it for the most), but I’ll see how I can work what you have into there. Thanks!

    Mathieu: Nope, as you can see from the first comment. :)

    Matic: Not necessarily. If your webserver is taking enough traffic to handle concurrent queries, your MySQL server is also already serving concurrent queries. This is just pushing the queries for a single hit into a smaller time frame, not increasing the number of queries in any way. A particular hit has more overlap with its you’re own queries, but less overlap with queries form other hits.

    Ulf: I’ll start working with ext/mysqli instead, I wasn’t aware is was deprecated (I’m not a big PHP guy). Thanks!

    Shlomo: Interesting! I could see building generic interfaces to use the event scheduler to do the same thing, but I wonder how it would scale once you start doing hundreds or thousands of queries per second. Something to test I suppose.

    Olaf: It will be polling. The libmysql already uses poll(), so it’s just a matter of tweaking the interface to return when it would normally block (or block only when you want to).

  9. Bill says:

    The asynchronous call should return with the id (same number returned form a show processlist; command), which would allow you to kill the query, if the user got tired of waiting for a long running report query.

  10. Ulf Wendel says:

    Eric: ext/mysql is not deprecated but its the oldest API and least preferable in my eyes. I’d love to get its usage figures down and call it deprecated.. but that future music and up to the community to decide. All PHP MySQL drivers accept and live from community contributions. If the users want ext/mysql – bad luck for me….

    Olaf: we tried background threads in PHP/mysqlnd. For smaller queries the background thread was simply too fast. Fetching was done before PHP could even ask for the result. Therefore I’m a little in favor of not using extra threads, if possible. Extra threads add extra overhead to the implementation. It seems not worth it for PHP. poll() or anything related seems to be the better option.

    When discussing background/async stuff in the web area, recall that nowadays there is AJAX and similar. This might help working around some of your problems without implementing anything new. Our earlier questionaires showed that background/async is seen as a rather exotic and advanced feature in the PHP community. I think MySQL should support it but I don’t expect it to gain mainstream usage soon.

    Ulf

  11. “I sent out the first contact e-mail to MySQL a week ago, but have not received any response yet. Anyone at MySQL listening? :)”

    Well, Andrey and Ulf are listening ;)

    The community team is this week tied up in SF at community one. I am sure they will get back to you as soon as they have some time to respond in a way that is appropriate for your type of contribution. Please don’t be impatient – we value all contributions. Shoot me an email if things get stuck anyway

    Roland Bouman
    roland at mysql dot com.

    http://rpbouman.blogspot.com/

  12. Eric Day says:

    BIll: Good idea, I’ll try to find an efficient way to incorporate that in, possibly as an added option.

    Ulf: Thanks for the clarification, that makes more sense. As I said my primary interest is with the C api, but most people would find the scripting language interfaces more useful. I’ll focus on ext/mysqi when I get to that point.

    Roland: Thanks for the update, that makes sense that the community team is tied up at Community One. :) I didn’t mean to give the impression I was being impatient, just curious when I might hear something back.

  13. Ulf Wendel says:

    Eric,

    in this particular case the “technicians” to talk to first at MySQL are probably Andrey and me for PHP as well as Georg and JimW for all C-ish languages.

    The connectors team does some brainstorming in these days which features *could* be added to the assorted client libraries in the future. Guess what is on the brainstorming list. I don’t say we *will* implement the feature – all I say is that its on a brainstorming list. In my eyes it would be *fantastic* to have someone from the community work on this. However, its not up to me to decide anything. Like Roland I can offer help finding the right people to talk to and point them to you in case of unexpected delays.

    Ulf

  14. Olaf : mysqlnd at the moment can do background fetching of results, the only extension that does use threads, although the code currently works only on systems that have POSIX threads and forces a ZTS build, even with script per process servers.
    Unfrotunately, the background fetching isn’t any faster, we have seen even some slowness of 1-2%. The reason is that the background thread was doing only IO work and PHP is very fast in executing code while IO is being done.
    Every query goes through several phases
    - sending the query to the server
    - the server processes the query, which might take some time, the client waits for data
    - reading the status of the query. The first byte is the number of the columns in the result set : 0-UPSERT query, > 0 – ResultSet, 0xFF – Error. This is the reason that in a SELECT there could be up to 254 columns.
    - if it’s a result set, the fields meta data is being read which is ended with EOF packet. For every field there is a packet with metadata.
    - Then comes the data of the result set, if there is any.
    - And again an EOF packet

    The most of the time, for small result sets, in the client is consumed by waiting, while the server processes the query and sends the data. Thus, a background fetching has no advantage because proportionally it takes small part of the timeframe. ASYNC queries OTOH send the query and doesn’t block waiting for the result of the query, thus is we don’t know whether the query was successful or not, but we can check the status at a later point.
    Now, one can think of mixing BACKGROUND and ASYNC and see whether this help – storing the data in the background even before the user polls for a result. mysqlnd can’t do that at the moment and unfortunately I don’t have the time to play with it, as I am working on another connector ATM.

  15. Evert says:

    Small detail, I think the ASIO setting should not be set while connecting to MySQL, but when executing a query.. Makes a lot more sense there

  16. Eric Day says:

    Thanks Ulf and Andrey for the extra info. I’ll be in touch soon once I have some patches to share.

    Evert: You actually want to have the ASIO setting during connect because you may be talking to a MySQL server over a connection with a noticeable RTT. This allows you to return from connect immediately (after sending SYN packet), possibly do something else, and then poll until the socket is ready for writing (ACK received). If you have nothing to do until the connection is established and ready for queries, you could just call the wait function immediately. There are still some other details and options to be worked out, but this is a feature I know I’ll need.

  17. Matthew says:

    I would really like to see something like this.

    What would be ideal is an C API like Postgres’ asynchronous query API, where you can keep polling for results. This means that it can be used to write non-blocking extension code for a scripting language interpreter like Ruby 1.8 which doesn’t support native threads.

  18. roger says:

    There has been some work recently for ruby asynchronous mysql:
    http://betterlogic.com/roger/?p=339

    still in beta, but hey :)

    Having a C api like postgres’ would be…even nicer :) [esp. if you could wrap the sockets in your own event loops]

  19. [...] also a real need for an asynchronous C API for MySQL which Ruby library authors could use. This project appears to have been trying – looking forward to [...]

  20. Matthew says:

    Roger,

    That link isn’t working – would love a peek at the beta though!

  21. roger says:

    Has there been any work on this? Is there a site to visit? Seems that it might be useful, seeing as there’s a PHP asynch out there, and also a perl asynch, and a ruby asynch…shared code would be nice [like the client proposed here].
    Thanks!
    -=R

  22. [...] I’m now back on track with where I left off. I’ve been making good progress on the asynchronous MySQL library I talked about in this post in the form of a new drizzle client library. This library is also compatible with MySQL since they [...]

  23. [...] I’m now back on track with where I left off. I’ve been making good progress on the asynchronous MySQL library I talked about in this post in the form of a new drizzle client library. This library is also compatible with MySQL since they [...]

Leave a Reply


< || >
Blog
Wiki
About
Resume
RSS
Comments

E-Mail
Launchpad
LinkedIn
Twitter
identi.ca
Facebook

OpenStack
Scale Stack
Gearman
NW Veg
Veg Food & Fit

Linux On Laptops