Tuesday, September 17, 2013

PHP Stream Notifications ❤ MongoDB

PHP Streams have several pretty nifty features that most people don't really know about; Filters, Wrappers, Context, and Notifications. Documenting these is a bit difficult, and getting the user to discover these features is even more problematic, as these things usually live outside of the normal path (function reference).
Maybe I'll blog about these things in the future, but for now I want to talk about the Stream (context) Notifications - or more specifically; Stream Notifications in the MongoDB extension for PHP.


The Stream Notifications are essentially pretty simple: when a stream does something, it notifies something (a callback function) that it did something (notification codes, importance, message code, ..).
Since this feature is a little neglected and people don't seem to know about it, this "something happened" isn't really a lot of things. When PHP reads from the http stream it will tell you if it came across a "redirect header", "mime-type", and, when available, the content size. Simple things.
This is actually pretty powerful and allows you to do some work, even when file_get_contents() a large large file, so it doesn't have to be a blocking operation anymore.
An example of a simple, yet pretty cool, script would be a wget/curl style download application with a progress bar (we actually use that example to download pear things when you install PHP and don't have wget or fetch).

Enter MongoDB for PHP.
As of pecl/mongo 1.4.0 we have switched out our homegrown sockets library in favor of using the native PHP streams. This gives us several advantages, such as greater portability, automatic support for SSL, and memory reporting is now more inline with reality, as everything is mapped using the PHP allocation functions.

Now, since we are using the PHP streams.. a pretty cool thing happens: We can now play around with the notification API and Stream Contexts.
A highly experimental (undocumented) API was included in the 1.4.0 release, and now as we are prepping for 1.5.0, I've spent more time on it, adding things that are "useless, until you play with it".. Such as progress reporting for read/write.. Wait what?

Yeah. OMG LOL For realz.
I created a little example, called spindle.php, which queries the enron dataset, and prints out a progress bar for all socket read and writes, along with meta information like the actual command and query being sent - and even some information about the actual wireprotocol operation, in case you are interested. Writing the queries and reading the data is too fast on my laptop so I had to add usleep() into the code to see the progressbar spinning.. but its still fun.

I can't really think of a real practical use case for reporting socket read/write progress updates.. But its there. Hopefully someone can come up with some cool things using it :)


Note: The Pull Request hasn't been merged yet, and the API is still considered as experimental, so you'll have to use my improved-notify-support branch for now if you want to play with it.


Progress reporting obviously isn't the only thing the new Notification API does, I'll blog about the more practical notifications later :)

-Hannes

5 comments:

  1. Great work and cool idea!

    ReplyDelete
  2. I could see a use case doing huge bulk operations and you want to notify gearman or what ever worker you are using about the progress

    ReplyDelete
    Replies
    1. Interesting, that is a pretty good fit!
      Importing data from external systems, which quite frequently takes hours and hours, in large batches is a good candidate for things like this :)

      Delete
  3. Your "improved-notify-branch" is dead. I've got a situation where I am trying to get status on a php backgrounded process. It's only running for about 30-60 seconds. Polling the pipe hangs the main thread and I thought 'stream_notification_callback' would be the answer but apparently it's not working in the unix wrapper(?).

    ReplyDelete
  4. The improved-notify-branch is merged, so all the new tings are part of 1.5.0 (which is in RC2 now).

    As for you "background process" that you are polling via "unix wrapper".. is this MongoDB related?


    Vanilla PHP has some notification events which I documented on http://php.net/stream_notification_callback but no all stream wrappers support all events.

    ReplyDelete