Tuesday, June 19, 2007

Comet and Server push

There is something I'm checking out in my spare time. A technology that exists for a while called "comet". There are other alternatives in server-push, but I prefer first to look into the one that seems a bit more standardized and thought out. I've read other threads in newsgroups that state comet is overkill in many situations.

Server-push isn't actually "push" in the sense that the server initiates the connection. It's more like a delayed client-pull with features on the server that prevent excessive resource drainage.

In this model, the client connects to the server and waits for information. Browsers have timeouts, sometimes servers do, so it will reconnect every x seconds if there is no data sent over the link. So, on error it reconnects. If the connection errors x times, it will stop connecting and display an error about unavailable services.

Comet has in the meantime be implemented in Tomcat as well as Jetty. Jetty has documented in more details how they implement server-side processing and has some statistics about expected resource usage and server loads.

A problem with a server could be the number of connections, but the first thing that runs out are processing threads.

The comet model basically uses asynchronous I/O processing with thread pools. So it's a way to multiplex your client connections to get time allocation by one of the threads in the thread pool. This of course requires the same session and client information to be available. A thread gets assigned to a connection when there is data to be read or when an internal server event (through the application layer actually) writes data to a client channel.

Using this model, the browser will also have to implement some event switch. It receives the data and does something with it. This could be a chat window or popup and so on. In general, a browser can only have two connections to the server. This may be a little bit limiting for data transfers, since the comet connection continuously consumes one connection. This makes image loads and asynchronous data loads take longer and become serialized. One way around this is to use virtual hosting techniques to assign the comet connection to some kind of HTTP event server and the other connections to other web servers that process data requests. A potential problem here is data security with javascript applications that are not always allowed when connecting to a different server than where the script came from.

Well, the usual applications arise from comet technology. chat and so on. But there should be other possibilities as well if we get the network security right (and issues like NAT and so on!). Consider connections from browser to browser without an intervening server. The immediate services are basically event-driven client applications that react on server events. You might connect to a certain site and whatever is going on in the system will notify you of that occurrence. This is a very interesting feature for many sites, since the user will no longer be 100% responsible to pull all that information to him.

Moreover, from a database access point of view, perhaps persistence frameworks can finally become more consistent. If the server knows that you are watching some kind of information and possibly editing it, any other user that edits the information before you should cause an event to be sent to your browser to notify you there is new unseen data to consider. The browser might even retrieve the new data and show it alongside. It should not be too difficult to get this done. There are caching mechanisms for example that can help in detecting which objects are being viewed by whom and when objects get refreshed in the cache through an edit.

No comments: