Saturday, October 16, 2010

ZeroMQ

I'm evaluating a couple of products for a larger project at work. The project will use at some point some kind of distributed storage, which may possibly be filled by the "ceph" project. Although it is experimental, there are a good couple of features that work already. For these evaluations, I've looked at some other work that has been done in this area. Most notably of course is the work that Google has done (GFS, Chubby, Percolator and BigTable). Some of the concepts described there are also implemented in Ceph, most specifically the Paxos algorithm for reliable replication.

Distributed storage is great for anything you'd like to query and which should be available 3-4 seconds after the data is "created" (although it's very likely created earlier). Distributed storage though is not a very good basis for broadcasting/multicasting/unicasting data as fast as possible after it was produced, because clients end up polling the file system for changes. For efficiency's sake and your own sanity, it is then better to use some kind of message queue. Two main architectures here: "bus" and "broker". The bus is basically a local subnet broadcast, where everything attached to the bus will have to parse messages and discard them if there is no interest. The broker(s) can be configured to receive messages from producers and route them through some broker network, reducing the number of times messages are copied and thus get re-transmitted through a separate link. Buses are useful only if you want all servers to know about the data, or the majority of them. Brokers are useful if you have geographically dispersed locations, want full control on how data is retransmitted, etc... Some brokers can also act more like "hubs", if you follow the "pubsubhubbub" project.

Sidenote: if you can avoid 'realtime' and mq altogether and do it through some file system, I'd recommend it. The concepts are simpler, it's easier to deal with faults and should you need any group locks, then the file system, even the distributed one called ceph, will be able to provide that for you. A mq puts the onus on the client for data caching, storing, reordering and all that. So file-based applications are always easier to write (at least from the client perspective :).

A quite complete evaluation of message queues was made by SecondLife. Note what is said about "AMQP". I believe that standards should evolve from things that work and not the other way around. For more information, check out:

http://bhavin.directi.com/selecting-a-message-queue-amqp-or-zeromq/

http://www.imatix.com/articles:whats-wrong-with-amqp

I've downloaded 0MQ and decided to give it a spin. The first thing I did was visit the IRC channel on a saturday and there were 50 people in there. That's usually a very good sign it's quite popular. There's a good chance that 0MQ is getting some serious attention soon. It may not be the absolute topper in performance speed, but it may be best if you're looking for flexible solutions, prototyping and esxperimentation. Having said that, if you're not running a super-duper high-performance site that needs to squeeze all the juice of your CPU's and NIC's, ZeroMQ will do just very fine for you, no problems whatsoever.

To set the expectations right... it's a bit more of a library for internetworking your applications together using multicast/tcp/ipc protocols, but that's already a big gain if there's a library that does that for you. This is typically a layer where many developers think they want something special, but ends up being the same requirement like everybody else's. Relying on a 3rd party lib that's widely used everywhere, and thus tested, is a very good decision. The only thing you end up doing is writing the application or service specific implementation and the message marshalling stuff. The library does the rest.

For anyone doing Enterprise messaging, 0MQ won't give you much yet. There are no tools to replay messages, retrieve them from error logs and what have you (or development tools). You may need something like OpenAMQ instead or RabbitMQ. To be honest, you do not necessarily need anything more complicated other than 0MQ if you develop everything in house.

Many services nowadays use XML/XSD to specify message formats. My main gripe with XML is that for too many people, it's about dealing with format instead of function. The issues with namespaces and all these other things make XML a really big beast to handle. Read _message matching_ to get some ideas where this will go wrong. Delayed brokers eventually slow down your entire business, make brokers unreliable, cause delays in some transactions that need to go fast, or negatively impact other processes that run on the same machine, requiring you to set up 2 brokers with double the probability something goes wrong... etc... Better get rid of it, move on and separate 'meaning' from 'transmission' and then verify things later. No need to encapsulate each message with the entire ontology each time a message is sent. I have a couple of interesting ideas in this respect that I'll share later as a research publication perhaps.

Update: 0MQ isn't without 'issues', but as far as I could determine so far there aren't any serious bugs in the code like memory leaks or reconnects, etc... I found two issues so far:
  • Messages are not being throttled at the server side. So if you look at the weather server example, which sends a lot of messages as fast as possible, under the water these messages actually get piled up in a list for the I/O subsystem to transfer them asynchronously. If the I/O subsystem simply cannot keep up, then you'll deplete your memory resources very quickly. The solution is to add high water marks or swaps, or use other methods for throttling the throughput of messages.
  • zmq_poll works, but because it assumes you're listening on multiple sockets, the idea is to visit each socket pool as fast as possible to collect any new messages that may have arrived. You then get one message per call and some kind of interleaving between each socket pool (if they're both really fast). Although this sounds nice, the underwater code assumes that you do not want to maintain any kind of "sleep" or other function to give back some CPU time for other processors. So, when you start 10 clients as in the examples implementing the zmq_poll, no client will yield and they all try to get 100% CPU time. If your client app can deal with it, the workaround is to embed a simple 'x' ms sleep in there. I recommend you sleep only when the last cycle didn't actually read any messages (no event was set).

Sunday, October 10, 2010

Ion channels of excitable membranes

I've just finished reading a book about ion channels in excitable membranes. Some types of excitable membranes are neurons, hence the interest. The kinetics of these types of cells are really interesting to try to understand. The more complicated or integral parts of sensory processing (and what this means) are performed in such cells.

The different types of channels and subtypes thereof have over time evolved by responding to particular messages or refined their function to respond better to what is happening in the environment. As this blog is not academic, I can voice some gut feelings / hypothesis here without bothering too much about scientific grounding and such. Most of the time, people philosophize over the function of the brain using analogies with existing machineries, comparing it to a computer and so on, but most of the time it becomes really difficult to trace this down towards actual cell function.

I think computer scientists made a bit of a mistake to make sweeping generalizations over the function of a neuron some 60 years ago. The attempt was made to map the function of neurons to something that could run on computer hardware. It started with a model where neurons, because they emitted a potential ( a pulse?! ), could be compared to logic building blocks which either emit a 0 or 1. Just like electronics. The "1" in electronics is basically the 5V of action potential that you'd normally put on a component to communicate a signal.

At some point, the "binary" neuron made way for the sigmoid neuron (so the 'processing kernel' was simply replaced) and this led to networks that were slightly more capable of performing some kind of function. At the moment, a lot of research is undertaken on Spiking Neural Nets (SNN), but there's still not a ground-breaking success in those networks.

I think the reason is that classical nets didn't pose any constraints on the timing of events, because they were mostly used for situations in which each state was information complete and where it was possible to derive the entire output from the combination of inputs (or let's make the output some approximation, as long as it worked, it was fine). Spiking nets do require specific timing constraints and there's not enough knowledge yet to support how such neurons should function over the domain of the signal and time to respond appropriately.

The refinement of using a sigmoid kernel (or tanh kernel) over a binary one is an important one already. The binary kernel was so abrupt in the change, that it was only useful in very specific situations. The sigmoid kernels are more useful, because you can generate proper outputs over a larger dynamic range. But the use of a replaced kernel hasn't been fully exploited yet.

The ion channels of membranes come in roughly three types:
- excitatory (generating the potential)
- regulatory / rectifying (attempting to establish equilibrium)
- modulatory, or acting over a longer time period, having an effect on the excitability of the cell

Classical nets are generally built with some assumed kernel in place and all kernels having the same configuration or parameters. The modulatory effects are not modeled in any way and the calibrati0n of the network is highly focused on the calculation of the weights, assumed to be the efficiency in transfer of neurotransmitters in the synapse.

The book I am reading demonstrates that the fast excitatory and regulatory actions themselves can be modulated somehow (this is like saying that the sigmoid function changes its shape), but that over time or multiple "calculations", this shape can also be modified on the basis of the output of this shape in previous steps (so the output of the kernel is changed as a function of the output of the same kernel in previous steps through some kind of accumulation function).

The classical nets seem very useful to have some alternative, mathematical model using neural networks for very complicated problems, but their abilities may be enhanceable by looking at biologic models in more detail. Neurons do not always function the same way, but you need to look at their behavior over time to see the longer-term effects of repetitive firing, the way how other inputs may influence their behavior (neurotransmitters) and either increase excitability or decrease it (or even suspend it).

However, I am convinced that networks that need to perform real-time tasks, or otherwise tasks where time is explicitly a concern can only be built with networks where this type of modulation and excitation is explicitly built in. The classical nets were never built from this perspective, but they remain interesting for classification or historical data analysis. The controllers (robotics?) of tomorrow and so on should attempt to understand more the dynamics and kinetics of the ion channels in neurons and the ability of neurons to time things.