There are some disadvantages to the use of DHT:
- It doesn't provide absolute guarantees on data consistency and integrity (but doesn't necessarily make a total mess either, check Amazon's paper on "Dynamo").
- It's not very useful for "group" queries, range queries or other kinds of data lookups.
- It doesn't natively support events or triggers very well.
- There is no authority in the network. So nodes have to cooperate between them in case certain decisions need to be made.
- Lookup of data is O(log(n)) and may take 2 or more seconds, depending on the real location, how many nodes there are and individual latencies between nodes.
- It's highly resilient to network leavers/joiners or other big changes around the network. Most implementations handle node changes very well.
- Data is automatically distributed, according to specific configuration options. The specific way how data is eventually organized depends on the chosen strategy in the design. For example, the way how nodes are organized (-ing) together is very important.
- Data is replicated across nodes, so it's difficult to lose it.
- The removal of any single node doesn't impact the network overall in any way.
Because keys allow multiple values, you can add yourself as a peer to share the file as well, so that others can negotiate with you on the file parts you may have that they don't. So, DHT in this context is used as a very effective mechanism to find peers to communicate with.
There are other uses of DHT once you start building some logic on top of this simple interface. You could create a virtual file system for example. A key with "/" can be loaded with a set of values, which are the 'subdirectories' that are valid under /. Then you just keep going until there are no more files. To get a file directly, you should be able to look for a key : "/work/myfiles/important-document", in order to get information about the location of that file.
Because projects like Hadoop have different ways for large file replication (blocks of X MB) across a huge cluster, using a DHT layer under this system could store the locations of each replicated block. Nodes themselves can manage their own files this way. This could possibly remove the need for the authoritative and centralized HDFS file directory server and make the network overall more robust and resilient (the HDFS central server is a single point of failure).
It's also possible to store files in two formats: the /x/y/z way, which is meant for finding information and the /nodeX/files way, which indicates the files located on a single machine. If each node itself manages and maintains that information, the rest of the network can react to node crashes by just looking up this information from somewhere and act on this.
The above method for storing information doesn't deviate much from Prefix Hash Trees. A complication in DHT is that you cannot access data by parts of the key (a search) because they are all hashed and unreadable, so you can only find data when you have the key in its entirety. You could create a database in this network as a kind of directory, but that introduces a single point of failure again. The schemes above make this slightly more accessible. This is also the reason why DHT's are not the right solution for every problem. It works when you access lots of data by the same primary key. Facebook uses it for showing entire user profiles, slashdot for storing and retrieving rendered comments. The only thing you need next is a method for expiring entries.
DHT's are generally stored in memory, making them really fast for lookups. Most of the times this is acceptable, but sometimes you want a bit more persistence by having individual nodes store their bits around this network. The biggest problems are faced once a node goes offline for a longer period of time and then reconnects. It is possible that the data it stores is now partly expired and you don't want those bits of information to get back on the network overall. It is also possible that certain updates have changed the information, so that it is newer, which the old keys do not contain. Amazon's Dynamo has more or less solved this by the use of "vector clocks". At some point, you need reconciliation.
Virtually, you could store everything in this network, but you need to know of course what something is as you retrieve it. If you're only looking at non-binary data, an efficient method for storing the data could be JSON. I doubt you'd need "protocol buffers", as they're designed for streaming access to large amounts of data. But possibly you could use the compiler in the project anyway to store it in the described format. That allows you at least to have the storage part covered, so that later you can create apps on top of DHT in python, C or Java.
What is so nice about DHT? The design allows you to work on top of a distributed system, if designed right of course, where you need not worry about fault tolerance within the application. Basically, once you are connected, you just write and read to your heart's content and it should work.
No comments:
Post a Comment