A modern communication protocol must be secure. And to do it right, security must have been integrated in the design from the very start. Here is a short list of security requirements for DITP:
- authenticate peers
- support exchanged data authentication and encryption
- provide access control on accessible services, objects and methods
- support single and multi-signed information of any kind
- signed information supporting polymorphism and aggregates
- allow anyone to verify any signature with minimal knowledge
Multi-signed information is when more than one people sign a given information, (i.e. a contract).
With a stream oriented encoding this all imply that we are able to apply a hash function (i.e. SHA) on transmitted data while it is encoded or decoded.
This is what I am currently implementing. Unfortunately, a server crash, monopolized all my time this week. Murphy's law revenge...
IDR is the data encoding used with DITP. It is to DITP what XML is to SOAP. IDR uses bleams that combines the benefits of block and stream encoding:
- no need to specify total size in front of it
- no upper size limit
- may be encapsulated without depth limit
- no constrain on data and doesn't rely on markers or tags
- no need to parse and search payload data to locate the end of bleam
A bleam is encoded as a sequence of byte blocks of at most 16KB. Each block is preceded by an unsigned short value whose 14 less significant bits encodes the number of bytes of data that follow. Its most significant bit is set to one if the block is not first, and its second most significant bit is set to one if the block is not last.
When encapsulating a bleam, its sequence of blocks is simply inserted in the sequence of blocks of the encapsulating bleam. The encapsulated bleam can be stored in the data of the the encapsulating bleam block if it fully fits in one of its block.
The maximum payload size is 16382 so that the biggest block will be 2^14 byte long. The invalid size value 16383 (0x3FFF) is then used as a signal. A signal block has no payload data.
The signal is used in IDR to inform the receiver that the expected sequence of data is interrupted because an exception or an error occured. If the signal block is flagged as end of bleam, the interruption is anonymous. Otherwise subsequent bleam data provides information on the reason of the interruption. In IDR it is the serialized exception object and the objects it may be referencing. Such interruption will eventually propagate to encloding bleams and be encoded as anonymous interruptions.
Encoding and decoding bleams require some care, but the effort is worth it because of its multiple benefits. With small messages DITP is as performant as common inter-object communication protocols. With bigger message, DITP benefits from the stream oriented encoding and reduced latency and memory usage requirements. DITP can thus be used to send huge files, streamed films, etc.
Most inter-object communication protocols encode message into blocks with its size encoded in front. This requires that the message is fully encoded to compute the size of blocks before it is sent.
A stream oriented protocol doesn't need this. The communication process can be pipelined. As shown in the figure below, this reduces the communication latency. Note that in a two way transaction the saving is doubled.
In absence of the block size, one needs a marker to signal the end of the block. For instance SOAP uses xml tags as markers. There are two drawbacks with such encoding: the marker can't show up in regular data and the marker must be searched to locate the end of a block. When the block size is available, locating the end of the block is trivial and very fast.
IDR and DITP combines the benefits of blocks and streams encoding by introducing a new encoding called BLEAM. Here is a short list of the their properties:
- bleams have no size limits
- bleams may be encapsulated with unlimited depth
- bleams impose no constrain on contained data
- no need to search and parse contained data to locate the end of bleam
The last property is what makes the difference with SOAP and its xml encoding.
See next blog note for a description of the bleam encoding.
DIS has finally its logo. It was clear at the beginning of the design process that it should put forward the world wide networking feature of DIS which is key to its working principle. I looked for various networking grids but couldn't find anything original and explicit enough.
It is fortuitous that I saw an instance of the UVG120 grid which was first discribed in the article "The Planetary grid: a New Synthesis" written by William Becker and Dr. Bethe Hagens in 1984. I adopted this grid for the logo with the written permission of Dr Bethe Hagens.
I find this grid beautiful because of its apparent randomness and its subliminal regularity resulting from combining the vertices's of a dodecahedron and an icosahedron mapped on a sphere. The fact that these volumes integrate the divine proportion may contribute to the impression of beauty.
The logo was designed by the graphic designer Johan VINET. He runs the company grafxtory but is better known as lordyoyo with his blog offering copious tutorials and enlightening information on graphic design. He has a creative and yet a professional approach, with a pleasant benevolent patience I challenged with my exigencies and care of the details.
Progress is slow because I have daily job and family duties.
I wasted a few days on the garbage collector that doesn't work when code is compiled in release mode. The problem is due to the garbage collector and an initialization failure. Since it only shows up in release mode, it is hard to debug.
I am in the process of extending the IDR API to support pre-encoded IDR data insertion and extraction. This will be used to implement a remote IDR data storage service, event dispatching, relays or broadcasting or simply to optimize generation of frequently emitted requests or responses. The need to support exceptions in this process requires some attention.
... because it reduces communication latency and memory requirement.
Communication latency:
In a message oriented protocol the sender has to fully encode the message before it can be sent, and the receiver must receive the whole message before he can start decoding and processing it. The receiver has thus to wait for the whole message to be generated, sent and received before doing anything.
With a stream oriented protocol the communication process is pipelined. The sender encodes and sends a first chunk of the message allowing the receiver to start decoding and processing it, while the sender encodes and sends the next chunk of the message. The sender and the receiver then work in parallel and not in sequence.
Manual data split is possible with message oriented protocol but at the price of a significant complexity increase of user code, and such splitting can become very tricky when dealing with object aggregates.
Memory requirement:
A message oriented protocol requires that the sender and the receiver hold an encoded copy of the full message. This puts the memory management system under stress because these memory blocks have a short life span, they are of varying size and they are sometime big.
With a stream oriented protocol, the memory requirement is limited to the storage of the encoded message chunk, and this even if the message size is huge. The buffer holding a message chunk can also be easily recycled because it is of a fixed size.
DITP has thus been designed to be a stream oriented protocol, while most inter-object communication protocols are classically message oriented (i.e. CORBA, RMI, ICE). SOAP is the only inter-object communication protocol I know that is stream oriented. It thus benefits from the reduced latency and small memory requirement; but these benefits are spoilt by the ASCII and xml encoding.
Things have a little bit evolved since the last time I checked the D programming language web site. An updated pdf version of the D language specification is available and the poseidon IDE is catching momentum.
There are some other nice features I forgot to mention in the previous note. Here is a short list:
- static asserts and if evaluated at compile time
- mixins of code and templates
- contract programming (pre and post conditions)
- version and debug keywords for conditional compilation
- scopes
There are probably more gems in there, but these are a good start to catch the added value of D compared to other programming languages. Note that D is not an academic research product. It has been created by compiler writers that capitalized their experience in designing a new and better language.
DIS prototype is currently developped in C++ with Microsoft's Visual Studio C++ IDE. This is because it is a very reliable IDE and has the best debugger I know and at this stage of development debugging plays an important role.
I could have used Java, with Eclipse as IDE, which is as good, and even better on some aspects; an impressive master piece of IBM. But I first need a library for compiled code to be used on servers. I also have the impression that it is easier to translate C++ code into Java than the opposite. C# is on my list, but I need an OS independent language and totally free to use. As far as I know it is currently not the case.
So, the D programming language is very close to the top of my list. This language has very attractive features and I foresee a brilliant future to it. But, as Java in its very early days, D is laking a good IDE with accurate debugging support. The other weakness is its documentation, but this shouldn't last long.
It has all the features one would expect from a modern programming language and I have just learned about the scope keyword and its purpose. This is great. Read this article for a clear and detailed explanation on its purpose and usage.
Another interesting feature is its support of the SWT library called DWT. It makes it easy to implement a portable GUI application. You can also very easily use your pet C library with D. The reverse should be possible but it is not clear yet how to do this. It would be interesting for people developping libraries in D and keep them usable with C or C++. Something I would need.
When I was younger I always thought Murphy's law was a kind of a joke. It sounds so fatalist. Intuitively I believed there was some truth in it, but I couldn't point it out. Here is its most popular, concise and general formulation.
"Everything that can go wrong will go wrong"
I recently (re)discovered its mathematical justification and thought it might be usefull to present here. Hold your socks, here it goes.
Let p be the probability that the "wrong thing" happens in one event. The probability that it doesn't happen is then 1-p. The probability that it never happens in n events is then (1-p)^n. Since 1-p is smaller than 1, this probability tends toward 0 when n gets big.
So the correct and complete law formulation should be
"Any possible outcome will happen at least once for sure, provided that it is given enough opportunities to occur".
So Murphy's law is not complete and general enough on two aspects:
- it applies for good events as well as bad events;
- it depends on the number of opportunities to occur.
Ok, my formulation is less fun than murphy's law and it is probably because of its provocative formulation and partial correctness that it attracts so much attention. But just to be sure you didn't miss the point, when you consider the probability of occurence of something, good or bad, take in account the number of opportunities it has to occur.
That's true for startup success as well as for car or software crashes or anything else. People often focus only on p and forget about n. The YCombinator funding model is based on making n increments cheap. And this is as good for entrepreneurs as for investors.
With most modern programming languages there is no such question because garbage collector is built-in. This is not the case with C++, and since I develop the first prototype in this language, I had to anwser it. Do I really need a garbage collector ?
IDR needs a garbage collector because it supports object aggregate encoding and IDR should impose minimal constrains on them. Cycles allowed, minimal difference with local objects, user implemented classes, etc.
Another reason result from the reliance on exception handling. This is the price to pay for using the streaming encoding model. If an exception is generated on the encoder side, it has to be propagated to the decoder. And manual memory management with exceptions can become tricky.
This is why I went into the effort of adding garbage collector support to C++. The good news is that it is planned to be added in the next version of the C++ standard. So the effort to implement IDR in C++ with a temporary solution is not a waste of time.