Distributed Information System (DIS)
  • Home
  • The blog
  • Contact

IDR encoding compared to Go language encoding

5/12/2012

0 Comments

 
As the author of the IDR encoding (yet unpublished), I was very curious do see how it compares to the data encoding proposed in the Go language designed by the Google team (gobs of data).

There are two fundamental difference between the two.

Value encoding

Gobs encodes value by a tag byte followed by a compact byte encoding of the value. The tag identifies the type of the value and its encoded byte length. The byte encoding drops trailing 0 bytes of the value.

IDR uses the most common computer internal representation of data as encoding and has thus no marshaling work.

Advantages

Gobs has two major benefits. The first benefit is that the type of data is provided with the value which allows anyone to decode the values of a message without prior knowledge of its content. The second benefit of such encoding is that data can be split in blocs anywhere since decoding is processed byte after byte.

IDR has the advantage of fast and trivial marshaling as in RPC and IIOP.

Disadventages

The price to pay with Gobs is the additional tag byte and the marshaling work. With IDR, it is the code complexity to ensure the atomicity of the base values if a data stream needs to be split and the absence of base value type information with the data.

Type encoding

Gobs provides the maximum type information with the message so that it is self describing. This makes the encoding more complex since conciseness competes with expressiveness.

RPC, IIOP and ICE rely on the context to determine the type of encoded data. The encoding targeting mainly use in communication, this optimization make sense to some extend.

IDR precedes any message with a type reference. The type reference is a key to a distributed database similar to the DNS from which a description of the data contained in the message may be obtained. It is possible to obtain a concise form to efficiently parse the data by a program or a detailed expressive form with comments to be used by humans.

The IDR data type description strategy seems the most efficient because the data type description is written once. But the decoupling of the type description from the data expose to the risk of loosing access to the data description if it gets deleted.

Conclusion

There are some good and bad points on both sides and there is no easy way to merge the good points into a new optimal encoding.

My experience is that the IDR encoding, while simple and efficient on some aspects, was quite complex to develop.

Today I still favor IDR's choice because of the marshaling efficiency. Olivier Pisano managed to translate the C++ IDR library to the D language in a very short time. So maybe it is just the conception and validation of IDR that took so much time.

I like very much the smart encoding of the base values in Go, but not so much to force all floating point values to be encoded into a double precision float (64bit). I hope they'll change that.

There are other differences between IDR and Gob which have not been detailed here. What they have in common is that both may use their encoding to support persistence. IDR may use it with its distributed database.



0 Comments

The 8 fallacies of distributed computing

10/17/2009

4 Comments

 
The following two paragraphs are the introductory paragraphs of the document Fallacies of distributed computing (pdf) by Arnon Rotem-Gal-Oz that presents the 8 fallacies of distributed computed.

"Distributed systems already exist for a long tThe software industry has been writing distributed systems for several decades. Two examples include The US Department of Defense ARPANET (which eventually evolved into the Internet) which was established back in 1969 and the SWIFT protocol (used for money transfers) was also established in the same time frame [Britton2001].

Nevertheless, In 1994, Peter Deutsch, a sun fellow at the time, drafted 7 assumptions architects and designers of distributed systems are likely to make, which prove wrong in the long run - resulting in all sorts of troubles and pains for the solution and architects who made the assumptions. In 1997 James Gosling added another such fallacy [JDJ2004]. The assumptions are now collectively known as the "The 8 fallacies of distributed computing" [Gosling]:
  1. The network is reliable
  2. Latency is zero
  3. Bandwidth is infinite
  4. The network is secure
  5. Topology doesn't change
  6. There is one administrator
  7. Transport cost is zero
  8. The network is homogeneous
..."

While in the process of designing a new distributed information system, it a good idea to check how it position itself regarding these 8 fallacies.

The network is reliable

DIS uses TCP which was designed to be reliable and robust. Reliable means that data is transmitted uncorrupted to the other end and robust means that it may resist to a certain amount of errors. There is however a limit to the robustness of a TCP connection, and in some conditions connection to a remote service may even not be possible.

DITP, the communication protocol of DIS, is of course designed to handle connection failures. Higher level and distributed services will have to take it in account too.

Making a distribute information system robust implies to anticipate connection failures at any stage of the communication. For instance, a flock of servers designed to synchronize with each other may suddenly be partitioned in two or more unconnected flocks because of a network failure, and be connected back together later.

The latency is zero

Latency was a major focus in the design of the DITP protocol because DIS is intended to be used for World Area Network (WAN) applications. DITP reduces latency impact by supporting asynchronous requests. These requests are batched and processes sequentially by the server in the order of emission. If a request in the batch is aborted by an exception, subsequent requests of the batch are ignored. This provides a fundamental functionality to support transactional applications.

In addition to this, DIS may also support the ability to send code to be executed by a remote service. This provides the same functionality as JavaScript code embedded in web pages and executed by browsers, allowing to implement powerful and impressive web 2.0 applications.

With DIS, remote code execution is taken care by services made available by the server manager if he wants to support them. The services may then process different types of pseudo-codes: JavaScript, Haxe, JVM, Python, ... Many different pseudo-codes services may then coexist and evolve independently of DIS. Such functionality is of course also exposed to security issues. See the secure network fallacy for an insight on how DIS addresses it.

Bandwidth is infinite

This fallacy is the rational of the Information Data Representation (IDR) design. It uses binary and native data representation. In addition to be very fast and easy to Marshall, it is also very compact.

DITP supports also user defined processing of transmitted data so that compression algorithms may be applied to them. DITP is also multiplexing concurrent communication channels in the same connections, allowing to apply different transmitted data processing to each channel. By choosing the channel the user may decide to compress transmitted data or not. 

The network is secure

A distributed system designed for a world wide usage must obviously take security in account. This means securing the transmitted data by mean of authentication and cyphering, as well as authenticating communicating parties and enforce access or action restriction rules.

Communication security is provided by the DITP protocol by mean of the user specified transmitted data processing. As data compression, these can also handle data authentication and cyphering. Different authentication and cyphering methods and algorithms can coexist in DIS and may evolve independently of the DITP protocol.

Authentication and access control may use conventional passwords methods as well as user identification certificates. But instead of using x509 certificates, DIS uses IDR encoded certificates corresponding to instances of certificate classes. Users may then derive their own certificates with class inheritance. They may extend the information carried in the certificate or combine different certificate types together.

An authentication based on password checking or user identity certificate matching doesn't scale well for a world wide distributed system because they need to access a reference database. With distributed services, accessing a remote database introduces latencies and replicating it (i.e. caches) weakens its security by multiplying the number breach points.

The authentication mechanism favored in DIS uses member certificates. These certificates are like club or company member access cards. When trying to access a service, the user present the corresponding certificate and the service needs simply to check the certificate validity.

With such authentication mechanism, the service can be scattered all over the Internet and remain lightweight as is required for embedded applications (i.e. smart phones, car computers, ...). The authentication domain can also handle billions of members as well and easily as a few ones. Member certificates may be extended to carry specific informations and connection parameters.

Topology doesn't change

The ability to handle network topology changes initiated the conception of DIS in 1992. It is thus designed from the start to address this issue in a simple, robust and efficient way. It is not a coincidence that the DIS acronym resembles the one of DNS. DIS is a distributed information system as the DNS is a distributed naming system. DIS uses the proven architecture of the DNS and applies it to generic information with additional functionalities like allowing to remotely manage the information. The DNS is known to be a corner stone of the network topology change solution, as will be DIS.

There is one administrator

As the DNS, DIS supports a distributed administration. Information domain administrator have full liberty and authority in the way they organize and manage their information domain as long as the interface to DIS respects some standard rules. As for the DNS, there will be a central administration that defines the operational rules and control their application. If DIS becomes a broadly adopted system, the central administration will be composed of members elected democratically and coordinated with the Internet governance administration if such structures happens to be created.

Transport cost is zero

The transport cost is indeed not zero but most of it is distributed and shared by the users. There remains however a residual cost for the central services and administration for which a revenue has to be identified. The DIS system will allow to obtain such a revenue and there is a rational reason why it ought to.

Imposing a financial cost to some domains or features of DIS which are limited or artificially limited resources provides a mean to apply a perceptible pressure on its misbehaving users (i.e. spam).

The network is homogeneous

DITP is designed to support different types of underlying transport connections. The information published in DIS is treated like an opaque byte block and may be of any type as well as its description language. It may be XML with its DTD description, binary with C like description syntax, python pickles or anything else. Of course it will also contain IDR encoded information with its Information Type Description.

Conclusion

The conclusion is that DIS, DITP and IDR have been designed without falling on any of the common fallacies. This is partly due to the long maturation process of its conception. While this may be considered as a shortcoming, it may also be its strength since it allowed to examine all aspects wisely with time.
4 Comments

Object deserialization handling

2/7/2009

0 Comments

 

In the last month I rewrote the IDR prototype from scratch and translated the IDR specification document in English. During this process I made a few enhancements in the IDR encoding. I removed an ambiguity with exceptions decoding in some very unlikely situations. The other change was to integrate the update of IEEE 754 specification in 2008 that now defines four types of floating point values, 2 Bytes, 4 Bytes, 8 Bytes and 16 Bytes. It may take some time until these types reach your desk, but IDR should better stick to the standards. So these will be the floating point encodings supported by IDR.

Beside these, there was a much bigger problem left in the API of object deserialization. The problem is to determine what to do when the decoder doesn't recognize the class type of a serialized object. The solution I came up is very satisfying since it matches all the requirements I had. It remains to check its usage convenience with real examples.

The problem

Object deserialization is a process in which the decoder reconstruct the serialized object aggregate. To do so it has to reconstruct each object of the aggregate and  restore their pointers to each other. Objects are reconstructed by using object factories, a classic in design pattern. An object factory is an object that "knows" how to reconstruct some types of objects.

The decoder has thus a collection of factories to which it delegates the reconstruction of the different types of objects found in the serialized aggregate. But what happens if the decoder can't find an appropriate factory for some type of serialized object ? In some use case this should be considered as an error, but in others it might be an acceptable and even desirable situation. 

Consider for instance exceptions. In IDR an exception is an object and handled as such. There is no point, and even impossible, for a decoder to have a factory for all possible exceptions in the world. It is enough for the decoder to have a factory for the common exception base classes and, of course, the one it has to deal with. It should then be enough to reconstruct the object as an instance of the parent class it has a factory for, a process called object slicing.

The worst case is when the decoder may not even slice the object because none of the parent class is "known" by the decoder. In this case the best the decoder can do is to ignore the object and set all references to it to NULL. We'll call this process object pruning. As for slicing, it may be considered as an acceptable and even desirable behavior with some use cases (i.e optional properties), and an error in others since the lobotomized data structured may end up too crippled or even invalid.

The problem is thus to define the appropriate behaviour of the decoder when a slicing or pruning occurs. In some case it is an error, in others not, and in some case it depends on what part of the aggregate the slicing or pruning took place.

The solution

The decision whether it is an error or not is obviously context specific and have thus to be put in the hands of the user. So the problem boiled down to determine how the user would able to select the appropriate behavior.

The solution I came up was to provide three object deserialization methods.

1. A strict object decoder that would throw an exception and abort object decoding as soon as a missing object factory is detected. With this you get an exact reconstruction or a failure.

2. A lax object decoder that would slice and prune at will and return whatever comes out of it, and nothing else. This object decoder would for instance be used for exceptions.

3. Another lax object decoder, like the previous one, but that would also return a feedback on the missing object factories. The feedback on slicing would be an associative index mapping sliced object references to the list of their unrecognized classs types. The feedback on pruning would be a list of the different types of pruned object with a list of the unrecognized class type and the number of instance pruned.

The later method would make it possible and easy for the user to determine if slicing and pruning occurred, what are the missing factories and test for specific objects if slicing took place and to what extend. Since this method would give an easy way to test if slicing or pruning took place, the strict object decoder may seem unnecessary. The reason of its presence is that it may stop the decoding process as soon as a missing factory is detected and thus avoid wasting resources when an exact reconstruction is required and no feedback is needed.

I'm very satisfied by this solution because it keeps the API simple with only a small effort on the decoder implementation. What I still need to validate is how convenient it is to use.

0 Comments

DIS development roadmap

11/12/2008

0 Comments

 

The following figure shows the kernel components of the Distributed Information System, the road map and how far I am today. The items in black are implemented and operational and the items in gray still needs to be implemented. Progress is going clockwise :).

OID An OID is to DIS what the URL is to the web. It is a unique, binary encoded and non reusable reference to an information published in the distributed information system. It was the first tile I designed and implemented. Its simplicity is inversely proportional to the time and effort required to invent it because I had to explore and compare many different possible and existing solutions.

IDR It is to DIS what HTML or XML is to the web. IDR is the Information Data Representation used in DIS. It is a stream oriented encoding with support of object serialization and exceptions. The prototype implementation is currently being fully rewritten. It still miss the ability to specify the encoding version or a formalization of data description. The later is required to be able to display data in a human readable format or to automatically generate data manipulation functions or containers mapped to different programming languages.

DITP It is to DIS what HTTP is to the web. It is the protocol used to exchange information or invoke remote actions in DIS. It is very simple, modular and extensible through the use of dynamically configurable data processing tasks. Support of compression, authentication or encryption is then provided by some kinds of plugins. The protocol use the object oriented model with remote method invocation. The current prototype does not yet support concurrent asynchronous method invocation.

DIS DIS stands here for Distributed Information Service and is not to be confused with Distributed Information System. It is fundamental to DIS, so a confusion is not really a problem. This service combines the properties of DNS and LDAP and would be a new kind of service on the Internet. I can't disclose more  on it because it is still in development. A first prototype has been implemented unfortunately proving the need to support data description.

SEC This part covers authentication and access control in DIS. It requires a functional DIS service. An interesting feature is that it is designed to scale up so that a service could cope with millions of different users without having to keep track of million accounts and passwords.

IDX It is a service simply mapping human readable UTF8 strings to OID references. It is equivalent to the list of named entries in a directory. Like any other services, its access is controlled by ACL and can thus be modified remotely with appropriate privileges. An index may be huge with multiple alternate entry point, exactly like the DNS but exclusively as a flat name space. The OID associated to the UTF8 string is stored in an object so that polymorphism allow to associate images (icons) and other informations to entries by extension.

DIR It is a graph of IDX services with one root entry. Services or information published in DIS can then be referenced by a humanly readable path in the IDX graph relative to the root.



It is an ambitious project but, I am convinced, its added value is worth the effort. I wish I could work full time on this project with the help of some other developers, but this would require funding I don't have access to for now.

An application would help demonstrating the added value of the system. I'm still looking for one with an optimal balance in development effort and success potential.

0 Comments

Time value encoding in DIS

5/26/2008

0 Comments

 

One fundamental question is the encoding of a time value. A time value has two types of use. One is as time stamp and the other is just as a general time reference.

Requirements

On one hand, a time stamp has the requirement to have a well defined and controlled precision, while the covered time span can be limited (i.e. +/- 200 years).  On the other hand, a general time reference needs to be applicable to a very large time span, with less constrains on the precision limit.

Options

For the time reference value one could use a double precision float representation with seconds as units. All arithmetic operations are provided right out of the box and generally hardwired in the processor. Conversion to calendar time is trivial since one simply has to extract the integer part of the value and convert it to a time_t value. From there one can use the common calendar time conversion and formatting functions.

For time stamps, using integers seems preferable. But we still have a choice between a split encoding like the timeval structure, a 64bit fixed point encoding, or an integer with very small time unit (i.e. nanoseconds).

Discussion

There is not much to discuss about the absolute time. Using a double precision float is an optimal solution. For time stamps however we have three different solutions.

From my experience, I've seen that split time encoding like the timeval structure is not convenient to use when dealing with time arithmetics. It is even error prone if the user has to program the operations himself.

I also tried to implement a fixed point time encoding class with the decimal point between bit 29 and 30. But this is tricky to get right and some operations are not trivial to implement correctly. This is because fractional computation requires normalization and optimal rounding errors handling.

A 64bit  integer using  nanoseconds as time units is apparently the most simple and straightforward time stamp encoding. Converting to seconds is done with a simple 64bit integer division which is also hardwired in most recent processors. Conversion to other time units like microseconds, milliseconds, days or week is as accurate and simple. Multiplication or division with decimal scalar values is also trivial.

Another advantage of the 64bit integer nanosecond values is that there is no need of special functions to do the conversions or operations. A programmer can easily figure out what to do and use conventional arithmetic operations.

With a 64 bit signed integers with nanosecond units, the covered time span is over +/- 292 years range. One can thus afford keep the current time_t January 1970 epoch and push back the wrapping limit far away. 

Conclusion

In DIS, we'll thus use a double precision float for general time reference value and a 64bit integer with nanosecond units for time stamps and delays encoding.

Note: I've seen the use of a double precision float for time encoding in some Windows operating system API. I still have to see the use of a 64bit signed integer with nanosecond units. It would make sense as an upgrade of time_t which is required since we are getting close to the wrapping limit.Update : It has been brought to my attention that Java stores time values in a signed 64bit integer with milliseconds as time units relative to January 1, 1970. The covered time span is thus +/- 290 million years. I'll stay with the nanosecond units for time stamps.

0 Comments

Progress status and cardinality encoding....

5/12/2008

0 Comments

 

It is time for a new communication duty on my project. It's still in a steady progress but not as fast as I would like. I now use the latest version of libgc. I spent most of my time last month searching the source of a major memory leak. I finally found out that it was caused by STL containers. I changed the code and now use gc_allocator.  Even strings had to be changed. Now the client and server run without any memory leak. I thought of changing language (i.e. D) but I didn't want to cut myself away from the C++ community as potential users. So I had to sort out the problem and I finally did.

The client service communication model is now finalized and under test. The system is ready to support transmitted data processing (compression, authentication and encryption).

In this  note I'll explain the encoding of a key data type I name cardinality. A cardinality is the number of elements in a sequence. Because of its efficient encoding I extended its use to other type of information. What makes a cardinality different from a classical unsigned integer is that small values are much more frequent than big values. Consider for instance strings. Most strings will be less than 256 bytes in length but from time to time big strings may show up.

We could thus benefit from an encoding that is compact for small values and eventually bigger for big values. I spent some time investigating all the possible encodings I could find and the most interesting ones have been found in BER and in ICE.

BER's cardinality encoding

BER stands for Basic Encoding Rules and is used in SNMP and x509 certificates encoding. In this encoding the cardinality value is chopped in 7 bits chunks and stored in as many bytes in big endian order. The leading bytes with value 0 are dropped. The most significant bit of all the bytes is set to one, except in the last byte where it is set to 0 and signals the last byte.

I implemented a BER encoder and decoder a long time ago and, as attractive as this encoding might be, it is not trivial to do and it requires some bit manipulations on bytes which makes it not optimal in term of performance.

ICE's cardinality encoding

ICE is an inter-object communication protocol which is really worth looking at and eventually use until DITP is available ;). Check their web site for more information.

ICE was apparently initially inspired by IIOP  that simply doesn't have a cardinality. Sequence length are encoded as a straight 4 byte integer. But ICE encodes classes and method names as strings which are generally short. The 4 byte encoding overhead was very clear. So the designer of ICE's encoding added a small change in it that significantly increased its efficiency.

If the cardinality value is less than 255, the value is stored in a single byte. Otherwise a byte with the value 255 is written and followed by the 4 byte value encoding the cardinality value. Encoding and decoding is trivial and much more efficient then the BER encoding. It is not as compact for values bigger or equal to 255, but the overhead is insignificant when considering the amount of data that follows (i.e. string).

IDR's cardinality encoding

IDR extends the idea found in ICE by supporting 8 byte encoding and the intermediate 2 byte integer size. As with ICE, if the value is smaller than 255 the cardinality is encoded as a single byte. If less than 65535, it is encoded in a total of three bytes. The first byte holds the value 255 and the cardinality value is stored in the following 2 bytes as a classical unsigned integer. If the value is bigger, it is followed by a 4 byte value, etc. up to the 8 byte integer.

The encoding is as trivial and efficient as the one of ICE, if we take care to detect and encode small values first. It is more compact for values up to 65535, and then becomes less efficient because big values have a long encoding size. But as already pointed out, this overhead is insignificant when considering the amount of data that follows.

Use of cardinality in IDR

The cardinality encoding is so efficient that it its use has been extended for other types of information. Here is a short list of them.

- Object references in serialized object aggregates are encoded as cardinality because a reference is simply the object index number in the sequence of serialized objects. The value 0 is reserved for the null reference and the first serialized object is thus identified by the reference value 1. Small reference values are expected to be the most frequent.

- Methods are identified by a number instead of a string as is common place with inter-object communication protocol. The encoding is much more compact and efficient to handle and, on the server side, the method call dispatching becomes simple and efficient because the identifier can be used as an index in a method pointer table. Encoding the method identifier as a cardinality was an obvious solution since small values will be the norm.

- A channel can host multiple concurrent client-service connections.  These bindings are identified by a unique number, starting with 1 and incrementing for each new binding. We can expect that most frequent binding identification will have a small value and thus the use of cardinality encoding imposed itself.

The progress may be slow but the benefit is that taking the time to think and explore the possible solutions for every choice yields a better thought out product.

0 Comments

Progress status

12/18/2007

0 Comments

 

Progress on the design and the prototype implementation is going on. I now have a working prototype for the inter-object communication system. This helps me testing and refining the design. I also regularly review and update the specification documents.

On DITP, the current points of focus is to find a good way to manage the PDU (Protocol Data Units) processing like compression, authentication or enciphering. The user must be able to select and set them up in a snap while keeping it as versatile and flexible as possible.

On IDR, the current point of focus is a refinement of signed information encoding. A straightforward implementation is to simply append to signature to the signed information. But this annihilates all the benefits of the stream oriented encoding. Beside, invalid signature or data must be detected as early as possible. A solution has been identified, but fitting it nicely with the current encoding requires some more investigation.

A design process is a difficult task because we have zillion of decisions to make. The more complex the design, the more decisions there is to make, and likely we can make a mistake somewhere. The two heuristics I use to minimize this risk is first to keep the design as simple as possible and second to minimize the constrains on usage. The former is popular, the later much less.

0 Comments

Data memory aligned or not ?

10/2/2007

1 Comment

 

When designing a data encoding, once the decision to use conventional binary representation is made, the next fundamental decision is whether the data should be memory aligned or not. RPC, CORBA and D-BUS use memory aligned data while ICE and IDR don't.

Memory aligned data ensures that 2, 4, 8, 10 or 16 byte values are stored at an address multiple of their size. For instance a 2 byte value (short integer) would then be stored at addresses 0, 2, 4 or 6. A 4 byte value (long integer) would be stored at addresses 0, 4 or 8 and an 8 byte value at addresses 0, 8, etc.

Some processors (Itanium, RISC) can only handle aligned data and the programmer has to add support for unaligned data himself. The x86 processor supports unaligned data but with a performance penalty. A quick benchmark on a X86 compatible processor showed that accessing unaligned data is nearly twice slower than accessing aligned data.

Memory aligned data requires padding space which on average can represent about 1/3 of used memory space. On modern PC with multi gigabyte RAM, this memory overhead is not relevant. But for hand held devices, embedded computer, long term stored or transmitted data, the memory overhead is much more relevant. A multipurpose encoding should thus care about memory usage as much as encoding/decoding performance.

It is to be noted that generating aligned data may in some case require additional computation and may thus have its own overhead. The code to marshal unaligned data is much simpler and straightforward. One use a simple pointer incremented by the accessed data type size. Getting rid of the alignment constrain also simplify PDU message encoding and encapsulation.

So basically the only drawback of using serialized and unaligned data encoding is the memory access overhead. But this penalty can be removed by ad'hoc hardware or processor instruction. For instance, processors could introduce a pointer variant behaving like an iterator on varying size data. This iterator could take advantage of asynchronous memory prefetching.

RPC, CORBA and D-BUS also benefit from a nearly direct mapping between encoded and internal data structure representation. This is fine when communicating with programs using this representation. But what about java, python or ruby which have their own data and object representation ? This optimization doesn't hold anymore.

This analysis has lead us to the decision to use sequential unaligned data for IDR encoding. What would be your choice on this matter ?

1 Comment

Progress on cryptography

7/20/2007

0 Comments

 

The low level C++ wrapper class for cryptographic functions is now finalized. I use XySSL as low level C cryptographic library. XySSL is an open source project of Christophe Devine, a French computer scientist specialized in security. XySSL will support the VIA padlock cryptographic engine which is a good news since VIA servers are cheap, cold and low consuming computers.

The signing algorithm is parameterized so that one can easily switch to a stronger model if needed. For now we'll use the PKCS1 2.0 OAEP signature model described in RFC3447 because it is stream friendly. The signature model described in IEEE 1363a adds a salt with the hash value. The salt is some random bytes that are hashed before the information to sign.

The problem with this is that the salt is not available when starting to decode the information. To do so we would have to put the signature in front of the information. But then it is the signature generation that would not be stream friendly. One would have to first serialize the data in some buffer so that we can compute the hash value and encode the signature. This then breaks the stream processing model.

It is not clear to me how this salt adds any security to the signature. Please add a comment if you have some hints on this. It seem that picking a stronger hash function with longer digest or combining multiple hash functions output would contribute more to security than the salt value.
 

0 Comments

Security

6/24/2007

2 Comments

 

A modern communication protocol must be secure. And to do it right, security must have been integrated in the design from the very start. Here is a short list of security requirements for DITP:

   - authenticate peers
   - support exchanged data authentication and encryption
   - provide access control on accessible services, objects and methods
   - support single and multi-signed information of any kind
   - signed information supporting polymorphism and aggregates
   - allow anyone to verify any signature with minimal knowledge

Multi-signed information is when more than one people sign a given information, (i.e. a contract).

With a stream oriented encoding this all imply that we are able to apply a hash function (i.e. SHA) on transmitted data while it is encoded or decoded.

This is what I am currently implementing. Unfortunately, a server crash, monopolized all my time this week. Murphy's law revenge... 

2 Comments
<<Previous

    Author

    Christophe Meessen is a  computer science engineer working in France.

    Any suggestions to make DIS more useful ? Tell me by using the contact page.

    Categories

    All
    Business Model
    Database
    Dis
    Ditp
    Dvcs
    Git
    Gob
    Idr
    Misc
    Murphys Law
    Programming Language
    Progress Status
    Startup
    Suggested Reading
    Web Site

    Archives

    December 2017
    November 2015
    September 2015
    February 2013
    December 2012
    November 2012
    May 2012
    February 2012
    March 2010
    October 2009
    September 2009
    July 2009
    June 2009
    May 2009
    February 2009
    January 2009
    November 2008
    September 2008
    August 2008
    July 2008
    May 2008
    April 2008
    March 2008
    February 2008
    January 2008
    December 2007
    October 2007
    August 2007
    July 2007
    June 2007
    May 2007

    RSS Feed

    Live traffic feed
    You have no departures or arrivals yet. Wait a few minutes and check again.
    Powered by FEEDJIT
Powered by Create your own unique website with customizable templates.