Distributed Information System (DIS)
  • Home
  • The blog
  • Contact

Data memory aligned or not ?

10/2/2007

1 Comment

 

When designing a data encoding, once the decision to use conventional binary representation is made, the next fundamental decision is whether the data should be memory aligned or not. RPC, CORBA and D-BUS use memory aligned data while ICE and IDR don't.

Memory aligned data ensures that 2, 4, 8, 10 or 16 byte values are stored at an address multiple of their size. For instance a 2 byte value (short integer) would then be stored at addresses 0, 2, 4 or 6. A 4 byte value (long integer) would be stored at addresses 0, 4 or 8 and an 8 byte value at addresses 0, 8, etc.

Some processors (Itanium, RISC) can only handle aligned data and the programmer has to add support for unaligned data himself. The x86 processor supports unaligned data but with a performance penalty. A quick benchmark on a X86 compatible processor showed that accessing unaligned data is nearly twice slower than accessing aligned data.

Memory aligned data requires padding space which on average can represent about 1/3 of used memory space. On modern PC with multi gigabyte RAM, this memory overhead is not relevant. But for hand held devices, embedded computer, long term stored or transmitted data, the memory overhead is much more relevant. A multipurpose encoding should thus care about memory usage as much as encoding/decoding performance.

It is to be noted that generating aligned data may in some case require additional computation and may thus have its own overhead. The code to marshal unaligned data is much simpler and straightforward. One use a simple pointer incremented by the accessed data type size. Getting rid of the alignment constrain also simplify PDU message encoding and encapsulation.

So basically the only drawback of using serialized and unaligned data encoding is the memory access overhead. But this penalty can be removed by ad'hoc hardware or processor instruction. For instance, processors could introduce a pointer variant behaving like an iterator on varying size data. This iterator could take advantage of asynchronous memory prefetching.

RPC, CORBA and D-BUS also benefit from a nearly direct mapping between encoded and internal data structure representation. This is fine when communicating with programs using this representation. But what about java, python or ruby which have their own data and object representation ? This optimization doesn't hold anymore.

This analysis has lead us to the decision to use sequential unaligned data for IDR encoding. What would be your choice on this matter ?

1 Comment

    Author

    Christophe Meessen is a  computer science engineer working in France.

    Any suggestions to make DIS more useful ? Tell me by using the contact page.

    Categories

    All
    Business Model
    Database
    Dis
    Ditp
    Dvcs
    Git
    Gob
    Idr
    Misc
    Murphys Law
    Programming Language
    Progress Status
    Startup
    Suggested Reading
    Web Site

    Archives

    December 2017
    November 2015
    September 2015
    February 2013
    December 2012
    November 2012
    May 2012
    February 2012
    March 2010
    October 2009
    September 2009
    July 2009
    June 2009
    May 2009
    February 2009
    January 2009
    November 2008
    September 2008
    August 2008
    July 2008
    May 2008
    April 2008
    March 2008
    February 2008
    January 2008
    December 2007
    October 2007
    August 2007
    July 2007
    June 2007
    May 2007

    RSS Feed

    Live traffic feed
    You have no departures or arrivals yet. Wait a few minutes and check again.
    Powered by FEEDJIT
Powered by Create your own unique website with customizable templates.