Blog Archives

Date time stamp binary encoding

12/18/2012

Infinite Clock II by Robbert van der Steeg

A date time stamp is a reference in time. This post consider only date time stamps used as time references in computer systems with a limited time span like now +/- 100 years.

It presents a binary encoding with microsecond unit resolution for absolute time encoding including time zone information or relative time intervals for arithmetic time computation.

Introduction

Operating systems classically represent time as an integer value corresponding to the number of seconds elapsed since 1970-01-01 00:00. Unfortunately the big time resolution granularity and absence of time zone information makes it inconvenient to use as time reference for world wide communicating applications.

Rationale

The rational of this encoding choice is to privilege efficient date time comparison and local time computation or UTC time and time zone extraction with simple to remember and trivial operations. Arithmetic operations on time should also be straightforward.

Time zone encoding

As of ISO 8601, the international normalization of date time representation, the time offset relative to UTC has a minute granularity. According to this bug report the smallest time zone offset value relative to UTC may be -15:56:00 in Asia/Manila and the biggest 15:13:42 in America/Metlakatla. We may round this to -16:00 to +15:59. This time span represent 2 x 960 = 1920 minutes. Thus 11 bits are sufficient to encode the time zone. The value is encoded as an unsigned integer relative to 1024. Thus -40 is encoded as 1024 - 40 = 984 and +40 as 1024 + 40 = 1064. An hour is 60 minutes, thus 2:04 is encoded as 2 x 60 + 4 = 124.

Time encoding

If we use 64 bit integers, this leaves 53 bits for time encoding. The obvious choice is to use the UTC time as universal reference and the time elapsed since 1970-01-01 00:00 in some unit to get an integer representation. This provides a well normalized and easy to remember time reference. It also simplify conversion from the existing (old) 32bit system time encoding. Reserving one bit as sign bit so that a 64 bit signed integer data type can be used, we have 52 bits left. Using microsecond time units, the time value can be in a year range of 1970 +/- 142. This leaves 100 years left ahead of us.

Encoding summary

The time is encoded in a 64 bit signed integer. The 53 most significant bits represent a signed time delay in microsecond time units.

When the value represent a time interval or the result of some time computation the 11 less significant bits are 0 so that conventional signed integer arithmetic operations can be use for time computation. The only constrain is with time interval division where the the 11 less significant bits of the result must be cleared.

When the time is an absolute time, the 53 most significant bits encode the time interval relative to the 1970-01-01 00:00 UTC time. The 11 less significant bits encode the local time offset relative to the UTC time in minute units and added by 1024 so that as it is encoded as an unsigned integer value. The value 0 (-1024) is not a valid time offset value.

Time operations

Testing if a time value is an absolute time or an interval is performed by testing if the 11 less significant bits are all 0.
To perform time computation, first clear the 11 less significant bits then use conventional integer addition, subtraction and multiplication arithmetic operations.
To perform time interval division, use the normal integer division operation and clear the 11 less significant bits of the result.
Comparing absolute times can be done as conventional integer comparison as well for time intervals. Comparing absolute time, with time interval won't make sense unless the time interval is relative to the 1970-01-01 00:00 UTC time.
Extracting the UTC time zone in minute units is performed by clearing the 53 most significant bits and subtracting 1024 to the resulting value.
Conversion to double precision floats with second units is trivial and without loss of precision, but it will lack the time zone information.

Final remarks

This encoding is trivial to understand and to manipulate by using conventional integer arithmetics, comparison or bit wise operations. Its value may represent an absolute time or a time interval with the possibility to distinguish between these two types of value. Time comparison or arithmetic operations in this representation is more efficient than by using double float encoding.

This encoding is perfectly suited for date time stamping in the defined limited range and using such encoded date as indexed key in a database or when sorting stamped information is needed. It allows to display any absolute time using the ISO 8601 convention or any country specific representation.

However this time encoding has two limitations which are minor weakness. The first limitation is the restricted time span covered by the encoding. The second limitation is the inability to encode summer or winter time saving information. The later is not impairing absolute time comparison because the UTC time is used as reference. The problem is just the inability to determine if the time zone offset includes or not the winter or summer time. But this is also the case with the ISO 8601 representation.

2 Comments

Date time stamp binary encoding

Author

Categories

Archives