Here is a document presenting a review on what is good an bad with HTTP. It provides some light on the choices I made for DIS. I couldn't identify the author's name in the text. Sorry. 

What is wrong with HTTP ?


In this essay, the first of a pair on browser apps, I explore how they are better than traditional desktop apps in some ways, but worse in others. Some of the disadvantages of browser apps are deeply rooted in the use of HTTP URLs for naming. In the second essay, I will present a design sketch for a new platform, are placement for HTTP combining both styles' advantages.Right now, we're seeing a massive shift to browser apps, largely server-side browser apps. As I warned in "People, places, things,and ideas," [18] this move to server-side browser apps imperils our software freedom; I outlined how to solve this problem in "The equivalent of free software for online services." [19] This pair of essays represents more detail on this problem and proposed solution.

read more ...

 
 

At work I'm currently working on tomographic reconstruction algorithms. I have to implement a Bayesian iterative algorithm that requires to select the median value in a set of the 27 float values from a cube of 3x3x3 voxels. This operation must be performed for each voxel and for each iteration. We have to expect 256 million voxels to process for each iteration, but "only" 60 to 100 iterations.

Trying to find the most efficient algorithm I came up with a new algorithm considering the one described on the select algorithm page of wikipedia. I then submitted a question on StackOverflow to get some feedback. And I did get valuable feedback. I was first pointed to the C++ nth_element function I didn't know at the time. It was also suggest to optimize by sharing intermediate information which is indeed a smart thing to do to get a better initial guess on the median value.

After multiple tests and changes to the code I finally reached what seem to be a very efficient algorithm. The ratio is so good that I'm still unsure about it, but I checked everything. It could be due to memory cache or particularly favorable parallelization opportunities with sse instructions. I don't know.

Here is the outline of the algorithm. We have 27 values and have to find the value that split the set in two with 13 values smaller or equal to it and 13 values bigger or equal to it. This value is called the median value.

The fundamental idea of the algorithm is to use a heap data structure which has its smallest or biggest value at the top. Such data structure is very efficient for adding an element or to extract its top most element. It can also be very easily mapped into an array.

The algorithm uses two heaps initially empty, with a common top value which is the median value. Each heap has a capacity of at most 14 elements, where one is common to the two heaps, their top value and also the median value.
The algorithm proceed in two phases. In the first phase, the algorithm picks a value as initial median value guess. Subsequent values are then compare with this median value and added to the corresponding heap until one heap becomes full and contains 14 elements.

At this point the median value is a value in the full heap or in the remaining set of values to process. The second phase of the algorithm then starts where the remaining values are processed. Values that would not be inserted in the full heap are ignored. The other values are inserted in the full heap after deleting its top most value. The heap then gets a new top most value and thus also a new median value. When all the remaining elements have been processed, the median of the 27 value set is the top most value of the full heap.

Here are some benchmark results. See StackOverflow for more detailed information.

HeapSort        :2.287 
QuickSort       :2.297 
QuickMedian1
:0.967 
HeapMedian1  
:0.858 
NthElement     :0.616 
QuickMedian2
:1.178 
HeapMedian2  
:0.597 
HeapMedian3  
:0.015  <-- best

It thus seem that HeapMedian3 is 33 times faster than NthElement. I used a 3GHz Intel E8400 processor and the Intel C++ compiler with options -03 and -xS for benchmarking.

Here is the code :

// return the median value in a vector of 27 floats pointed to by a
float
heapMedian3( float*a )
{
   
float left[14], right[14], median,*p;
   
unsigned char nLeft, nRight;

   
// pick first value as median candidate
   p
= a;
   median
= *p++;
   nLeft
= nRight =1;

   
for(;;)
   
{
       
// get next value
       
float val = *p++;

       
// if value is smaller than median, append to left heap
       
if( val < median )
       
{
           
// move biggest value to the heap top
           
unsigned char child = nLeft++, parent = (child -1)/2;
           
while( parent && val > left[parent] )
           
{
               left
[child] = left[parent];
               child
= parent;
               parent
=(parent -1)/2;
           
}
           left
[child] = val;

           
// if left heap is full
           
if( nLeft == 14)
           
{
               
// for each remaining value
               
for( unsigned char nVal = 27-(p - a); nVal; --nVal )
               
{
                   
// get next value
                   val
= *p++;

                   
// if value is to be inserted in the left heap
                   
if( val < median )
                   
{
                       child
= left[2] > left[1] ? 2 : 1;
                       
if( val >= left[child])
                           median
= val;
                       
else
                       
{
                           median
= left[child];
                           parent
= child;
                           child
= parent*2 + 1;
                           
while( child <14 )
                           
{
                               
if( child < 13 && left[child+1] > left[child] )
                                   
++child;
                               
if( val >= left[child] )
                                   
break;
                               left
[parent] = left[child];
                               parent
= child;
                               child
= parent*2 + 1;
                           
}
                           left
[parent] = val;
                       
}
                   
}
               
}
               
return median;
           
}
       
}

       
// else append to right heap
       
else
       
{
           
// move smallest value to the heap top
           
unsigned char child = nRight++, parent = (child -1)/2;
           
while( parent && val < right[parent] )
           
{
               right
[child] = right[parent];
               child
= parent;
               parent
= (parent -1)/2;
           
}
           right
[child] = val;

           
// if right heap is full
           
if( nRight == 14 )
           
{
               
// for each remaining value
               
for( unsigned char nVal = 27-(p - a); nVal; --nVal )
               
{
                   
// get next value
                   val
= *p++;

                   
// if value is to be inserted in the right heap
                   
if( val > median )
                   
{
                       child
= right[2] < right[1] ? 2 : 1;
                       
if( val <= right[child] )
                           median
= val;
                       
else
                       
{
                           median
= right[child];
                           parent
= child;
                           child
= parent*2 + 1;
                           
while( child <14 )
                           
{
                               
if( child < 13 && right[child+1] < right[child] )
                                   
++child;
                               
if( val <= right[child] )
                                   
break;
                               right
[parent] = right[child];
                               parent
= child;
                               child
= parent*2 + 1;
                           
}
                           right
[parent] = val;
                       
}
                   
}
               
}
               
return median;
           
}
       
}
   
}
}

 
 

This is a talk of Steve Jobs at Stanford  in 2005. It's really worth  viewing.

 
 

In some discussion group we just settled, we were asked to write down what would be the most important for us in our lives. And to our big surprise 90% youngsters in the group used the words "recognition" or "social recognition". There wasn't any significant discussion before that could have biased this informal survey. It was surprising enough to catch my attention.

20 years later, here is how I explain this to myself. It is known for many years now that our brain has special centers responsible of fundamental drives in our behavior. These centers have been detected and localized by changes in people's behavior when altered by an accident or a disease. The drives are related to hunger, libido, aggressivity/love, etc.


Assuming that mankind is the product of an evolution driven by natural selection, the existence of such drive centers makes sense. It ensures perennity of our species from the biological and the physiological perspective. Maybe the social perspective did not get the same attention.

To me, the existence of a drive for social recognition would also make sense because it pushes toward social coherence and cohesion and thus also to stability while being flexible enough to preserve the capability of our society to evolve and adapt to context changes.

This social recognition drive is thus equivalent to sexual drive, with all the possible related pathology. Social recognition compulsion can for instance lead to mythomania or eccentric behavior.

So my impression, if you allow me to push the logic to its extrema, is that social network sites are a variant of porn sites. They just tickle the brain centers driving our behavior.

I would thus say that social networking is directly related to the social recognition drive, which is itself a kind of attractive magnetic force required to get social cohesion and coherence. If this is true, animals would show the same fundamental drive, and why not, ETs too.

 
 

DIS is designed to address many shortcomings of existing internet services. But it is also subject to the network effect which means that its value is function of the number of users.

It's like a TV broadcasting system. Whatever the quality of the technology, without interesting broadcasts it won't attract users and without users it won't attract producers and without producers it won't get attractive broadcasts and then again no users. But when the feedback in such system becomes positive we get a virtuous circle which amplifies its spinning by itself. And once the virtuous circle is spinning one can tap its energy with advertisement, freemium or even a fee if the traction is strong enough.

To bootstrap it, one has to pick the right TV broadcast which has to be one that people want to see while also simple and cheap to produce. As you may guess this is a strategic choice.

Starting such type of system is not easy, but once rolling it is quite safe because it is difficult to stop and provides thus a protection against competitors. A good example is Google failing to catch on You Tube which ended up buying it.

Like the TV broadcasting system, DIS offers the liberty degree to pick the right bootstrapping application. It has to be something that people want while being also cheap and simple to produce.

The business model is also strategic. Offering a free service means not tapping energy from the virtuous circle which is good to get maximum acceleration. But at some point we need to tap energy from it so that we can invest it in development of new features and make it more attractive.

The business model and the bootstrapping application are thus key to success, but don't expect me to present them here before they are in production.

 
 

This picture of Marseille was taken from the top of Marseilleveyre, a small mountain south of Marseille. The small harbor in front is the harbor of Pointe Rouge. The Vieux Port, center of Marseille, and Notre Dame de la Garde, are not visible here.

One of the islands in the bay host the Château d'If that was a prison for sometime. This place was made famous by the roman "Le Comte de Monte Cristo" of Alexandre Dumas.

 
 

I just wanted to share with you two nice pictures taken during my summer holidays. 

The picture shows some vineyards and a very old church in a small village close to Blaye. The wine of this region is labeled côte de grave and famous. The river Gironde is just behind the small hill in the background. The famous Bordeau wine region Medoc is just on the other side of the Gironde.

 
 

Eric Schmidt, Google CEO, presented his view of the web evolution track at the Seoul Digital Forum.

DIS was conceived with this view in mind. The protocol and key service are the keystone of such system.

There are other properties that Eric Schmidt didn't evoke but that require a better response than provided by existing systems.

 
 

An enlightening reading on hacking! 'To those about to hack"

As people may have understood by now, I'm more of an Abe than a George...

 
About me... 06/27/2007
 

I've added a photo of my self so you can associate a face to DIS. The city where I live and work is Marseille which is in Provence (south of France). View it with Google maps.

Below you can see the calanque of Sugiton, just next to Marseille, that you can reach after a 30 minute walk through a pine wood from Luminy.