The design of our protocol is geared toward small delay thresholds. Thus, it assumes that summaries are updated via sending the differences. If the delay threshold is large, then it is more economical to send the entire bit array; this approach is adopted in the Cache Digest prototype in Squid 1.2b20 [44].
We added a new opcode in ICP version 2 [48],
ICP_OP_DIRUPDATE (= 20), which stands for directory update messages.
In an update message, an additional header follows the regular ICP header
and consists of: 16 bits of Function_Num, 16 bits of
Function_Bits, 32 bits of BitArray_Size_InBits, and 32 bits of
Number_of_Updates. The header completely specifies the hashing
functions used to probe the filter. There are Function_Num of hashing
functions. The functions are calculated by first taking bits 0 to M-1,
M to 2M-1, 2M to 3M-1, etc. out of the MD5 signature [38,22] of the URL, where M
is Function_Bits, and then modular the bits by
BitArray_Size_InBits.
If 128 bits are not enough, more bits are generated by computing the MD5
signature of the URL concatenated with itself.
The header is followed by a list of 32-bit integers. The most significant bit in an integer specifies whether the bit should be set to 0 or 1, and the rest of the bits specify the index of the bit that needs to be changed. The design is due to the concern that if the message specifies only which bits should be flipped, loss of previous update messages would have cascading effects. The design enables the messages to be sent via a unreliable multicast protocol. Furthermore, every update message carries the header, which specifies the hash functions, so that receivers can verify the information. The design limits the hash table size to be less than 2 billion, which for the time being is large enough.