Symmetric RSS (Receive-Side scaling)

Receive-Side scaling is a good method, supported by hardware NIC vendors to load-balance traffic flows,up to 5 tupples, to different cores. This helps for the locality of reference and cache coherency (which improves performance).

RSS (Taken from Microsoft)

The RSS algorithm which is mostly used, and suggested by Microsoft is Toeplitz hash algorithm. This algorithm has two inputs: (1) Key, (2) Input (Tupples) from the packet. It outputs a 32 bit hash value, which is used to determine which hardware queue the packet will be delivered to. The relevant core, can poll the associated queue for it’s data – which might be even application specific. The input can be select-able 2,3,4,5 tupples, supporting IPV4 and IPV6. The drawback is that Microsoft’s recommended key, won’t allow symmetric flows to be load-balanced to the same core. To understand this, we need to know how the Toeplitz hash algorithm works.
This is the pseudo code for the hash calculation:

ComputeHash(input[], n)

result = 0
For each bit b in input[] from left to right
if (b == 1) result ^= (left-most 32 bits of K)
shift K left 1 bit position

return result

As you can see, the input is XORed with the key data, whenever there is a “1” bit. Let’s assume we have a frame IP source:, IP destination: and UDP port 22 to udp port 55. This means that the input for the hash function of the 4 tupples will be: [][][22][55] and for the opposite direction: [][][55][22]. To support the same hash value for these two inputs, the first 32bit of the key need to be identical to the second 32bit, and the 16bit afterwards should be identical to the next 16bit.
The problem with this key requirements, is that we weakens it, in such way we can get a lot of collisions, leading to a very bad distributed load-balancing.
Luckily I ran into this paper.

New Key

Basicly they proved that modifying the original Microsoft’s key:

0x6d5a 0x56da 0x255b 0x0ec2
0x4167 0x253d 0x43a3 0x8fb0
0xd0ca 0x2bcb 0xae7b 0x30b4
0x77cb 0x2da3 0x8030 0xf20c
0x6a42 0xb73b 0xbeac 0x01fa

with a new key:

0x6d5a 0x6d5a 0x6d5a 0x6d5a
0x6d5a 0x6d5a 0x6d5a 0x6d5a
0x6d5a 0x6d5a 0x6d5a 0x6d5a
0x6d5a 0x6d5a 0x6d5a 0x6d5a
0x6d5a 0x6d5a 0x6d5a 0x6d5a

yields to a similar distribution performance.This new absolute symmetrical key, answers both requirements: The first two 32bit values are identical (first row), and also the next 2 16bit values (second half row).
Having the key configured to the NIC, we can now achieve load-balancing of a bi-directional TCP connection. This is useful, for applications like IPS, DPI, Security and data analytic algorithms, which the opposite data direction needs to be processed too.
Toeplitz hash is widely used, and there are plenty of benchmarking on the web, exploiting the collisions use-cases. Usually the more the traffic is random (usually apply to high-rate links)  the more “uniform” the distribution is, and less care about.

More on RSS can be found on MSDN site here.

4 thoughts on “Symmetric RSS (Receive-Side scaling)

  1. Pingback: RSS Toeplitz Hash Calculation C Code | richliu's blog

  2. Pingback: Learning DPDK : Symmetric RSS | Denys Haryachyy

  3. I think the suggested alternative key (repetitions of 0x6d5a) doesn’t work well for RSS when the hash result has to be larger than 16-bit.
    Microsoft RSS spec defines hash result of 32-bits.
    Using 16-bit key for Toeplitz-Hash, results in hash where 16 MSBs are identical to 16 LSBs.
    What am I missing?

Leave a Reply

Your email address will not be published. Required fields are marked *