为什么用于计算哈希值code XOR运算符?运算符、哈希值、code、XOR

由网友(旧城失词‖soul ≈)分享简介:在此MSDN文章http://msdn.microsoft.com/en-us/library/ms132123.aspx 论述类Equalitycomparer并具有example.In这个例子中大约有此类比较箱 - In this MSDN article http://msdn.microsoft.com...

在此MSDN文章 http://msdn.microsoft.com/en-us/library/ms132123.aspx 论述类Equalitycomparer并具有example.In这个例子中大约有此类比较箱 -

In this MSDN article http://msdn.microsoft.com/en-us/library/ms132123.aspx it discusses the Class Equalitycomparer and has an example.In this example about comparing boxes it has this class -

class BoxSameDimensions : EqualityComparer<Box>
    public override bool Equals(Box b1, Box b2)
        if (b1.Height == b2.Height & b1.Length == b2.Length
            & b1.Width == b2.Width)
            return true;
            return false;

    public override int GetHashCode(Box bx)
        int hCode = bx.Height ^ bx.Length ^ bx.Width;
        return hCode.GetHashCode();

我不明白,行int H code = bx.Height ^ bx.Length ^ bx.Width;

I don't understand the line int hCode = bx.Height ^ bx.Length ^ bx.Width;


Could someone explain please? Why the xor?


^ 运算符是的按位异或操作。

在这种情况下,它是被用来作为一种方便的方式来生成从三个整数的散列code。 (我不认为这是一个很好的方式,但是这是一个不同的问题...)

In this case it's being used as a convenient way to generate a hash code from three integers. (I don't think it's a very good way, but that's a different issue...)

古怪,构建一个散列code后,他们使用 GetHash code()上一遍,这是毫无意义的一个int,因为它会只返回INT本身 - 所以这是一个空操作。

Weirdly, after constructing a hash code, they use GetHashCode() on it again, which is utterly pointless for an int because it will just return the int itself - so it's a no-op.


This is how they should have written it:

public override int GetHashCode(Box bx)
    return bx.Height ^ bx.Length ^ bx.Width;


This SO answer explains why XOR works quite well sometimes: Why are XOR often used in java hashCode() but another bitwise operators are used rarely?


Note: The reason I don't like using xor for a hash code for three ints like that is because:

a ^ b ^ a == b

在换句话说,如果在第一和最后一个整数有助于散列code相同,只要不向最终散列code在所有 - 它们互相抵消,其结果总是中间的int。

In other words if the first and last ints contributing to the hash code are the same, they do not contribute to the final hash code at all - they cancel each other out and the result is always the middle int.


It's even worse if you are only using two ints because:

a ^ a == 0


So for two ints, for all cases where they are the same the hash code will be zero.


