Hm, in case you really need to keep ice_getHash I would strongly suggest to change the internals of HashUtil.h to a standard hashing algorithm that guarantees equal distribution of hashes.
I wrote a little test to demonstrate the shortcomings of the algorithm:
Code:
#include <iostream>
#include <sstream>
#include <string>
#include <map>
#include <Ice/Ice.h>
template<typename T>
inline std::string to_string(const T& val)
{
std::ostringstream streamOut;
streamOut << val;
return streamOut.str();
}
int main(int argc, char** argv)
{
typedef std::map<Ice::Int, Ice::ObjectPrx> SeenMap;
SeenMap seen;
unsigned int collisions = 0;
unsigned int i = 0;
unsigned int maxCollisions = 10;
Ice::CommunicatorPtr communicator = Ice::initialize(argc, argv);
for (i = 0; collisions < maxCollisions; ++i)
{
std::string oid = to_string(i)+":default -p 10000";
Ice::ObjectPrx obj = communicator->stringToProxy(oid);
if (!seen.insert(std::make_pair(obj->ice_getHash(), obj)).second)
{
Ice::ObjectPrx existingObj = seen[obj->ice_getHash()];
std::cerr << "Hash of "
<< obj->ice_toString()
<< " == "
<< seen[obj->ice_getHash()]->ice_toString()
<< " == "
<< obj->ice_getHash()
<< std::endl;
++collisions;
}
}
std::cerr << "Found " << collisions << " collisions in " << i << " iterations" << std::endl;
}
By modifying maxCollisions you can determine the hit/miss ratio, in case of maxCollision == 10 the program output is:
Code:
Hash of 20 -t:tcp -p 10000 == 15 -t:tcp -p 10000 == 1490
Hash of 21 -t:tcp -p 10000 == 16 -t:tcp -p 10000 == 1495
Hash of 22 -t:tcp -p 10000 == 17 -t:tcp -p 10000 == 1500
Hash of 23 -t:tcp -p 10000 == 18 -t:tcp -p 10000 == 1505
Hash of 24 -t:tcp -p 10000 == 19 -t:tcp -p 10000 == 1510
Hash of 30 -t:tcp -p 10000 == 25 -t:tcp -p 10000 == 1515
Hash of 31 -t:tcp -p 10000 == 26 -t:tcp -p 10000 == 1520
Hash of 32 -t:tcp -p 10000 == 27 -t:tcp -p 10000 == 1525
Hash of 33 -t:tcp -p 10000 == 28 -t:tcp -p 10000 == 1530
Hash of 34 -t:tcp -p 10000 == 29 -t:tcp -p 10000 == 1535
Found 10 collisions in 35 iterations
Which is almost 30% duplicates.
When run with maxCollisions == 10000, the output is:
Code:
...
Hash of 11984 -t:tcp -p 10000 == 11979 -t:tcp -p 10000 == 192535
Hash of 11990 -t:tcp -p 10000 == 11985 -t:tcp -p 10000 == 192540
Hash of 11991 -t:tcp -p 10000 == 11986 -t:tcp -p 10000 == 192545
Hash of 11992 -t:tcp -p 10000 == 11987 -t:tcp -p 10000 == 192550
Hash of 11993 -t:tcp -p 10000 == 11988 -t:tcp -p 10000 == 192555
Hash of 11994 -t:tcp -p 10000 == 11989 -t:tcp -p 10000 == 192560
Found 10000 collisions in 11995 iterations
which is more than 80% duplicates.
So whoever is relying on ice_getHash to get reasonably unique identifiers for proxies within the Ice::Int number space right now is in serious trouble. I would assume that's also true for Object::ice_getHash (it uses the same simplistic algorithms from HashUtil.h), even though it's a lot less likely to surface in case of classes.