If I stay away from the fact if using the HashSet is the right type for the job at hand or if your Comparer even makes sense implementing a proper GetHashCode
does seem to make a huge difference.
Here is an example implementation, based on an answer from Marc Gravell:
class KeyWordComparer : EqualityComparer<Keyword>
{
// omitted your Equals implentaton for brevity
public override int GetHashCode(Keyword keyword)
{
//return 0; // this was the original
// Marc Gravell https://stackoverflow.com/a/371348/578411
int hash = 13;
// not sure what is up with the only 8 ID's but I take that as a given
for(var i=0; i < Math.Min(keyword._lista_modelos.Count, 8) ; i++)
{
hash = (hash * 7) + keyword._lista_modelos[i].GetHashCode();
}
return hash;
}
}
When I run this in LinqPad with this test rig
Random randNum = new Random();
var kc = new KeyWordComparer();
HashSet<Keyword> set = new HashSet<Keyword>(kc);
var sw = new Stopwatch();
sw.Start();
for(int i =0 ; i< 10000; i++)
{
set.Add(new Keyword(Enumerable
.Repeat(0, randNum.Next(1,10))
.Select(ir => randNum.Next(1, 256)).ToList()));
}
sw.Stop();
sw.ElapsedMilliseconds.Dump("ms");
this is what I measure:
- 7 ms for 10,000 items
If I switch back to your return 0;
implementation for GetHashCode
I measure
- 4754 ms for 10,000 items
If I increase the testloop to insert 100,000 items the better GetHashCode still completes in 224 ms on my box. I didn’t wait for your implementation to finish.
So if anything implement a proper GetHashCode
method.
solved c# HashSet init takes too long