The following is a hands-on post meant to demonstrate how GetHashCode()
and Equals()
methods are used by .NET Framework under the hood. For the sake of simplicity I will refer to the popular hashed-base Dictionary type, although any other hash based structure will follow a similar behavior, if not the same one.
After understanding this post you should be able to spot potential problematic behaviors and resolve them, prevent creation of unreachable items in dictionaries and improve CRUD actions performance on hash based structures.
The Theory
GetHashCode()
is used to create a unique integer identifier for objects/structs. The hashcode can be used for two purposes:
- Programmatically, by developers, to distinguish objects/structs form each other (NOTE: Not recommended when the default .NET implementation is used, as it's not guaranteed to preserve the same hash between .NET versions and platforms)
- Internally, by .NET Framework, when using the object/struct as a key in a hashed based list, i.e. Dictionary. This scenario is the main subject of this article.
Microsoft has supplied both Reference and Value types with two default GetHashCode()
implementations. The implementation is designed to provide a generic "good-enough" solution with almost no knowledge of our particular types at runtime. As result the default implementation of GetHashCode()
works, but is not optimized. (The default Value.GetHashCode()
has an additional issue, discussed later in this post).
Add and Search Mechanism
When using an object/struct as a key in a Dictionary, .NET implementation internally calls the given object/struct's GetHashCode()
method and calculates an index from it, based on the length of the list.
Although implementation can vary between .NET versions, the common implementation is similar to the following:
int hash = GetHashCode() % dictionary.Length;Don't worry, we will review the default implementation later-on.
Hashbased structures are composed by buckets that are used to 'hold' the supplied objects. When trying to insert new items in a Dictionary .NET uses the calculated index and tries to insert the item in that specific index position.
If the calculated index is still not in use (the bucket is empty), the item is placed smoothly in the bucket (O(1) algorithm).
In case the calculated index already 'holds' an object(s), meaning that the bucket is already in use, .NET will still try to add the object to the bucket, but only if the identical key is not already used by that specific bucket. This scenario is called "collision"(O(n) algorithm).
In order to check the existence of identical keys in a bucket .NET compares each existing key in the bucket with the new candidate item by calling the Equals()
method on each of the existing items:
bool exist = alreadyInUseKey1.Equals(newKey1); //* Loops through the entire bucket items list
If .NET encounters an equivalence between two keys (exist = true) an 'An item with the same key has already been added' exception will be thrown. Otherwise the item will be added with the provided key.
NOTE: Typically, the domain of a hash function (the set of possible keys) is larger than its range (the number of different table indexes), and so it will map several different keys to the same index (called collision).
When searching an item based on its key object, .NET internally retrieves the key's hashcode, recalculates the index and instructs the CPU to retrieve the object from the specific bucket at the index position.
Optimal Implementation
In order to distinguish objects from each other, we need to make use of all, or parts of, its internal values (properties or fields). There are three common golden rules we must follow in order to maximize use of hashed lists:
- If two objects are equal (as defined by
Equals()
method), they must generate the same hash value, Otherwise, their hashcodes can’t be used to find objects in their correspondent buckets.
The contrary is not required - Multiple non-Equal objects can have the same hashcode (collisions). This is the reason why we need to overrideEquals()
whenGetHashCode()
has been overridden and vice versa, we need them to be synchronized. - For any object A,
A.GetHashCode()
must be an instance invariant. No matter what methods are called upon A,A.GetHashCode()
must always return the same value. That ensures that an object placed in a bucket will always remain reachable. - To reduce the frequency of hashcode collisions, the hash function should generate a uniform distribution among all integers (32/64 bit) based on all, or part of, the object/struct fields. That’s how you achieve efficiency in a hash-based container.
Wikipedia: In probability theory and statistics, the discrete uniform distribution is a symmetric probability distribution whereby a finite number of values are equally likely to be observed; every one of n values has equal probability 1/n. Another way of saying "discrete uniform distribution" would be "a known, finite number of outcomes equally likely to happen". A simple example of the discrete uniform distribution is throwing a fair die. The possible values are 1, 2, 3, 4, 5, 6, and each time the die is thrown the probability of a given score is 1/6. If two dice are thrown and their values added, the resulting distribution is no longer uniform since not all sums have equal probability.
In order to use the large number produced by the GetHashCode()
as an index, we need to reduce it to the length of the dictionary while preserving the good distribution gained by our GetHashCode()
.
For that reason, as explained previously, .NET internally calculates the index by modeling it against the length of the list. By the way, .NET preserves the hashcode in order to use it in case the Dictionary capacity is increased at runtime. In this case .NET recalculates all items indexed with the new Length of the dictionary and replaces the items position accordingly.
A common and successful algorithm is to XOR all the return values from GetHashCode()
on all fields in a type. If your type contains some mutable fields, exclude those fields from the calculations. At this point it should be clear that a poor hashcode algorithm can cause performance issues and even unreachable items (see samples in "Hands On" section below).
Default .NET Implementation
As mentioned before, mediocre GethashCode()
and Equals()
implementations where supplied by default by Microsoft:
System.Object doesn't have to follow rule 3, because it uses a simple sequential integer. So when you create reference types that are meant to be hash keys, you should override GetHashCode()
to get a better distribution of the hash values across all integers for your specific type.
System.ValueType overrides Objects' implementations and performs a hash calculation based on the object's first Property. Rule 2 and 3 depend on the first property of an object. It must be well distributable and immutable.
NOTE:
I have taken the description above, about System.ValueType
implementation, from the book "Effective C#". Several tests later, I now understand that it's not always the case. .NET apparently uses a more complex algorithm. In this article we will follow "Effective C#"'s description.
Hands On
The following is an effective illustration demonstrating how small changes can lead to big differences when talking about hash based structures:
Defaul Value Implelemntation
Given a custom value struct-key, where the first field (id) of the key is always different.
struct MyKey { private int id; private string msg; public MyKey(int id) { this.id = id; this.msg = "Hello"; } } private static void AddDic() { Dictionary <MyKey,int> list = new Dictionary<MyKey,int>(); for (int i = 0; i < 100; i++) list.Add( new MyKey(i), i); }
The code above will take ~8 seconds on my 4 core 8MB RAM machine.
Let's say that time comes and your college made a small refactor:
struct MyKey { private string msg;//* Now first field in the struct private int id; public MyKey(int id) { this.id = id; this.msg = "Hello"; } }
Even though she just changed the order of the fields, now the code will take ~30 seconds, because the first field of the struct is fixed and all the items are inserted in the same bucket.
Custom Equals & GetHashCode
The same is true with custom overridden methods. We need to make sure the hashcode is well distributed, otherwise we will suffer from the same behavior.
Let's consider a similar object-key implementation:
struct MyKey { private int id; public MyKey(int id) { this.id = id; } public override bool Equals(object obj) { int int1 = ((MyKey)obj).id; bool equ = int1 == id; return equ; } public override int GetHashCode() { int hash = 1;//* Always same value return hash; } }
The first sample makes use of the above object-key to generate an identical hashcode for all instances.
private static void AddDic() { Dictionary <MyKey , int> list = new Dictionary<MyKey, int>(); double beforems = DateTime.Now.TimeOfDay.TotalMilliseconds; Console.WriteLine("Befoare Insert: " + beforems); for ( int i = 0; i < 100; i++) { list.Add(new MyKey(i), i); Console.WriteLine("----"); } double afterms = DateTime.Now.TimeOfDay.TotalMilliseconds; Console.WriteLine("After Insert: " + afterms); Console.WriteLine("Total: " + (afterms - beforems)); Console.ReadKey(); }
Adding new items into the Dictionary is very expensive, as mentioned before. When an identical hashcode is encountered, .NET tries to determine equality for all the key objects by looping throughout all the items in the specific bucket. This innocent loop of 100 items suffers from extremely low performance just because the hashcode in the custom key is always the same.
Output:
Equals - Internal Object ID:15 External Object ID: 16 Equals - Internal Object ID:14 External Object ID: 16 Equals - Internal Object ID:13 External Object ID: 16 Equals - Internal Object ID:12 External Object ID: 16 Equals - Internal Object ID:11 External Object ID: 16 Equals - Internal Object ID:10 External Object ID: 16 Equals - Internal Object ID:9 External Object ID: 16 Equals - Internal Object ID:8 External Object ID: 16 Equals - Internal Object ID:7 External Object ID: 16 Equals - Internal Object ID:6 External Object ID: 16 Equals - Internal Object ID:5 External Object ID: 16 Equals - Internal Object ID:4 External Object ID: 16 Equals - Internal Object ID:3 External Object ID: 16 Equals - Internal Object ID:2 External Object ID: 16 Equals - Internal Object ID:1 External Object ID: 16 Equals - Internal Object ID:0 External Object ID: 16 ---- GetHashCode - Internal Object ID:17 HashCode :1 Equals - Internal Object ID:16 External Object ID: 17 Equals - Internal Object ID:15 External Object ID: 17 Equals - Internal Object ID:14 External Object ID: 17 Equals - Internal Object ID:13 External Object ID: 17 Equals - Internal Object ID:12 External Object ID: 17 Equals - Internal Object ID:11 External Object ID: 17 Equals - Internal Object ID:10 External Object ID: 17 Equals - Internal Object ID:9 External Object ID: 17 Equals - Internal Object ID:8 External Object ID: 17 Equals - Internal Object ID:7 External Object ID: 17 Equals - Internal Object ID:6 External Object ID: 17 Equals - Internal Object ID:5 External Object ID: 17 Equals - Internal Object ID:4 External Object ID: 17 Equals - Internal Object ID:3 External Object ID: 17 Equals - Internal Object ID:2 External Object ID: 17 Equals - Internal Object ID:1 External Object ID: 17 Equals - Internal Object ID:0 External Object ID: 17 ---- GetHashCode - Internal Object ID:18 HashCode :1 Equals - Internal Object ID:17 External Object ID: 18 Equals - Internal Object ID:16 External Object ID: 18 Equals - Internal Object ID:15 External Object ID: 18 Equals - Internal Object ID:14 External Object ID: 18 Equals - Internal Object ID:13 External Object ID: 18 Equals - Internal Object ID:12 External Object ID: 18 Equals - Internal Object ID:11 External Object ID: 18 Equals - Internal Object ID:10 External Object ID: 18 Equals - Internal Object ID:9 External Object ID: 18 Equals - Internal Object ID:8 External Object ID: 18 Equals - Internal Object ID:7 External Object ID: 18 Equals - Internal Object ID:6 External Object ID: 18 Equals - Internal Object ID:5 External Object ID: 18 Equals - Internal Object ID:4 External Object ID: 18 Equals - Internal Object ID:3 External Object ID: 18 Equals - Internal Object ID:2 External Object ID: 18 Equals - Internal Object ID:1 External Object ID: 18 Equals - Internal Object ID:0 External Object ID: 18 ----
We can see that every new added object (17 and 18) is tested against each existing item in the dictionary.
By changing our implementation (although not the optimal one) to return a distinct value for each item, we will dramatically increase the overall performance.
struct MyKey { private int id; public MyKey(int id) { this.id = id; } public override bool Equals(object obj) { int int1 = ((MyKey)obj).id; bool equ = int1 == id; return equ; } public override int GetHashCode() { int hash = id;//* Always different return hash; } }
An additional common mistake can occurs when our GetHashCode()
implementation is based on a mutable field, for example, msg
field. In this case we can make items become unreachable if the field gets modified at runtime:
private static void ChangeDic() { Dictionary<MyKeyStruct , int> list = new Dictionary<MyKeyStruct , int>(); MyKeyStruct one = new MyKeyStruct(1); MyKeyStruct oneTwo = new MyKeyStruct(2); MyKeyStruct oneThree = new MyKeyStruct(3); MyKeyStruct oneFour = new MyKeyStruct(4); MyKeyStruct oneFive = new MyKeyStruct(5); MyKeyStruct oneSix = new MyKeyStruct(6); list.Add(one, 1); Console.WriteLine( "----" ); list.Add(oneTwo, 2); Console.WriteLine( "----" ); list.Add(oneThree, 3); Console.WriteLine( "----" ); list.Add(oneFour, 4); Console.WriteLine( "----" ); list.Add(oneFive, 5); Console.WriteLine( "----" ); list.Add(oneSix, 6); one.msg = "x" ;//* Change msg value var item = list[one];//* Exception: The given key was not present in the dictionary. Console.ReadKey(); }
The item '1' is lost somewhere in the hash map because after being stored in a specific bucket, we changed the msg
filled, causing GetHashCode()
method to produce a new hash, and pointing to a different bucket. Next time 'one' key is used, a new hashcode will lead to a different bucket and '1' item will never be found.
Summary
The default behavior, Object.GetHashCode()
works correctly for reference types.
ValueType.GetHashCode()
works only if the first field in your struct is immutable.
Both of them do not necessarily generate an efficient distribution hashcode, and their implementation must be based on immutable values.
ValueType.GetHashCode()
generates an efficient hash code only when the first field in your struct contains values across a meaningful subset of its inputs.
Comments
Dot Net Training in Chennai
Android Training in Chennai
ETL Testing training in chennai| SAP MM training in chennai | Informatica training in chennai
Salesforce Training in Chennai
Big Data Training in Chennai
Android Training in Chennai
Selenium Training in Chennai
JAVA Training in Chennai
German Classes in chennai
cloud computing training in chennai
cloud training in chennai
hosting
india hosting
india web hosting
iran web hosting
technology 11 great image sites like imgur hosting
final year project dotnet server hacking what is web hosting
macao web hosting
inplant training in chennai
inplant training in chennai
inplant training in chennai for it.php
r programming training in chennai
internship in bangalore for ece students
inplant training for mechanical engineering students
summer internships in hyderabad for cse students 2019
final year project ideas for information technology
bba internship certificate
internship in bangalore for ece
internship for cse students in hyderabad
summer training for ece students after second year
robotics courses in chennai
inplant training in chennai
inplant training in chennai
inplant training in chennai for it
Australia hosting
mexico web hosting
moldova web hosting
albania web hosting
andorra hosting
australia web hosting
denmark web hosting
slovakia web hosting
timor lestes hosting
egypt hosting
egypt web hosting
ghana hosting
iceland hosting
italy shared web hosting
jamaica web hosting
kenya hosting
kuwait web hosting
python training in chennai
internships in hyderabad for cse 2nd year students
online inplant training
internships for aeronautical engineering students
kaashiv infotech internship review
report of summer internship in c++
cse internships in hyderabad
python internship
internship for civil engineering students in chennai
robotics course in chennai
aeronautical internship in india
free internship in chennai for mechanical engineering student
architectural firms in chennai for internship
internship in coimbatore for eee
online internships for cse students
mechanical internship certificate
inplant training report
internships in hyderabad for cse
internship for mba students in chennai
internship in trichy for cse
Learn Best Youtube Marketing Course Training in Chennai
Learn Best AWS Developer Course Training in Chennai
Learn Best AWS Architect Course Training in Chennai
Learn Best AWS Cloud Practitioner Certification Course Training in Chennai
Digital Marketing Course In Kolkata
Web Design Course In Kolkata
SEO Course In Kolkata
keep it up!!!
android training in chennai
android online training in chennai
android training in bangalore
android training in hyderabad
android Training in coimbatore
android training
android online training
seo content writing tips
language similar to english
salesforce basics
star certification
hacking books
PLC Training in Chennai
Content Writing Training in Chennai
Sales Training in Chennai
React JS Training in Chennai
WordPress Training in Chennai
Google Analytics Training in Chennai
to know about python training course , use the below link.
Python Training in chennai
Python Course in chennai
Nice post and I am very happy to visit your blog. Keep doing...!
Oracle Training in Chennai
Oracle Training in Bangalore
Oracle Training in Coimbatore
Oracle DBA Training in Chennai
Java Training in Chennai
Java Course in Chennai
java interview questions and answers
selenium interview questions and answers
digital marketing interview questions and answers
hadoop interview questions and answers
oracle interview questions
data science interview questions and answers
Social Media Marketing Courses in Chennai
Social Media Marketing Online Course
Social Media Online Course
Sales Course in Chennai
เว็บแทงบอล
ufabet
ufa
พวงหรีด
โควิด
บาคาร่า
คาสิโนออนไลน์
ufabet
ufa
เว็บบอล
relx
Android and IOS operating systems. Watch online movies, Thai movies, Western movies, Asian movies, Cartoon movies, Netflix Movie, Action Movies, Comedy Movies, Crime Movies, Drama Movies, Horror Movies, Adventure Movies, Crash Movies and still have many new movies to watch. You can watch for free anytime, anywhere 24 hours a day at see4k.com.
GangManga read manga, read manga, read manga online for free, fast loading, clear images in HD quality, all titles, anywhere, anytime, on mobile, tablet, computer. Android and IOS operating systems. Read top comics, action dramas, comedy, adventure, horror and manga. New coming every day to watch many more. Can be read for free anytime anywhere 24 hours a day at gangmanga.com..
It is no secret that football is among the most popular and widely watched sports. Everybody who likes football tries to find the best platform for free soccer streaming. So, what are the best free sports streaming sites? We are going to answer this question. On this page, you can find a detailed overview of the most widespread soccer streaming websites. Keep on reading and make the best choice for you live24th.me.
ufabet , our one another option We are a direct website, not through an agent, where customers can have great confidence without deception The best of online betting sites is that our Ufa will give you the best price
หาคุณกำลังหาเกมส์ออนไลน์ที่สามารถสร้างรายได้ให้กับคุณ เรามีเกมส์แนะนำ เกมยิงปลา รูปแบบใหม่เล่นง่ายบนมือถือ คาสิโนออนไลน์ บนคอม เล่นได้ทุกอุปกรณ์รองรับทุกเครื่องมือ มีให้เลือกเล่นหลายเกมส์ เล่นได้ทั่วโลกเพราะนี้คือเกมส์ออนไลน์แบบใหม่ เกมยิงปลา
อีกทั้งเรายังให้บริการ เกมสล็อต ยิงปลา แทงบอลออนไลน์ รองรับทุกการใช้งานในอุปกรณ์ต่าง ๆ HTML5 คอมพิวเตอร์ แท็บเล็ต สมาทโฟน คาสิโนออนไลน์ และมือถือทุกรุ่น เล่นได้ตลอด 24ชม. ไม่ต้อง Downloads เกมส์ให้ยุ่งยาก ด้วยระบบที่เสถียรที่สุดในประเทศไทย
AWS certification course in Chennai
ufa
ufabet
แทงบอล
แทงบอล
แทงบอล
บาคาร่า
สล็อต
ufa
แทงบอล
how to make a paper airplane that flies far and straight | classic dart paper airplane | how to make a good paper | nakamura lock paper airplane instructions | best paper airplane design for distance and accuracy | best paper airplane design for distance and speed | Boomerang paper airplane instructions | Zazoom internet | Eagle paper airplane | how to make a paper airplane that flies far and straight step by step | 4 aerodynamic forces | Paper airplane templates for distance | the eagle paper airplane | best paper plane in the world | How to make a successful paper airplane
mua ve may bay di my
vietnam airlines quốc tế
khi nào có chuyến bay từ nhật về việt nam
vé máy bay khứ hồi từ đức về việt nam
vé máy bay giá rẻ từ Canada về Việt Nam
mua ve may bay tu han quoc ve viet nam
khách sạn cách ly ở vân đồn
vé máy bay cho chuyên gia nước ngoài
transported by a trained and vetted security chauffeur. We usually enhance our services by offering close protection services in London to boost the security of our high-valued customers.
Python training in chennai | Python course in Chennai
python training centre in chennai
best python institute in chennai
Android Training in Chennai
Android Course Online
Android Training in Bangalore
German Classes in Bangalore
German Language Course in Hyderabad
Tally Course in Tambaram
Tally course in Chennai
Appium Training Online
AWS Training in Tnagar
AWS Training in Chennai
Google Ads Training Courses In Chennai
Google Ads Online Course
Hadoop Training in Chennai