Language Knowledge - JavaHashMap Class Depth Resolution

HashMap It is also more commonly used Java Collected framework classes， This category involves more knowledge， Includes arrays、 linked list、 Red and Black Trees, etc.， And some efficient and clever calculations， And this class has been improved with several versions， There are some differences between the different versions， Here it's all based on JDK8 source code (computing)。 The usual source code translation， See if you can answer a few of the following questions？（ Some parts are really hard to translate， Just look at it, people.）

HashMap Source Code Translation

** Question 1: What is the understanding of initCapacity, size, threshold, loadFactor, bin in HashMap?**

HashMap Stored are key-value pairs， But it's not simply one carrot and one hole。

HashMap Introduction to the noun

1、 (located) at HashMap in the reference constructor of， We specify initCapacity， but will take the number greater than or equal to this 2 power of a number as table Initial capacity of the array， use tableSizeFor(int) approach， as if tableSizeFor(10) = 16（2 of 4 cube (third power, math.)），tableSizeFor(20) = 32（2 of 5 cube (third power, math.)）， that is say table The length of an array is always 2 power of a number。

2. size Records the number of key-value pairs stored in the HashMap.

3. threshold is used to store the maximum number of key-value pairs that can be stored under the current capacity, or is the threshold value for HashMap expansion, when **size >= threshold** When the HashMap is expanded, the**threshold = capacity（table Length of the array） * loadFactor**， But when specifying initCapacity Not yet. put When the key pair，threshold Temporarily equal to capacity values。

4、loadFactor is the load factor， The smaller the load factor， The more space is wasted on arrays， The more uniform the distribution of key-value pairs， The faster you find it， In turn the larger the load factor， The higher the array space utilization， The more uneven the distribution of key-value pairs， The slower the search， So depending on the actual situation， Making choices in time and space。

5、bin (located) at HashMap The comments of the multiple occurrences of， But the word doesn't translate well，table The elements stored in each location of the array（ There may be more than one.） configure (computing) bin， Each location of the array can be seen as a container or a bucket， The container holds one or more elements。

owing to HashMap Only open access to size Method of parameters， So if you want to see the values of other parameters， The usual methods won't work.， You can use reflection to get the values of the above parameters， Write code to verify it， my Test Code。

** issues 2：HashMap How the data is stored internally？**

HashMap internal structure

HashMap internally.** Arrays + Chained Tables + Red Black Trees** implemented to determine the position in the table array for each Node, calculated as **index = (n - 1) & hash**（n because of table Length of the array）， Here. hash is through key of hashCode Calculated， The calculation formula is **hash = key.hashCode ^ (key.hashCode>>>16)**。Node The key-value pairs are stored in the key harmony value and the calculated hash happen to， Still saving the next one Node References next（ If there is no next Node，next = null）， An array position corresponds to a one-way list。 When the list length exceeds the list tree（ Turn the list into a tree structure） the threshold for 8 time， The list is converted to a red and black tree， to increase the speed of the lookup。

** issues 3：HashMap The method of expansion？**

while HashMap in **size >= threshold** time，HashMap It's going to have to be expanded。HashMap same ArrayList The same， Internally, there are arrays of dynamic growth，HashMap Expanded capacity for use resize() approach， compute table The new capacity of the array and Node The new position in the new array， Copy the values in the old array into the new array， This enables automatic expansion。

1、 When empty HashMap When an instance adds an element， will be in default capacity 16 because of table Length expansion of arrays， this time threshold = 16 * 0.75 = 12。

2、 When the not-empty HashMap Instance adds new elements when the array is not large enough， It will take the old capacity of**2 times (multiplier)** Expand capacity， Of course, expansion is also a size limit， The new capacity after expansion should be less than or equal to the specified maximum capacity， Create a new one with the new capacity table array， Then there are the array elements Node copied， compute Node The method of location is **index = (n-1) & hash**， The benefit of this calculation is that it is，Node The position in the new array either remains the same， Either the original position plus the capacity value of the old array， Positions in the new array are predictable（ Regular）， And on the list Node The order of will not change（JDK7 middle HashMap the calculation method of will change Node Order）。

Expand the array node movement

** Question 4: What are the details of the HashMap put method?**

put What is called inside the method putVal() approach， So yes put The analysis of the method is also right putVal Analysis of methods， The whole process is complicated， The flowchart is below：

put Method flowchart

1、 Judge the key value pair array table Whether empty or null， If it is a call resize() Method to expand the capacity

2、 Based on the key value key compute hash worth to the index of the array to be inserted index， in case table[index]==null， instructions There's no node at this location yet.， Add a new node directly to this location， turn 7， in case table[index] Not empty， turn 3；

3. determine whether the first node of table[index] is the same as key, if the same directly overwrite value, otherwise turn to 4, where the same refers to key.hashCode and the equals method of key.

4、 judgment table[index] Whether it is treeNode node， That is, it is table[index] Whether the location is stored is a red and black tree， If it's a red and black tree， Key value pairs are inserted directly into the tree， Otherwise turn 5；

5. traverse table[index], determine whether the length of the chain table is greater than 8, if greater than 8, convert the chain table into a red-black tree, perform the insertion operation in the red-black tree, otherwise perform the insertion operation of the chain table.

6、 If found during the traversal key Direct coverage already exists value can then (do sth)；

7、 After successful insertion， Determine the number of key-value pairs that actually exist size Is the maximum capacity exceeded threshold， If more than， Expand capacity resize。

** issues 5：HashMap Why is the length of the array2 power of a number？**

The main reason is to make it easier to compute the index of the Node's position in the array and to improve the speed of computation. Ideally, the table array in a HashMap holds only one Node, i.e., there are no hash collisions, so that access efficiency is the highest. But the reality is that collisions are hard to avoid, and what we want to do is to distribute the data as evenly as possible in the table array, and the convention is to use **hash % length = index** calculated Node The position in the array， This formula can be replaced by **index = hash - (hash / length) * length**， But this calculation is complicated， We humans use decimal， And computers use binary，2 The powers of are expressed in binary in a very regular way， as if（16）10 = （10000）2， Even more ingenious is when length = 2 the power of，**hash % length = hash & (length - 1)** The bitwise operations are very efficient in computers, and the length - 1 here is also very regular, e.g. (15)10 = (01111)2, any bitwise operation with hash value and 01111 will result in 0000001111 (015), which also happens to be the index of the array. And when the HashMap is expanded, the table array is twice as long, or a power of 2, so that the index of the Node's position in the array can always be calculated quickly.

** Question 6: How do several Map collection classes compare?**

Map The collection class | key | value | Super | JDK | instructions |
---|---|---|---|---|---|

Hashtable | Not allowed to be null | Not allowed to be null | Dictionary | 1.0 | Thread-safe（ obsolete） |

ConcurrentMap | Not allowed to be null | Not allowed to be null | AbstractMap | 1.5 | Thread safety (JDK1.8 uses lock segmentation and CAS, which also performs well) |

TreeMap | Not allowed to be null | Allowed to be null | AbstractMap | 1.2 | The thread is not secure（ Orderly） |

HashMap | Allowed to be null | Allowed to be null | AbstractMap | 1.2 | The thread is not secure（resize There is a dead chain problem、 Easy to lose data， Do not use in multithreaded） |