prio_tree.txt 5.2 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108
  1. The prio_tree.c code indexes vmas using 3 different indexes:
  2. * heap_index = vm_pgoff + vm_size_in_pages : end_vm_pgoff
  3. * radix_index = vm_pgoff : start_vm_pgoff
  4. * size_index = vm_size_in_pages
  5. A regular radix-priority-search-tree indexes vmas using only heap_index and
  6. radix_index. The conditions for indexing are:
  7. * ->heap_index >= ->left->heap_index &&
  8. ->heap_index >= ->right->heap_index
  9. * if (->heap_index == ->left->heap_index)
  10. then ->radix_index < ->left->radix_index;
  11. * if (->heap_index == ->right->heap_index)
  12. then ->radix_index < ->right->radix_index;
  13. * nodes are hashed to left or right subtree using radix_index
  14. similar to a pure binary radix tree.
  15. A regular radix-priority-search-tree helps to store and query
  16. intervals (vmas). However, a regular radix-priority-search-tree is only
  17. suitable for storing vmas with different radix indices (vm_pgoff).
  18. Therefore, the prio_tree.c extends the regular radix-priority-search-tree
  19. to handle many vmas with the same vm_pgoff. Such vmas are handled in
  20. 2 different ways: 1) All vmas with the same radix _and_ heap indices are
  21. linked using vm_set.list, 2) if there are many vmas with the same radix
  22. index, but different heap indices and if the regular radix-priority-search
  23. tree cannot index them all, we build an overflow-sub-tree that indexes such
  24. vmas using heap and size indices instead of heap and radix indices. For
  25. example, in the figure below some vmas with vm_pgoff = 0 (zero) are
  26. indexed by regular radix-priority-search-tree whereas others are pushed
  27. into an overflow-subtree. Note that all vmas in an overflow-sub-tree have
  28. the same vm_pgoff (radix_index) and if necessary we build different
  29. overflow-sub-trees to handle each possible radix_index. For example,
  30. in figure we have 3 overflow-sub-trees corresponding to radix indices
  31. 0, 2, and 4.
  32. In the final tree the first few (prio_tree_root->index_bits) levels
  33. are indexed using heap and radix indices whereas the overflow-sub-trees below
  34. those levels (i.e. levels prio_tree_root->index_bits + 1 and higher) are
  35. indexed using heap and size indices. In overflow-sub-trees the size_index
  36. is used for hashing the nodes to appropriate places.
  37. Now, an example prio_tree:
  38. vmas are represented [radix_index, size_index, heap_index]
  39. i.e., [start_vm_pgoff, vm_size_in_pages, end_vm_pgoff]
  40. level prio_tree_root->index_bits = 3
  41. -----
  42. _
  43. 0 [0,7,7] |
  44. / \ |
  45. ------------------ ------------ | Regular
  46. / \ | radix priority
  47. 1 [1,6,7] [4,3,7] | search tree
  48. / \ / \ |
  49. ------- ----- ------ ----- | heap-and-radix
  50. / \ / \ | indexed
  51. 2 [0,6,6] [2,5,7] [5,2,7] [6,1,7] |
  52. / \ / \ / \ / \ |
  53. 3 [0,5,5] [1,5,6] [2,4,6] [3,4,7] [4,2,6] [5,1,6] [6,0,6] [7,0,7] |
  54. / / / _
  55. / / / _
  56. 4 [0,4,4] [2,3,5] [4,1,5] |
  57. / / / |
  58. 5 [0,3,3] [2,2,4] [4,0,4] | Overflow-sub-trees
  59. / / |
  60. 6 [0,2,2] [2,1,3] | heap-and-size
  61. / / | indexed
  62. 7 [0,1,1] [2,0,2] |
  63. / |
  64. 8 [0,0,0] |
  65. _
  66. Note that we use prio_tree_root->index_bits to optimize the height
  67. of the heap-and-radix indexed tree. Since prio_tree_root->index_bits is
  68. set according to the maximum end_vm_pgoff mapped, we are sure that all
  69. bits (in vm_pgoff) above prio_tree_root->index_bits are 0 (zero). Therefore,
  70. we only use the first prio_tree_root->index_bits as radix_index.
  71. Whenever index_bits is increased in prio_tree_expand, we shuffle the tree
  72. to make sure that the first prio_tree_root->index_bits levels of the tree
  73. is indexed properly using heap and radix indices.
  74. We do not optimize the height of overflow-sub-trees using index_bits.
  75. The reason is: there can be many such overflow-sub-trees and all of
  76. them have to be suffled whenever the index_bits increases. This may involve
  77. walking the whole prio_tree in prio_tree_insert->prio_tree_expand code
  78. path which is not desirable. Hence, we do not optimize the height of the
  79. heap-and-size indexed overflow-sub-trees using prio_tree->index_bits.
  80. Instead the overflow sub-trees are indexed using full BITS_PER_LONG bits
  81. of size_index. This may lead to skewed sub-trees because most of the
  82. higher significant bits of the size_index are likely to be 0 (zero). In
  83. the example above, all 3 overflow-sub-trees are skewed. This may marginally
  84. affect the performance. However, processes rarely map many vmas with the
  85. same start_vm_pgoff but different end_vm_pgoffs. Therefore, we normally
  86. do not require overflow-sub-trees to index all vmas.
  87. From the above discussion it is clear that the maximum height of
  88. a prio_tree can be prio_tree_root->index_bits + BITS_PER_LONG.
  89. However, in most of the common cases we do not need overflow-sub-trees,
  90. so the tree height in the common cases will be prio_tree_root->index_bits.
  91. It is fair to mention here that the prio_tree_root->index_bits
  92. is increased on demand, however, the index_bits is not decreased when
  93. vmas are removed from the prio_tree. That's tricky to do. Hence, it's
  94. left as a home work problem.