Lecture Notes for SE 598, Data Structures and Algorithms

9 August 2000, Implementing Maps and Multimaps


  1. use a linked-list - binary tree

    1. problem - binary trees can get out of balance; insertion order is important

    2. solution - fix the binary tree if it gets out of balance

    3. problem - rebalancing a binary tree is expensive on each insertion

    4. solutions - don't rebalance perfectly, don't rebalance after every insertion

      1. trade-off - the less perfect the tree, the less often the need to rebalance

    5. examples

      1. avl trees - balance is near perfect (+- 1), rebalance on every insertion

      2. red-black trees - balance is so-so (factor of 2), rebalance can be much less frequent

      3. stl uses red-black trees

  2. use a vector - hash table

    1. vectors are fast to access but slow to find

    2. if a value can be associated with its index, then find would be fast

      1. suppose "alphonse" could be associated with 1

      2. but, the association must be fast

    3. such vectors are known as hash tables

    4. hash functions associate arbitrary values with indices

      1. hash functions turn arbitrary values into indices

      2. fast, constant (almost) associations

      3. index = 0;
        for (int i = sizeof(str) - 1; i >= 0; i--)
          index += str[i] mod v.size()
        

    5. what's the problem - there are many strings but (relatively) few indices

      1. collisions

    6. dealing with collisions

      1. use the first available free slot in the hash table - open addressing

        1. typedef pair hash_table_slot

        2. insert

        3. find

        4. advantages - simple

        5. disadvantages - fixed size, potentially slow lookup

      2. have each hash-table slot be the head of a chain - closed addressing or chaining

        1. each hash-table slot (bucket) points to a linked list of values

        2. insert

        3. find

        4. advantage - simple, arbitrary size

        5. disadvantage - potentially slow lookup

    7. the stl may have hashed associative containers too

      1. advantages - fast

      2. problems - slow, no more sorting, difficult to use right


This page last modified on 9 August 2000.