Hierarchical Tries: Trading Time for Space- 123docz.net

A binary trie, as described in detail in Section 15.4, organizes IP prefixes in a treelike fashion to identify the longest matching prefix. The internal nodes can be considered as decision points and the leaves represent the actual prefix. At each internal node, when the appropriate bit is examined, a 0 means the left branch needs to be traversed, while a 1 indicates that the right branch needs to be traversed. The path from the root to a leaf node gives the bit sequence of the prefix represented by the leaf node. Since IP prefixes can be represented as a range of addresses in a single dimension, the binary trie scheme is viewed as a solution for one-dimensional packet classification (also known as packet classification for a single field).

F I G U R E 16.2 Constructing theF1trie needed for a hierarchical trie.

F I G U R E 16.3 Constructing theF2tries needed for a hierarchical trie.

A hierarchical trie is a simple extension of the binary trie except that these tries can ac- commodate two fields. Let us illustrate hierarchical tries using the example classifier in Ta- ble 16.2 consisting of two fields, F1 andF2. First, construct a binary trie using the distinct prefixes of fieldF1. This trie is shown in Figure 16.2 and is referred to as theF1trie. For the sake of clarity, the Figure 16.2 shows the nodes shaded if the prefix represented by each node appears in fieldF1of any one of the rules. Also, the rules containing the prefix of each node are indicated next to the node.

Now for each unique prefix in fieldF1, construct a binary trie using the corresponding prefixes of fieldF2. Consider theF1prefix0∗. It occurs in rulesR2,R3, andR7and theirF2 prefixes are01∗,0∗, and10∗, respectively. We build a binary trie using these prefixes. Such a trie is called an F2 trie. Each of the five distinctF1 prefixes requires anF2 trie and they are all shown separately in Figure 16.3. In Figure 16.3, above each trie, the correspondingF1 prefixes are shown in boldface. As in theF1trie, the nodes corresponding to prefixes found in the rules are shaded.

The next logical problem is how to establish the relationship between theF1trie and the F2tries according to the rules in the classifier. An additional pointer is stored in each node of theF1trie that relates anF1prefix to itsF2trie. This pointer, called the next-trie pointer, stores the pointer to the root of the F2 trie corresponding to itsF1 prefix. The entire trie, known as the hierarchical trie, is shown in Figure 16.4 where the next-trie pointers are indicated by dashed arrows linking theF1trie to theF2tries. To briefly describe this, a hierarchical trie is

F I G U R E 16.4 Combining theF1andF2tries for building a hierarchical trie.

constructed by first building a trie for all theF1prefixes, and for eachF1prefixPanF2trie is associated that stores the rules whoseF1prefix is exactly the same asP.

Once the hierarchical trie is constructed, the search proceeds as follows. Let us assume, for the sake of discussion, that the header fields F1 and F2 are extracted from the packet and denoted byF`1 andF`2, respectively. The search starts by traversing the F1 trie to find the longest matching prefix for F`1. The algorithm then searches the correspondingF2 trie for the prefixes matchingF`2and updates the best matching rule. The search then backtracks on theF1trie and traverses all theF2tries associated with all the ancestors ofF`1. During the traversal, it continues to keep updating the best matching rule, if an appropriate one is found.

The algorithm terminates once the search backtracks to the root node and its corresponding F2 tries are examined. The last recorded best matching rule during the search provides the final solution. To better understand this, let us go through an example of search.

Example 16.2 Searching the hierarchical trie.

Consider the classification of an incoming packet with fieldsF1=000andF2=000using the hierarchical trie shown in Figure 16.4. The search proceeds from the root of theF1trie to identify the longest prefix match for 000and reaches the node corresponding to the prefix 00∗. The algorithm immediately fetches the next-trie pointer at this node and traverses itsF2 trie using 000. This leads to the matching ruleR1 and is recorded as the best matched rule so far. Now the search backtracks and moves up to the node with the prefix0∗in theF1trie and proceeds to itsF2trie. During the traversal, the search encountersR3, which matches the packet fields. Immediately,R3is compared with the best rule matched so far,R1. According to our criteria, since ruleR1occurs earlier in order in the classifier than R3,R1is retained as the best matching rule. The search again backtracks on theF1trie reaching the root node.

The next-trie pointer of the root node is fetched and itsF2trie is traversed. It finds that none of the rules matches. Subsequently, the search terminates since there are no more nodes to

backtrack andR1is declared to be the best matched rule.

Why does the search need to backtrack? The reason that the search backtracks even after locating the first matching rule is to find the best matching rule. For example, assume the order of rules in the classifier is changed such that ruleR3becomes the first entry and rule R1 is moved to the third entry. In this case, the best matching rule for the same search in Example 16.2 will beR3since it occurs earlier thanR1in the classifier. If the search has not backtracked, the best matching ruleR3would be missed. Hence, the search algorithm needs to traverse all theF2tries in the path as it backtracks on theF1trie.

The lookup cost of this scheme isO(W2)for two fields. It follows from the observation that in the worst case, it is possible to end up searchingW F2 tries, one for each bit of the longest matchingF1prefix. The cost of searching eachF2trie isO(W)and hence the overall search cost isO(W2). The amount of memory consumed isO(NW). Observe that each prefix in fieldF1requiresWnodes for theF1trie in the worst case. If all theNrules contain distinct F1prefixes, the worst-case memory required forF1trie isO(NW). EachF2prefix also requires W nodes in the worst case, and hence the memory required for storing all the F2 tries is O(NW). Therefore, the overall memory required isO(NW)+O(NW)=O(2NW), which is stillO(NW).

Hierarchical Tries: Trading Time for Space

Advanced Topics (Bonus Material on CD-ROM) 1

Bellman–Ford Algorithm and the Distance Vector Approach