Extending Two-Dimensional Solutions

When searching for a solution to a new problem, the natural tendency is to adapt or extend an existing solution to solve the problem. We extended the trie-based IP address lookup algorithms to classify rules in two dimensions. Similarly, can we extend the two-dimensional solutions to handle rules withddimensions? In this section, we investigate such extensions by categorizing them into nạve extensions and native extensions.

16.7.1 Nạve Extensions

A nạve extension, as mentioned in [47], uses any efficient two-dimensional scheme and instead of a single rule at the leaf, a set of rules is stored. Even though the choice of the dimensions can be any two fields in the classifier, typically source and destination address fields are chosen. The primary motivation behind these choices is based on the source–destination matching observation outlined in Section 16.6.2. Recall that this observation indicates that when considering only the source and destination fields, almost all packets match at most five rules and no packet matches more than 20 rules. This will reduce the number of rules searched at the leaf between 5 and 20. A linear search on this reduced set of 20 rules will definitely perform better than the nạve linear search of the entire classifier, assuming the classifier contains more than 20 rules.

F I G U R E 16.10 Extending a two-dimensional scheme for classification on arbitrary number of fields (adapted from [47]).

The nạve extension is illustrated in Figure 16.10. It uses an efficient two-dimensional scheme to find all distinct source–destination pairs(S1,D1),(S2,D2), . . . , (Sn,Dn)that match a packet header. Each distinct pair(Si,Di)is associated with a list of rules that contains(Si,Di) in source and destination fields. All the fields in the rule need not be stored in the list. Instead, only the fields other than the source and destination address need to be stored. The search first traverses all the rules associated with(S1,D1), then(S2,D2), and so on.

Use of such a data structure provides a few advantages. First, each rule is represented only once and hence the memory required is linear in the number of rules. However, some- times replication of rules might be needed to reduce the number of source–destination pairs traversed during the search. Second, since only source and destination address fields are used for the search structure, the blowup of rules because of translating port range to prefixes is eliminated.

The grid-of-tries approach discussed in Section 16.5.3 is an efficient two-dimensional scheme for classifying address prefix pairs and it can be used as the two-dimensional scheme in Figure 16.10. However, this scheme cannot be generalized ford>2. A solution proposed in [47] called extended grid of tries (EGT) uses the standard grid-of-tries scheme for two fields and extends the traversals to find all the matching rules.

16.7.2 Native Extensions

While nạve extensions use any two-dimensional scheme with a list of rules stored at the leaf, native extensions augment these schemes with additional levels of tries to accommodate multiple dimensions or fields. The general scheme is illustrated in Figure 16.11. In this section, we describe such extensions to hierarchical tries and set pruning tries.

Hierarchical tries for two fields can be extended recursively for rules withddimensions as follows:

F I G U R E 16.11 Extending a two-dimensional scheme natively for classification on arbitrary number of fields.

• Ifd=1, the hierarchical trie is just a binary trie.

• If d>1, first construct a binary trie corresponding to a field, sayF1, called the F1 trie.

ThisF1trie contains the distinct prefixesRithat belong to field F1of all the rules in the classifier.

• For each prefix Pin theF1 trie, construct recursively a(d−1)hierarchical trie, sayTP, using the rules that specifyPin the fieldF1. The node representing the prefixPin theF1 trie is linked to trieTPusing a pointer called the next-trie pointer.

A packet classification on an incoming packet header(H1,H2, . . . ,Hd)proceeds recursively as follows. First, theF1trie is traversed based on the bits inH1and finds the node corresponding to the longest matching prefix ofH1. The next-trie pointer of the node is fetched and if it is not null, the search continues recursively on the(d−1)hierarchical trie. As the search proceeds, it keeps track of the best matching rule encountered so far. Once the search terminates on this(d−1)hierarchical trie, it backtracks on theF1trie. For eachF1trie node encountered during backtracking, the search follows the next-trie pointer, if it is not null.

Then it recursively traverses the(d−1)hierarchical trie stored at that node and updates the best matched rule, as more rules are encountered, if needed. This search algorithm is some- times referred to as the backtracking search algorithms due to its recursive nature. It is left as an exercise to the reader to infer that the time complexity forddimensions isO(Wd).

Similar to hierarchical tries, set pruning tries can be extended to accommodate d dimensions. Let us start with a d-dimensional hierarchical trie consisting of an F1 trie and its(d−1)-dimensional hierarchical tries associated with the nodes. Now consider the set of nodes that represents prefixes shorter than a prefixPin theF1trie. Call this set of nodesT.

In ad-dimensional set pruning trie, the rules in the(d−1)tries linked to all nodes inTare replicated to the(d−1)-trie linked to prefixP. The replication of prefixes is carried out in a recursive fashion for the(d−2)-tries and so on.

The search algorithm for a packet with the header(H1,H2, . . . ,Hd)needs to traverse the F1trie only once to find the longest matching prefix node forH1. If its next-trie pointer is not null, the search follows theF2trie and finds the longest matching prefix forH2. The search continues in this fashion forF3tries and so on for alld-fields. Since the rules are replicated in a set pruning trie, the search algorithm ensures that all the matching rules will be encountered in its path. Ford-fields, it can be shown that the memory required isO(dWNd). You can now see how the memory grows exponentially with the addition of each field. However, the time complexity for search isO(dW), which is linear in the number of fields, unlike hierarchical tries.

Advanced Topics (Bonus Material on CD-ROM) 1

Bellman–Ford Algorithm and the Distance Vector Approach