What makes IP-lookup an interesting problem is that it must be performed increasingly fast on increasingly large routing tables. One direction to tackle this problem concentrates on partitioning muting tables in optimized data structures, often in tries (digital trees), so as to reduce as much as possible the average number of accesses needed to perform LPM [2,17,19,26] . Each lookup however, requires several (four to six) dependentseriaIized memory accesses stressing conventional memory architectures to the limit. Memory latency and not bandwidth is the limiting factor with these approaches. Significant effort has been devoted to solve the latency problem either by using fast RAM (e.g., Reduced Latency DRAM-RLDRAM) or by replicating the routing table over several devices so that searches can run in parallel to attain the necessary speeds [3] . The first solution can only mitigate the problem and the second solution dnves up system costs (due to bus replication) and further complicates muting table update. In all cases the solution is a mdeoff among search speed, update speed and memoiy size.
r. INTRODUCTION Independently of a router's Internet hierarchy level --core. edge, or access platform-a function that must be performed in the most efficient manner is packet forwarding. in other words, determining routing, security and QoS policies for each incoming packet based on information from the packet itself A prime example is the Internet Protocol's basic routing function (IPlookup) which determines the next network hop for each incoming packet. Its complexity stems from wildcards in the muting tables, and fmm the Longest Prefix Match (LPM) algorithm mandated by the Classless Inter-Domain Routing (CIDR).
Since the advent of CIDR in 1993, IP routes have been identified by a <mule prejx, prefix length> pair, where the prefix length is between 1 and 32 bits. For every incoming packet. a search must be performed in the router's forwarding table to determine the packet's nexT network hop. The search is decomposed into two steps. First. we fmd the set of routes with prefixes that match the beginning of the incoming packet's IP destination address. Then, among h s set of routes, we select the one with the longest prefix. Tlus identifies the next network hop. 
Georgios Keramidas

992
What makes IP-lookup an interesting problem is that it must be performed increasingly fast on increasingly large routing tables. One direction to tackle this problem concentrates on partitioning muting tables in optimized data structures, often in tries (digital trees), so as to reduce as much as possible the average number of accesses needed to perform LPM [2, 17, 19, 26] . Each lookup however, requires several (four to six) dependentseriaIized memory accesses stressing conventional memory architectures to the limit. Memory latency and not bandwidth is the limiting factor with these approaches. Significant effort has been devoted to solve the latency problem either by using fast RAM (e.g., Reduced Latency DRAM-RLDRAM) or by replicating the routing table over several devices so that searches can run in parallel to attain the necessary speeds [3] . The first solution can only mitigate the problem and the second solution dnves up system costs (due to bus replication) and further complicates muting table update. In all cases the solution is a mdeoff among search speed, update speed and memoiy size.
TCAMs-A fruitful approach to circumvent latency restrictions is through parallelism: searching all the routes simultaneously. Content Addressable Memories perform exactly this fully-parallel search. To handle route prefixes, Ternary C A M S (TCAMs) are used which have the capability to represent wildcards. TCAMs Siber- core [25] ) currently offer a large array of TCAM products used in P-lookup and packet classification.
In a TCAM, P-lookup is performed by storing routing table entries in order of decreasing prefix lengths. TCAMs automatically report the f i t entry among all the entries that match the incoming packet destination address (topmost match).
The need to maintain a sorted table in a TCAM makes incremental updates a difficult problem. If N is the total number of prefixes to be stored in anM-entIy TCAM, naive addition of a new update can result in O(N) moves. Significant effort has been devoted in addressing this problem [9, 24] , however all the proposed algorithms require an external entity to manage and partition the routing table.
In addition to the update problems, two other major drawbacks plague TCAMs: high cost/density ratio and high power consumption. ways analogous to set-associative memories but in this paper we argue for pure set-associative memory smctures for IPlookup: many more "blocks" with less ussociativitr, and sepnrtttioii of the comparators from the storage array. In TCAMs, blocking further complicates muting table management requiring not only correct sorting but also correct partitioning of the routing tables. Routing table updates also become more complicated. In addition, external logic to select blocks to be searched is necessary. All these factors further increase the distance between our proposal and TCAMs in terms of ease-of-use mhle still failing to reduce power consumption below that of a straightfomwd set-associative array.
More seriousIy: blocked TCAMs can only reduce average power consumption. Since the main constrain in our contest is the fixed power budget of a linecard a reduction of average power consumption is of limited value -maximum power consumption still matters. As we show in this paper. the muxiinum? power consumption of IPStash is less than the power consumption of a comparable blocked TCAM with full power management. Compared to OUT earlier proposal [SI for a set associative memory for IP-lookup: i) we have resolved its major shortcoming which was the sigmficant expansion of the route prefixes (which resulted in expanded routing tables twice their original size), ii) we introduce a new power-management technique leading to new levels of power-consumption efficiency and iii) while our earlier work concerned a specific point in the design space of set-associative memories for TP-lookupl in this paper we systematically explore a much larger space of possible solutions.
IR&ush-To
&ucture uf this puper-Section I1 presents the PStash architecture and our implementation of the LPM algorithm. In Section I11 we show that IP-lookup needs associativity depending on the routing table size. Section IV presents other features of the architecture. Section V provides simulation results for power consumption and Section VI discusses related work.
FinalIy, Section VU offers our conclusions.
11.
IPSTASH ARCHITECTURE The main idea of the IPStash is to use a set-associative memov structure to store routing tables. lPStash functions and looks like a set-associative cache. However, in contrast to a cache which holds a smaIl part of the data set, IPStash is intended to hold a routing table in its entirety. In other words, it is the main storage for the routing table-not a cache fur itIn this section we describe how routing tables can be inserted in a set-associative structure and how LPM is performed in this case.
A . IPSsash Basics
To insert routing prefixes in a set-associative structure -as opposed to a TCAM-we first need to define an index. Routing prefixes can be of any length but in reality there are no prefixes Of course, to perform LPM we need to select the longest of all the matching prefixes in a set. To do this we need another level of length arbitration after the tag match that gives us the longest matching prefi. Again the prefii length. stored with the matching tags, i s used in comparisons to select the longest prefix. If the prefis Iength is stored as a bimq value it is expanded into a full bit mask. The maximum length can be found by comparing the masks with a combinatorid circuit or using a iength I-rrbitrufiun bus with as many lines as the maximum prefix length. Arbitration works as follows: When multiple tags match simultaneously, they assert the wire that corresponds to their prefix length. Every matching tag sees each other's length and a self-proclaimed winner outputs its result on the output bus. All other matching tags withdraw.
As mentioned above an &-bit index and especially the MSB bits would be disastrous for the associativity requirements for a large routing table. Conflict chains would be unacceptably long.
In the next subsections we show two things. First, how we can increase the index to address a larger number of sets. Second, how we can partition the routing table into ciasses, each with its own index, to dramatically increase the efficiency of storing a routing table in a set-associative amy. Both of these techniques are driven by the structure of the routing tables whckye-ana- Fig. 2 shows the distribution of prefix lengths for four tables taken from [22] and from different time periods (from 1999 to 2003). We can easily draw some general conclusions -also noted by other resemhm-from the graphs in Fig. 2 : the distribution is the same for all tables regardless of their size and creation date. With respect to the actual prefix lengths: 24-bit prefixes comprise about 60% of the tables; prefixes longer than 24 bits are v e q few (about 1%);
there are no prefixes less than 8 bits; the bulk (about 97%) of the prefixes have lengths between 16 and 24 bits.
C. Prefix expansion and index selecrion
A straightfonvard method to increase the index is to use a controlled prefix expansion technique to expand prefixes to larger lengths. For example, we can eqand prefixes of lengths 8.9,10, and 11 all to length 12 thus having theopportunity to use up to 12 bits as index. ' The controlled prefix expansion creates comparably very few additional espanded prefixes at these short lengths simply because they are vev few short prefixes to begin w-ith. This, however, is not true for all prefix lengths as it can be seen in Fig. 2 . As we expand prefixes into larger and larger lengths, routing-table inflation becomes a significant problem.
Unfortunately, it is desirable to expand prefixes to large lengths in order to gain access to the "best" indexing bits. Fig. 3 shows the bit enrtrogv for prefies of length 16 to 20 {upper graph) and 21 to 24 (lower graph). The y-axis is the prefix length. the x-axis represents the bits (up to bit 24), and the zais is the entropy of the bits. Bit entropy is the bit's apparent randomness -how un-biased it seems towards one or zero. The higher the entropy the better the bit for indexing. Indesing with high entropy bits will help to spread references more evenly across the memory minimizing the associativity requirements. MSB bits have very low entropy and are really unsuitable for indexing. Regardless of prefix length the best bits €or indexing start from bit 6 and reach the prefixes' maximum length
The above analysis suggests expansion of prefixes to large lengths and selection of the right-most (non-wildcard) bits as index -prefix expansion creates high entropy bits. Even if we could accept routing-table inflation prefix expansion alone is not sufficient for efficient storage of a routing table into a set--lyze next.
. ".--*% 6 u -a -associative structure -e v e n wfih a very good index, a single Jashing of the routing us to treat each prefix length independently of all others. Thus, we can insert. for example. prefixes of length 32 into IPStasli using the most appropriate index: similarly. we insert prefixes of length 3 1,30,29, .... using again the most appropriate index from the available non-wildcard bits. To perform LPM we start by searching the longest prefixes using their corresponding index to retrieve them. We repeat with progressively shorter prefis lengths until we fiid the first match -the LPM.
But iterating over 24 prefix lengths (lengths 32 to 8) is impractical. First, it would make some searches unacceptably slow if we had to try several different lengths until we found a match Second, it would introduce great variability in the hit Iatency which is clearly undesirable in a router/network processor environment.
Our solution is to partition prefixes into a small set of classes and iterate over the classes. For esample. we can partition the routing table into the following classes:
Class 1 contains all the prefixes from 2 1 to 32 bits. Any 12 (or any other number if we chose so) of the fmt 21 bits can be used for indexing --bits above 2 1 are wildcard bits. Class 2 contains all the prefixes from 17 to 20 bits. Any 12 bits of the first 17 can be used as an index, but bits 18 to 20 contain wildcards.
Class 3 contains all the prefixes from 8 to 16 bits. Only this class -the last class containing the shortest prefixesrequires prefix expansion of the shorter prefKes to guarantee the availability of the index bits.
Class Dartitioning is n o h n r more than a definition of the indes (co&equently >f the tagyfor a set of prefix lengths. It allows us to re-hush a routing table multiple times, each hash using an optima1 index. Fig. 4 shows the associativity requirements for 8 routing tables when they are single-hashed (single class), doublyhashed (2 classes) and triply-hashed (3 classes). The benefit from more than 3 classes is little; we have not seen significant improvement going from 3 to 4 classes. The optimal class partitioning depends on the actual routing table to be stored and can change over-time. Thus, IPStash is configurable with respect to the classes used to store and access a routing To put it all together Fig. 5 shows how the index and tag are extracted from a prefix belonging to some class. The class boundaries define the range of prefix lengths tlmt belong to the class. The lower class boundary guarantees that no bit below that boundan: can be a wildcard bit for the prefixes belonging to the specific class. Thus, the index can always be safely chosen from bits below the lower class bounday. Any bits below the lower class boundary besides the indes bits form the jixed fag of the prefix while non-wildcard bits above the lower class boundaq form the variable part of the prefix tag. The length of the prefix is used to form a mask that controls exactly how many bits of the tag participate in the tag match. This mask is stored with the tag in each entry.
To insert a prefix in IPStash we fmt assign it to a class, extract its indes and form its tag by concatenating its fixed tag parts with the variable part. In the same time we form the mask stored with the tag that controls tag match (Fig. 6) .
To perform LPM in PStash we iterativeIy search all classes until we find a match. For each class we take the incoming IP address? extract the class index and form the corresponding tag to be compared against the stored prefix tags (Fig. 7) . The IP address tag is a full tag containing all the IP address bits but when it is compared to the stored prefis tags the corresponding masks control wllich bits participate in the comparison and which bits are ignored (Fig. 7) .
I ? Skewed associativi9
Although there are significant gains going from a single hash (single class) of the routing table to WO and three hashes (2 and 3 classes) --possibly accompanied by a prefix expansion to secure an index for the shortest class- Fig. 4 shows that there are still considerable associativity requirements even for triplehashing. Our second proposal, orthogonal to class partitioning, for increasing hashing effectiveness and decreasing associativity Equirements is based on Semec's idea of a skewed associativity [23] . Skewed associativily can be applied in PStash with great success. The basic idea of skewed associativity is to use different indexing functions for each of the set-associative ways. Thus, items that in a standard cache woutd compete for a place in the same set because of identical indexing across the ways. in a skewed-associative cache map on dzferent sets. One way to think about skewed associativity is to view it as an additional increase of the m m p y of the system by the introduction of additional randomness in the dstribution of the items in the cache.
The left upper graph of Fig. 8 shows how RT5 is loaded into an "u~imited-associati~~i~~~ IPStash using 12 bits for index and the three class approach-without restriction to the number It is the order introduced by the hashing function. The effect of skewing (shown in the right graph of Fig. 8 The effect of skewed associativity is shown in Fig. 9 which compares the associativity requirements with and without skewing and for 1,2, and 3 classes for all 8 routing tables. The bene- fits are si@cant across all cases, comparable and additive to the benefits from multiple hashing. A distinct effect of skewing is to "linearize" the required associativity curves and bring them veIy close to the best possible outcome as it is further analyzed in Section 111. Our approach to assess memory overhead in IPStash is to exhaustively study the choices for different indices and class configurations per index. We examine several different index lengths from 8 to 16 bits. For a given index, we select a class configuration, which -for simplicity-is common to all 8 routing tables we use. We have also examined class contigurations tailored individually for each routing table which gives us a small additional benefit. Imbedded in the class configuration is the prefix expansion in the shortest class. Fig. 10 shows the normalized memory overhead (lower part) and required associativity (upper part) for all the tables used in this paper.-In all cases, the class configuration that minimizes the average memory overhead of the 8 routing tables is shown. Detailed results are presented in Table I which shows the effect of the index on the number of the expanded prefixes and on the memory overhead (for both skewed and non-skewed cases). Fig. 10 and Table I show that as the number of index bits grows, memory overhead is increasing and the required associativity is decreasing. In both cases, the trends are exponential.
DETAILED ANALYSIS OF MEMORY
REQUIREMENTS
On one hand we are seehng low associativity for an efficient implementation of IPStash. On the other, increasing the index to decrease associativity, increases both capacity inefficiencies of IPStash: we have to both store larger expanded tables and the empty slots left in sets correspond to a larger percentage of wasted memory in low associativity. This is clear in Fig. 11 required associativity (skewed case) to the initial size for our eight routing tables. As we can see this relationshp is remarkably linear -which implies good scalability with size-and holds for all indices, albeit at different slopes. The slope of a curve in this graph ("slope") is a measure of the hashing eficiency: the optimal slope ("opt") for each index is ]hers. The ratio of the slope to its optimal is a measure of its closeness to the optimal.
The most important observation here is that although the slopes of the curves are quite near the theoretical optimal slopes in each case, small indices are closer to the optimal slopes than longer indices confirming increasing ineficiency with index Iength.
To conclude, the choic of the indes must strike a fine balance between the memon. overhead to store a routing table and its associativity requirements. Both memory size and associativity negatively affect power consumption and performance of an actual PStash device.
The above analysis pertains to information (memov overhead, required associativity) that we extract solely from routing tables. The rest of the paper deals with the analysis of anAutecturcll trade-offs in the context of desigrung a memory structure optimized for IP-lookup. Ths is the topic of Section V where we use the Cacti tool to study this problem. ' To increase capacity in IPStash we add more associativity. This stems from the linear relation of muting table size and required associativity. We extended Cacti to handle more than 32-ways, but as of yet we are unable to validate these numbers.
Thus. we use Cacti's ability to simulate multi-banked caches to increase size and associativity at the same time. In Cacti, multiple banks are accessed in paraltel and are intended mainly as an alternative to multiple porls. We use them, however, to simulate higher capacity and associativity.
Our basis for comparison is the Ultru-18 (18Mbit, 512K IPv4 entries) TCAM from SiberCore [25] . Ultra-18 is presently the top-of-line TCAM'. Table I11 shows the power characteristics of the Ultm-18. Since in our study we cannot scale IPStash arbitmily (because of Cacti's powers-of-two restrictions) we chose to scale the TCAMs instead Detailed characteristics presented in Table I1 allow us to project Ultra-18 power consumption for speclfic capacities. Our approach is to use IPStash memory overhead factors presented in Table I to scale TCAM capacity. For example, a 5 12K-entry IPStash witha 12-bit index has a memory overhead of 1.23 meaning that it can store a routing table of about S12A.23 = 416K entries. Thus, we compare against a TCAM with same scaled capacity, i.e., a TCAM with 416K entries.
We use Cacti to study various configurations (adjusting associativity, number of sets, and number of banks) of a 512K-entry IPStash. An entry in our case contains the maximum numt w of prefn bits -aside from index bits-plus the corresponding mask (e.g., for a 12 bit index, 20+20 = 40 bits for tag), and data payload (8-bit port number). Table I11 shows power and latency results for some of the possible conf&urations where the associativity (of each bank) is fixed at 32. Power results are nomlized for the same throughput -e.g., 100 Million Searches Per Second (Msps), a common performance target for many TCAMs. We restrict solutions to those with a memory overhead less than 2 (TableI). The xasoning is that TCAMs also have a bdden memory overhead to support wildcards which is exactly 2. Two more changes are needed in Cacti to simulate IPStash. The first is the ei$ra wired-or bus required for length arbitration The arbitration bus adds both latency and power to each access. Using Cacti's estimates we compute the overhead to be less than 0.4 Watts (at 1110 Msps). Our estimates for the arbitration bus are based on the power and latency of the cache's bitlines. We consider length arbitration as a separate pipeline stage in PStash whiciL however: does not affect cycle time -address decoders define cycle time in all cases. The second change concerns the support for skewed associativity. Skewed index construction (rotations and XORs) introduces negligible latenqand power consumption to the design. However; a skewed-associative IPStasli requires separate decoders for the wordlinessomething Cacti does not do on its own. We compute latency and power overhead of the separate decoders in all cases. We conclude that the skewed-associative PStash is slightly fasrw than a standard IPStash while consuming about the same power. The reason is that the decoders required in the skewed-associalive case are faster than the monolithic decoder employed in the standard case. At the same time although each of the small decoders consumes less power than the original monolithic decoder. all of them together consume slightly more power.
ACCESS CYCLE
With our modifications, Cacti shows that a 512K-entry, 32-way, PStash easily esceeds 100 Msps. In any configuration, pipeline cycle time is on the order of 2 to 5 11s. Power consumption at 100 Msps starts at 2.13 W (including length arbitration and skewing overhead) with a 13-bit index and increases with decreasing index. In the exmme case of an %bit index, power is ovetwhelming mainly due to routing overhead (among banks). Power results are normalized for the same throughput (100 Msps) instead of frequency. Thus, the operational frequency of IPStash may not be the same as in TCAMs -it is in fact higher.
Results are analogous for the 200 Msps level performance.
Results for the 32-way IPStash configurations show a clear trade-off between power and performance. In the next section we introduce a power management technique for IPStash and present results for the most appealing configurations in terms of power or performance in the entire design space of IPStash devices.
A. Power Management in IPStush
As we have shown in the previous section, for the same performance, IPStash power consumption is sigmficantly lower h n the announced minimum power consumption of the Ultra-18 with optimal power management. Power management in the TCAM typically requires both optimaI parhtioning of the routing tables and external hardware to selectively power-up individual TCAM blocks,
In this section we introduce a novel power management tecluuque for LPStash that is simple, transparent, and often very effective. The concept is to assignfmorite -but not necessarily exclusise-associative ways or banks of ways to Merent prefm classes. In the following we refer to banks of ways but our discussion applies equally well to individual associative ways. The hope is that, for the most part, different classes end up occupying different banks. Since in our LPM we search classes consecutively, when a class occupies specific banks we restrict our search solely to those.
This power management techmque can be implemented with very little hardware. First. we assign favorite banks to sets of classes in a veq simple manner: Class 1 (the largest) favors the leftmost banks while the combination of Class 2 and Class 3 favors the rightmost banks. AI1 the classes intenniv somewhere the middle. "Bank-favoritism" is exhibited on prefix insertion only: we simply steer Class-] prefixes to the left and Class-2 Cacti incorporates a siniple model to simulate multi-bank caches which is applicable in OUT case. Cacti considers each bank as fully independent: every bank has its own independent address and data lines. Cacti includes a routing overhead that represents power and time penalty for driving address and data lines to each bank.
Assuming a 512k-entq rPStash with 16 banks each consisting of 16 ways, our simulations show that 84% of the total associativity is devoted to pure set-associative ways (57% associativity for Class-1 prefixes, 27% for the Class-2 and Class-3 prefixes) and 16% of the associativity is devoted to mixed classes. This means that upon anival of an incoming packet, in the first lookup (Class-1) only 73% of the banks (12 banks) need to be searched and only 42% of the banks (7 banks) are needed for the other two sequential searches. Average power consumption in this case is reduced by 37.3%. indices of 11-14 bits. The horizontal dimension represents the mar;i" search rate (in Msps) that a specific IPStash can achieve and the vertical dimension represents maximum power reduction compared to the scaled power consumption of the ULTRA-18 TCAM with full memory management. All power results are normalized for the same throughput -100 Msps. PStash power consumption without any power management is 61% lower compared to the fully-power-managed ULTRA-18. When we employ power management in IPStash, a further improvement in power consumption is achieved. In our case, power management introduces negligible overhead, needing no additional external hardware or effort. Considering the search throughput, IPStash devices easily exceed the current top-of-the-line performance of 100 Msps. In some configurations more than 250 Msps are achieved.
3.
As we have discussed, the concept for longest prefix match in IPStash is to iteratively search prefix classes -usually three in our study-for progressively shorter prefixes until a match is found. For the analysis in Section V we assume worst case take advantage of the eEort of several TCAM vendors to reduce power consumption by providing mechanisms to enable and search only a part of a TCAM much smaller than the entire TCAM array. The authors propose a bit-selection architecture and parLitioning technique to design a powerefficient TCAM architecture. In [18] , the authori pmpose to place TCAMs on separate buses for parallel accesses and introduce a paged-TCAM architecture to increase throughput and reduce power consumption The idea of a "paging" TCAM archtecture is further explored in [2 1,301 in order to achieve new levels of power reduction and throughput. Our proposd is similar in spirit but cllstinctly different in implementation since we advocate separation of storage (in an SRAM set-associative memon, amy) and search functionality (variable tag match and length arbitration).
We believe that this separation results in the most efficient implementations of the "blocking" or paging concept. Furthermore, our effort is centered in fitting a routing table in the most efficient manner in the least associative array possible.
Many researchers employ caches to speed up the translation of the destination addresses to output port numbers Our approach is dfferent from all previous work. Instead of using a cache in combination with a general-purpose processor or an ASIC routing machine. we use a stand-alone set-associative architecture. IPStash offers unparalleled simplicity compared to all previous proposals while being fast and powerefficient at the same time.
VII. CONCLUSIONS
In this paper. we propose a set-associative architecture called IPStash which abandons the TCAMs in IP-lookup applications. IPStash overcomes many problems faced by TCAM designs such as the complexity needed to manage the routing table, power consumption, density and cost. IPStash can be faster than TCAMs and more power efficient while still maintaining the simpliciw of a content addressable memoq.
The recent turn of the TCAM vendors to powerefficient blocked architectures where the TCAM is divided up in independent blocks that can be addressed externally justifies our approach. Blocked TCAMs resemble set-associative memories.
and our own proposal in particular. but their blocks are too few. their associativity is too hgh and their comparators are embedded in the storage array instead of being separate. In our mind. we see no reason to use a fully-associative, ternary, contentaddressable memory to do the work of a set-associative mem-
OV.
What we show in this paper is that associativity is a function of the routing table size and therefore need not be inordinately high as in blocked TCAMs with respect to the current storage capacities of such devices. What me propose is to go all the way, and instead of having a blocked fullyassociative architecture that inherits the deficiencies of the TCAMs, start with a clean set-associative design and implement IP-lookup on it. We show how longest prefix match can be implemented by iteratively searclung classes of (increasingly) shorter prefixes. Prefix classes allow us to hash the routing table multiple times (each time using an optimized index) for insertion in IPStash. Multiple-hashing coupled with skewed associativity results in a required associativity for routing tables impressively close to optimal.
Using Cacti, we study IPStash using 8 routing table sizes and find that it can be more than twice as fast as the top+f-theline TCAMs while offering up to 64% power savings (for the same throughput) over the announced minimum power consumption of commercial products. In addition, IPStash exceeds 230 Msps while the state-of-the-art performance for TCAMs (in the same technology) currently only reaches about 100 Msps.
We believe that IPStash is the natural evolutionary step for lqe-scale Lp-lookup from TCAMs to associative memories. We are working on expanding IPStash to support many other networking applications such as IPv6. NAT. MPLS, the handling of millions of "flows" (point-to-point Internet connections) by using similar teclmiques as in IP-lookup.
