For low-cost RFID systems, the power consumption of passive tags is a key issue. Playing the most important role in the tag's base-band processing, anti-collision protocols (as well as their implementing circuits) need much endeavor for the optimization of power. In this technical report, we combine the power optimizations on the protocol level and on the circuit level, instead of carrying them out separately. We propose a new criterion, which takes into account both time and energy consumption, to evaluate anti-collision schemes. We put forward an improved anti-collision scheme, and compare it to two existing and recommended anti-collision schemes.
introduction
In low cost RFID systems, there are no batteries attached to tags. Tags are passive, getting their energy from the electro-magnetic waves emitted from the reader. For frequencies above 900MHz, the working distance of 1 meter makes tags fall into the far field region of the reader's antenna. The energy received by the tag is generally less than 100uW. Such an energy supply requires the function of the tag to be as simple as possible so that its power consumption is minimized.
A passive tag is power limited, rather than energy limited. We can reasonably assume that the reader can always keep the energy supply so that a problem such as battery life doesn't exist. Thus the key concern is the tag's power consumption, which restricts the tag's maximum possible distance to the reader [1] .
Anti-collision circuit acts as the main part of the base-band processing circuits on the tag. Therefore, low power implementation of anti-collision protocol is one of the key issues in the design of tags.
Low power technologies have been studied on different levels, such as the protocol level, system level, RTL level, circuit level, layout level, and material level. Generally the studies on different levels are carried out individually and more often the system is optimized on different levels independently. However, we found that in a passive RFID system, anti-collision protocols can't be evaluated accurately without the detailed implementing circuits because the restriction on power consumption has come to such an extend that both the protocol and the circuit have to be optimized to their physical limitations. So we combine the protocol level and circuit level to explore the performances. From now on, we will use the term "anti-collision scheme" to refer to the combination of an anti-collision algorithm and its implementing circuit.
The remainder of this report is organized as follows: in Section2, we give our cost function that takes both power consumption and time complexity into account in order to evaluate the anti-collision schemes; in Section 3, two practical schemes are discussed in detail and an improved scheme is put forth; in Section 4, we compare three schemes; finally in Section 5, we give our conclusion.
the cost function to evaluate anti-collision schemes
Generally, we have two types of targets for low power design. The first is to minimize the total energy consumption in order to prolong the battery's lifetime, and the other is to minimize the average power in order lower the heat emitted by the chip. In passive RFID systems, however, the key target of low power design is to extend the distance between the reader and the tag. The tag works by receiving the electromagnetic power emitted by the reader. The power received by the tag is inversely proportional to the square of the distance between the reader and the tag [1] . It is the maximum instant power consumption of the tag that determines the tag's longest working distance from the reader. So for passive tags, the optimization target is to minimize the maximum instant power of the circuit.
In a practical tag, the capacity in the rectification circuit ( Figure 1 ) can compensate the energy when the received power is less than the circuit's instant power consumption, thus effectively loosening the requirement on the maximum instant power of the inner circuits. On the other hand, generally in a tag there is a controlled path (Figure 1, M1 ) between the power and the ground to bypassing surplus current when the consumed power is less than the input power if the voltage on the capacity reaches Vvdd, the surplus power (the difference between the input power and the consumed power) will be bypassed or "wasted".
If we arbitrarily select a section of time (t 1 ,t 2 ) (counted with clock cycles), and suppose we make full use of the charges stored on the capacity in order to lower the input power, then we have:
Here P in is the input power, C is the capacitance of the capacity, T is the clock period and E t1 , t2 is the total energy consumed by circuit during t 2 and t 1 .
Equation (1) can be transformed into (2) under the assumption ∆V<<V vdd :
Ai is the number of inversions occurring at clock cycle i and C load is the average load capacitance of the gates in the circuit. If, under some performance requirements, the maximum acceptable voltage drop on power supply is V drop , then we have:
So, for an anti-collision scheme, Here,
and T inquiring is the whole period of time in which the scheme is working, or put another way, the executing time of the anti-collision scheme. We use P in,minimum , denoted simply with Cost, as our cost function to evaluate anti-collision schemes.
Although the time consumption of an anti-collision scheme is relatively loose constraint in our application, it is still tightly related to the total number of tags to be read within a limited accessible time. It is unfair to compare the anti-collision schemes without the consideration of T inquiring . Otherwise, we can get an Cyc is the number of clock cycles needed to complete a scheme. Equation (3) can be further transformed into:
It is apparent that if C equals 0 or V drop equals 0, then our Cost degenerates into the problem of finding the maximum instant power consumption of the circuits.
We compare various anti-collision schemes under the same V vdd , V drop , C load , T INQ and C. So in practice, we adopt a simplified expression for P:
Hereafter, we use n to denote the length of IDs configured inside the tags and k the number of tags simultaneously appearing within the working zone of the reader.
the evaluation for some practical anti-collision schemes
In this section, we introduce two existing and recommended anti-collision schemes and make evaluations on their performances with our criteria. In this technical report all the circuit diagrams are omitted.
Binary-Tree Scheme
Binary-Tree Scheme [3] uses the protocol that requires tags to remember the previous inquiring results, thus reducing the average inquiring time. However, with binary-tree scheme, a tag has to completely finish an inquiring processing before it can respond to the next reader, therefore, if more than one reader work near a tag, the coordination among the readers becomes less flexible.
In this protocol, once a tag has been completely identified, it will be killed. Inside every tag, there is a pointer. Every time the tag is reset, the pointer points to the highest bit of the tag's ID, and with the ongoing of inquiring, it moves toward the lowest bit. During inquiring, the reader sends one inquiring bit at one time. The tags whose pointed bit is the same as the inquiring bit will back send their next bits to the reader while the tags whose pointed bit isn't will convert to the state of "quiet", and will not answer the remaining inquires in this round of inquiring until one tag has been killed and all the remaining tags are reset. If the reader senses a non-collision answer, it uses it as its next-step inquiring bit; otherwise if a collision is sensed, it uses '0' as its next-step inquiring bit. Thus for every cycle of inquire, one tag, and only one tag will be identified, when its pointer finally gets to the lowest bit. Then the identified tag will be killed and all the other tags that have already entered the state of "quiet" will be reset, and a new cycle of inquiring begins from the highest bit. After k cycles of inquiring, the IDs in the k tags will all be identified. Table 1 illustrates the process of inquiring, supposing we have 3 tags whose IDs are 001, 011 and 100 respectively. Here a '*' denotes a collision sensed by the reader. The process is equivalent to searching on a binary tree k times, from the root to the k leaves ( Figure 2 ).
To identify one tag, (n-1) clock cycles for inquiring and (n-1) clock cycles for answering are needed. k tags need 2 k(n-1) clock cycles. Besides, 3 additional clock cycles are needed for the reader to know there are no tags whose IDs start with '0' alive (step 5 in Table 1 ). So we have Figure 3 gives the state diagram of the tag's state machine. 
Cyc=2k(n-1)+3

state diagram for binary-tree protocol
If a tag is not at the "quiet" or "killed" states, it shifts between the "receiving" and "sending" states. The longest string of continuous "receiving" and "sending" occupies 2(n-1) clock, when this tag is just being read out. During this period of time the maximum P(t 2 ,t 1 ) (3) is obtained. Even though every tag consumes different quantity of energy during the whole inquiring process (the earlier a tag is identified and killed, the fewer energy it consumes), they have the same Cost, for the operations for them to be identified are the same.
Query-Tree Scheme
Query-Tree scheme [4] adopts a memoryless protocol. That means tags needn't remember their inquiring histories. Memory-less-protocols have the advantages and disadvantages opposite to memory protocols.
In this protocol, instead of sending only one bit for every inquiring, the reader sends a prefix, which may have the length of 1 to (n-1) bits. The tags will send the remaining bits of their IDs when the prefix matches the first bits (counting from the highest bit) of their IDs. The reader can tell from the tags' response at which bit the collision occurred. Then the reader overrides the non-collision bits and extends the prefix directly to the collision bit. Once the reader receives data of no collision, it knows it has read an ID. After recording the ID, it revises its prefix (to change the last bit, or change the sub-last bit and abandon the last bit) and continues its querying process. Table 2 illustrates the inquiring process.
Here we have 4 tags: 0001, 0011, 1000 and 1100.
The state diagram is shown in Figure 4 . Again assuming that the ID numbers are evenly distributed. We denote the inquiring routines with a binary tree ( Figure 5 ), where m=[ log 2 k], k is the number of tags. There are no collisions once the length of prefix gets m, the remaining part of the ID will be sent back at one time, and no further inquiry is needed. So we are effectively making a depth-first traverse on a full binary tree of depth m.
Corresponding to every node on the tree is the prefix consisting of the node and all its ancestors.
The prefix sent by the reader are enclosed with a pair of "NULL" for the tags to be informed od the start and the end of the received data. The reader knows when the back sent data finished, after which it needs an extra operation "wait" to deal with the back-sent data. We divide the nodes on the query tree into two classes: type I refers to those lying on the layers less than m, and type II refers to those on the layer of m. During the inquiring process, or the depth-first traverse on the query tree, for every arrival at a node of type I, the tag should experience n (either receiving or sending data) + 2 (NULL) + 1 clock cycles. The arrival at a node of type II means the identification of a tag, so besides the (n+3) clock cycles, which is the same as that of type I nodes, 2 extra clock cycles are needed for the reader to record the read-out ID. Thus, we have with this protocol, the tags no longer have the same Cost. It turns out that the first-read-out tag has the maximum Cost, or:
Published
FUD-AUTOID-WH-001 ©2003 Copyright 9 3.3. Improved Query-Tree Scheme
We put forward an improved query-tree scheme. It is similar to query-tree scheme except that, while the tags are back-sending their remaining parts of IDs, if the reader detects a collision bit, it will send a signal to the tags to stop their back sending. Thus if we are not facing the worst case, the number of operation "sending" will decrease. Table 3 illustrates the protocol. Figure 5 can also be used to denote the inquiring process of improved query tree scheme.
In fact, the reader will give the tags a "stop" signal 2 bits after the collision bit because the reader should spend one clock cycle seeing the collision and another clock cycle sending the stop signal. Thus the back sending data will actually stop 2 bits after the collision bit. 
comparison among the anti-collision schemes
Generally saying, the power performance of anti-collision schemes is the function of n,m (or k), C load , C, V drop and V. We restrict our discussions to draw some practical conclusions.
First, by omitting the low power items, we get the following simplified expressions for Cost 1 and Cost 2 . Then, if m=15, then we have 2 15 = 32768 tags to be read, which is a big enough number for today's technology. Thus we make comparisons with m≤15. Besides, we reasonably assume 32< n<128. Then, we can derive:
We call the item "redundancy factor", denoted with R. There is a tradeoff between the cost of the chip (tag) and C. The lower the chip's cost, the smaller the value of C.
Comparison between Cost 3 and Cost 1
Because , we conclude that if R ≤ 14.4n -274.2, then the improved query scheme is better than binary scheme. For example, if n=96 and V drop /Vv dd =0.1, with a small C (C/C load ≤ $11082), the improved query scheme is better than binary scheme.
Comparison between Cost 2 and Cost 1
First, we have the following deduction:
Thus, the satisfaction of the first inequality guarantees that Cost 2 < Cost 1 , which leads to: or For example, when n = 96, V drop /V vdd = 0.1 and C/C load ≤ 3634 (with small C), the query-tree scheme is better than binary-tree scheme. This conclusion reveals that our improved query-tree scheme is better than the query-tree scheme with small R.
Comparison between
conclusion
In this paper we optimize the power consumption of anti-collision schemes with the combination of protocol level and circuit level. Based on this methodology, we propose a criterion taking into account both the time complexity and the energy consumption. We put forth an improved anti-collision scheme and compare it to the two existing and recommended anti-collision schemes. Our detailed analyses show that with proper protocol improvement and circuit design, memoryless protocols can have better power performance than memory protocols.
