# Domino Logic With Variable Threshold Voltage Keeper

Volkan Kursun, Member, IEEE, and Eby G. Friedman, Fellow, IEEE

Abstract—A variable threshold voltage keeper circuit technique is proposed for simultaneous power reduction and speed enhancement of domino logic circuits. The threshold voltage of a keeper transistor is dynamically modified during circuit operation to reduce contention current without sacrificing noise immunity. The variable threshold voltage keeper circuit technique enhances circuit evaluation speed by up to 60% while reducing power dissipation by 35% as compared to a standard domino (SD) logic circuit. The keeper size can be increased with the proposed technique while preserving the same delay or power characteristics as compared to a SD circuit. The proposed domino logic circuit technique offers 14% higher noise immunity as compared to a SD circuit with the same evaluation delay characteristics. Forward body biasing the keeper transistor is also proposed for improved noise immunity as compared to a SD circuit with the same keeper size. It is shown that by applying forward and reverse body biased keeper circuit techniques, the noise immunity and evaluation speed of domino logic circuits are simultaneously enhanced.

*Index Terms*—Body biased keeper, contention current, domino logic, forward body bias, keeper, low-power and high-speed dy-namic circuits, noise immunity, reliability, reverse body bias.

## I. INTRODUCTION

**D** OMINO logic circuit techniques are extensively applied in high-performance microprocessors due to the superior speed and area characteristics of domino CMOS circuits as compared to static CMOS circuits [1], [2]. High-speed operation of domino logic circuits is primarily due to the lower noise margins of domino circuits as compared to static gates. This desirable property of a lower noise margin, however, makes domino logic circuits highly sensitive to noise as compared to static gates. As on-chip noise becomes more severe with technology scaling and increasing operating frequencies, error free operation of domino logic circuits has become a major challenge [1], [3]–[5].

Threshold voltage reduction accompanies supply voltage scaling, providing enhanced speed while maintaining dynamic power consumption within acceptable levels in each new integrated circuit technology generation. Scaling the threshold voltage, however, degrades the noise immunity of domino logic gates [1]. Moreover, exponentially increasing subthreshold leakage currents with reduced threshold voltages have become

The authors are with the Department of Electrical and Computer Engineering, University of Rochester, Rochester, NY 14627-0231 USA.

Digital Object Identifier 10.1109/TVLSI.2003.817515

an important issue threatening the reliable operation of deep submicrometer (DSM) dynamic circuits [1], [3]–[5], [13]–[15].

In a standard domino (SD) logic gate, a feedback keeper is employed to maintain the state of the dynamic node against coupling noise, charge sharing, and subthreshold leakage current. The keeper transistor is fully turned on at the beginning of the evaluation phase. Provided that the necessary input combination to discharge the dynamic node is applied, the keeper and pulldown network transistors compete to determine the logical state of the dynamic node. This contention between the keeper and the pulldown network transistors degrades the circuit speed and power characteristics. The keeper transistor is typically sized smaller than the pulldown network transistors in order to minimize the delay and power degradation caused by the keeper contention current. A small keeper, however, cannot provide the necessary noise immunity for reliable operation in an increasingly noisy and noise sensitive on-chip environment [3]-[5]. There is, therefore, a tradeoff between reliability and high-speed/energy-efficient operation in domino logic circuits.

A variable threshold voltage keeper circuit technique is proposed in this paper for simultaneous power reduction and speed enhancement of domino logic circuits. The current drive of the keeper transistor is dynamically adjusted with the proposed circuit technique. The threshold voltage of the keeper transistor is modified during circuit operation to reduce the contention current without sacrificing noise immunity. The variable threshold voltage keeper circuit technique is shown to enhance circuit evaluation speed by up to 60% while reducing power dissipation by 35% as compared to a SD logic circuit. The keeper size can be increased while preserving the same delay or power characteristics as compared to a SD circuit since the contention current is reduced with the proposed technique. The proposed domino logic circuit technique offers 14.1%, 8.9%, or 11.9% higher noise immunity under the same delay, power, or power-delay product conditions, respectively, as compared to a SD logic circuit technique. Forward body biasing the keeper transistor is also proposed for improved noise immunity as compared to a SD circuit with the same keeper size. It is shown that by applying forward and reverse body bias circuit techniques, the noise immunity and evaluation speed of domino logic circuits are both enhanced.

Challenges in the design of SD circuits are reviewed in Section II. The operation of the proposed domino logic with a variable threshold voltage keeper (DVTVK) circuit technique is described in Section III. Simulation results characterizing the delay, power, and noise immunity of the DVTVK technique as compared to SD are presented in Section IV. Dynamically

Manuscript received May 8, 2002; revised December 23, 2002. This work was supported in part by DARPA/ITO under AFRL Contract F29601–00-K-0182, in part by the New York State Office of Science, Technology, and Academic Research to the Center for Advanced Technology, in part by Electronic Imaging Systems and the Microelectronics Design Center, and in part by Xerox Corporation, IBM Corporation, Lucent Technologies Corporation, Eastman Kodak Company, and Photon Vision Systems, Inc.



Fig. 1. Domino gates with standard keeper transistors. (a) Standard footed domino gate. (b) Standard clock-delayed footless domino logic circuit.

forward body biasing the keeper transistor for enhanced noise immunity is proposed in Section V. Finally, some conclusions are offered in Section VI.

#### II. BACKGROUND

Performance critical paths in high-performance integrated circuits are often implemented with domino logic circuits. Although domino logic circuit techniques are preferable in high-speed circuits, the reliability of domino circuits is seriously degraded in DSM technologies. The operating principles of domino logic circuits are reviewed in this section. Reliability issues threatening the correct operation of domino logic circuits together with some promising solutions recently proposed in the literature are reviewed. The basic operation of a SD logic circuit is described in Section II-A. The noise immunity, signal delay, and energy dissipation tradeoffs in domino logic circuits are discussed in Section II-B.

#### A. Operation of SD Logic Circuits

A standard footed domino gate is shown in Fig. 1(a). Domino circuits behave in the following manner. When the clock signal is low, the domino logic circuit is in the precharge phase. During this phase, the dynamic node is charged to  $V_{\text{DD1}}$  by the pullup

transistor. The output transitions low, turning on the keeper transistor. When the clock transitions high, the circuit enters the evaluation phase. In this phase, provided that the necessary input combination to discharge the dynamic node is applied, the circuit evaluates and the dynamic node is discharged to ground. If the circuit does not evaluate in the evaluation phase, the high state of the dynamic node is preserved against coupling noise, charge sharing, and subthreshold leakage current by the keeper transistor until the pullup transistor is turned on at the beginning of the following precharge phase.

The foot transistor (see Fig. 1) controlled by the clock signal divides the operation of a domino logic circuit into two distinct phases independent of the timing of the input signals. The isolation of the pulldown network from ground in the precharge phase eases the relative timing of the input and clock signals in cascaded multistage footed domino circuits. If the necessary input combination to discharge the dynamic node is applied during the precharge phase, the pulldown transistors cannot alter the state of the dynamic node as the pulldown path to ground is blocked by the foot transistor.

The foot transistor has a nonzero resistance and parasitic capacitance that degrades the evaluation speed of a domino circuit. The foot transistor is typically sized significantly larger than the pulldown network transistors to minimize this speed degradation. Increasing the size of the foot transistor, however,

increases the power dissipation since the foot transistor switches every clock cycle. Provided that the clock signal is appropriately delayed, the foot transistors can be omitted in a cascaded multistage domino circuit [as shown in Fig. 1(b)], reducing both the circuit evaluation delay and the power dissipation. The clock signal is intentionally delayed from one stage to the next stage in order to ensure that no short-circuit current path from the power supply to ground exists (formed by the pullup and pulldown network transistors being turned on simultaneously). The clock signal driving a footless domino gate is delayed to transition low only after the previous stage domino gates are all precharged and the inputs to the footless domino gate are all low. Similarly, the inputs to a footless domino gate should transition high only after the clock signal at the gate transitions high and the evaluation phase begins [2]. Although more strict timing of the input and clock signals is required, the overall delay and power characteristics of a footless domino circuit are enhanced as compared to a standard footed domino circuit. Footless domino circuits are, therefore, increasingly popular in high-speed integrated circuits [2]. Since the clock signal driving each domino gate is delayed, a multistage footless domino circuit is often categorized as a clock-delayed or delayed-reset domino circuit. Note that a first stage domino gate in a multistage clock-delayed domino circuit is typically footed as shown in Fig. 1(b).

## *B.* Noise Immunity, Delay, and Energy Tradeoffs in Domino Logic Circuits

As described in Section II-A, the keeper transistor is fully turned on as the output goes low during the precharge phase. When the clock signal transitions high, the pullup transistor turns off and the keeper transistor provides the only conductive path between the dynamic node and  $V_{DD1}$ , preserving the logical state of the dynamic node in the evaluation phase. Provided that the necessary input combination to discharge the dynamic node is applied during the evaluation phase, the keeper transistor opposes the evaluation of the input signals, degrading the speed and power characteristics of a SD logic circuit. The current provided by the keeper transistor to charge the dynamic node while the pulldown network transistors are attempting to discharge the dynamic node is called contention current.

The effect of the keeper transistor on the noise immunity, evaluation delay, and power characteristics of a domino logic circuit is evaluated assuming a 0.18- $\mu$ m CMOS technology. The low noise margin (NML) is the noise immunity metric used in this section. The NML is defined as

$$NML = V_{IL} - V_{OL} \tag{1}$$

where  $V_{IL}$  is the input low voltage defined as the smaller of the dc input voltages on the voltage transfer characteristic (VTC) at which the rate of change of the dynamic node voltage with respect to the input voltage is equal to one (the unity gain point on the VTC).  $V_{OL}$  is the output low voltage.

Simulation results for four input standard footless domino AND and OR gates are shown in Fig. 2. For comparison, simulation results of domino logic circuits without a keeper are also included in Fig. 2. All of the transistors other than the keeper transistor are sized the same. The effect of the keeper transistor on the circuit delay and noise immunity characteristics varies



Normalized NML, Delay, and Power



Fig. 2. Comparison of the normalized noise immunity, evaluation delay, and power characteristics of standard footless domino logic circuits with different keeper sizes. (a) Effect of the increased keeper size on the circuit characteristics of a four input domino AND gate. (b) Effect of the increased keeper size on the circuit characteristics of a four input domino OR gate. NML 1, Delay 1, and Power 1: only one input is excited while the other input signals are either grounded (for the OR gates) or connected to  $V_{DD}$  (for the AND gates). NML 2, Delay 2, and Power 2: All four input signals are excited with the same input or noise signal.

depending upon the gate input excitation. The simulations of the first group of circuits (NML1, Delay1, and Power1 shown in Fig. 2) are based on the assumption that the input or noise signals couple only at a single gate input while the other gate inputs are connected either to ground (for the OR gates) or to  $V_{DD}$ (for the AND gates). Additional simulations (NML2, Delay2, and Power2 shown in Fig. 2) are produced assuming that all of the gate inputs are excited simultaneously by the same input or noise signal.

As shown in Fig. 2(a), when the input or noise signal is applied to only one input while the other gate inputs are connected to  $V_{DD}$  [NML1, Delay1, and Power1 shown in Fig. 2(a)], the addition of a keeper whose size is a quarter of a pulldown transistor

degrades the evaluation speed and power by 16% and 14%, respectively, as compared to a four input domino AND gate without a keeper. Increasing the keeper size from 0.25 to 1, the NML1 is increased by 163%. The increased keeper size, however, also increases the delay and power dissipation by 190% and 132%, respectively. When all of the gate inputs are excited [NML2, Delay2, and Power2 shown in Fig. 2(a)], the NML2, delay, and power are increased by 104%, 177%, and 125%, respectively, by increasing the keeper size from 0.25 to 1.

When only one input signal is excited while the other three input signals are grounded in a four input domino OR gate, the addition of a keeper half the size of a pulldown network transistor degrades the power and delay by 18% and 16%, respectively, as compared to a SD circuit without a keeper [as shown in Fig. 2(b)]. Increasing the keeper size from 0.5 to 2 increases the noise immunity, delay, and power by 119%, 104%, and 118%, respectively. When all of the gate inputs are excited by the same noise or input signal, the effect of the keeper current on both the circuit performance and reliability is reduced. Increasing the keeper size from 0.5 to 2, therefore, improves the NML by only 24%. The delay and power are increased by 40% and 67%, respectively.

As displayed in Fig. 2, from a circuit performance and energy efficiency point of view, the keeper should be sized as small as possible (or preferably omitted as in earlier domino logic circuits). On the contrary, from a noise immunity and operational reliability point of view, the keeper size should be as large as possible while guaranteeing functionality for a worst case delay input signal combination. There is, therefore, a tradeoff between high noise immunity and high-speed/energy-efficient operation of domino logic gates [3]–[5].

In order to manage these conflicting requirements (a strong keeper for high noise immunity and a weak keeper for high speed), a variable strength keeper scheme was first proposed by Alvandpour [3]. Two keeper transistors are employed in the proposed scheme. One of the keeper transistors is sized small in order to reduce the contention current while the other keeper transistor is sized larger for high noise immunity. The larger keeper transistor is conditionally turned on if the dynamic node is not discharged during the evaluation phase. The weak keeper offers limited noise immunity, improving the evaluation speed during the worst case evaluation delay while the strong keeper offers good robustness to noise and leakage during the rest of the evaluation phase [4]. The primary drawback of this technique is that a delay element and a conditional keeper control circuit are required for each domino gate, increasing the area and energy overhead of the conditional keeper circuits. A similar technique with a single keeper transistor which is cutoff at the beginning of the evaluation phase has been proposed in [5]. The dynamic node, without any conductive path to the power supply, floats at the beginning of the evaluation phase. Although the contention current is reduced with the technique proposed in [5], reliable operation cannot be maintained in an increasingly noisy and noise sensitive on-chip environment. It is assumed with the domino circuit techniques proposed in [3] and [5], that the timing of the clock and input signals driving the domino gates are well known, permitting the worst case evaluation delay to be accurately estimated. The effectiveness of both techniques



Fig. 3. A K input domino OR gate with a variable threshold voltage keeper.

in reducing the delay and power of domino logic circuits depends upon an accurate estimate of the worst case evaluation delay [4]. Provided that the worst case evaluation delay is underestimated, the conditional keeper can be turned on before the evaluation is completed (the dynamic node is fully discharged), producing a contention current on par with the current produced by a SD keeper transistor. Alternatively, if the worst case evaluation delay is overestimated, the circuit is exposed to noise with little noise immunity for an extended amount of time, thereby degrading the reliability of the circuit.

A variable threshold voltage keeper circuit technique is proposed in this paper for simultaneously reducing power, enhancing speed, and improving noise immunity in domino logic circuits. The current drive of the keeper transistor is adjusted by dynamically body biasing the keeper. The threshold voltage of the keeper transistor is modified during circuit operation to reduce the contention current without sacrificing noise immunity. Similar to the conditional keeper and high-speed domino techniques, it is assumed that the worst case evaluation delay of the domino circuits can be accurately predicted. The operation of the proposed domino logic circuit technique with a variable threshold voltage keeper is described in Section III.

## III. DOMINO LOGIC WITH VARIABLE THRESHOLD VOLTAGE KEEPER

The DVTVK circuit technique is introduced in Section III-A. The threshold voltage of the keeper is dynamically modified during circuit operation by changing the body bias voltage of the keeper. Operation of the body bias generator is described in Section III-B.

### A. Variable Threshold Voltage Keeper

A K input domino OR gate based on the proposed circuit technique is shown in Fig. 3. A representative waveform that characterizes the operation of the circuit is shown in Fig. 4.

The operation of the DVTVK circuit behaves in the following manner. When the clock is low, the pullup transistor is on and the



Fig. 4. Waveforms that characterize the operation of the proposed variable threshold voltage keeper circuit technique.

dynamic node is charged to  $V_{DD1}$ . The substrate of the keeper is charged to  $V_{DD2}$  ( $V_{DD2} > V_{DD1}$ ) by the body bias generator, increasing the keeper threshold voltage. The value of the high threshold voltage (high- $V_t$ ) of the keeper is determined by the reverse body bias voltage ( $V_{DD2} - V_{DD1}$ ) applied to the source-to-substrate p-n junction of the keeper. The current sourced by the high- $V_t$  keeper is reduced, lowering the contention current when the evaluation phase begins. A reduction in the current drive of the keeper does not degrade the noise immunity during precharge as the dynamic node voltage is maintained during this phase by the pullup transistor rather than by the keeper.

When the clock goes high (the evaluation phase), the pullup transistor is cutoff and only the high- $V_t$  keeper current contends with the current from the evaluation path transistor(s). Provided that the appropriate input combination that discharges the dynamic node is applied in the evaluation phase, the contention current due to the high- $V_t$  keeper is significantly reduced as compared to SD logic. After a delay determined by the worst case evaluation delay of the domino gate, the body bias voltage of the keeper is reduced to  $V_{DD1}$ , zero biasing the source-to-substrate p-n junction of the keeper. The threshold voltage of the keeper is lowered to the zero body bias level, thereby increasing the keeper current. The DVTVK keeper has the same threshold voltage of a SD keeper, offering the same noise immunity during the remaining portion of the evaluation phase (assuming the SD and DVTVK keepers are the same size).

#### B. Dynamic Body Bias Generator

The proposed dynamic body bias generator (DBBG) is shown in Fig. 5. The DBBG produces an output signal swinging between  $V_{DD1}$  and  $V_{DD2}$  from an input signal swinging between ground and  $V_{DD1}$ . The DBBG generates the proper body bias voltages for the keeper with an appropriate delay, ensuring that the contention current is reduced without sacrificing noise immunity.

The operation of the DBBG is controlled by the clock signal that also controls the operational phases of the domino logic circuit. When the clock goes low, Node<sub>2</sub> is discharged through  $N_2$ , turning on  $P_1$  and  $P_3$ .  $P_2$  and  $P_4$  are cutoff and the body



Fig. 5. Body bias generator circuit.

bias voltage is increased to  $V_{DD2}$ . When the clock goes high, the domino circuit enters the evaluation phase. Node<sub>1</sub> is discharged through  $N_1$ , turning on  $P_2$  and  $P_4$ .  $P_1$  and  $P_3$  are cutoff. The voltage at Node<sub>3</sub> is maintained at  $V_{DD1}$  through  $P_4$ . During this stage, the DBBG must ensure that the keeper current is increased to the low threshold voltage (low- $V_t$ ) current level to maintain higher noise immunity if the dynamic node is not discharged by the evaluation path transistors. After a delay determined by the worst case evaluation delay of the domino gate, the body bias voltage is reduced to  $V_{DD1}$ . Hence, with a time delay  $t_d$  after the clock edge, the threshold voltage of the keeper is reduced to the zero body bias level, increasing the keeper current. During the remaining portion of the evaluation phase, the noise immunity characteristics of the SD and DVTVK circuit techniques are identical.

The proposed dynamic body bias generator assumes two supply voltages,  $V_{DD1}$  and  $V_{DD2}$ , where  $V_{DD1} < V_{DD2}$ . The delay and power savings can be improved by increasing  $V_{DD2}$ as compared to  $V_{DD1}$ . This change, however, also degrades the noise immunity characteristics of a domino circuit at the beginning of the evaluation phase. The appropriate reverse body bias voltage applied to the keeper is determined by the target delay/power objectives while satisfying the lowest acceptable noise immunity requirements during the worst case evaluation delay of a domino gate. The highest bias voltages that can be applied across the source-to-substrate p-n junction and the gate oxide of a MOSFET for a specific technology are other factors that determine  $V_{DD2}$ .



Fig. 6. A four-bit multiple-output domino carry generator of a carry lookahead adder implemented with the proposed variable threshold voltage keeper circuit technique.  $W_{N2} = 2W_{N1}/3$ ,  $W_{N3} = 2W_{N1}/4$ ,  $W_{N4} = 2W_{N1}/5$ ,  $W_{N5}$ ,  $W_{N6}$ ,  $W_{N7}$ ,  $W_{N8}$ ,  $W_{N9} = 2W_{N1}$ .

## **IV. SIMULATION RESULTS**

As discussed in Section II, the worst case evaluation delay of a wide domino OR gate occurs when only one input is excited while the other inputs are grounded. Similarly, the worst case evaluation delay in a domino gate with stacked pulldown transistors (*e.g.*, an AND-OR or an AND gate) occurs when all of the inputs in the critical pulldown path are excited by the same input signal while all of the other inputs are grounded. The worst case evaluation delay determines the clock speed of a domino circuit while the target clock speed determines the size of a keeper. The speed and power characteristics of the domino logic circuits are evaluated for the set of worst case input vectors. While evaluating the noise immunity, the same noise signal is applied to all of the test circuit inputs as this situation represents the worst case noise condition.

The SD and DVTVK circuit techniques are evaluated for two different test circuits assuming a 0.18- $\mu$ m CMOS technology. Simulation results of a multiple-output domino carry generator implemented with the proposed DVTVK circuit technique are

presented in Section IV-A. The proposed DVTVK circuit technique is also applied to a chain of footless domino OR gates. Simulation results of the clock delayed domino OR gates (COR) with the proposed DVTVK circuit technique are presented in Section IV-B. The effect of gate sizing on the delay and power characteristics of the proposed DVTVK circuit technique is discussed in Section IV-C.

## A. Multiple Output Domino Carry Generator With Variable Threshold Voltage Keeper

A four-bit multiple-output domino carry generator (CG) implemented with the proposed variable threshold voltage keeper circuit technique (CG-DVTVK) is shown in Fig. 6. A description of the multiple-output domino circuit technique is presented in [11]. The CG circuit has four dynamic nodes. Each dynamic node of the CG can be discharged independently by asserting the generate (G) input of the corresponding node. The critical path of the CG circuit is along the  $N_5$ - $N_9$  path. The worst case evaluation delay of the CG occurs while discharging the fourth



Fig. 7. Variation of the power-delay product (PDP), delay, power, and noise margin low (NML) characteristics of CG-DVTVK with  $V_{DD2}$ . Values are normalized to those of a SD carry generator circuit with the same size transistors (KPR = 2.2).

dynamic node (Dynamic<sub>4</sub>) through the critical path. During evaluation of the delay and power characteristics, the propagate inputs ( $P_1$ - $P_4$ ) and C<sub>in</sub> are asserted while the generate inputs ( $G_1$ - $G_4$ ) are grounded. While evaluating the noise immunity, all of the inputs are excited by the same noise signal. A 1-GHz clock with a 50% duty cycle is applied to the circuits. All of the common transistors in the SD and DVTVK test circuits are sized the same.

In order to determine an appropriate reverse body bias voltage to be applied to the keeper, the delay, power, power-delay product (PDP), and noise immunity characteristics of CG-DVTVK are evaluated by varying  $V_{DD2}$  (for a keeper to critical path effective transistor width ratio (KPR) of 2.2). The normalized delay, power, PDP, and NML of CG-DVTVK as compared to the SD carry generator (CG-SD) are shown in Fig. 7. The evaluation delay and power dissipation are reduced by increasing  $V_{DD2}$  as compared to  $V_{DD1}$ . Increasing  $V_{DD2}$ , however, also degrades the noise immunity characteristics of the domino circuit at the beginning of the evaluation phase. As shown in Fig. 7, the degradation in noise immunity is 2% for a reverse body bias voltage of 0.3 V while the delay and power savings are 4% and 1%, respectively. Increasing the reverse body bias voltage of the keeper transistor to 1.8 V  $(V_{DD2} = 3.6 \text{ V})$ , the delay and power savings are increased to 60% and 35%, respectively, while the degradation in noise immunity at the beginning of the evaluation phase increases to 11%. It is assumed that applying a supply voltage of up to 3.6 V to the body bias generator does not create any MOSFET gate oxide related reliability problems in the target CMOS technology. It is also (arbitrarily) assumed that a degradation of the noise margin by 11% at the beginning of the evaluation phase is acceptable. In the following analysis,  $V_{DD1}$  and  $V_{DD2}$ are 1.8 and 3.6 V, respectively.

Simulation results characterizing the delay and power gains achievable with the DVTVK circuit technique for a same size keeper as compared to SD are analyzed in Section I. Since the contention current is significantly reduced with the proposed variable threshold voltage keeper circuit technique, the size of the keeper transistor can be increased to improve the noise immunity without degrading the delay and power characteristics as compared to a SD logic circuit. The improvement in noise im-

TABLE I A Comparison of the Evaluation Delay, Power Dissipation, Power-Delay Product (PDP), and NML (for Maximum Reverse Body Biased Keeper) of SD and DVTVK Circuit Techniques for KPR = 2.2

|           | Evaluation Delay | Power | PDP  | NML  |
|-----------|------------------|-------|------|------|
|           | (ps)             | (µW)  | (tJ) | (mV) |
| SD        | 291              | 2625  | 764  | 478  |
| DVTVK     | 116              | 1717  | 199  | 427  |
| Reduction | 60%              | 35%   | 74%  | -11% |

munity offered by the DVTVK technique under the same delay, power, or power-delay product conditions as compared to SD is presented in Section II.

1) Improved Delay and Power Characteristics With Comparable Noise Immunity: The keeper width is a multiple of the equivalent width of the pulldown critical path and is varied to evaluate the delay, power, and noise immunity characteristics. The evaluation delay, power, power-delay product (PDP), and NML of the SD and DVTVK circuits as a function of the keeper to critical path effective transistor width ratio (KPR) are shown in Fig. 8. Provided that the input vector combination that produces the worst case evaluation delay is applied, the fourth dynamic node of the SD circuit cannot be fully discharged during the entire evaluation phase for KPR values above 2.2 due to the high contention current in SD logic circuits. A KPR of 2.2 is, therefore, the largest value that is considered in this analysis. The gain in delay, power, and PDP achieved by the proposed technique is listed in Table I.

The proposed variable threshold voltage keeper circuit technique is effective for enhancing the evaluation speed of domino logic circuits. The enhancement in circuit speed of DVTVK as compared to SD is 8% for a KPR of 0.6. As shown in Fig. 8(a), the effectiveness of the proposed technique increases with larger keeper size as the degradation in circuit speed becomes more severe due to increased contention current. As listed in Table I, DVTVK improves the evaluation delay by 60% as compared to SD for a KPR of 2.2. As shown in Fig. 8(b), the proposed circuit technique also lowers the power consumption for a wide range of keeper sizes. As listed in Table I, DVTVK reduces the power by 35% as compared to SD (for a KPR = 2.2). As the keeper size is decreased, the effect of the keeper contention current on the power dissipation becomes smaller. The reduction in power, therefore, diminishes with decreasing keeper size. Due to the energy overhead of the dynamic body bias generator circuit, the power consumed by DVTVK is 13% greater than SD when the KPR is reduced to 0.6.

The power-delay product (PDP) of the circuits is also illustrated in Fig. 8 to better compare the effect of the proposed variable threshold voltage keeper circuit technique on circuit performance and energy dissipation. SD has a higher PDP as compared to DVTVK for values of KPR greater than 0.8. As listed in Table I, DVTVK lowers the PDP by 74% as compared to SD for a KPR of 2.2.

Another important metric for domino circuits is the noise immunity. The proposed circuit technique degrades the noise immunity as compared to SD, although only at the beginning of the evaluation phase. This degradation occurs for a brief amount of time until the threshold voltage of the keeper is lowered for increased noise immunity. The time delay  $(t_D)$  at the beginning



Fig. 8. SD and DVTVK simulation results for different keeper to critical path equivalent transistor width ratios (KPR). (a) Evaluation delay versus KPR. (b) Power dissipation versus KPR. (c) Noise margin versus KPR. (d) Power delay product versus KPR.

of the evaluation phase, after which the keeper current drive is increased to the low- $V_t$  level, is determined by the worst case evaluation delay of the domino gate. The degradation in noise immunity changes between 8% and 11% under maximum reverse body bias conditions as the KPR is increased from 0.6 to 2.2. As shown in Fig. 8(c), the noise immunity of DVTVK is identical to the noise immunity of SD whenever a zero body bias is applied to the keeper.

2) Improved Noise Immunity With Comparable Delay or Power Characteristics: The DVTVK circuit technique is shown to offer significant delay and power savings for the same size keeper as compared to SD. Because of the high contention current in SD logic circuits, the circuit evaluation delay and power increases significantly with larger keeper size. As explained in Section II, the significant speed and energy penalty incurred to increase the noise immunity in SD logic circuits is due to the static strength of the keeper current during the entire evaluation phase. As shown in Fig. 8, the NML of SD and zero body biased DVTVK increases by 34% as the KPR is increased from 0.6 to 2.2. The adverse effect of increased keeper size on the delay and power characteristics is significantly lower for DVTVK as compared to SD. As shown in Fig. 8, the evaluation delay and power dissipation of SD (DVTVK) are increased by 3.8(1.6) times and 2.6(1.5) times, respectively, for a 34%noise immunity improvement as the KPR is increased from 0.6 to 2.2. The PDP of SD (DVTVK) increases 10 (2.5) times for a KPR of 2.2 as compared to a KPR of 0.6.

TABLE II Achievable Improvement in NML With the DVTVK Circuit Technique as Compared to SD While Maintaining Equal Delay, Power Dissipation, or PDP (KPR of DVTVK is 2.2)

|            |        | Noise margin improvement as compared to SD |                   |  |
|------------|--------|--------------------------------------------|-------------------|--|
|            | SD-KPR | NML                                        | NML               |  |
|            |        | Zero Body Bias                             | Reverse Body Bias |  |
| Same Delay | 1.34   | 14.1%                                      | 1.9%              |  |
| Same Power | 1.63   | 8.9%                                       | -2.7%             |  |
| Same PDP   | 1.45   | 11.9%                                      | 0.0%              |  |

Since the contention current is significantly reduced with the proposed variable threshold voltage keeper technique, the width of the keeper transistor in a DVTVK circuit can be increased without degrading the delay and power characteristics as compared to a SD logic circuit. DVTVK, therefore, offers higher noise immunity as compared to SD under the same delay, power, or power-delay product conditions. The KPR of DVTVK is fixed at 2.2 (the highest value considered during the analysis). The SD keeper size is reduced to lower the contention current, offering the same delay, power, or PDP as compared to DVTVK. The improvement in the NML of DVTVK as compared to SD (both under the maximum reverse body biased and zero body biased DVTVK keeper conditions) are listed in Table II. The KPR of SD required for the same delay, power dissipation, or PDP characteristics as compared to the DVTVK circuit technique is also listed in Table II.



Fig. 9. Clock delayed domino logic with the proposed variable threshold voltage keeper circuit technique.

As listed in Table II, the NML of DVTVK (zero body biased keeper) is 14.1% higher as compared to SD when the SD keeper is sized for comparable evaluation speed. Since the keeper transistor in the CG-DVTVK circuit is sized 64% larger than the keeper in CG-SD, the noise immunity of CG-DVTVK is higher as compared to CG-SD even at the beginning of the evaluation phase when the keeper threshold voltage is increased by reverse body biasing the keeper. Under the same power dissipation conditions, the NML of DVTVK with zero body biased keeper improves by 8.9% as compared to SD. When the power-delay products of DVTVK and SD are maintained the same, the DVTVK (with zero body biased keeper) offers an 11.9% higher NML as compared to SD.

### B. Clock-Delayed DVTVK

As discussed in Section II, footless domino logic circuits have enhanced speed and power characteristics as compared to footed domino logic circuits. Cascaded footless domino logic circuits, however, require careful control of the relative timing of the clock and input signals. When the DVTVK circuit technique is applied to a clock-delayed footless domino circuit, the body bias signals should be delayed with respect to the input signals at each footless domino stage. Appropriate timing of the body bias signal is crucial for maximizing the savings in the delay and power without sacrificing noise immunity with the proposed circuit technique. The proposed DVTVK circuit technique is applied to cascaded footless domino OR gates as shown in Fig. 9. A three-stage chain of eight input domino OR gates with a fan-out of three (COR) is investigated.

A body bias signal that swings between  $V_{DD1}$  and  $V_{DD2}$  from a clock signal that swings between ground and  $V_{DD1}$  is generated in the first stage of a clock-delayed domino circuit. The substrate of the keepers within the domino gates in the following stages are driven by cascaded inverters supplied by  $V_{DD1}$  and  $V_{DD2}$  (as shown in Fig. 9). The delay and drive strength of these inverters are adjusted in each domino stage to maintain the correct timing of the body bias signals. The clock and body bias signals are delayed at each footless domino stage, maximizing the savings in the delay and power with the proposed variable threshold voltage keeper circuit technique.

The keeper width is a multiple of the width of a pulldown network transistor (all of the nMOS transistors in a pulldown path are sized the same) and is varied to evaluate the delay, power, and noise immunity characteristics of a chain of domino logic circuits with variable threshold voltage keepers (COR-DVTVK) and a chain of domino logic circuits with standard keepers (COR-SD). A 1-GHz clock with a 50% duty cycle is applied to the circuits. All of the common transistors in the SD and DVTVK test circuits are sized the same. Each domino gate at the third stage drives a 10 fF load. The savings in evaluation delay, power, and PDP of COR-DVTVK as compared to COR-SD for different keeper sizes are listed in Table III.

As listed in Table III, DVTVK improves the evaluation delay, power, and PDP by 6.9%, 0.6%, and 7.5%, respectively, as compared to SD for a KPR = 0.6. The effectiveness of the proposed

TABLE III SAVINGS IN DELAY, POWER, AND PDP OF COR-DVTVK AS COMPARED TO COR-SD WITH DIFFERENT KEEPER SIZES

|     | Percentage improvement as compared to SD |       |      |      |
|-----|------------------------------------------|-------|------|------|
| KPR | Delay                                    | Power | PDP  | NML  |
| 0.6 | 6.9                                      | 0.6   | 7.5  | -6.1 |
| 0.8 | 9.9                                      | 3.2   | 12.8 | -5.9 |
| 1.0 | 12.3                                     | 5.7   | 17.3 | -5.9 |
| 1.2 | 15.8                                     | 8.8   | 23.2 | -6.0 |
| 1.4 | 19.3                                     | 12.7  | 29.5 | -6.0 |
| 1.6 | 23.3                                     | 16.8  | 36.2 | -6.1 |
| 1.8 | 28.6                                     | 21.9  | 44.2 | -6.2 |
| 2.0 | 35.0                                     | 28.5  | 53.5 | -6.5 |
| 2.2 | 43.4                                     | 37.2  | 64.4 | -6.4 |

TABLE IV Achievable Improvement in NML With the DVTVK Circuit Technique as Compared to SD While Maintaining Equal Delay, Power Dissipation, or PDP (KPR of DVTVK is 2.2)

|            |      | Noise margin improvement as compared to SD |                   |  |
|------------|------|--------------------------------------------|-------------------|--|
| SD-KPR     |      | NML                                        | NML               |  |
|            |      | Zero Body Bias                             | Reverse Body Bias |  |
| Same Delay | 1.45 | 8.1%                                       | 0.0%              |  |
| Same Power | 1.61 | 6.1%                                       | -1.8%             |  |
| Same PDP   | 1.52 | 7.2%                                       | -0.8%             |  |

technique increases with larger keeper size as the degradation in circuit speed and power characteristics becomes more severe due to increased keeper contention. The enhancement in circuit speed, power, and PDP of DVTVK as compared to SD are 43.4%, 37.2%, and 64.4%, respectively, for a KPR of 2.2. The degradation in noise immunity (NML) changes between 5.9% and 6.5% as the KPR is varied between 0.6 and 2.2.

Similar to CG-DVTVK, the keeper transistors in a COR-DVTVK circuit can be sized larger, offering higher noise immunity with the same delay and power characteristics as compared to a SD logic circuit. The keeper transistors of COR-DVTVK and COR-SD are sized for the same delay, power, or PDP characteristics. The improvement in the NML of COR-DVTVK as compared to COR-SD (under both the maximum reverse body biased and zero body biased keeper conditions) are listed in Table IV. COR-DVTVK offers 8.1% higher noise immunity as compared to SD with the same evaluation speed. The larger size of the COR-DVTVK keeper compensates for the reduced gate drive (|Vgs - Vtp|) of the keeper transistor at the beginning of the evaluation phase when the keeper is reverse body biased. The noise margins of COR-DVTVK with a reverse body biased keeper and COR-SD for the same evaluation delay are, therefore, equal.

## *C.* Impact of Gate Size on the Energy Overhead of the Dynamic Body Bias Generator

It is assumed that each of the carry generator outputs (in Section IV-A) and the third stage footless domino OR gate outputs (in Section IV-B) drive a 10-fF load. The transistors in the domino logic circuits are sized to operate with a 1-GHz clock with a 50% duty cycle. In Fig. 6,  $W_{N1} = 25W_{min}$  and  $W_{pullup} = 8W_{min}$ . In Fig. 9,  $W_{pulldown} = 10W_{min}$  and  $W_{pullup} = 9W_{min}$ . In the body bias generators,  $P_1$ ,  $P_2$ ,  $P_3$ ,  $P_4$ ,  $N_1$ ,  $N_2$ , and the transistors within  $I_1$  are minimum sized ( $L = L_{min}$  and  $W = W_{min}$ ) while the size and number of

inverters have been adjusted to appropriately delay the body bias signals. The DVTVK circuit technique increases the required area by 2.3% to 2.8% and 2.6% to 3% as compared to CG-SD and COR-SD, respectively, for  $0.6 \le \text{KPR} \le 2.2$ . For increasing keeper size, the delay elements (the inverters) are resized to strengthen the body bias signal while most of the transistors forming the DBBG are minimum size. The energy savings due to the reduced contention current as compared to a SD circuit typically exceeds the additional energy dissipated by the body bias generator.

The affect of reducing the output load capacitance on the delay and power characteristics of the proposed DVTVK circuit technique is evaluated in this section for a four-bit multiple-output domino carry generator (CG) and a cascaded three stage eight input clock-delayed domino OR gates (COR). The load capacitance is scaled from 2–10 fF while maintaining a clock frequency of 1 GHz. The savings in the delay, power, and PDP of the CG-DVTVK and COR-DVTVK circuits varies with the load capacitance as shown in Fig. 10 (KPR = 2.2).

The DBBG is used to only drive the substrate of the keeper transistors in the domino logic circuits. Most of the transistors in a DBBG are, therefore, sized minimum even for a high output load capacitance. The energy overhead of DBBG becomes more significant as the pullup, pulldown, and the output inverter transistors of the domino logic circuits are scaled together with the load capacitance. As shown in Fig. 10, the power savings are reduced for a smaller output load capacitance. The degradation in the power savings of the CG is more significant as compared to COR at small load capacitances. This behavior is explained by the same DBBG being shared by several OR gates in the second and third stages of COR-DVTVK, reducing the overall energy overhead of the DBBG circuits. At high loads, however, the power savings of CG-DVTVK and COR-DVTVK are similar. The speed enhancement by the proposed DVTVK technique is primarily dependent on the relative size of the pulldown network transistors and the keeper. The effectiveness of the DVTVK circuit technique for improving the delay characteristics as compared to SD is relatively insensitive to the load capacitance as shown in Fig. 10 (for the same keeper to pulldown network transistor width ratio).

## V. DOMINO LOGIC WITH FORWARD AND REVERSE BODY BIASED KEEPER

Reverse body biasing the keeper at the beginning of the evaluation phase is effective for simultaneously improving the speed and power characteristics of domino logic circuits. Zero body biasing the keeper transistor after the worst case evaluation delay is proposed in order to not sacrifice noise immunity with the proposed variable threshold voltage keeper circuit technique. Alternatively, forward body biasing (FBB) the keeper after the worst case evaluation delay is proposed to improve the noise immunity characteristics as compared to SD. The threshold voltage of a forward body biased MOSFET is reduced, increasing the conduction current as compared to a zero body biased transistor with the same physical dimensions. FBB the keeper, therefore, improves the noise immunity characteristics as compared to a standard domino logic circuit with



Fig. 10. Variation of the savings in delay, power, and PDP of the CG-DVTVK and COR-DVTVK circuits with output load capacitance as compared to CG-SD and COR-SD, respectively (KPR = 2.2).

the same keeper size. The proposed DVTVK circuit technique with a forward and reverse body biased keeper is applied to cascaded footless domino OR gates. Simulation results for the COR-DVTVK with a forward body biased keeper are presented in Section V-A. Technology scaling characteristics of the reverse and FBB techniques applied to a keeper transistor are discussed in Section V-B.

## A. Clock-Delayed Domino Logic With Forward and Reverse Body Biased Keeper

A three stage chain of eight input domino OR gates with a fan-out of three (COR) is simulated assuming a 0.18- $\mu$ m CMOS technology. The only difference in the dynamic body bias generator (DBBG) of the domino circuit with a forward biased keeper is that  $V_{DD1}$  (as shown in Figs. 5 and 9) is replaced by a smaller supply voltage  $V_{DD3}$  ( $V_{DD3} < V_{DD1}$ ). A body bias signal that swings between  $V_{DD3}$  and  $V_{DD2}$  from a clock signal that swings between ground and  $V_{DD1}$  is generated in the first stage of the clock-delayed domino circuit. The substrate of the keepers within the domino logic gates in the following stages are driven by cascaded inverters supplied by  $V_{DD3}$  and  $V_{DD2}$ . An eight input footless domino OR gate with a FBB keeper is shown in Fig. 11.

When a keeper transistor is forward body biased the source-to-body and drain-to-body p-n junctions produce diode currents as illustrated in Fig. 11. The forward body bias voltage that can be applied to a MOSFET is limited due to these diode currents. The diode current through the drain-to-body p-n junction ( $I_{diode2}$ ) opposes the drain current ( $I_{drain}$ ) of a keeper transistor.  $I_{diode2}$  attempts to discharge the dynamic node while  $I_{drain}$  is charging the node. The drain-to-substrate current, therefore, reduces the net current supplied by the keeper to maintain the state of the dynamic node. The noise margin is



Fig. 11. An eight-input footless domino OR gate with a forward body biased keeper.



Fig. 12. Variation of the noise immunity of an eight-input domino OR gate with forward body bias voltage for KPR = 1 and KPR = 2.2. The noise immunity values are normalized to the zero body biased keeper condition. NML-ONE: noise couples to one input while all of the other inputs are grounded. NML-ALL: noise couples to all of the inputs.

greater at forward body bias voltages where the improvement in the keeper drain current due to the reduced threshold voltage



Fig. 13. Variation of the savings in delay, power, and PDP of COR-DVTVK as compared to COR-SD with the forward body bias voltage for two different keeper sizes. (Delay1, Power1, PDP1) : KPR =1. (Delay2.2, POWer2.2, PDP2.2) : KPR = 2.

dominates the increased drain-to-body junction current. For strongly forward body biased keepers,  $I_{diode2}$  lowers (clamps) the voltage of the dynamic node. At room temperature, the dc operating point of the dynamic node when all of the pulldown transistors are cutoff (ideal noiseless condition) is reduced by more than 5% for forward body bias voltages higher than 700 mV. The noise immunity can, therefore, be reduced, provided that the body diode is strongly turned on at high FBB voltages.

The noise immunity criterion used in this section is similar to the criterion described in [4]. The variation in the noise immunity characteristics of an eight input footless domino OR gate with the body bias voltage applied to the keeper transistor is shown in Fig. 12, for two different noise coupling scenarios. All of the values are normalized to the standard zero body biased keeper case. As shown in Fig. 12, increasing the forward body bias voltage towards 700 mV enhances the noise immunity. For a forward body bias voltage of 700 mV, the enhancement in noise immunity varies between 3.8% (noise couples to all of the inputs) and 11.2% (noise couples to only one input) as compared to a standard domino logic circuit with the same size transistors (KPR = 2.2). As the forward body bias voltage is increased beyond 700 mV, the body diodes are strongly turned on, degrading the noise immunity.

A FBB voltage of 700 mV provides the highest enhancement in the noise immunity characteristics at room temperature. For FBB voltages above 600 mV, however, the power overhead of the DVTVK circuit technique significantly increases due to the high diode currents. The variation of the savings in delay, power, and PDP of COR-DVTVK as compared to COR-SD with 500 and 600 mV FBB for two different KPR values is illustrated in Fig. 13. The improvement in delay, power, PDP, and NML of the DVTVK circuit technique as compared to SD for a forward

TABLE V SAVINGS IN DELAY, POWER, POWER-DELAY PRODUCT (PDP), AND NML OF COR-DVTVK AS COMPARED TO COR-SD (WITH THE FORWARD BODY BIAS VOLTAGE OF 0.6 V)

|     | Improvement (%) |       |      |         |         |
|-----|-----------------|-------|------|---------|---------|
| KPR | Delay           | Power | PDP  | NML-ALL | NML-ONE |
| 1   | 12.3            | -8.9  | 4.5  | 2.4     | 6.8     |
| 2.2 | 43.4            | 28.3  | 59.4 | 3.5     | 10.2    |

body bias voltage of 600 mV with two different keeper sizes is listed in Table V.

The speed enhancement of the DVTVK circuit technique is primarily dependent on the reverse body bias voltage applied to the keeper at the beginning of the evaluation phase. For a  $V_{DD2}$ of 3.6 V, therefore, the delay savings of the proposed DVTVK circuit is similar to the delay savings reported in Section IV. As shown in Fig. 13, the improvement in the delay of the DVTVK circuit technique is approximately 43% under the 500 mV and 600 mV FBB conditions (KPR=2.2).

The power overhead of the DVTVK circuit technique increases when the keeper is forward body biased due to the junction diode currents and the increased voltage swing of the DBBG and keeper substrate (from VDD1  $\rightarrow$  VDD2 to VDD3  $\rightarrow$  VDD2). As listed in Table V, the power savings of the DVTVK circuit technique is reduced to 28.3% as the forward body bias voltage is increased to 600 mV (KPR = 2.2 and load = 10 fF). Similar to the analysis described in Section IV, for smaller keeper sizes, the effect of the keeper contention current on the evaluation delay and power dissipation is less. The reduction in delay is, therefore, lower and the power savings is smaller with decreased keeper size. As the KPR is reduced to 1, the savings in delay and PDP are reduced to 12.3% and 4.5%,

respectively. Since the energy overhead of the DVTVK circuit technique increases when the keeper is forward body biased, the power dissipation of DVTVK is 8.9% higher as compared to SD for a KPR = 1 when the keeper transistor is forward body biased by 600 mV.

For a forward body bias voltage of 600 mV and KPR = 2.2, the enhancement in noise immunity varies between 3.5% (noise couples to all of the inputs) and 10.2% (noise couples to one input). For a KPR = 1, the range of the enhancement in the noise immunity for a 600 mV FBB is between 2.4% and 6.8%.

## B. Technology Scaling Characteristics of the Reverse and Forward Body Bias Techniques Applied to a Keeper Transistor

Dynamically adjusting the current drive of the keeper transistors in a domino logic circuit is proposed in this paper. The threshold voltage of a keeper transistor is modified during circuit operation by body biasing the keeper transistor. More general body bias schemes have been proposed in the literature in order to enhance speed (by lowering the threshold voltage of the transistors), to reduce active power (by lowering both the supply and threshold voltages while maintaining the same speed as compared to a high threshold voltage circuit), to decrease active and standby leakage current (by increasing the threshold voltage of the transistors in the idle portions of a circuit), or to control the within-die and die-to-die threshold voltage variations (by adaptive body biasing) [6]–[10]. In a circuit where the body bias voltages of all of the transistors are modified, the power and current demand of the body bias generator can become significant [6]. A dynamic body bias generator is proposed in this paper to drive only the keeper transistors in a domino logic circuit. The power and current demand of the body bias generator for the proposed variable threshold voltage keeper circuit technique is, therefore, small.

Reverse body biasing is typically applied to reduce the subthreshold leakage current  $(I_{\text{off}})$  when a circuit is idle [7], [8]. There is an exponential relationship between the subthreshold leakage current and threshold voltage of a MOSFET [13]. Reverse body biasing a transistor increases the threshold voltage, thereby reducing the subthreshold leakage current. Increasing the reverse body bias voltage, however, also increases the band-to-band tunnelling current in the source-to-substrate and drain-to-substrate p-n junctions. At high reverse body bias voltages, the increased band-to-band tunnelling current becomes comparable to the reduced subthreshold leakage current. There is, therefore, an optimum reverse body bias voltage (limited by the increased band-to-band tunnelling currents) that can be applied to a transistor to reduce the total leakage current [7], [8]. Reverse body biasing the keeper transistor is proposed in this paper in order to reduce the active mode conduction current  $(I_{\text{drain}}$  when the keeper is on) rather than the subthreshold leakage current ( $I_{\text{off}}$  when the keeper is off). The maximum reverse body bias that can be applied to a keeper transistor is, therefore, not limited by the increased band-to-band tunnelling current in the DVTVK circuit technique.

The maximum voltage that can be applied across the gate oxide of a MOSFET is another factor that limits the reverse body bias voltage. Due to the scaling of the gate oxide thickness, the maximum reverse body bias voltage that can be applied to a keeper can be reduced in future DSM technology generations. The savings in delay and power of the variable threshold voltage keeper circuit technique as compared to SD are reduced at lower keeper reverse body bias voltages as discussed in Section IV-A.

The effectiveness of reverse body biasing is reduced with technology scaling due to increasing short-channel and decreasing body effects [7], [8]. Forward body biasing has often been proposed as an alternative to reverse body biasing [8], [9]. FBB enhances body effect while reducing short-channel effects. FBB is expected to become more effective for controlling the threshold voltage of MOSFETs fabricated in future DSM process technologies as the supply to threshold voltage ratio decreases with technology scaling [6], [8]. FBB, however, produces diode currents through the source-to-substrate and drain-to-substrate p-n junctions. These diode currents can become comparable to the drain current of a keeper transistor at low drain-to-source voltages provided the forward body bias voltage is increased beyond a specific value dependent on the junction temperature (700 mV at room temperature). The diode currents degrade the dc operating voltage of the dynamic node even when all of the pulldown transistors are turned off. Moreover, the diode currents increase the power overhead of DVTVK dircuit technique. The increased diode currents, therefore, limit the maximum forward body bias voltage that can be applied to a keeper transistor for enhanced noise immunity.

### VI. CONCLUSION

A high-speed, low-power domino logic circuit technique is proposed. The proposed technique dynamically changes the threshold voltage of the keeper with a specific delay after the beginning of each operational phase (evaluation and precharge) of the domino circuit by varying the body bias voltage of the keeper transistor. The keeper contention current is reduced by increasing the keeper threshold voltage by applying a reverse body bias to the keeper at the beginning of the evaluation phase. Similarly, the degradation in noise immunity of DVTVK as compared to SD is avoided by reducing the keeper threshold voltage to the zero body bias level after a delay greater than the worst case evaluation delay of a domino logic circuit. Significant enhancements in speed and reductions in power are achieved when the keeper is sized for increased noise immunity.

The DVTVK and SD circuit techniques are compared in terms of the evaluation delay and power dissipation assuming the DVTVK and SD circuits have the same keeper size. The DVTVK technique operates at up to 60% higher speed while consuming 35% less power as compared to SD. DVTVK also reduces the PDP by up to 74% as compared to SD. A temporary degradation in the noise immunity of DVTVK of less than 11% as compared to SD is observed when the keeper of the DVTVK is reverse body biased.

Since the contention current is significantly reduced with the proposed variable threshold voltage keeper technique, the keeper transistor in a DVTVK circuit can be sized larger, offering greater noise immunity with the same delay and power characteristics as compared to a SD logic circuit. The DVTVK and SD circuit techniques are compared in terms of the noise immunity that the two circuit techniques offer with the same evaluation delay, power dissipation, or power-delay product characteristics. For the same evaluation delay characteristics, DVTVK (with a zero biased keeper) offers 14.1% higher noise immunity as compared to SD. Under the same power dissipation conditions, DVTVK (with a zero biased keeper) improves the noise immunity by 8.9% as compared to SD. Similarly, under the same PDP conditions, DVTVK (with a zero biased keeper) offers 11.9% higher noise immunity as compared to SD.

Forward body biasing the keeper transistor is also proposed to improve the noise immunity as compared to a SD circuit with the same keeper size. By applying a forward body bias of 600 mV to a keeper transistor, the noise immunity is enhanced by up to 10.2%. Dynamically forward and reverse body biasing the keeper transistor simultaneously enhances the noise immunity, evaluation speed, power dissipation, and PDP characteristics of a domino logic circuit.

#### REFERENCES

- V. Kursun and E. G. Friedman, "Low swing dual threshold voltage domino logic," in *Proc. ACM/SIGDA Great Lakes Symp. VLSI*, Apr. 2002, pp. 47–52.
- [2] K. J. Nowka and T. Galambos, "Circuit design techniques for a Gigahertz integer microprocessor," in *Proc. IEEE Int. Conf. Computer De*sign VLSI Computers Processors, Oct. 1998, pp. 11–16.
- [3] A. Alvandpour, P. Larsson-Edefors, and C. Svensson, "A leakage tolerant multi-phase keeper for wide domino circuits," in *Proc. IEEE Int. Conf. Electronics Circuits Systems*, Sept. 1999, pp. 209–212.
- [4] A. Alvandpour, R. K. Krishnamurty, K. Soumyanath, and S. Y. Borkar, "A sub-130-nm conditional keeper technique," *IEEE J. Solid-State Circuits*, vol. 37, pp. 633–638, May 2002.
- [5] M. W. Allam, M. H. Anis, and M. I. Elmasry, "High-Speed dynamic logic styles for scaled-down CMOS and MTCMOS technologies," in *Proc. IEEE Int. Symp. Low-Power Electronics Design*, July 2000, pp. 155–160.
- [6] C. Wann *et al.*, "CMOS with active well bias for low-power and RF/analog applications," in *Proc. IEEE Int. Symp. VLSI Technology*, June 2000, pp. 158–159.
- [7] A. Keshavarzi, S. Narendra, S. Borkar, C. Hawkins, K. Roy, and V. De, "Technology scaling behavior of optimum reverse body bias for standby leakage power reduction in CMOS IC's," in *Proc. IEEE Int. Symp. Low-Power Electronics Design*, July 1999, pp. 252–254.
- [8] S. H. Huang *et al.*, "Scalability and biasing strategy for CMOS with active well bias," in *Proc. IEEE Int. Symp. VLSI Technology*, June 2001, pp. 107–108.
- [9] A. Keshavarzi, S. Narendra, B. Bloechel, S. Borkar, and V. De, "Forward body bias for microprocessors in 130 nm technology generation and beyond," in *Proc. IEEE Int. Symp. VLSI Circuits*, June 2002, pp. 312–315.
- [10] J. Tschanz, S. Narendra, R. Nair, and V. De, "Effectiveness of adaptive supply voltage and body bias for reducing the impact of parameter variations in low power and high-performance microprocessors," *Proc. IEEE Int. Symp. VLSI Circuits*, pp. 310–311, June 2002.
- [11] I. S. Hwang and A. L. Fisher, "Ultrafast compact 32-bit CMOS adders in multiple-output domino logic," *IEEE J. Solid-State Circuits*, vol. 24, pp. 358–369, Apr. 1989.
- [12] V. Kursun and E. G. Friedman, "Domino Logic With Variable Threshold Voltage Keeper, U.S. Patent Pending".

- [13] M. Anders, R. Krishnamurthy, R. Spotten, and K. Soumyanath, "Robustness of sub-70 nm dynamic circuits: Analytical techniques and scaling trends," *Proc. IEEE Int. Symp. VLSI Circuits*, pp. 23–24, June 2001.
- [14] V. Kursun and E. G. Friedman, "Variable threshold voltage keeper for contention reduction in dynamic circuits," *Proc. IEEE Int. ASIC/SOC Conf.*, pp. 314–318, Sept. 2002.
- [15] V. Kursun and E. G. Friedman, "Domino logic with dynamic body biased keeper," *Proc. IEEE Eur. Solid-State Circuits Conf.*, pp. 675–678, Sept. 2002.
- [16] V. Kursun and E. G. Friedman, "Sleep switch dual threshold voltage domino logic with reduced standby leakage," *IEEE Trans. VLSI Syst.*, to be published.

**Volkan Kursun** (S'01) received the B.S. degree in electrical and electronics engineering from the Middle East Technical University, Ankara, Turkey, in 1999, and the M.S. degree in electrical and computer engineering from the University of Rochester, Rochester, NY, in 2001. He is currently working toward the Ph.D. degree at the University of Rochester.

During summers 2001 and 2002, he was with Intel Labs., Hillsboro, OR, working as a Graduate Technical Intern, responsible for the modeling and design of high switching frequency monolithic dc–dc converters. His current research interests include high-performance and low-power integrated circuit design.

**Eby G. Friedman** (S'78–M'79–SM'90–F'00) received the B.S. degree from Lafayette College, Easton, PA, in 1979, and the M.S. and Ph.D. degrees from the University of California, Irvine, all in electrical engineering, in 1981 and 1989, respectively.

From 1979 to 1991, he was with Hughes Aircraft Company, Carlsbad, CA, where he was Manager of the Signal Processing Design and Test Department, responsible for the design and test of high-performance digital and analog ICs. In 1991, he joined the Department of Electrical and Computer Engineering, University of Rochester, Rochester, NY, where he is currently a Distinguished Professor, Director of the High-Performance VLSI/IC Design and Analysis Laboratory, and Director of the Center for Electronic Imaging Systems. During the 2000-2001 academic year, he was on sabbatical at the Technion–Israel Institute of Technology. His current research and teaching interests include high-performance synchronous digital and mixed-signal microelectronic design and analysis with application to high-speed portable processors and low-power wireless communications. He is author of more than 200 papers and book chapters, several patents, and author or editor of seven books in the fields of high-speed and low-power CMOS design techniques, high-speed interconnect, and the theory and application of synchronous clock distribution networks.

Dr. Friedman is the Chair of the IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION SYSTEMS Steering Committee, Regional Editor of the Journal of Circuits, Systems, and Computers, and is a Member of the editorial boards of the PROCEEDINGS OF THE IEEE, Analog Integrated Circuits and Signal Processing, and Journal of VLSI Signal Processing. He is a Member of the Circuits and Systems (CAS) Society Board of Governors, and a Member of the Technical Program Committee of a number of conferences. He was previously Editor-in-Chief OF IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATED (VLSI) SYSTEMS, a Member of the Editorial Board of the IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS-II: ANALOG AND DIGITAL SIGNAL PROCESSING, IEEE CAS liaison to the Solid-State Circuits Society, Chair of the VLSI Systems and Applications IEEE CAS Technical Committee, Chair of the Electron Devices Chapter of the IEEE Rochester Section, Program and Technical chair of several IEEE conferences, Guest Editor of several special issues in a variety of journals, and a recipient of the Howard Hughes Masters and Doctoral Fellowships, an IBM University Research Award, an Outstanding IEEE Chapter Chairman Award, and a University of Rochester College of Engineering Teaching Excellence Award. He is a Senior Fulbright Fellow.