# Resource Based Optimization for Simultaneous Shield and Repeater Insertion

Renatas Jakushokas, Student Member, IEEE, and Eby G. Friedman, Fellow, IEEE

Abstract—A new approach for resource based optimization for high performance integrated circuits is presented. The methodology is applied to simultaneous shield and repeater insertion, resulting in minimum coupling noise under power, delay, and area constraints. Design expressions exhibiting parabolic noise behavior are compared with SPICE simulations. Due to the parabolic coupled noise behavior, the minimum noise is established. A design case is compared with only shielding and only repeater insertion techniques, exhibiting enhanced performance for different resources.

Index Terms—Area, delay, noise, optimization, power, repeater insertion, resources, shielding, tradeoff surface.

## I. INTRODUCTION

**F** URTHER INCREASES in integrated circuit (IC) scaling requires more efficient devices, circuits, and systems in terms of power, delay, noise, and area. Efficient optimization processes are therefore required. To achieve this capability, many different design techniques are used. In many cases, only one technique is implemented; however, two or more techniques applied simultaneously may provide higher performance. A methodology that considers multiple design objectives while satisfying system requirements typically utilizes lower resources. Optimization processes and related design techniques applied to high performance ICs are the topic of this paper.

A standard optimization process is based on a *cost* function. There are two steps involved in this process, i.e., building a function and determining the optimal value of the function. The *cost* function is typically a sum of coefficients multiplied by the resources or a product of resources with power coefficients, such as

$$\cot = \alpha_1 \cdot \operatorname{power} + \alpha_2 \cdot \operatorname{delay} + \alpha_3 \cdot \operatorname{noise} + \alpha_4 \cdot \operatorname{area}$$
(1)

$$\cos t = \operatorname{power}^{-1} \cdot \operatorname{detay}^{-2} \cdot \operatorname{noise}^{-5} \cdot \operatorname{area}^{-4}$$
(2)

where  $\alpha$  and  $\beta$  characterize the importance of a particular resource. In [1], the function with  $\beta_1 = \beta_2 = 1$  and  $\beta_3 = \beta_4 = 0$ , referred to as a power-delay product, is used to optimize a

The authors are with the Department of Electrical and Computer Engineering, University of Rochester, Rochester, NY 14627-0231 USA (e-mail: jakushok@ece.rochester.edu; friedman@ece.rochester.edu).

Digital Object Identifier 10.1109/TVLSI.2009.2015950

system of tapered buffers. While normalization is required for the resources in (1), (2) is more complicated. The primary disadvantage of a standard optimization process is the requirement to select the values of  $\alpha$  and  $\beta$  prior to the optimization process.

IC development can be functionally separated into two major layers, namely, the design layer and the supportive layer. The design layer includes the architecture, circuit, and interconnect. The power supply system, clock distribution network, and substrate are related to the supportive layer. In the literature, a number of local optimization techniques have been published for each separate group of the layers. For interconnect, low-swing interconnects [2], cascaded buffers [3], repeater insertion [4], shielding [5], differential signaling [6], [7], active regeneration [8], [9], intentional skewing [10], bus swizzling [11], [12], and tapered interconnects [13] are well known design techniques. Each technique trades off power, delay, noise, and area differently. Delay, bandwidth, and power for RC and RLC interconnects have been investigated in [14], however, only one design technique, repeater insertion, is used. By combining some of these techniques, more efficient results may be achieved. In [15], two methods, i.e., shield and repeater insertion, have been combined to reduce noise within a standard optimization process.

In this paper, a general resource based optimization process is presented. Any design constraint may be characterized as a resource. Some constraints, such as power and area, are more commonly treated as a resource. Other design objectives, such as delay or noise, are less commonly referred to as a resource. A practical application is composed of a combination of optimization processes and multiple design techniques. A methodology that considers these issues in an integrated fashion is the focus of this paper. Two different techniques that provide immunity to coupled noise, namely, shield and repeater insertion, have been combined based on resource optimization to exemplify this process. Each of the techniques exhibits different power, delay, noise, and area resource characteristics.

This paper is organized as follows. Limitations to the standard optimization process that motivates resource based optimization processes are described in Section II. This process is simultaneously applied to shield and repeater insertion in Section III. Each resource model is also presented in this section. A practical case study is presented in Section IV. In Section V, simultaneous shield and repeater insertion techniques are compared with only shielding and only repeater insertion. Finally, this paper is concluded in Section VI.

## II. RESOURCE BASED OPTIMIZATION PROCESS

Limitations in standard optimization processes are described in Section II-A. The theory and limitations of resource based

Manuscript received June 24, 2008. First published August 11, 2009; current version published April 23, 2010. This work was supported in part by the National Science Foundation under Grants CCF-0541206, CCF-0811317, and CCF-0829915, by the New York State Office of Science, Technology and Academic Research to the Center for Advanced Technology in Electronic Imaging Systems, by Intel Corporation, by Eastman Kodak Company, and by Freescale Semiconductor Corporation.



Fig. 1. Optimization flow diagram. (a) Standard and (b) resource based optimization processes.

optimization processes are presented in Sections II-B and II-C, respectively. Different design techniques are introduced in Section II-D.

#### A. Limitations in Standard Optimization Processes

A general flow for a standard optimization process is shown in Fig. 1(a). The primary disadvantage of this flow is the need for user involvement before the optimization process is initiated. The *cost* function and coefficients must be allocated for each resource. For the same system, two users may choose different coefficients and thereby produce different results. Additionally, some resources have changing importance. These aspects constrain the standard optimization process.

## **B.** Resource Based Optimization Processes

To overcome these limitations, a different resource based optimization process is proposed. The user involvement occurs at the end of this process. In Fig. 1(b), a flow diagram of this resource based optimization process is presented.

In order to provide insight into the resource based optimization flow, consider a system where

$$area = f_1(width) \tag{3}$$

noise = 
$$f_2(\text{width})$$
. (4)

A fundamental assumption in (3) and (4) is that the width determines the area and noise. Conversely, the area or noise may determine the width. By inverting (3), the same system is described by

width = 
$$f_1^{-1}(\text{area})$$
 (5)

noise = 
$$f_2($$
width $)$ . (6)

Substituting (5) into (6), the same system can be characterized by

noise = 
$$f_2 \left[ f_1^{-1}(\text{area}) \right]$$
. (7)

This system representation describes the relationship between the two resources and can be presented as a tradeoff line.

Power, area, noise, and delay are four primary design criteria. The number of variables, e.g., line width, shield width, number of repeaters, and power supply, is typically high. Any system can be represented by n variables and n + 1 resources

where  $res_1, res_2, \ldots, res_{n+1}$  are the resources, such as power, delay, noise, and area, and  $a_1, a_2, \ldots, a_n$  are the variables, such

Authorized licensed use limited to: UNIVERSITY OF ROCHESTER. Downloaded on May 04,2010 at 16:53:41 UTC from IEEE Xplore. Restrictions apply.

as line width, shield width, and length. Inverting the first n equations in (8)

$$\begin{array}{c}
a_{1} = g_{1}(res_{1}, res_{2}, \dots, res_{n}) \\
a_{2} = g_{2}(res_{1}, res_{2}, \dots, res_{n}) \\
\vdots \\
a_{n} = g_{n}(res_{1}, res_{2}, \dots, res_{n}) \\
res_{n+1} = f_{n+1}(a_{1}, a_{2}, a_{3}, \dots, a_{n}).
\end{array}$$
(9)

To exemplify this process, if n equations in (8) are invertible, (9) describes the same system. The first n equations in (9) are substituted into the last equation in (9), resulting in

$$res_{n+1} = f_1 [g_1(res_1, res_2, \dots, res_n),$$

$$g_2(res_1, res_2, \dots, res_n),$$

$$\dots,$$

$$g_n(res_1, res_2, \dots, res_n)].$$
(10)

Representing the system by (10), the interaction is among the resources and not among the design variables. The function described in (10) represents a solution space. The behavior of each resource among the other resources is referred to here as a tradeoff surface.

#### C. Limitations in Resource Based Optimization Processes

Resource based optimization also exhibits limitations. These limitations can be categorized as follows:

- 1) model inaccuracies;
- 2) function inversability.

In a standard optimization process, inaccuracy in the models produces quantization error. In resource based optimization, however, this error is cumulative. Due to these additive errors, the models used in this optimization process must be sufficiently accurate. Otherwise, only the fidelity of the final function may be useful.

Function inversability is a different limitation in resource based optimization processes. For y = f(x), where x cannot be directly extracted, certain techniques are required to provide inversability. Some of these techniques are truncation, Taylor expansion, and approximation, which can lead to greater model inaccuracy.

In Section III, a case study is presented where these resource based process limitations are demonstrated. The limitations are described, and strategies for overcoming these constraints are provided.

# D. Local Optimization Techniques

Several techniques have been proposed in the literature to overcome interconnect noise, such as shielding, repeater insertion, differential signaling, active regeneration, intentional skewing, and bus swizzling. Each of these techniques protects the interconnect from coupled noise in a different way and requires different resources. The following section focuses on two commonly used techniques, namely, shield and repeater insertion.

# **III. SHIELD AND REPEATER INSERTION**

Placing a shield beside and inserting repeaters along a victim line are chosen to exemplify the resource based optimization process. The width of the shield line and the number and size of the repeaters are chosen to express noise on the victim line as a function of power, area, and delay resources. Repeater insertion, shielding, and basic resource expressions are summarized in the following section. As compared to [15], where a *cost* function is used, this paper is based on resource optimization. In [15], the noise is modeled based on the Devgan metric [16], while in this paper, the shielded noise model is based on [17].

#### A. Repeater Insertion

Repeater insertion is a well known design technique to reduce the delay required to propagate a signal along a line [4]. The objective is to divide the interconnect into smaller sections, reducing the quadratic delay dependence on length to a linear dependency, thereby reducing the overall delay [18]. If the number of repeaters is too small, the delay due to the interconnect will be dominant. If the number of repeaters is too large, the repeater delay dominates. The optimal number of repeaters that minimizes the overall delay has been presented in [4], [14], and [18].

An additional advantage of repeater insertion is reducing the coupled noise from adjacent interconnects. It is impractical, however, to insert excessive repeaters due to delay, power, and area constraints.

## B. Shielding

Shielding inserts an additional line between a victim line and an aggressor line. This technique can be divided into two major categories: passive and active shielding [19]. The focus of this paper is on passive shielding. A passive shield line is connected to the power/ground network, filtering the noise from the aggressor away from the victim line. The technique is highly effective, although significant area is required.

# C. Resources

Four primary resources for simultaneous shield and repeater insertion are considered: power, delay, noise, and area. In this paper, the resource models are based on a 0.18  $\mu$ m CMOS technology.

1) Power: Two primary power dissipation sources are considered. The first source, dynamic power, is used to charge and discharge the interconnect and transistor capacitances. The second source, short-circuit power, also occurs when the transistors switch. During the switching time, the current from the power to ground network passes through the NMOS and PMOS transistors. This power component is typically in the range of 5%–10% of the overall transient power. The total transient power is the summation of the dynamic and short-circuit power

$$power = power_{dyn} + power_{sc}$$
. (11)

The dynamic power is

$$power_{dyn} = \alpha C_{eff} V_{dd}^2 f$$
 (12)

where  $V_{dd}$  and f are the power supply voltage and operating frequency, respectively.  $\alpha$  is a switching coefficient characterizing the switching behavior, and  $C_{eff}$  is the effective capacitance

$$C_{\text{eff}} = k \left( \frac{C_{\text{line}}}{k} + C_{\text{transistor}} \right) = C_{\text{line}} + c_o h k.$$
(13)

 $C_{\text{line}}$ ,  $c_o$ , h, and k are the line capacitance, minimum gate capacitance, ratio between the final and minimal transistor widths, and the number of inserted repeaters along the victim line, respectively. The short-circuit power for one transistor is [18]

$$\text{power}_{\text{sc}} = \left| \ln \left( \frac{v_{tn}}{V_{\text{dd}} + v_{tp}} \right) \right| \frac{C + \vartheta_{do} RC}{\vartheta_{do}} I_{\text{peak}} f V_{\text{dd}} \quad (14)$$

where  $v_{tn}$  and  $v_{tp}$  are the threshold voltages of the NMOS and PMOS transistors, respectively. R and C are the lumped load resistance and capacitance, respectively.  $\vartheta_{do}$  is the saturation velocity, also defined in [18], and  $I_{\text{peak}}$  is the maximum saturation current of the switching transistor and is expressed as

$$I_{\text{peak}} = \frac{\mu_n c_{\text{ox}}}{2} \frac{w}{l} \left(\frac{V_{\text{dd}}}{2} - v_{tn}\right)^2 \tag{15}$$

where  $\mu_n$ ,  $c_{ox}$ , w, and l are the N-type mobility, oxide capacitance, width, and length of the transistor, respectively. Expressing (14) in h and k, the following terms are substituted:

$$C = c_o h + \frac{c_{\rm int}}{k} \tag{16}$$

$$R = \frac{r_{\text{int}}}{k} \tag{17}$$

$$\vartheta_{do} = \vartheta_{do_o} h = \frac{n}{r_o} \tag{18}$$

$$w = w_o h \tag{19}$$

where  $r_o, w_o$ , and  $\vartheta_{do_o}$  represent the minimum resistance, minimum width, and minimum saturation velocity of the transistor, respectively.  $r_{\text{int}}$  and  $c_{\text{int}}$  are the resistance and capacitance of the victim line, respectively. The NMOS and PMOS threshold voltages are assumed to be equal, permitting the total short-circuit power to be expressed as

$$power_{sc} = k \left| \ln \left( \frac{v_t}{V_{dd} + v_t} \right) \right| \frac{\left( c_o h + \frac{c_{int}}{k} \right) \left( 1 + \frac{h}{r_o} \frac{r_{int}}{k} \right)}{\frac{h}{r_o}} \cdot \frac{\mu_n c_{ox}}{2} \frac{h w_o}{l} \left( \frac{V_{dd}}{2} - v_t \right)^2 f V_{dd}.$$
(20)

2) Delay: Minimizing the overall interconnect delay in a repeater system has been investigated in [4]. In [18], a more accurate delay expression is presented based on the saturation velocity characteristic

$$delay = k\alpha_1 \frac{C + \vartheta_{do} RC}{\vartheta_{do}}$$
(21)

where  $\alpha_1$  is relative to the propagation delay and equal to 0.693 for 50% of the voltage waveform (or 2.3 for 90%). Substituting (16)–(18) into (21), the signal propagation delay is

$$delay = k\alpha_1 \frac{\left(c_o h + \frac{c_{int}}{k}\right) \left(1 + \frac{h}{r_o} \frac{r_{int}}{k}\right)}{\frac{h}{r_o}}.$$
 (22)



Fig. 2. Model of shielding effect with coupling noise [5].

Two resources, *power* and *delay*, only affect the repeater insertion process. Another two resources, *noise* and *area*, are defined simultaneously for both shield and repeater insertion.

*3) Noise:* Noise modeling in shielded interconnect has been investigated in [5] and [17]. From the shield model used in [17] and shown in Fig. 2, the noise as a function of the shield line width is approximated by

$$noise_{sh} = C_1 e^{-C_2 w_{sh}} \tag{23}$$

where  $w_{\rm sh}$  is the width of the shield line, and  $C_1$  and  $C_2$  are constants extracted from the model. The noise voltage is normalized to  $V_{\rm dd}$ , beginning from  $C_1$  with no shield line present  $(w_{\rm sh} = 0)$  and exponentially decreasing with wider shield lines. The exponential term emphasizes the effectiveness of this technique. Repeater insertion divides the overall length of the line into smaller sections. Assuming a uniform distribution of the noise along the victim line, the total noise of the line is

$$noise_{rep} = \frac{1}{k}$$
(24)

dividing the noise by the number of inserted repeaters. The total effect of inserting a shield line and repeaters is expressed as a product

noise = noise<sub>sh</sub> · noise<sub>rep</sub> = 
$$C_1 e^{-C_2 w_{\rm sh}} \frac{1}{k}$$
. (25)

4) Area: A schematic layout of a shielded line with repeaters is shown in Fig. 3. The width ratio between the PMOS and NMOS transistors is three. The PMOS transistor is designed in a stack structure to reduce the overall width. Half of the NMOS and PMOS transistors are under the signal line, resulting in a total repeater width of  $hw_o$ . Note that the power, ground, and aggressor lines are not shown and are not considered in the area expression. The area of the structure shown in Fig. 3 is

area = length(
$$w_{\text{line}} + w_{\text{sh}} + s + hw_o$$
) (26)

where length,  $w_{\text{line}}$ , and s represent the total length, signal line width, and spacing between the signal line and shield line, respectively. Two terms in this equation, h and  $w_{\text{sh}}$ , are the design variables.

Authorized licensed use limited to: UNIVERSITY OF ROCHESTER. Downloaded on May 04,2010 at 16:53:41 UTC from IEEE Xplore. Restrictions apply.



Fig. 3. Schematic layout of a signal line with shield line and repeaters to reduce coupling noise.

#### D. Coupling Noise With Resource Based Optimization

The resource models are summarized in (27)–(30) and expressed in terms of the resources and variables

$$power = f_1(h,k) \tag{27}$$

$$delay = f_2(h,k) \tag{28}$$

noise = 
$$f_3(w_{\rm sh}, k)$$
 (29)

$$\operatorname{area} = f_4(w_{\rm sh}, h). \tag{30}$$

Due to the two common variables, a resource based optimization procedure is initiated with (27) and (28). The overall power equations are noninvertible, demonstrating the limitation of this procedure. The truncation method is therefore used, where the short-circuit power term is dropped, resulting in a successful inversion

$$h = g_1(\text{power}_{\text{dvn}}, \text{delay})$$
 (31)

$$k = g_2(\text{power}_{dyn}, \text{delay}).$$
 (32)

The power becomes  $power_{dyn}$  to emphasize that only dynamic power is considered. The short-circuit power is added later in the procedure. Equations (31) and (32) are substituted into (30)

area = 
$$f_4(w_{\rm sh}, g_1(\text{power}_{\rm dyn}, \text{delay}))$$
. (33)

Inverting (33), the width of the shield line is

$$w_{\rm sh} = g_4(\text{area, power}_{\rm dyn}, \text{delay}).$$
 (34)



Fig. 4. Noise as a function of power and delay in a system with shields and repeaters.

Substituting (31), (32), and (34) into (29), the noise function is

noise = 
$$f_3(\text{area}, \text{power}_{dvn}, \text{delay})$$
. (35)

Note that the noise is not a function of the number or size of the repeaters or width of the shield line.

# **IV. SIMULATION RESULTS**

A case study with inserted repeaters and a shielded victim line is considered. The area, power, delay, and noise are evaluated for this system. Several physical parameters are chosen to reflect practical design characteristics. Specifically,  $s = 0.5 \ \mu m$ , length = 1 mm,  $V_{\rm dd} = 1.8 \ V$ ,  $v_t = 0.5 \ V$ ,  $l = 0.18 \ \mu m$ ,  $c_{\rm int} = 250 \ {\rm fF}$ ,  $r_{\rm int} = 11 \ \Omega$ ,  $w_{\rm line} = 2 \ \mu m$ ,  $w_o = 0.5 \ \mu m$ ,  $C_1 = 7.25\%$ , and  $C_2 = 1.33 \cdot 10^6 \ {\rm m}^{-1}$ . By increasing the area, the noise is reduced since wider shield lines and additional repeaters are possible. The noise monotonically decreases as a function of area; therefore, the area is set to a value of 4.15 nm<sup>2</sup>, which is a practical design value.

Each solution of (35) represents a specific h, k, and  $w_{\rm sh}$ , which determines the short-circuit power from (20). The short-circuit power is added to the dynamic power, permitting the overall power dissipation to be estimated.

A graph presenting noise as a function of power and delay is shown in Fig. 4. Note the relationship among power, delay, and noise, generating a tradeoff surface, permitting different tradeoffs to be made. The top view of the graph shown in Fig. 4 is shown in Fig. 5, where the lighter region indicates a higher noise. For this design case, a 180 ps delay is the minimum delay, as shown in Fig. 5. This delay is not the same as determined in [4], [14], and [18] since power, noise, and area are also considered. The lower edge of the power curve, shown in Fig. 5, saturates to a minimum power value. This curve does not reach zero due to the minimum power required to charge and discharge the line capacitance.

In Fig. 6, noise is presented as a function of delay at a constant power and maximum allowed area. An increase in delay will reduce the coupling noise since more repeaters or wider shield lines are available. The exponentially increasing curve, shown in



Fig. 5. Top view of Fig. 4. The lighter color represents a larger amount of noise.



Fig. 6. Noise as a function of delay at a constant power (50  $\mu$ W) and maximum allowed area (4.15 nm<sup>2</sup>).

Fig. 6, indicates the noise penalty from choosing a value close to the minimum delay. Note that by relaxing the delay constraint, the coupling noise is significantly smaller.

Noise as a function of power at the maximum allowed delay and area is shown in Fig. 7. The graph consists of two different regions. The noise is reduced by increasing the power, and the noise increases at a higher power. This parabolic noise behavior can be exploited to determine the minimum noise for this system. To motivate these results, three cases, shown in Fig. 7, have been evaluated. The first case, at a power of 29  $\mu$ W, produces a 1.1% noise (normalized to  $V_{dd}$ ). The noise voltage in this case is 21 mV. The noise for the second case located at a power of 49  $\mu$ W is 0.65% (or 11.5 mV). The final case at a power of 70  $\mu$ W produces 0.8% (or 14 mV) noise. The 20 mV noise difference between the first and second case exemplifies the tradeoff. The noise difference between the second and third case is smaller but significant.

The effects of k (number of repeaters), h (width of the repeater), and  $w_{\rm sh}$  (width of the shield line) as a function of power are shown in Fig. 8. The area and delay are maintained at maximum values. With an increase in power, the number and width



Fig. 7. Noise as a function of power at the maximum allowed delay (350 ps) and area (4.15  $\text{nm}^2$ ).



Fig. 8. k, h, and  $w_{\rm sh}$  as a function of power at the maximum delay (350 ps) and area (4.15 nm<sup>2</sup>).

 TABLE I

 Three Design Cases Shown in Fig. 7 And Evaluated in SPICE

|             | <i>k</i> (number of repeaters) | $h \cdot 0.5$ (width of the repeaters) | $w_{sh}$ (width of the shield line) |
|-------------|--------------------------------|----------------------------------------|-------------------------------------|
| First case  | 2                              | 0.8 μm                                 | 0.8 µm                              |
| Second case | 6                              | 1.2 µm                                 | 0.5 µm                              |
| Third case  | 8                              | 1.5 µm                                 | $0.1 \ \mu m$                       |

of repeaters increase at a different rate, maintaining a constant delay. Simultaneously, the width of the shield lines decreases, providing more space for larger repeaters while maintaining the area constant. The larger number of repeaters reduces the noise; however, the reduction in the shield width increases the noise. Adding repeaters at lower power levels reduces the noise more than adding repeaters at higher power levels. Hence, at lower power levels, the most efficient noise reduction technique is repeaters, while at higher power levels, the most efficient noise reduction technique is shield lines, as shown in Fig. 7. Both of these techniques reduce the noise, exhibiting a parabolic noise behavior, allowing the minimum noise design to be determined.

|          | k    | h    | w <sub>sh</sub> | Delay [psec] | Change<br>in Delay [%] | Power [µW] | Change<br>in Power [%] | Noise [mV] | Change<br>in Noise [%] |
|----------|------|------|-----------------|--------------|------------------------|------------|------------------------|------------|------------------------|
|          | 2.04 | 1.63 | 0.83            | 350          | 0.0                    | 28.9       | 41.1                   | 21.1       | 82.5                   |
| Analytic | 5.91 | 2.33 | 0.48            | 350          |                        | 49.0       |                        | 11.6       |                        |
|          | 8.04 | 3.04 | 0.13            | 350          | 0.0                    | 69.6       | 42.1                   | 13.7       | 18.2                   |
| SPICE    | 2    | 1.63 | 0.83            | 520          | 6.6                    | 44.9       | 22.0<br>33.1           | 15.7       | 54.5                   |
|          | 6    | 2.33 | 0.48            | 557          |                        | 57.6       |                        | 9.1        |                        |
|          | 8    | 3.04 | 0.13            | 563          |                        | 76.6       |                        | 14.0       |                        |

 TABLE II

 Analytic and SPICE Results for Three Design Cases From Table I and Fig. 7



Fig. 9. Delay, power, and noise for three different design cases. Analytic and SPICE results are compared.

In this case, the minimum noise occurs at 49  $\mu$ W of total power and contributes only 0.65% (or 11.5 mV) noise.

This concept is evaluated on a system composed of a victim interconnect with several repeaters and a shield line. Three design cases, listed in Table I, are considered.

The power, delay, and noise are determined from SPICE simulations. The analytic model and SPICE results are compared in Fig. 9 and Table II for three cases, listed in Table I, and shown in Fig. 7. In Table II, the change in delay, power, and noise is determined relative to the minimum noise design case (second case). In the analytic model, the delay is maintained constant; however, small changes in the delay are noted from SPICE. The power resulting from the analytic model and SPICE is similar. The noise evaluated from SPICE also exhibits good agreement with the analytic model. The SPICE results demonstrate the same parabolic noise behavior when simultaneously applying shield and repeater insertion. The noise is lower in the second design case than the first and third design cases, confirming the parabolic noise behavior. The minimum noise is achieved with simultaneous shield and repeater insertion while satisfying power, area, and delay constraints.

TABLE III Comparison Among Shielding, Repeater Insertion, and Shield and Repeater Insertion Techniques

|                                                  | Noise   | Area                 | Power   | Delay    |
|--------------------------------------------------|---------|----------------------|---------|----------|
| Only Shielding                                   | 14.5 mV | 4.15 nm <sup>2</sup> | 22.2 μW | 515 psec |
| Only Repeaters                                   | 13.0 mV | 4.15 nm <sup>2</sup> | 86.7 μW | 354 psec |
| Simultaneous<br>Shield and<br>Repeater Insertion | 11.5 mV | 4.15 nm <sup>2</sup> | 49.0 μW | 350 psec |

# V. COMPARISON OF SHIELD AND REPEATER INSERTION TECHNIQUES

A comparison of simultaneous shield and repeater insertion with only shielding (without repeater insertion) and only repeater insertion (without shielding) is discussed in this section. The same resources are compared: power, delay, area, and noise. A constant area of  $4.15 \text{ nm}^2$  is assumed.

In only shielding, all of the area except for the victim line and spacing is dedicated to the shield line. A 1.65  $\mu$ m shield line is inserted between the aggressor and victim lines. The reduction in coupled noise is only due to the shield line, and according to (25), when k = 1 (a single driver repeater), the coupled noise is 0.81% (or 14.5 mV). The power dissipation is minimal, only 22.2  $\mu$ W, since dynamic power is only dissipated for the line and driver repeater, and a small amount of short-circuit power to switch the driver repeater. The delay, however, increases to 515 ps.

In the repeater insertion case (without shielding), emphasis is placed on achieving a target delay of 350 ps, as in the simultaneous shield and repeater insertion case. Consequentially, minimum noise is targeted. To minimize the noise, the largest number of repeaters is required. To satisfy the target delay and area constraints, the highest number of repeaters is determined to be ten. In this case, all of the area is occupied by the repeaters. The noise is reduced from  $C_1 = 7.25\%$  to  $7.25\% \cdot 1/10 = 0.725\%$  or 13 mV. The power consumption for this system is comparably high, i.e., 86.7  $\mu$ W. The results are compared in Table III.

Note in Table III that the noise is similar among all of the cases. A noise advantage of 2–3 mV is determined for the simultaneous shield and repeater insertion case. If the delay is not constrained, the more appropriate technique is only shielding since minimal power is dissipated in this case. In those cases

where the delay is also considered, the only repeater insertion technique achieves the target delay with comparable noise performance. The power dissipation, however, is almost twice that of the simultaneous shield and repeater insertion case.

#### VI. CONCLUSION

Resource based optimization is described and compared to standard optimization processes in this paper. The resource based optimization process is evaluated for a system that simultaneously considers shield and repeater insertion. The methodology is used to investigate area, power, delay, and noise tradeoffs. The coupled noise as a function of power with maximum allowed delay and area is evaluated, demonstrating a parabolic noise behavior. This approach permits the minimum noise design to be determined. The analytic model exhibits good agreement with SPICE. Over 50% reduction in coupled noise is demonstrated as compared to three design cases by applying this resource based optimization process. To motivate simultaneous shield and repeater insertion, the following three cases have been evaluated and compared: shielding only, repeater insertion only, and simultaneous shield and repeater insertion. The noise performance is comparable among all of these techniques. With only shielding, however, the delay is higher, while in only repeater insertion, the power is higher. In practical cases where the delay, power, and area are constrained, simultaneous shield and repeater insertion exhibits the best performance.

#### References

- J. S. Choi and K. Lee, "Design of CMOS tapered buffer for minimum power-delay product," *IEEE J. Solid-State Circuits*, vol. 29, no. 9, pp. 1142–1145, Sep. 1994.
- [2] V. Kursun, R. M. Secareanu, and E. G. Friedman, "CMOS voltage interface circuits for low power systems," in *Proc. IEEE Int. Symp. Circuits Syst.*, May 2002, pp. 3.667–3.670.
- [3] L. W. Linholm, "An optimized output stage for MOS integrated circuits," *IEEE J. Solid-State Circuits*, vol. SSC-10, no. 2, pp. 106–109, Apr. 1975.
- [4] H. B. Bakoglu and J. D. Meindl, "Optimal interconnection circuits for VLSI," *IEEE Trans. Electron Devices*, vol. ED-32, no. 5, pp. 903–909, May 1985.
- [5] J. Zhang and E. G. Friedman, "Crosstalk modeling for coupled *RLC* interconnects with application to shield insertion," *IEEE Trans. Very Large Scale Integr. (VLSI) Syst.*, vol. 14, no. 6, pp. 641–646, Jun. 2006.
- [6] Y. Massoud, J. Kawa, D. MacMillen, and J. White, "Modeling and analysis of differential signaling for minimizing inductive crosstalk," in *Proc. ACM/IEEE Des. Autom. Conf.*, Jun. 2001, pp. 804–809.
- [7] A. Carusone, K. Farzan, and D. A. Johns, "Differential signaling with a reduced number of signal paths," *IEEE Trans. Circuits Syst. II, Analog Digit. Signal Process.*, vol. 48, no. 3, pp. 294–300, Mar. 2001.
- [8] R. M. Secareanu and E. G. Friedman, "Transparent repeaters," in Proc. IEEE Great Lakes Symp. VLSI, Mar. 2000, pp. 63–66.
- [9] A. Nalamalpu, S. Srinivasan, and W. Burleson, "Boosters for driving long on-chip interconnects: Design issues, interconnect synthesis, and comparison with repeaters," *IEEE Trans. Comput.-Aided Design Integr. Circuits Syst.*, vol. 21, no. 1, pp. 50–62, Jan. 2002.
- [10] K. Hirose and H. Yassura, "A bus delay reduction technique considering crosstalk," in *Proc. IEEE Des., Autom., Test Eur. Conf. Exhib.*, Mar. 2000, pp. 441–445.
- [11] B. Soudan, "Reducing mutual inductance of wide signal buses trough swizzling," in *Proc. IEEE Conf. Electron., Circuits, Syst.*, Dec. 2003, vol. 2, pp. 870–873.
- [12] P. Gupta and A. Kahng, "Wire swizzling to reduce delay uncertainty due to capacitive coupling," in *Proc. IEEE Int. Conf. VLSI Des.*, Jan. 2004, pp. 431–436.
- [13] M. A. El-Moursy and E. G. Friedman, "Wire shaping of *RLC* interconnects," *Integr. VLSI J.*, vol. 40, no. 4, pp. 461–472, Jul. 2007.

- [14] G. Chen and E. G. Friedman, "Low-power repeaters driving *RC* and *RLC* interconnects with delay and bandwidth constraints," *IEEE Trans. Very Large Scale Integr. (VLSI) Syst.*, vol. 14, no. 2, pp. 161–172, Feb. 2006.
- [15] T. Zhang and S. S. Sapatnekar, "Simultaneous shield and buffer insertion for crosstalk noise reduction in global routing," *IEEE Trans. Very Large Scale Integr. (VLSI) Syst.*, vol. 15, no. 6, pp. 624–636, Jun. 2007.
- [16] A. Devgan, "Efficient coupled noise estimation for on-chip interconnect," in *Proc. IEEE/ACM Int. Conf. Comput.-Aided Des.*, Nov. 1997, pp. 147–151.
- [17] J. Zhang and E. G. Friedman, "Effects of shield insertion on reducing crosstalk noise between coupled interconnects," in *Proc. IEEE Int. Symp. Circuits Syst.*, May 2004, vol. 2, pp. 529–532.
- [18] V. Adler and E. G. Friedman, "Repeater design to reduce delay and power in resistive interconnect," *IEEE Trans. Circuits Syst. II, Analog Digit. Signal Process.*, vol. 45, no. 5, pp. 607–616, May 1998.
- [19] M. Ghoneima and Y. Ismail, "Formal derivation of optimal active shielding for low-power on-chip buses," in *Proc. IEEE/ACM Int. Conf. Comput.-Aided Des.*, Nov. 2004, pp. 800–807.



**Renatas Jakushokas** (S'09) received the B.Sc. degree in electrical engineering from ORT Braude College, Karmiel, Israel, in 2005 and the M.S. degree in electrical and computer engineering from the University of Rochester, Rochester, NY, in 2007, where he is currently working toward the Ph.D. degree in electrical engineering.

He was previously an intern with Intrinsix Corporation, Fairport, NY, in 2006, working on sigma–delta ADCs; Eastman Kodak Company, Rochester, NY, in 2007, working on high perfor-

mance comparators; and Freescale Semiconductor Corporation, Tempe, AZ, in 2008, where he worked on evaluating substrate isolation techniques. His research interests include power, noise, signal integrity, and optimization techniques in high performance integrated circuit design methodologies.



**Eby G. Friedman** (S'78–M'79–SM'90–F'00) received the B.S. degree in electrical engineering from Lafayette College, Easton, PA, in 1979 and the M.S. and Ph.D. degrees in electrical engineering from the University of California, Irvine, in 1981 and 1989, respectively.

From 1979 to 1991, he was with Hughes Aircraft Company, rising to the position of Manager of the Signal Processing Design and Test Department, where he was responsible for the design and test of high performance digital and analog ICs. Since 1991,

he has been with the Department of Electrical and Computer Engineering, University of Rochester, Rochester, NY, where he is currently a Distinguished Professor and the Director of the High Performance VLSI/IC Design and Analysis Laboratory. He is also a Visiting Professor with the Technion—Israel Institute of Technology, Haifa, Israel. His current research and teaching interests include high performance synchronous digital and mixed-signal microelectronic design and analysis with application to high speed portable processors and low power wireless communications. He is the author of about 350 papers and book chapters; the author or editor of ten books in the fields of high speed and low power CMOS design techniques, high speed interconnect, and the theory and application of synchronous clock and power distribution networks; and holds several patents.

Dr. Friedman is the Regional Editor of the Journal of Circuits, Systems and Computers, a member of the editorial boards of Analog Integrated Circuits and Signal Processing, the Microelectronics Journal, the Journal of Low Power Electronics, and the Journal of VLSI Signal Processing; the Chair of the IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS Steering Committee, and a Member of the Technical Program Committee of a number of conferences. He was previously the Editor-in-Chief of the IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, a member of the editorial boards of the PROCEEDINGS OF THE IEEE and the IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II: ANALOG AND DIGITAL SIGNAL PROCESSING, a member of the Circuits and Systems Society Board of Governors, the Program and Technical Chair of several IEEE conferences, a Guest Editor of several special issues in a variety of journals, and a recipient of the University of Rochester Graduate Teaching Award and the College of Engineering Teaching Excellence Award. He is a Senior Fulbright Fellow.