# Clock Tree Layout Design for Reduced Delay Uncertainty

Dimitrios Velenis Illinois Institute of Technology ECE Department Chicago, IL 60616-3793 Email:velenis@ece.iit.edu Marios C. Papaefthymiou University of Michigan EECS Department Ann Arbor, MI 48109-2122 Email:marios@eecs.umich.edu Eby G. Friedman University of Rochester ECE Department Rochester, NY 14627-0231 Email:friedman@ece.rochester.edu

Abstract— The design of clock distribution networks in synchronous digital systems presents enormous challenges. Controlling the clock signal delay in the presence of various noise sources, process parameter variations, and environmental effects represents a fundamental problem in the design of high speed synchronous circuits. Two different approaches for enhancing the layout of the clock tree in order to reduce the uncertainty of the clock signal are presented in this paper. The application of these techniques on a set of benchmark circuits demonstrates interesting tradeoffs among the aggregate clock buffer size, the total wire length of the clock tree, and the power dissipation.

## I. INTRODUCTION

The continuous quest for higher circuit performance has pushed clock frequencies deep into the gigahertz frequencies range, reducing the period of the clock signal well below a nanosecond. Deviations of the clock signal from a target delay can cause incorrect data to be latched within a register, resulting in a system malfunctioning. These deviations of the delay of a signal from a target value are described as delay uncertainty.

The uncertainty of the clock signal delay is caused by a number of factors that affect a clock distribution network, examples of which include process and environmental parameter variations (PEPV). Effects such as the non-uniformity of the gate oxide thickness and imperfections in the polysilicon etching process [1] can cause variations in the current flow within a transistor, thereby introducing delay uncertainty. In addition, variations in the geometric parameters of the interconnect wires introduce uncertainty in the signal characteristics. Environmentally induced parameter variations caused by changes in the ambient temperature [2] and external radiation also introduce delay uncertainty. On-chip noise due to interconnect crosstalk [3] introduces additional delay uncertainty as the interconnect length increases and the wire-to-wire spacing becomes shorter. The sensitivity of a clock distribution network to these effects has become an issue of fundamental importance to the design of high performance synchronous systems.

In this paper, a methodology for reducing the uncertainty in the clock signal delay is presented. The objective of this methodology is to satisfy delay uncertainty constraints at the most critical data paths of a circuit. The primary design concepts of the proposed methodology are described in section II. These concepts are implemented into two different design strategies described in section III. These design strategies have been applied to a set of benchmark circuits and interesting tradeoffs between the power dissipated by a clock tree and the clock tree area are described in section IV. Finally, some conclusions are presented in section V.

#### II. DESIGN METHODOLOGY CONCEPT

The most crucial effect of the uncertainty introduced in the clock signal delay is the increased delay uncertainty between the arrival time of different clock signals that drive sequentially-adjacent registers connected by a combinational path. The more strict the setup and hold time constraints of a combinational data path, the more sensitive the timing of a data path is to delay uncertainty. Reducing the delay uncertainty at the critical data paths is the primary objective of the two design techniques presented in this paper. In the first technique, this objective can be achieved by increasing the common portion of the clock tree shared by the clock signals that drive the critical registers. The second technique leverages the size of clock buffers to reduce the clock signal delay uncertainty.

## A. Common portion among clock paths

The clock signal is distributed to sequentially-adjacent registers along different paths within a clock tree. The topology of a clock tree that specifies the hierarchy of the branch nodes within a tree can greatly affect the delay uncertainty introduced along the clock paths. In particular, as the common portion of two paths in a clock tree increases, the delay uncertainty between the leaves of these paths is likely to decrease. The common portion of the two paths can be increased by separating these paths from a branch node *deeper* within the clock tree (closer to the leaf registers).

#### B. Clock buffer size

Inserting buffers along an interconnect line alleviates the quadratic dependence of the signal propagation delay on the line length, permitting a line to be modeled as a simple capacitive line rather than as an RC line. In addition, buffer insertion introduces uncertainty in the signal delay. Device parameter variations change the current flow within a buffer, thereby introducing uncertainty in the buffer delay. Furthermore, crosstalk among interconnects causes variations in the effective load of a buffer, introducing additional uncertainty in the signal propagation delay. It has been shown in [4] that increasing the buffer size significantly reduces the delay uncertainty due to these effects.

## III. CLOCK TREE LAYOUT DESIGN

In this section, the proposed design approaches are applied to the clock tree layout design process. The primary focus is on reducing the clock signal delay uncertainty, particularly among those signals that drive the most critical data paths within a circuit. A strategy that reduces the delay uncertainty by increasing the size of the buffers along the most critical clock paths is described in section III-A. An alternative strategy that combines the buffer sizing approach with a dedicated clock tree for the most critical registers is presented in section III-B.

#### A. Buffer insertion and sizing

To investigate the effect of increasing buffer size on the delay uncertainty of a clock signal, a buffer insertion and sizing tool has been developed. The input to the buffer insertion tool is a minimal rectilinear Steiner tree that represents the clock tree layout.

 TABLE I

 TRADEOFF BETWEEN THE INCREASE IN CLOCK TREE AREA AND THE REDUCTION IN POWER DISSIPATION

|         | Number    | Aggregate buffer size |           |           | Clock tree area( $\mu m$ ) |           |          | Power dissipation $(\mu W)$ |           |           |
|---------|-----------|-----------------------|-----------|-----------|----------------------------|-----------|----------|-----------------------------|-----------|-----------|
| Circuit | of        | Buffer                | Dedicated | Reduction | Buffer                     | Dedicated | Increase | Buffer                      | Dedicated | Reduction |
|         | Registers | Sizing                | Tree      | (%)       | Sizing                     | Tree      | (%)      | Sizing                      | Tree      | (%)       |
| 1       | 11        | 21                    | 10        | 52        | 3065                       | 3930      | 28       | 499                         | 438       | 2         |
| 2       | 12        | 18                    | 5         | 72        | 2348                       | 2706      | 15       | 425                         | 356       | 16        |
| 3       | 17        | 29                    | 7         | 75        | 3382                       | 4167      | 23       | 638                         | 531       | 16        |
| 4       | 23        | 34                    | 8         | 76        | 3823                       | 5005      | 31       | 763                         | 656       | 14        |
| 5       | 28        | 30                    | 9         | 70        | 4490                       | 5403      | 20       | 825                         | 737       | 10        |
| 6       | 36        | 25                    | 10        | 60        | 4901                       | 5911      | 20       | 882                         | 851       | 3         |
| 7       | 42        | 15                    | 9         | 40        | 4901                       | 5167      | 5        | 923                         | 831       | 10        |

The first step of this tool is to insert buffers within the clock tree. Clock buffers are inserted in a bottom-up approach, starting from the tree leaves (*i.e.* the clocked elements) at the lowest level and advancing towards the root of the tree. When an intermediate node in the tree is reached, the total load from that node to the bottom of the tree is the summation of the capacitive load of the interconnect lines and the clocked elements. A clock buffer is inserted at a node when the downstream capacitive load of that node exceeds a particular threshold value. The magnitude of the downstream load determines the size of the inserted buffer.

The second step of the buffer insertion tool is to determine the clock signal delay uncertainty between sequentially-adjacent registers of the critical data paths. The delay uncertainty between the arrival times of the clock signals at those registers is introduced by the variation effects at the non-common portions of the clock tree. The uncertainty in the signal delay is determined for each of the subtrees along the signal path. In each subtree, the following three components of delay uncertainty are considered:

- i) Interconnect delay uncertainty due to crosstalk.
- ii) Buffer delay uncertainty due to crosstalk.
- iii) Buffer delay uncertainty due to device parameter variations.

The total delay uncertainty of a signal propagating along a clock path is composed of the aforementioned terms, given the wire length of a path, the wire length of the subtrees, and the size of the buffers driving those subtrees. If the resulting delay uncertainty is greater than the delay uncertainty constraints for the particular clock registers, the size of the buffers located along the non-common clock paths are iteratively increased to reduce the delay uncertainty until the constraints are satisfied.

## B. Dedicated minimal clock tree driving the critical path registers

An alternative approach for reducing delay uncertainty among the clock paths that drive the most critical registers focuses on increasing the common portion of these paths. This approach utilizes a dedicated minimal clock tree to distribute the clock signal to these critical registers. The delay uncertainty can be further reduced to satisfy the design constraints by increasing the size of the buffers that drive this dedicated clock tree. The dedicated tree is developed through the following steps. Initially, the registers of the critical data paths are identified. A minimal rectilinear Steiner tree is used to distribute the clock signal to only those registers. The root of this dedicated tree is the point whose coordinates are the arithmetic mean of the corresponding coordinates of the critical registers. To satisfy the delay uncertainty constraints among the registers, a buffer is inserted at the root of the tree. The size of this buffer is iteratively increased until all of the delay uncertainty constraints are satisfied. Once the delay uncertainty at the critical nodes is determined, the clock signal is distributed to the remaining clock registers and the root of the dedicated tree through a minimal wire length clock tree.

## IV. POWER AND AREA TRADEOFFS

Two different design strategies have been proposed for reducing the delay uncertainty among the clock signals that drive the registers of the most critical data paths. The primary design cost for both of these strategies is an increase in the capacitive load of the clock distribution network. This larger load is the result of increasing either the size of the clock buffers or the total wire length of the clock tree. Both of these strategies therefore increase the power dissipated by a clock tree. This effect is demonstrated by the application of the proposed strategies on a set of benchmark circuits. The resulting aggregate buffer size, clock tree wire length and power dissipation are listed in Table I.

Note in Table I that the application of the dedicated clock tree strategy results in a lower power dissipation compared with the buffer insertion approach, although the wire length of the corresponding clock tree is longer. The reduction in power dissipation is due to the reduction of the aggregate buffer size on the dedicated clock tree approach, compared with the aggregate buffer size for buffer sizing.

## V. CONCLUSIONS

A methodology that satisfies the timing constraints of the most critical data paths in a circuit is presented in this paper. Two different layout design techniques are developed to reduce the uncertainty of the clock signal. One technique increases the size of the clock buffers inserted in the clock distribution network in order to reduce the delay uncertainty of the clock signal. The second technique exploits the common portion among the clock paths that drive the registers of the critical data paths. Simulation results from the application of these techniques to a set of benchmark circuits demonstrate useful tradeoffs among the aggregate size of the clock buffers, the total wire length of the clock tree, and the power dissipated by a clock distribution network.

#### REFERENCES

- R. Sitte, S. Dimitrijev, and H. B. Harrison, "Device Parameter Changes Caused by Manufacturing Fluctuations of Deep Submicron MOSFET's," *IEEE Transactions on Electron Devices*, Vol. 41, No. 11, pp. 2210–2215, November 1994.
- [2] S. Sauter, D. Schmitt-Landsiedel, R. Thewes, and W. Weber, "Effect of Parameter Variations at Chip and Wafer Level on Clock Skews," *IEEE Transactions on Semiconductor Manufacturing*, Vol. 13, No. 4, pp. 395–400, November 2000.
- [3] A. Vittal, L. H. Chen, M. Marek-Sadowska, K.-P. Wang, and S. Yang, "Crosstalk in VLSI Interconnections," *IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems*, Vol. 18, No. 12, pp. 1817–1824, December 1999.
- [4] D. Velenis, M. C. Papaefthymiou, and E. G. Friedman, "Reduced Delay Uncertainty in High Performance Clock Distribution Networks," *Proceedings of the IEEE Design Automation and Test in Europe Conference*, pp. 68–73, March 2003.