# Implementation of Arbitrary Circuits using Modified Constant Delay Technique

<sup>1</sup>Leela Krishna Thota, <sup>2</sup>Selva Kumar Jayakumar

<sup>1,2</sup>Department of ECE, SRM University, Chennai, INDIA Email: <sup>1</sup>thota.leelakrishna@gmail.com, <sup>2</sup>selva2802@gmail.com

Abstract -- This paper perceives Pre evaluation of output before the arrival of inputs from the preceding stagesis ready becomes an added advantage of MCD logic style. Besides adjusting the width of timing window, clock allocation and its distribution are considered as crucial design factors. Power consumption is drastically reduced, but the pre-charge propagation path delay affects the speed performances and limits the energy-delay product (EDP) improvement. Using 45-nm general purpose CMOS technology, MCD logic is evaluated for single cycle multistaged circuit block. Simulation results unveil that the MCD logic achieves meliorate performance and is more energy efficient than the other logic styles for the implementation on arbitrary circuits.

Index Terms – Contention, glitch,Modified Constant Delay (MCD) logic, Pre – evaluation.

# **I.INTRODUCTION**

Ever growing demand of Low Power energy efficient high performance Very Large Scale Integration (VLSI) can be addressed at different design levels mentioned as the architectural, algorithmic, circuit, layout, and the process technology level. At the circuit design level, power saving can be done by means of selecting proper choice of a logic style for implementing arbitrary circuits. This is because all the important parameters governing power dissipation like switching capacitance in the entire circuit, transition activity, and short-circuit currents which are strongly influenced by the logic style selected. Application dependency, implementation methodology of the circuit, and the design technique used, various performance aspects become vital, disallowing the articulation of universal rules for optimal logic styles.

However, the performance enhancement comes with various costs which include the noise margin reduction, charge-sharing problem and increased power dissipation due to a higher data activity. Several variations of the pass transistor CMOS logic style [1], dynamic domino logic, namely NP domino (NORA domino) [2], zipper domino [3], and data-driven dynamic logic (D3L) [4], [5], have been proposed but they are never widespread in the VLSI industry [6], [7].

Dynamic and static gates when arranged alternatively lead to a new era named Compound Domino Logic (CDL), which is known for its high performance circuit blocks like 64 bit adder [8]–[11].But in its design, the output invertor is replaced with an inverting high composite static gate, i.e., NAND, in order to satisfy the requirement of monotonicity [12] without wasting the one invertor delay. A tremendous effort on research has been dedicated to explore new logic styles that go beyond dynamic domino and static logic's. In specific, Source- Coupled Logic (SCL) [13] demonstrated victor performance that is difficult to achieve by any other logic styles. However, it lacks due to its sufferings from high power dissipation due to a constant current requirement and its expectation of complementary signals due to its differential nature.

The rest of the paper is organised as follows. Section II gives an introduction to the most important existing static and dynamic logic styles and compares them qualitatively. Section III provides the importance of FTL and introduction of MCD logic. Section IV renders the design of MCD logicand measurement of unwanted glitch voltage using first order approximation of Taylor's series. Section V dissects the impact of timing window width on MCD logic and it's comparison with other logic style's using different logic expressions. Section VI examines MCD logic advantage as performance oriented for various circuits in terms of simulation results as discussed. Section VII concludes with some final comments.

## **II. LOGIC STYLES**

Consider dynamic domino logic [Fig. 1.], the critical path consists of NMOS logic transistors. In FTL, the part of the clock and logic transistor is replaced and the clock transistor itself is included in critical path.



Fig. 1. Schematic of dynamic domino Logic with Footer transistor.

A high performance capable Feed Through Logic (FTL) came into existence. The first era of FTL discloses many defects which includes excessive power dissipation and reduced noise margin.

# III. EXPLOITATION OF MCD LOGIC

#### 3.1. Instauration of FTL Logic

The basic operation of FTL [16],[17] [Fig. 2] in CMOS technology is as follows:

When CLK is high, the pre discharge period occurs and Out is pulled down to GND through M2. When CLK is low, M1 is on, M2 is off, and the gate moves into the evaluation period.



Fig. 2.Basic FTL

If inputs [I0, I1..In] (IN) are at logic "1," Out enters into the contention mode where M1 and transistors in the NMOS pull-down network (PDN) are conducting current at the same instant. The output quickly rises to logic "1" if PDN is off. In this case, FTL's critical path is always a single PMOS transistor.

## 3.2. Proposed MCD logic

In order to resolve the above problems, MCD logic is proposed with a schematic shown in Fig. 3.



Fig. 3. CDL Block Diagram



Fig. 4. CDL Buffer

A buffer implemented in MCD logic with schematics of TB and LB is shown in Fig. 4 & fig. 5 constitute the waveform of the CDL basic buffer obtained from Cadence Virtuoso.



Fig. 5. Timing waveform indicating Glitch value

# IV. DESIGN CONSIDERATION FOR PROPOSED MCD LOGIC

#### 4.1. Output Glitch Measurement

Fig. 6. provides a simplified schematic of MCD logic during the mode of contention, where both transistors P1 and N1 are ON simultaneously and induce a glitch



Fig. 6. CDL during Contention Mode

voltage  $\Delta V1$ , which in turn generates another smaller glitch  $\Delta V2$ . By design,  $\Delta V1$  should be small [ $\langle V_T$ ]. Hence,P<sub>1</sub>operates in the saturation region while N1 is in the linear mode. Theequation of current is given as

$$\frac{1}{2}\mu_{p}C_{0x}\frac{W_{p1}}{L_{p1}}(V_{gsp1}-V_{tp})^{2}$$

$$=\mu_{n}C_{ox}\frac{W_{n1}}{L_{n1}}\left[(V_{gsn1}-V_{tn})V_{dsn1}-\frac{(V_{dsn1})^{2}}{2}\right] (1)$$

where  $\mu_p$  and  $\mu_n$  are the hole and electron mobility of PMOS and NMOS transistors, respectively, Land Ware

the transistor length and width, respectively,  $C_{ox}$  is the oxide capacitance and  $V_{ds}$  and  $V_{gs}$  are the transistor drainto-sourceand gate-to-source voltages, respectively. Assuming the usage of same length devices, and  $\mu_p \approx 0.5 \mu_n$ , rearranging (1) gives

$$\frac{W_{p1}}{4W_{n1}} \left( V_{DD} - V_{tp} \right)^2 = \left( V_{DD} - V_{tn} \right) \Delta V_1 - \frac{\left( \Delta V_1 \right)^2}{2}$$
(2)

 $\Delta V1$  can be found by solving the quadratic equation

$$\Delta V_{1} = (V_{DD} - V_{tn}) - \sqrt{\left((V_{DD} - V_{tn})^{2} - \frac{w_{g1}}{2w_{s1}}(V_{DD} - V_{tp})^{2}(3)\right)}$$

By Taylor expansion, it is approximated in first order as

$$\sqrt{N^2 + d} = N + \frac{d}{z_N} - \frac{d^*}{z_N^4} + \frac{d^*}{16N^4} \dots \otimes N + \frac{d}{z_N} (4)$$

Assuming Vtp  $\approx$  Vtn, (3) can be approximated as

$$\Delta V_{1} \approx V_{DD} - V_{tn} - (V_{DD} - V_{tn}) + \frac{W_{p1}(V_{DD} - V_{tp})^{2}}{4W_{n1}(V_{DD} - V_{tn})}$$

$$\approx \frac{W_{p1}(V_{DD} - V_{tp})}{4W_{n1}}(5)$$

 $\Delta V2$  is found through a similar imminent. Now consider Fig. 9. again transistor N2 operates in the sub threshold region while P2 is working in the linear mode. The two current equations when equated yields

$$\frac{W_{n2}}{L_{n2}} I_t e^{\frac{V_{gsn2} - V_{tn}}{\eta V_T} \left(1 - e^{\frac{-V_{dsn2}}{V_T}}\right)}$$
  
=  $\mu_p C_{ox} \frac{W_{gs}}{L_{gs}} [(V_{gsp2} - V_{tp})V_{dsp2} - \frac{(V_{dsy2})^2}{2} (6) \text{Referring}$   
[18], [19] we have

$$I_t = \mu_0 C_{0x} (V_T)^2 e^{1.8}, \eta = 1 + \frac{3T_{ox}}{w_{dm}}, V_T = \frac{\kappa T}{q} (7) \quad \text{where}$$

 $\mu_o$  is the zero bias mobility,  $\eta$  is the coefficient of sub threshold swing,  $W_{dm}$  is the depletion layer maximum width,  $V_T$  is the thermal voltage, K is the Boltzmann constant, q is the charge of electron , and T is the temperature in kelvin. But for NMOS transistors,  $\mu_o$  is simply  $\mu_n$ .

$$\Delta \overline{\overline{v}}_{2}^{*} \approx \frac{Ae^{\frac{\Delta V_{1} - V_{tn}}{\eta V_{T}}}}{2(V_{DD} - \Delta V_{1} - V_{tp})} (8)$$

4.2. Estimation of Power Consumption

Data activity evaluates how frequent signals witches and is defined as

Data Activity = # No of signal Transitions # No.of signals X # No.of clock cycles (9)

## V. MCD LOGIC DEPICTION

All logic transistors have a 120nm effective NMOS width. For MCD logic, the PMOS clock transistor's

width is set to 240nm. The data and clock frequencies are set to 2MHz. The transistor sizing's are optimized primary for delay, because the main objective is to explore MCD logic's performance advantage.

### 5.1. MCD logic Performance

MCD logic demonstrates superior performance for complicated logic expressions in the D-Q mode due to the pre-evaluated characteristic.MCD logic is approximately two times faster than dynamic domino logic. This is contributed by:

- 1) the pre-evaluated characteristic;
- 2) the less number of transistors in the critical path.

#### VI. PERFORMANCE ANALYSIS

6.1. 8-bit Ripple Carry Adders (RCAs)



Fig. 7. RCA Block Diagram

The main intent of this 8-bit RCA is to exhibit MCD logic performance advantage and to dissertate design conditions that should be taken into account when MCD logic is used. A more energy-efficient pass-transistor FA design [15] will be enforced in the subsequent analysis to provide a more realistic comparison.Fig. 7 depicts the RCA block diagram and FA schematic. The worst case timing diagram for MCD logic occurs when  $A[0:7] = 0,B[0:7]=1,C_{in} = 0.$ 

#### 6.2. 32-bit Carry Look ahead Adder (CLA)

32-bit CLA is implemented to further analyse MCD logic performance. The detailed operations of CLA are described in [6] and the schematic is displayed in Fig. 8. The 32-bit CLA uses eight 4-bit FAs with dedicated circuitry to facilitate carry generation. The energy-efficient FA used in this analysis utilizes pass transistor logic styles with only 24 transistors for sum generation [15].



Fig. 8. 32 bit - CLA Block Diagram

6.3. Simulation Results & Discussion

When compared with static, dynamic, and MCD logic RCAs with various figures of merits at different data activity factors.MCD logic based RCA is approximately 17% and 37% faster than the static and dynamic counterparts, respectively. On the other hand, the power consumption of MCD logic ranges from 0.5to 2.8 timeshigher than that of static logic. In terms of the PDP, MCD logic is 0.8 and 1.1 less than dynamic and static logic at 100% data activity respectively. MCD logic provides a speed advantage that logic styles such as static and dynamic find difficult to reach [24].

Simulations are performed in Cadence with Generic Process Design Kit (GPDK) model card of 45nm which includes all the process corners. Layout is implemented in Cadence Virtuoso using ADEXL tool.

Fig. 9. represents the CDL basic buffer layout.



Fig. 9.CDL buffer layout. VII. CONCLUSION

A novel high performance logic style was a self-reset circuit which makes use of pre evaluation technique. MCD logic was used to evaluate performance of mutli stage single circuit block oriented circuits where unique critical path exists. Implementation of 8 bit RCA using MCD logic provides an area improvement of 22.3% when compared with static logic style and speed improvement of 16%. Performance analysis of 32-bit CLAs reveals that MCD logic is 15% faster than dynamic domino logic. Compared to8-bit adders, MCD logic achieves a better delay improvement, but has an even best EDP reduction when implemented on 32 bit adder. PDP and EDP is measured for the arbitrary circuits where a drastic improvement is observed. MCD logic's advantages in terms of delay and EDP can be extended to the design of multipliers also, which results in providing an efficient design of ALU for the computations to be performed effectively.

## REFERENCES

- R. Zimmermann and W. Fichtner, "Low-power logic styles: CMOS versus pass-transistor logic," IEEE J. Solid-State Circuits, vol. 32, no. 7,pp. 1079–1090, Jul. 1997.
- [2] N. Goncalves and H. De Man, "NORA: A racefree dynamic CMOS technique for pipelined logic structures," IEEE J. Solid-State Circuits,vol. 18, no. 3, pp. 261–266, Jun. 1983.
- [3] C. Lee and E. Szeto, "Zipper CMOS," IEEE Circuits Syst. Mag., vol. 2 no. 3, pp. 10–16, May 1986.
- [4] R. Rafati, S. Fakhraie, and K. Smith, "A 16-bit barrel-shifter implemented in data-driven dynamic logic (D3L)," IEEE Trans. Circuits Syst. I, Reg. Papers, vol. 53, no. 10, pp. 2194– 2202, Oct. 2006.
- [5] F. Frustaci, M. Lanuzza, P. Zicari, S. Perri, and P. Corsonello, "Lowpower split-path datadriven dynamic logic," Circuits Dev. Syst. IET, vol. 3, no. 6, pp. 303–312, Dec. 2009.
- [6] N. Weste and D. Harris, CMOS VLSI Design: A Circuits and Systems Perspective, 4th ed. Reading, MA: Addison Wesley, Mar. 2010.
- [7] K. Bernstein, High Speed CMOS Design Styles, 1st ed. New York: Springer-Verlag, Aug. 1998.
- [8] S. Mathew, R. K. Krishnamurthy, M. A. Anders, R. Rios, K. R. Mistry,and K. Soumyanath, "Sub-500-ps 64-b ALUs in 0.18μm SOI/bulkCMOS: Design and scaling trends," IEEE J. Solid-State Circuits, vol. 36, no. 11, pp. 318–319, Nov. 2001.
- [9] S. Mathew, M. Anders, R. Krishnamurthy, and S. Borkar, "A 4 GHz 130 nm address generation unit with 32-bit sparse-tree adder core," in VLSI Circuits Dig. Tech. Papers Symp., 2002, pp. 126–127.
- [10] S. K. Mathew, M. A. Anders, B. Bloechel, T. N. Krishnamurthy, and S. Borkar, "A 4-GHz 300-mW 64-bit integer execution ALU with dual supply voltages in 90-nm CMOS," IEEE J. Solid-State Circuits, vol. 40, no. 1, pp. 44–51, Jan. 2005.

- [11] S. Wijeratne, N. Siddaiah, S. Mathew, M. Anders, R. Krishnamurthy, J.Anderson, S. Hwang, M. Ernest, and M. Nardin, "A 9 GHz 65 nm Intel pentium 4 processor integer execution core," in IEEE Int. Solid-State Circuits Conf. ISSCC Dig. Tech. Papers, San Francisco, CA, Feb. 2006, pp. 353–365.
- I. Sutherland, R. F. Sproull, and D. Harris. (Feb. 1999). Logical Effort: Designing Fast CMOS Circuits [Online]. Available: http://amazon.com/o/ASIN/1558605576/
- [13] S. Kiaei, S.-H. Chee, and D. Allstot, "CMOS source-coupled logic for mixed-mode VLSI," in Proc. IEEE Int. Circuits Syst. Symp., New Orleans, LA, May 1990, pp. 1608–1611.
- [14] L. McMurchie, S. Kio, G. Yee, T. Thorp, and C. Sechen, "Output prediction logic: A highperformance CMOS design technique," in Proc.Comput. Des. Int. Conf., Austin, TX, 2000, pp. 247–254.
- [15] M. Aguirre-Hernandez and M. Linares-Aranda, "CMOS full-adders for energy-efficient arithmetic applications," IEEE Trans. Very Large Scale Integr. (VLSI) Syst., vol. 19, no. 99, pp. 1–5, Apr. 2010.

[16] V. Navarro-Botello, J. A. Montiel-Nelson, and S. Nooshabadi, "Low power arithmetic circuit in feedthrough dyanmic CMOS logic," in Proc. IEEE Int. 49th Midw. Symp. Circuits Syst., Aug. 2006, pp. 709–712.

- [17] V. Navarro-Botello, J. A. Montiel-Nelson, and S. Nooshabadi, "Analysis of high-performance fast feedthrough logic families in CMOS," IEEE Trans. Circuits Syst. II, Exp. Briefs, vol. 54, no. 6, pp. 489–493, Jun. 2007.
- [18] Y. Taur and T. Ning, Fundamentals of Modern VLSI Devices. Cambridge, U.K.: Cambridge Univ. Press, 1998.
- [19] K. Roy, S. Mukhopadhyay, and H. Mahmoodi-Meimand, "Leakage current mechanisms and leakage reduction techniques in deep submicrometer cmos circuits," Proc. IEEE, vol. 91, no. 2, pp. 305–327, Feb. 2003.
- [22] P. Chuang, D. Li, and M. Sachdev, "Design of a 64-bit low-energy highperformance adder using dynamic feedthrough logic," in Proc. IEEE Int. Circuits Syst. Symp., May 2009, pp. 3038–3041.

**~~**