

## Università degli Studi di Pavia Facoltà di Ingegneria Dipartimento di Elettronica

## Dottorato di ricerca in Microelettronica XXIV ciclo

## Wideband Low-Power CMOS Analog Building Blocks for Millimeter-Wave Phased-Array Receivers

Tutore: Chiar.mo Prof. Francesco Svelto

Coordinatore: Chiar.mo Prof. Franco Maloberti

> Tesi di Dottorato di Andrea Ghilioni

To Tamigio

"Research is what I'm doing when I don't know what I'm doing." Wernher Von Braun

"No great discovery was ever made without a bold guess." Isaac Newton

*"If you can not saw with a file or file with a saw, then you will be no good as experimentalist."* Augustin Fresnel

"Before anything else, preparation is the key to success." Alexander Graham Bell

"No man should escape our universities without knowing how little he knows." Julius Robert Oppenheimer

"The first principle is that you must not fool yourself, and you are the easiest person to fool." Richard Feynman

"When there is no hope in the future, there is no power in the present." James Clerk Maxwell

"Scientists investigate that which already is; Engineers create that which has never been." Albert Einstein

"Engineers like to solve problems. If there are no problems handily available, they will create their own problems." Scott Adams

## Contents

| Pr | eface                                         |                                                                                                                                                                                                                                                                                              | v                                            |
|----|-----------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------|
| 1  | CM0<br>1.1<br>1.2<br>1.3<br>1.4               | <b>OS millimeter-waves: applications and opportunities</b><br>The Moore's Law: the roadmap toward the high frequencies<br>The advent of millimeter-waves on CMOS                                                                                                                             | <b>1</b><br>2<br>4<br>6<br>9                 |
| 2  | <b>Ove</b><br>2.1<br>2.2                      | rview of phased-array systems       I         Basic operative principle and performances       I         2.1.1       Interference of EM waves       I         2.1.2       Receiver operation       I         2.1.3       Transmitter operation       I         Receiver architecture       I | <b>13</b><br>14<br>15<br>17<br>19            |
| 3  | 2.3<br>An a                                   | Overview of the state of the art                                                                                                                                                                                                                                                             | 21<br>25                                     |
|    | <ul><li>3.1</li><li>3.2</li><li>3.3</li></ul> | Circuit description3.1.1Transconductor3.1.2Input matching network3.1.3Switching pair3.1.4Image rejection filter3.1.5Noise analysisExperimental resultsConclusions                                                                                                                            | 25<br>26<br>28<br>33<br>34<br>35<br>38<br>40 |
| 4  | Qua                                           | drature frequency generation                                                                                                                                                                                                                                                                 | <b>13</b>                                    |
|    | 4.1                                           | Quadrature VCO       4.1.1         Flicker noise issue in quadrature generation       4.1.2         Basic idea       4.1.3         Circuit design       4.1.3                                                                                                                                | 43<br>43<br>45<br>47                         |

### CONTENTS

|                     |              | 4.1.4   | Experimental results                              | 51 |  |  |  |
|---------------------|--------------|---------|---------------------------------------------------|----|--|--|--|
|                     | 4.2          | Freque  | ncy divider architecture based on dynamic latches | 55 |  |  |  |
|                     |              | 4.2.1   | Overview of the state of the art                  | 55 |  |  |  |
|                     |              | 4.2.2   | Basic idea                                        | 58 |  |  |  |
|                     |              | 4.2.3   | Circuit design                                    | 60 |  |  |  |
|                     |              | 4.2.4   | Experimental results                              | 63 |  |  |  |
|                     |              | 4.2.5   | Performance improvement                           | 66 |  |  |  |
|                     | 4.3          | Conclu  | sions                                             | 69 |  |  |  |
| General conclusions |              |         |                                                   |    |  |  |  |
| Α                   | Phas         | se rota | tors                                              | 75 |  |  |  |
| Bil                 | Bibliography |         |                                                   |    |  |  |  |

## Preface

Since the potential of wireless telecommunications was unveiled in 1901 by Guglielmo Marconi with its first wireless trans-oceanic telegraph experiment, the seeking for higher data-rates has always grown. In fact, through time wireless technology is spreading deeply in our society, providing more and more applications and contributing to its progress. In turn, the strong request of the market has encouraged a huge effort in the development in the field, and has consequently led to its exponential growth.

A topic of great interest for today's research, both academic and industrial, is the phased-array architecture, that was initially introduced in the 1950s as a method to combine multiple transmitters or receivers into a single high-performance system. Several identical radiators are physically placed in an ordered matrix and driven by phase-shifted replicas of the same signal, in order to build through the interference of the individual EM fields an electronically-steerable directive antenna, with also interference-rejection capabilities and signal to noise ratio (for the receiver) and power outlet (for the transmitter) enhancement. Due to the former high cost and critical complexity of the assembled structures, this approach was originally employed in military and space fields only, but, thanks to the constant performance improvement and cost reduction of technologies through time, phased-arrays are becoming feasible even for consumer applications.

The higher the operative frequency of the system, the larger the achievable bandwidth and thus the speed of the communication. Thanks to the steady scaling of the minimum device size of silicon technologies, the today's cutting edge topic is the millimeter-wave field, formally located between 30 and 300 GHz but habitually identified with the buzzword "60 GHz". Such a high frequency promises data-rates in excess of 10 Gbps, and thanks to the short wavelength, a System On Chip (SOC) approach with embedded antennas becomes feasible too. The subject has been studied for over a decade, initially on expensive compound semiconductors, but subsequently on the cheap and affordable bulk CMOS technology that truly opens the door to the widespread of mm-wave phased-array systems in the consumer market.

This Ph.D. thesis is focused on the development of novel and high-performance

basic building blocks for mm-wave phased-array receivers in nm-scaled bulk CMOS technology, and it is organized as follows:

**chapter 1** outlines the motivation of the work. It gives an overview of the field of millimeter-waves on silicon CMOS: how and why it has come about, which perspectives it has introduced and what applications it has made feasible and interesting to investigate.

**chapter 2** is focused on phased-arrays. First it introduces the basic physical principle and the architecture, subsequently it proposes a possible solution. The chapter examines the key design points and provides a brief summary of the state of the art.

**chapter 3** describes an area and power efficient merged LNA and mixer suitable for heterodyne architecture. Circuit solutions are introduced to minimize noise, leading to an overall noise figure and power consumption comparable with state-of-the-art CMOS LNAs.

**chapter 4** addresses the implementation of basic PLL blocks for frequency generation. Firstly, it introduces a quadrature VCO based on magnetically coupled resonators; the oscillation frequency is set by inter-stage passive components only, thus greatly reducing the conversion of flicker noise into phase noise, and demonstrating accurate quadrature phases. Secondly, it proposes a new architecture for high-speed frequency dividers; through the use of clocked differential amplifiers working as dynamic CML latches, high speed and low power consumption are achieved simultaneously without the use of inductors, hence occupying a very small area. Several prototypes of dividers by 2 and 4 have been realized in two different technology nodes and characterized.

Pavia, November 8, 2011.

Andrea Ghilioni

## Chapter 1

# CMOS millimeter-waves: applications and opportunities

Microelectronics is widespread today. Most of our actions lead us to interact with some microelectronic devices. Sometimes the contact is evident, like when using a computer, calling with a mobile phone, or listening to an iPod. Some other times is hidden, like when accelerating a modern car: pushing the throttle no wire is pulled as it used to happen once, but a sensor in the pedal communicates to the Electronic Control Unit how to drive the engine on our behalf. These are just few examples to illustrate how much microelectronics is permeated in almost all the markets of today's economy, and how has become one of the basic building blocks of today's society.



Figure 1.1: Widespread of microelectronics.

## 1.1 The Moore's Law: the roadmap toward the high frequencies

Microelectronics employs very sophisticated and expensive technologies, thus its development asks for huge capitals. Just to say, the construction of a today's cutting-edge silicon factory requires an investment of several billion dollars, but don't think this could be a problem; consumers are always covetous to buy the latest smartphone, the updated laptop, and the newest electronic plaything, therefore plenty of money is always at hand. This is the basis of a virtuous cycle that was already predicted about fifty years ago by a far-sighted man named Gordon Moore, who proposed the most guessed law of the history of semiconductors. In 1965 the future founder of the Intel Corporation postulated that the number of transistors crammed on a single integrated circuit doubles approximately every two years [1]. The development of commercially-available microprocessors in the last four decades shown in Figure 1.2 has proven that he was right.

The Central Processing Units have always been towing the development of the integrated semiconductor technologies because of their peculiar general-purpose capabilities; digital applications hold today the largest share of the market due to their undoubted superiority in signal processing, data storage and communication if compared to the analog field. The technologists' ability to reduce the physical dimension of integrated devices over the time allows designers to squeeze more and more functions on the same chip, thus reducing the cost of the system but at the same time increasing its potential. This widens the party of possible buyers, increases the sales and collect more and more capitals for a further development. Several semiconductor technologies are available to make transistors today; just to name a few: Si, SiGe, Ge, GaAs, InP, pHEMT, HBT, etc... Silicon is the poorest one: it has the lowest carrier mobility and the highest energy-gap, and thus it is usually considered suitable just for "low-frequency" applications. In its 2003 report, the International Technology Roadmap for Semiconductors [3] gave the silicon up just to 5 GHz, while indicating the more expensive SiGe, GaAs and InP for analog applications ranging from 10 to 100 GHz. Despite its limitations in performances, silicon has always been the most widely employed semiconductor for several reasons: the raw material is relatively cheap and handily available, its dioxide is a wonderful dielectric and can be easily growth with a high yield, and digital consumer applications are really just fine on silicon CMOS. This is the reason behind the huge worldwide effort in the research and development of the silicon technology, which has led to a dramatic reduction in the minimum gate length achievable and in turn in the extraordinary enhancement of the maximum operative frequency, as shown in Figure 1.3.

Though mostly oriented to digital applications, deep sub-micron silicon CMOS



#### 1.1 The Moore's Law: the roadmap toward the high frequencies

**Figure 1.2:** Transistors count in commercial microprocessors from 1971 to 2011 [2]. Moore's prediction in dashed line.

lends itself to implementing analog functions at high frequencies. In the RF field this has been of great benefit, introducing a substantial reduction in the costs and allowing the System On Chip (SOC) approach [4]. In almost all today's commercial mobile devices, GSM, UMTS, GPRS and HSPA transceivers, whose carriers are between approximately 2 to 8 GHz, are made on a single chip<sup>1</sup> in 90 nm or 65 nm, clearly demonstrating the affordability of the technology in those fields. This strongly justifies and prepares the ground for the search of a further increase in the operative frequency up to the mm-waves, thus allowing datarates in excess of several Gbps for wireless systems. Today commercial systems on silicon CMOS operating beyond 10 GHz are still rare, but the research in this field is very aggressive, and the situation will change shortly.

<sup>&</sup>lt;sup>1</sup>With one exception: the power amplifier. Due to the typically low power outlet and efficiency achievable on CMOS, this block is usually realized on a separate chip made of a compound semiconductor.



**Figure 1.3:** Evolution of minimum gate length and maximum operative frequency of silicon CMOS transistors. Past data and future prediction are based on 2003-2010 ITRS reports [3].

## **1.2** The advent of millimeter-waves on CMOS

Each progress step in the available technologies always opens the door for new interesting possibilities and innovative applications. Hand in hand with the scaling of the minimum gate length, more than ten metal levels have become available for interconnections, thus adding several degrees of freedom for the fabrication of monolithic passive components. On-chip inductors and transmission lines, both indispensable for high-frequency design, have become more and more performing, enhancing the feasibility of analog RF and millimeter-wave circuits. One of the first attempts to go beyond 10 GHz with distributed devices on silicon CMOS dates back to 2001 [5], and shortly after that the millimeter-wave field has been the subject of an intense research [6]-[10], up to systematic design methodologies [11]-[13] ready for industrial production.

The continuous seek for higher carrier frequencies is not pursued just for the sake of overcoming the existing limits, but it has a strong motivation that can be well understood at first glance referring to Figure 1.4. The higher the operating frequency, the easier broadening the bandwidth for a given quality factor of the passive components: e.g. the -3 dB cutoff of a parallel LC resonator is equal to:

$$BW = \frac{f_0}{Q} \tag{1.1}$$



**Figure 1.4:** Datarate versus bandwidth of various wireless standards. Spectral efficiencies are highlighted [14].

In the millimeter-wave range it is then possible to perform a good gain over several GHz of bandwidth employing simple second-order resonant networks<sup>2</sup>. The spectrum available is broad and still depopulated, so no complex modulation techniques are required to maximize the spectral efficiency. Plain system architectures can be successfully employed thus reducing the development effort and enhancing the reliability of the structure, but datarates in excess of 1 Gbps can still be achieved even with a low spectral efficiency. The datarate C over a noisy transmission channel is directly proportional to the allocated bandwidth BW, as demonstrated in the very well known Shannon's work [15]:

$$C = BW + \log_2(1 + SNR) \tag{1.2}$$

where SNR is the signal to noise ratio. Just to make an example, a 2 GHz bandwidth around 60 GHz allows to straightforwardly obtain a bitrate up to 1 Gbps

 $<sup>^2{\</sup>rm This}$  thanks to the good quality factors achievable at mm-waves by on-chip inductors on modern silicon CMOS substrates.

by mean of a trivial OOK modulation. From (1.1), an upper boundary of 30 is set on the quality factor<sup>3</sup>, thus allowing to use only few tuned gain stages and containing the overall power consumption of the system. Obviously, several desigh challenges are present in this field, mainly oriented to the high-frequency modeling of transistors [10] and the accurate electromagnetic characterization of all the passive components, interconnections and the ground plane [16].

## **1.3** Applications at millimeter-waves

The 60 GHz range offers a wide bandwidth over short distances, so it is well-suited for operation in indoor environments. Walls and floors strongly attenuate the millimeter waves up to 40 dB depending on material properties and thickness, thus a network cell will be typically confined within a single room. This feature is an advantage in sight of interference avoidance between adjoining cells: since no spectral diversity is required, the whole bandwidth depicted in Figure 1.8 can be entirely used allowing data transfer at full speed. A possible home entertainment application is shown in Figure 1.5: a wireless peer-to-peer communication can be established between smartphones, laptops and home cinema for the sharing and streaming of HD videos, photos and music; mass storage devices can be part of the network for backup purposes, and a hotspot will provide a connection to the internet. Within this category, a lot of services oriented the fast diffusion of informations in public places can be implemented: e.g. data kiosks inside museums can rapidly push a multimedia guide into the visitor's tablet pc, or urgent flight information can be directly sent to travelers' smartphones in the airports.

Another interesting application is a short range automotive radar system to create a more safe and comfortable driving environment. Adding sight capabilities to cars enables the generation of new kinds of driving aids, like pre-braking for collision avoidance, steering correction for lane following and throttle modulation for adaptive cruise control, as shown in Figure 1.6. Just as an example, at 110 km/h a save of one second in the driver's response time is equivalent to a reduction of more than 30 meters in the distance covered while braking. Although such devices are already present on the market, they are actually realized on expensive compound semiconductors, reserving them for a niche use on luxury cars only. The chance to integrate such functions on cheaper technologies has recently led several governments worldwide to plan to make it a standard accessory in the next years. Up to now, quite a few standards have been defined for different applications. Short range (<10 m) radars ask for wide azimuth angle

 $<sup>^3</sup> The maximum quality factor actually achievable around 60 GHz for on-chip inductors on current bulk CMOS substrates is about 15÷20.$ 

#### 1.3 Applications at millimeter-waves



**Figure 1.5:** Home entertainment holds a large share of the market. Indoor wireless multimedia streaming and syncing is a desirable application.

 $(>70^{\circ})$  and fine resolution (<10 cm), and thus a wide bandwidth of 7 GHz is provided together with a pulsed modulation scheme. Mid- and long-range radars use a frequency modulated continuous wave (FMCW) modulation with a narrower bandwidth since a lower spatial resolution is required due to the higher distance of the target. A summary of those standards is presented in Table 1.1 [17].

A third interesting application is in the imaging area, i.e. a method of forming images through the detection of millimeter-wave radiation from a scene. Passive

| Range | $f_0$  | BW      | Modulation | Angle                       | Range | Resolution |
|-------|--------|---------|------------|-----------------------------|-------|------------|
| Short | 24 GHz | 7 GHz   | Pulsed     | 70°                         | 10 m  | < 10 cm    |
| Mid   | 24 GHz | 250 MHz | FMCW       | $30^{\circ}\sim 60^{\circ}$ | 40 m  | $\sim 1$ m |
| Long  | 77 GHz | 1 GHz   | FMCW       | 16°                         | 150 m | $\sim 1$ m |

 Table 1.1: Classification of automotive radar systems.

1. CMOS millimeter-waves: applications and opportunities



**Figure 1.6:** The typical scenario prospected today considers 24GHz and 77GHz as the most likely carriers for short- and mid-range automotive radar.

millimeter-wave (PMMW)<sup>4</sup> static imaging has been performed for decades, but the today's technology have made the realization of real-time imaging systems at video rates feasible, thus renewing the interest in this area. This technology has a huge potential, since several materials that are opaque for the visible light becomes transparent when observed at millimeter-waves, such as haze, fog, clouds, smoke, or sandstorms, and even clothing. This ability to see under conditions of low visibility that would ordinarily blind visible or infrared (IR) sensors has the potential to transform the way low-visibility conditions are dealt with. For the military, low visibility can become an asset rather than a liability; in the commercial realm, fog-bound airports could be eliminated as a cause for flight delays or diversions; for security concerns, imaging of concealed weapons could be accomplished in a fast and non-intrusive manner. For the last purpose, the short wavelength of millimeter-waves is beneficial: the resolution<sup>5</sup> of an imaging system that employs an EM wave as a probe is defined by

$$\sin\theta \approx 1.22 \frac{\lambda}{S} \tag{1.3}$$

where  $\theta$  and S are explained in the next figure. A short wavelength thus allows to detect small objects even with small viewers. Consider that the millimeter-wave

 $<sup>^{4}</sup>$ The system is passive if the detection is based only on the naturally-occurring radiation from the target.

<sup>&</sup>lt;sup>5</sup>I.e. the the size of the smallest detectable object.



**Figure 1.7:** Resolution of an imaging system. S is the size of the antenna, R is the distance of the target and  $\theta$  is the minimum detectable angle

range is so-called because for the frequencies between 30 GHz and 300 GHz the wavelength is comprised amidst 10 mm and 1 mm. To make an example, a 10 cm wide antenna working at 60 GHz is capable of seeing an object with a radius as small as 3 mm from a distance of 5 cm. Combining this feature with the much lower harmfulness of the millimeter-waves on the human body than the X-rays, such imaging systems are optimal for medical non-invasive screening applications like breast and skin tumor early detection, and for security applications like body scanners.

## 1.4 IEEE 802.15.3c WPAN standard

Since the IEEE 802.15.3 Task Group 3c (TG3c) was formed in 2005 [18] to develop a millimeter-wave-based Medium Acces Control (MAC) and Physical Layer (PHY) for Wireless Personal Area Networks (WPANs), all the research described so far was abruptly turned from a mere speculation into a milestone of the tomorrow's market. The standardized spectrum depicted in Figure 1.8 is designed to support high data rates of at least 1 Gbps for applications such as high speed internet access and streaming content download (video on demand, home theater, etc...). Very high data rates in excess of 2 Gbps in option are provided for simultaneous time dependent applications such as real time multiple HDTV video streaming and wireless data bus for cable replacement.

The standard defines three PHY modes: Single Carrier (SC), High Speed Interface (HSI) and Audio/Visual (AV). The SC PHY is divided into three sub-classes of modulation and coding schemes targeting different wireless connectivity applications. Class 1 is specific for the low-power low-cost mobile market, employing simple modulation techniques as OOK while maintaining a relatively high data rate up to 1.5 Gbps; class 2 is intermediate for data rates up to 3 Gbps while class 3 is dedicated to high-performance application in excess of 5 Gbps. All of these classes can use only one carrier, so two or more of the adjacent channels shown in Figure 1.8 can be merged together. HSI PHY is specific for low-latency,



**Figure 1.8:** The spectrum of the IEEE 802.15.3c millimeter-wave WPAN standard.

| PHY | Class | Modulation          | [Mbps]               |
|-----|-------|---------------------|----------------------|
|     | 1     | OOK / BPSK / (G)MSK | 26  ightarrow 1650   |
| sc  | 2     | QPSK                | 1760  ightarrow 3300 |
| 50  | 2     | 8-PSK               | 3960                 |
|     | 5     | 16-QAM              | 5280                 |
|     |       | QPSK                | 32  ightarrow 2695   |
| HSI | I     | 16-QAM              | 3080  ightarrow 5390 |
|     |       | 64-QAM              | 5775                 |
| A\/ |       | QPSK                | 952  ightarrow 1904  |
| AV  |       | 16-QAM              | 3807                 |

**Table 1.2:** Summary of the modulation schemes and achievable data rates of different configurations of the standard IEEE 802.15.3.

bidirectional high-speed links, and uses orthogonal frequency domain multiplexing and several coding schemes to maximize the spectral efficiency. AV PHY is similar to HSI but it is mainly targeted to unidirectional high-speed communication such as HDTV streaming. Table 1.2 summarizes the modulation schemes and the achievable data rates of the different PHY modes.

The standard embeds also a beam forming protocol for the implementation of directive antennas. In fact, one of the most important aspects to be considered in a wireless link is the attenuation of the electromagnetic waves. Even assuming the propagation through a lossless medium, the intensity of a wave radiated by a generic antenna suffers from the free space path loss (FSPL), due to the increase

of the area of the wave front traveling away from the source<sup>6</sup>:

$$FSPL = \left(\frac{4\pi d f}{c}\right)^2 \frac{1}{G_{tx}G_{rx}}$$
(1.4)

where d is the distance of the link, f is the operative frequency, c is the speed of light and  $G_{\text{tx,rx}}$  are the antenna gains of transmitter and receiver. Directive antennas have a high gain in the direction of the transmission, thus greatly reducing the impact of (1.4) and allowing to enhance the distance and the data rate while maintaining a reasonable transmitter power.

The standard defines several protocols for active beam tracking and switching: the high collimated beam mandates a good alignment between TX and RX in order to ensure the robustness of the link. Scanning capabilities are provided in order to let the system search for the best antenna orientation before the beginning of the data flow, and a continuous tracking is performed in order not to lose the link if the position of one of the two interlocutors is changed. This protocol also contemplates the possibility of non line of sight (NLOS) communications, exploiting the reflection of the beam on the nearest suitable surface.

The most efficient way to implement all of the beam-forming features defined by the standard is the phased-array architecture, that allows to build an electronically steerable directive antenna by combining several properly-driven radiators, those can be individually not directive. An extended discussion of such a solution is presented in chapter 2.

<sup>&</sup>lt;sup>6</sup>The FSPL is also known as the Friis' formula.



**Figure 1.9:** Non-Line-of-Sight (NLOS) communication in an indoor environment. The nearest suitable surface is exploited to reflect the beam in order to establish an indirect path when the direct one is obstructed.

## Chapter 2

## **Overview of phased-array systems**

A major issue in designing high data rate 60 GHz radios is the limited link budget even for indoor distances. The propagation over few meters at millimeter-waves corresponds to thousands of wavelengths long path that is responsible for a huge loss, and the non line of sight (NLOS) operation exacerbates the problem since the reflection of the waves on a generic surface is lossy too. Moreover, a system targeted to the consumer market must not employ expensive technologies, so relatively high noise figure of the receiver and low output power of the transmitter should be expected. All of these factors concur to limit the power budget and so the feasible link distance. Due to the relative small size of 60 GHz antennas, the phased-array technique is an attractive solution to create directive radiators with a high gain in the direction of the transmission, thus greatly reducing the path loss (1.4). In addition, the phased-array architecture is able to create an electronically-steerable beam that allows the generation of self-aligning systems, and introduces a signal to noise ratio enhancement for the receiver and a power outlet increase for the transmitter, thus further relaxing the link budget. These features at millimeter-waves arouse a great interest today, both academic and industrial; plenty of research papers (many of them in CMOS) [19]–[30] and the recent patent of SiBEAM (now part of Silicon Image) [31] demonstrate it.

## 2.1 Basic operative principle and performances

In this paragraph a brief description of the physical operation of phased-arrays is presented. Equations of the overall gain and noise figure of the receiver and power outlet of the transmitter are derived to demonstrate the enhancements in performances with respect to the single element. Since the result is obtained by the parallelism of multiple nominally equal elements, the unconcealed cost is an increase in area and power consumption.

#### 2.1.1 Interference of EM waves

The operation of phased-arrays is based on the interference of electromagnetic waves. Several antennas properly placed nearby can fade or enhance the overall transmitted field in certain directions, leading to a modified radiation pattern with respect to the one of the single radiator. As an example, Figure 2.1 (a) shows



**Figure 2.1:** Wavefronts conveyed by: (a) single isotropic radiator, (b) an array of eight isotropic radiators vertically spaced by  $\lambda/2$ , all driven by the same signal.

the spherical wavefronts conveyed by a single isotropic antenna; in Figure 2.1 (b), an array of eight of the same vertically spaced by  $\lambda/2$  and all driven by the same signal sends the wavefronts mainly in the horizontal direction, while transmitting almost nothing along the vertical axis. This particular occurrence shows a general result: a phased-array can build a directive antenna based on non-directive blocks. The expression of the overall electric field radiated by M identical elements is

$$\mathbf{E}_{\text{tot}}(\mathbf{r},t) = \sum_{i=1}^{M} V_i \, \mathbf{e}(\mathbf{r} - \mathbf{x}_i) \, e^{-\alpha |\mathbf{r} - \mathbf{x}_i|} \, e^{-j[\omega t + \mathbf{k} \cdot (\mathbf{r} - \mathbf{x}_i) + \varphi_i]} \qquad \left[\frac{V}{m}\right] \qquad (2.1)$$

where  $V_i$ ,  $\mathbf{e}(\mathbf{r})$  and  $\varphi_i$  are the amplitude, the normalized vector and the phase of the field transmitted by the *i*-th element, which is located in the position  $\mathbf{x}_i$ . Looking at the last exponential term in the (2.1), it's easy to see that the position of an antenna can be traded with the phase of its driving signal. This is the most powerful feature of phased-arrays: the conveyed beam can be *electrically* steered

#### 2.1 Basic operative principle and performances

only by changing the phase of the input signal, without any mechanical movement. Continuing with the previous example, the driving phases in Figure 2.1 (b) are  $\varphi_i = 0 \quad \forall i$ ; in Figure 2.2 the same array is driven with two different set of phases:  $\varphi_i = (i-1)\pi/2$  in case (a) and  $\varphi_i = (i-1)\pi$  in case (b), showing that the beam can be steered over an angle up to 360 degrees. Lastly, due to the



Figure 2.2: A change in the phase of the driving signals turns into a steered beam: (a)  $\varphi_i = (i-1)\pi/2$ , (b)  $\varphi_i = (i-1)\pi$ 

reciprocity of antennas, the whole discussion is valid for the reception exactly as for the transmission.

#### 2.1.2 Receiver operation

The fundamental structure of a phased-array receiver is depicted in Figure 2.3. Only two elements are represented to simplify the discussion, but a general result for an arbitrary number of parallel paths will be derived. The incident wavefronts are those of a flat wave, a valid approximation if the receiver is far enough from the transmitter<sup>1</sup>. The wave travels along a direction inclined to an angle  $\theta$  on the perpendicular axis of the array, thus different wavefronts engrave on different antennas at the same time. The front 1 has traveled a distance  $t = d \sin \theta$  more than the front 0, having so accumulated a phase

<sup>&</sup>lt;sup>1</sup>Given  $S_{tx}$  and  $S_{rx}$  the size of transmitter and receiver, and R the distance between them, the wavefronts can be assumed flat if  $R \gg S_{tx}, S_{rx}$ .



Figure 2.3: Basic architecture of a phased-array receiver.

shift  $\Delta \varphi_1 = \varphi_1 - \varphi_0 = \frac{2\pi}{\lambda} d \sin \theta$ . Furthermore, the attenuation between adjacent wavefronts can be usually neglected. Assuming  $A_0 = 1$  and  $\varphi_0 = 0$  without lack of generality, the electric signals at the output of the antennas are:

$$u_0 = \sin(\omega_{s}t)$$
$$u_1 = \sin(\omega_{s}t + \Delta\varphi_1)$$

The local oscillator is distributed through phase shifters, so that:

$$LO_i = A_{LO} \cos\left(\omega_{LO}t + \varphi_{LO} + \Delta\psi_i\right)$$

To simplify the next calculations, but still retaining the generality, we can assume here  $A_{\rm LO} = 2$  and  $\varphi_{\rm LO} = 0$ . If the low-pass filter (LPF) is ideal, by the use of very well known trigonometric identities, the output signals of the mixers can be simplified in:

$$v_0 = \sin[(\omega_s - \omega_{LO})t + \Delta\psi_0]$$
  
$$v_1 = \sin[(\omega_s - \omega_{LO})t + (\Delta\varphi_1 - \Delta\psi_1)]$$

If the phase shifters are able to provide  $\Delta \psi_1 = \Delta \varphi_1 - \Delta \psi_0$ , the downconverted signals add up *coherently* and z is trivially twice  $v_0$ . It is immediate to extend this result for an array of M elements: setting  $\Delta \psi_i = \Delta \varphi_i = i \frac{2\pi}{\lambda} d \sin \theta$  we have:

$$z = M A_0 G_0 \sin[(\omega_{\rm s} - \omega_{\rm LO})t]$$

where  $G_0$  is the conversion gain of each branch of the receiving chain. The overall conversion gain of the receiver is thus:

$$G_{\rm tot} = M G_0 \tag{2.2}$$

Even the overall noise figure is positively affected. The *i*-th antenna receives an input signal  $u_i$  already corrupted by some stochastic noise  $\mathbf{n}_{ui}$ ; since each one of them comes from the same wave, we have:

$$\begin{array}{ll} C(u_i, u_j) &= 1 \\ C(\mathbf{n}_{u_i}, \mathbf{n}_{u_j}) &= 1 \end{array} \end{array} \right\} \forall \ i, j$$

where C is the correlation. The signal and noise powers  $S_u$  and  $N_u$  can be assumed equal  $\forall i$ . The  $v_i$  at the input of the adder will thus show:

$$\begin{split} S_{\rm v} &= G_0^2\,S_{\rm u} \\ N_{\rm v} &= G_0^2\,N_{\rm u} + N_i = F_0\,G_0^2\,N_{\rm u} \end{split}$$

where  $F_0$  and  $N_i$  are the noise factor and the added noise power of each branch. The noise power is the same  $\forall i$  again, so  $N_i = N_0$ , but since each receiver is a separate circuit, we can assume:

$$C(\mathbf{n}_i, \mathbf{n}_j) = 0 \qquad \forall \ i, j$$

Passing through the sum operation, the correlated signals and noises are added in amplitude, while the non-correlated are added just in power, obtaining:

$$\begin{split} S_{\rm z} &= M^2 \, G_0^2 \, S_{\rm u} \\ N_{\rm z} &= M^2 \, G_0^2 \, N_{\rm u} + M \, N_0 \end{split}$$

The received signal to noise ratio is thus:

$$SNR_z = \frac{S_{\rm u}}{N_{\rm u} + \frac{N_0}{G_0^2 M}}$$

The ratio of the input SNR over the output one gives the overall noise factor of the phased-array receiver:

$$F_{\rm tot} = \frac{F_0 - 1}{M} + 1 \tag{2.3}$$

It is evident that a large number of elements tends to make the noise added from the single branches negligible, and the overall receiver can be almost noiseless.

#### 2.1.3 Transmitter operation

The structure of the transmitter is dual to the one of the receiver, and is shown in the next figure. The same source information u is copied at the input of each



Figure 2.4: Basic architecture of a phased-array transmitter.

up-converting mixer:

$$u = \sin(\omega_{s}t)$$

The resulting signals  $v_i$  those drive the antennas are:

$$v_i = \sin[(\omega_s + \omega_{LO})t + \Delta \psi_i]$$

and the transmitted field is analogous to the one shown in Figure 2.2 (a). Again in the same figure is possible to note that the wavefronts tend to become flat after a certain distance<sup>2</sup> from the transmitter. In the region where the interference between the individual waves is constructive, assuming  $V_i = V_0 \ \forall i$  without lack of generality, the amplitude of the total field (2.1) is maximum and approximately equal to

$$|\mathbf{E}_{tot}| = V_{tot} = M V_0 \qquad [V]$$

The magnetic field can be derived from the electrical one as:

$$\mathbf{H} = \frac{\mathbf{u} \times \mathbf{E}}{\eta} \qquad \left[\frac{A}{m}\right] \tag{2.4}$$

where  $\mathbf{u}$  is the unit vector of the propagation and  $\eta$  is the wave impedance. The power density carried by the EM wave is defined by the Poynting Vector:

$$\mathbf{S} = \frac{\mathbf{E} \times \mathbf{H}^*}{2} \qquad \left[\frac{W}{m^2}\right] \tag{2.5}$$

<sup>&</sup>lt;sup>2</sup>See footnote 1 on page 15.

From (2.4),  $|\mathbf{H}_{tot}| \propto |\mathbf{E}_{tot}|$ , and combining this with (2.5) the total power transmitted from a phased-array of M elements is

$$P_{\rm tot} = M^2 P_0 \qquad [W] \tag{2.6}$$

where obviously  $P_0$  is the output power of a single branch.

## 2.2 Receiver architecture

The phased-array receiver operates a coherent detection, so the basic structure described in paragraph 2.1.2 can be implemented both with a homodyne and heterodyne architecture. These solutions in turn can be realized in different ways, and their advantages and drawbacks in view of a phased-array implementation are compared in this paragraph.

The block diagrams of two equivalent declinations of a homodyne phased-array receiver are depicted in Figure 2.5. The RF input signal is down-converted to baseband in only one step, thus minimizing the overall complexity of the system. The bandpass filter removes the out-of-band interferers, while the low-pass filters perform channel filtering at baseband. In (a), the phase shifting is provided in series to the RF path. A typical phase shifter that can be employed for this purpose is shown in Figure 2.6 [25]. It is formed by two parallel paths, one of those with a delay cell implementing a fixed phase shift of  $90^{\circ}$ . This solution has a simple implementation, but several drawbacks: the delay cell is usually realized by means of a series transmission line, so it occupies a wide area thus being unsuitable for phased-arrays with a large number of elements; moreover, its loss in series to the signal path reduces the gain and raises the noise figure of the receiver. The second possibility depicted in Figure 2.5 (b) avoids the previous problems, because the phase shift is operated on the local oscillator; in this way each branch needs a guadrature mixer and two phase shifters, thus critically raising the area occupancy and the power consumption of the system. In addition, independently on how the phase shifting is provided, the direct conversion has several well-known issues: DC offsets, flicker noise, self-mixing of the LO leackage and of strong interferers, and high LO frequency (equal to the RF carrier).

The heterodyne architecture avoids all the issues listed for the direct conversion, while introducing the problem of the image rejection. However, this drawback can be tolerated because the use of the heterodyne architecture allows a critical reduction of the component count of the whole system: providing the phase shifting in series to the LO path, as shown in Figure 2.7, it leads to only a single-phase mixer driven by one phase shifter for each branch, thus greatly reducing the area occupancy and the power consumption with respect to the



**Figure 2.5:** Two equivalent declinations of a homodyne phased-array receiver: (a) phase shifting of the RF path; (b) phase shifting of the LO path.



Figure 2.6: Phase shifter for the RF path as proposed in [25].

#### 2.3 Overview of the state of the art

scheme of Figure 2.5 (b). A phase shifter architecture well-suited for LO path shifting is described in Appendix A. Such phase rotator needs a resonant load, but its area occupancy is typically lower with respect to the  $\lambda/4$  line<sup>3</sup> used by the solution depicted in Figure 2.6, thus the heterodyne architecture allows a smaller area and power consumption even with respect to the direct conversion implementation of Figure 2.5 (a).



Figure 2.7: Sliding-IF (heterodyne) receiver.

Specifically, the system depicted in Figure 2.7 implements a sliding-IF architecture, that is a particular case of heterodyne architecture where the LO frequency is set to 2/3 of the RF frequency, so the IF falls at 1/2 of the LO. The half-LO waveform is always present in the PLL that locks the VCO, and often even in quadrature (depending on the structure of the first frequency divider), thus allowing to implement the heterodyne architecture with only one PLL.

#### 2.3 Overview of the state of the art

Several phased-array solutions have been already proposed in literature, and a brief review of four peculiar solutions is presented in this paragraph.

<sup>&</sup>lt;sup>3</sup>Assuming an equivalent refractive index of 2.4 for the silicon dioxide,  $\lambda/4$  is equal to 500  $\mu$ m at 60 GHz, while typical millimeter-wave inductors have a dimension contained within about 200  $\mu$ m

24 GHz o LO2 $LO_{\varphi 0}$ 19.2 GHz 4.8 GHz 16-phase Ś  $LO2_{I&Q}$  $LO_{\varphi 1}$ Phase selectors LO2<sub>Q</sub>  $LO_{\phi^7}$ 

Figure 2.8: 8-element phased-array receiver presented in [7].

down-conversion with 19.2 GHz LO and 4.8 GHz IF. The shifting is performed on the LO path through the digital selection of the available phases of the 16-phase VCO. The RF paths are formed by the cascade of two-stage single-ended LNAs and gilbert cells with one input connected to AC ground. All the drains of the RF mixers are connected to the same resonant load to perform passive signal combining at IF through the sum of the down-converted currents. No antennas are described in this work. This solution realizes a gain of 43 dB and a NF of 7.4 dB for each signal path, while the overall performances of the phased-array receiver are a total gain of 61 dB with an SNR improvement of 9 dB with respect to the single branch. Total power consumption is 910 mW and chip area is  $11.55 \,\mathrm{mm}^2$ .

A 4-element 77 GHz receiver in 0.13  $\mu$ m SiGe BiCMOS is presented in [21], and the system architecture is reported in Figure 2.9. A sliding-IF architecture with 52 GHz LO and 26 GHz IF is adopted, with local phase shifting on the LO path. The VCO is single-phase, and the quadrature signals required by the phase interpolators are generated inside them by means of  $\lambda/4$  T-lines, thus occupying a very large area. Differential dipole antennas are embedded on-chip and directly connected to the cascade of a two-stage cascoded fully-differential LNA and a gilbert cell. VGA are implemented by means of single-stage pseudo-differential cascode with programmable bias voltage. The IF signal combining is active and implemented by means of pseudo-differential cascoded transconductors. This solution performs a gain of 37 dB and a NF of 8 dB for each signal path, while the overall performances of the phased-array receiver are a total gain of 49 dB with NF of 8 dB. Total power consumption is 1.05 W and chip area is about  $12.9 \,\mathrm{mm}^2$ .

A 8-element 24 GHz receiver in 0.18  $\mu$ m HBT SiGe BiCMOS is presented in [7], and the system architecture is reported in Figure 2.8. It adopts a sliding-IF



**Figure 2.9:** 4-element phased-array receiver with on-chip antennas presented in [21].

A 16-element 60 GHz receiver in  $0.12 \,\mu$ m SiGe BiCMOS is presented in [28], and the system architecture is reported in Figure 2.10. Also this solution adopts



**Figure 2.10:** 16-element phased-array receiver with in-package antennas presented in [28].

a sliding-IF architecture, with 17 GHz  $\times$  3 LO and 8.5 GHz IF, but the phase shifting is provided in series with the RF path. Aperture-coupled patch-antennas are packaged together with the RX IC. Each branch is formed by a three-stage single-ended LNA, a reflection-type phase-shifter (RTPS), a balun and a fully-differential VGA. The RTPS is based on a Lange coupler, and the balun is formed by two Lange couplers connected together, thus a large silicon area is occupied. Signal combining is performed in several steps by means of passive Gysel couplers, those also implement the routing between the different individual receivers, and active current combiners. This solution performs a programmable gain from 10 to 58 dB and a NF of 6.8 dB for each RF path, with a total power consumption of 1.8 W and chip area of  $37.7 \text{ mm}^2$ .

A 4-element 60 GHz direct-conversion receiver in 65 nm CMOS is presented in [30], and the system architecture is reported in Figure 2.11. Each branch



Figure 2.11: 4-element phased-array receiver presented in [30].

is formed by a three-stage single-ended LNA, I&Q mixers and baseband phase shifting implemented by weighted sum of the I and Q down-converted signal paths. Signal combining is performed in current-mode summing and fed into the baseband buffers. This solution performs a gain of 24 dB and a NF of 6.8 dB for each signal path, with a total power consumption of 137 mW and chip area of  $4.38 \text{ mm}^2$ .

All the described solutions employ multi-stage LNAs. Since each stage has at least one inductor or T-line and a bias current, the replication of the amplifiers for each signal branch leads to a significant area occupancy and power consumption. Moreover, the local generation of the quadrature signals in each branch starting from a single-phase VCO by means of 90° phase shifters lends to a further area occupancy. In the next chapters a power and area efficient single-stage merged LNA and mixer, and a quadrature VCO targeted for heterodyne architecture will be proposed to overcome the listed problems. The QVCO has been realized at 60 GHz to demonstrate the validity of its structure both for the homodyne and heterodyne architecture.

## Chapter 3

## An area and power efficient merged LNA and mixer

The most significant parameter of an active device for high-frequency operation is the maximum frequency of oscillation  $f_{max}$  [10], i.e. the frequency where the intrinsic power gain of the device drops below 0 dB. The 65 nm technology node exhibits  $f_{max}$  of about 150 GHz, as shown in Figure 1.3, so the available gain around 60 GHz is rather poor. Several reported CMOS LNAs employ multiple stages in order to minimize the impact of mixer and cascaded stages noise [32]-[35]. This approach has several drawbacks: the power consumption grows almost linearly with the number of stages; each stage requires at least one passive component. Inductors and T-lines occupy a huge area compared to transistors; accurate ground path and wire EM modeling is required for long interconnections and several cascaded resonant stages lead to an overall narrow bandwidth. In view of a phased-array implementation, equations (2.2) and (2.3) show that gain and noise figure of each individual receiver are not critical, while their area and power consumption are key aspects, as shown in Figure 2.7. In this chapter a single processing stage with LNA, mixer and image-rejection filter merged together is presented, showing noise performances comparable to stand-alone LNAs, while power dissipation, silicon area and signal bandwidth are much improved compared to state-of the art CMOS LNAs in the same band.

## 3.1 Circuit description

The complete low-noise down-converter is shown in Figure 3.1, and it has been realized in 65 nm CMOS. It is tailored to the heterodyne architecture depicted in Figure 2.7, with the input RF signal at 60 GHz, the LO at 40 GHz and the down-converted IF at 20 GHz. Transistor  $M_1$ ,  $L_G$  and  $C_{pad}$  realize the matching



**Figure 3.1:** Schematic of the merged LNA and mixer. Intrinsic components in light grey.

network, transistors  $M_2$  and  $M'_2$  form the mixer, inductor  $L_{\rm L}$  loads the circuit and implements the IF filtering too,  $L_{\rm F}$  and  $C_{\rm F}$  realize the image-rejection filter and the pmos  $M_3$  is a current source used to reduce both the switching time and the noise of the mixer. All of these functions are described in detail in the following sections, that are kept separate for the sake of clarity even if the design flow is made of several iterative steps where all the components must be taken into account at the same time: they are all connected to each other through electric ( $C_{\rm GD}$  and  $g_{\rm DS}$  over all) and magnetic (the mutual inductance between nearby spirals) parasitic couplings.

#### 3.1.1 Transconductor

Three alternative classical topologies can be adopted to implement a low-noise input stage: 1) inductive degeneration (LD), 2) common-gate (CG), 3) common-source (CS). LD has the lowest noise figure together with a good transconductance gain thanks to the series resonance of the input network, but requires two inductors thus occupying a large area; CG has the widest band of input matching and theoretically does not require inductors for implementation thus saving area, but it needs a current source that reduces the voltage headroom of the circuit; CS together with the input matching network has a gain comparable to the one of LD
#### 3.1 Circuit description

but with the use of one inductor only<sup>1</sup>. This receiver is tailored to a phased-array system, so the minimization of the area is of primary importance with respect to gain and noise figure, as shown by (2.2) and (2.3), but the voltage headroom is rare in nm-scaled CMOS technologies, so the common-source topology has been chosen as optimum compromise.

The well known trade-off in the design of high frequency amplifiers is shown in the next figure:  $f_{\rm t}$  is proportional to the square root of the current density, while the ratio  $g_{\rm m}/g_{\rm DS}$  is the inverse. Despite this trend is based on the



**Figure 3.2:** Indicative trade off between gain and  $f_t$  versus the current density in mos transistors.

quadratic model of the mosfet, that is well known to be quite inaccurate for today's short-channel devices, it constitutes a good starting point for the design flow. The final choice has been made based on reference [36], which demonstrates that nanometer-scaled nmos devices show an optimum current density between 200 and  $300 \,\mu\text{A}/\mu\text{m}$  for maximum  $f_{\rm t}$  and  $f_{\rm max}$  almost *independent* of the technology node. On this basis, a current density of  $250 \,\mu\text{A}/\mu\text{m}$  has been set for  $M_1$ .

The noise contributed by the transistor must be taken into account too. The main noise source of the device is the one related to the drain current<sup>2</sup>. Its equivalent noise voltage reported in series to the gate has a spectral density

<sup>&</sup>lt;sup>1</sup>The impedance transformation performed by the matching network provides a substantial voltage gain. This aspect is described in detail at the end of subsection 3.1.2, together with the gain comparison to the LD.

<sup>&</sup>lt;sup>2</sup>The induced-gate noise has been neglected since the models of the transistors provided with the design kit don't comprise it.

equal to:

$$\frac{\partial \overline{e_{n}^{2}}}{\partial f} = \frac{4KT\gamma}{g_{m}} \qquad \left[\frac{\mathsf{V}^{2}}{\mathsf{Hz}}\right]$$

and this is *independent* of  $g_{DS}$  and the load impedance seen at the drain. This noise is inversely proportional to the width of the mos:

$$\frac{\partial \overline{e_{n}^{2}}}{\partial f} \propto \frac{1}{W\sqrt{\frac{I}{W}}}$$

Since the current density is set, the noise is inversely proportional to the width of  $M_1$ , i.e. the power consumption. Another trade-off emerges because the input capacitance is directly proportional to the width of  $M_1$ :

$$C_{\rm GS} = \frac{2}{3} C_{\rm ox} W L$$

Concerning the gain, both  $g_{\rm m}$  and  $g_{\rm DS}$  are directly proportional to drain current at a fixed current density. It is wise to increase the width of  $M_1$  until its  $g_{\rm DS}$  remains negligible with respect to the input impedance of the switching pair, otherwise a significant part of the signal will be wasted. From computer simulations a width of 20  $\mu$ m over the minimum length of 60 nm has been chosen for  $M_1$  as optimum compromise between gain, noise, input capacitance and power consumption, leading to a total dissipation of 5 mW from a 1V supply voltage. The finger width is  $1\,\mu$ m, short enough to make the degradation of  $f_{\rm max}$  due to the series gate resistance negligible.

## 3.1.2 Input matching network

Regarding the input impedance only, the small signal model of transistor  $M_1$  can be approximated around  $f_0$  by a simple parallel RC circuit, as shown in Figure 3.3. Because of the feedback operated by  $C_{\rm GD}$ ,  $R_{\rm G,eq}$  depends not only on  $R_{\rm G}$  but also on  $C_{\rm GD}$ ,  $g_{\rm m}$  and  $Z_{\rm L}$  in a quite complex way; the same for  $C_{\rm G,eq}$ .  $R_{\rm G,eq}$  and  $C_{\rm G,eq}$  are shown in light grey in Figure 3.1, and their resulting values in this design are:

$$R_{G,eq} = 450 \ \Omega$$
  $C_{G,eq} = 35 \ \text{fF}$ 

These parameters evaluated at 60 GHz are represented on the Smith chart in Figure 3.5. A single-stub matching network can be used to adapt the input, because it is able to transform any load impedance into the  $Z_0$  of the employed transmission lines. There are two disadvantages in this case: the wavelength in the silicon dioxide is about 2 mm at 60 GHz, thus the lines usually occupy quite



Figure 3.3: Equivalent input impedance of a common-source.

a wide area, and the unavoidable capacitance of the RF pad modifies the architecture of the network, making it not all-purpose any longer. For this particular case, a single-stub matching network can be emulated by the use of a series inductor together with the parasitic capacitance of the pad, as shown in Figure 3.4. Starting from the  $Z_{G,eq}$ , a series inductance makes the impedance move on a cir-



**Figure 3.4:** The single-stub matching network under particular conditions can be emulated by a series inductor and a parallel capacitor.

cular segment toward the open-circuit point. The radius of the circle is reduced when the quality factor of the inductor is lowered, as shown in Figure 3.5. If the path cuts through the admittance locus G = 20 mS, the use of a parallel capacitance can bring the impedance to the center of the Smith chart following that circle<sup>3</sup>. The movement is clockwise exactly as for the series impedance, so the intersection must be in the upper hemisphere. Generalizing, this architecture based on a series inductance and a parallel capacitance is able to match any point comprised in the area highlighted in Figure 3.6.

In the design of the spiral, the smallest area has been preferred to the highest Q because the input RF current passes through the network shown in Figure 3.7, formed by the series of  $L_{\rm G}$ ,  $C_{\rm GS}$  and the parasitic inductance of the ground path.

 $<sup>^{3}</sup>$ The measured Q of the RF pad is more than 50 at 60 GHz that can be approximate equal to infinity, since the losses are dominated by the Q of the integrated inductor that is about 10.



**Figure 3.5:** Impedance transformation operated by a series inductance with different values of Q at 60 GHz. The value of the inductance increases along the direction of the arrows.

Such inductance degenerates the the transconductor, thus modifying the architecture of the matching network. The area occupied by  $L_{\rm G}$  has been minimized by the use of narrow metal strips ( $W = 4 \,\mu$ m). In this way the core of the circuit has been laid out very close to the ground pad, as shown in Figure 3.15, and  $L_{\rm gnd}$  is negligible. The problem of the ground path exists only for the input transconductor that is in a single-ended configuration, since the signal is converted to a differential fashion immediately after the switching pair.

### Voltage gain

The matching network, whose equivalent circuit is represented in Figure 3.8, performs an impedance transformation. Because of the conservation of the energy, the voltage is also transformed across the adapter. Assuming a perfect matching at the input port, the source delivers its maximum available power:

$$P_{\rm D}=\frac{V_{\rm s}^2}{8\,R_{\rm s}}=\frac{V_{\rm in}^2}{2\,R_{\rm s}}$$

#### 3.1 Circuit description



**Figure 3.6:** Locus of the matchable impedances by the use of a series inductance and a parallel capacitance starting from the load.



Figure 3.7: RF input current path. Pad capacitance is omitted.



**Figure 3.8:** Equivalent circuit for the evaluation of the voltage gain of the matching network.

The signal propagates through the adapter, and the resulting incident power on the  $R_{\rm G,eq}$  can be expressed as:

$$P_{\rm G} = \frac{V_{\rm G}^2}{2\,R_{\rm G,eq}}$$

Since the adapter is lossy, it can be described as  $|S_{21}| = \alpha \in [0, 1]$ , thus the relationship between the output and the input power is  $P_{\rm G} = \alpha^2 P_{\rm D}$  and so the voltage gain between  $V_{\rm in}$  and  $V_{\rm G}$  is:

$$G_{\text{match}}(f_0) = \alpha \sqrt{\frac{R_{\text{G,eq}}}{R_{\text{s}}}}$$
(3.1)

In this design the attenuation of the matching network is about 3.5 dB, almost completely due to the series resistance of  $L_{\rm G}$ . This value can be equivalently expressed as  $\alpha = 0.67$ , thus the voltage gain provided by the adapter (at 60 GHz) is equal to

$$G_{\text{match}}(f_0) = 6 \text{ dB}$$

This justifies what has been said at the beginning of subsection 3.1.1: CS together with the input matching network has a gain comparable to the one of LD but with the use of one inductor only. This is summarized in Figure 3.9: the overall transconductance gain of LD is  $Q_{\rm in}$  times the  $g_{\rm m}$  of the transistor, where  $Q_{\rm in}$  is the quality factor of the resonance of the input network. Considering  $f_0 = 60$  GHz and  $R_{\rm s} = 50 \Omega$ , and assuming  $C_{\rm GS} = 35$  fF,  $Q_{\rm in}$  results equal to 3.6 dB, so in this realization the gain of the CS with input matching is even higher, thus further justifying the choice.



**Figure 3.9:** Overall transconductance gain of inductive degenerated (LD) and common source stage with input matching network (CS).

In this design, starting from a pad capacitance of 70 fF and the equivalent input impedance of the common source evidenced at the beginning of this paragraph,  $L_{\rm G}$  has resulted equal to 330 pH.

#### 3.1 Circuit description

### 3.1.3 Switching pair

The current-voltage characteristic of the differential pair is shown in figure Figure 3.10. A good operation of the mixer requires hard switching, so the threshold of the



Figure 3.10: Current-voltage characteristic of a differential pair.

commutation should be minimized.  $V_0$  is directly proportional to the overdrive voltage of the transistors, and the latter is proportional to the square root of the current density:

$$V_0 \propto V_{\rm ov} \propto \sqrt{\frac{I}{W}}$$
 (3.2)

When the pair is completely switched, the active transistor works as a current buffer; the input impedance seen looking into the source is equal to:

$$Z_{\rm in} = \frac{1+g_{\rm DS}Z_{\rm L}}{g_{\rm m}+g_{\rm DS}}$$

where  $Z_{\rm L}$  is the load impedance connected to the drain.  $Z_{\rm in}$  can be minimized by increasing  $g_{\rm m}$  and reducing  $g_{\rm DS}$ :

$$g_{\rm m} \propto \sqrt{WI} \qquad g_{\rm DS} \propto I$$
 (3.3)

The noise contribution of the switching pair is directly proportional to the biasing current I [37], and this together with equations (3.2) and (3.3) clearly show that the best design strategy for the switching pair is to increase the width of the transistors while minimizing the bias current. The latter can be made independent of one of the transconductor by means of an additional current source, as shown in Figure 3.11, while the size of the devices can't be increased too much because of the parasitic capacitance that is directly proportional to the area of the device. The pmos transistor  $M_3$  in Figure 3.1 works as current source: a non-minimum channel length has been selected for it in order to minimize its  $g_{DS}$  and its noise contribution. The reduction of the bias current in the pair reduces the amplitude of cyclostationary voltage swing at the output (the LO feedthrough), and this in



Figure 3.11: Differential pair with reduced bias current.

combination with the lowering of the  $V_{\rm ov}$  allows to increase the LO amplitude, thus hardening the switching and further reducing the noise contribution of the pair. Transistors  $M_2$  have  $W = 20 \,\mu$ m over minimum length, while  $M_3$  has  $W = 40 \,\mu$ m, L = 120 nm and draws 4 mA, thus leaving 1 mA for the biasing of the switching pair. The load inductance of the mixer is equal to 770 pH, to resonate the drain capacitance of the switching pair and the gate capacitance of the output buffer at the chosen IF frequency of 20 GHz.

## 3.1.4 Image rejection filter

The input matching network is narrowband, so it provides some filtering of the interferers in the image band. On the other hand, the connection between the transconductor and the switching pair is broadband, so all the image noise of the former is down-converted to IF, greatly increasing the overall noise figure of the receiver. The passive components  $L_{\rm F}$  and  $C_{\rm F}$ , together with the intrinsic parasitic capacitance  $C_{\rm p}$ , realize the image rejection filter. The resonant network is shown in detail in Figure 3.12. The main resistive components that set the overall quality factor are the series resistance of inductor  $L_{\rm F}$  and the  $g_{\rm DS}$  of transistor  $M_3$ , while the contribution of the capacitance losses is negligible. The  $g_{\rm DS}$  of  $M_1$  and the input impedance of the switching pair are not shown since they can be considered part of the active mixer, the latter with the image rejection filter as a separate component connected in parallel. At first glance, this network shows two singularities: a short due to the series resonance of  $L_{\rm F}$  with  $C_{\rm p}C_{\rm F}/(C_{\rm p} + C_{\rm F})$ , where the former occurs at a lower frequency with respect to the latter. In this design, the



**Figure 3.12:** Architecture of image rejection filter. The quality factor of the capacitances is neglected.

following approximations can be done:

$$g_{\rm DS3} \ll s(C_{\rm F}+C_{\rm p})$$
 
$$g_{\rm DS3}R_{\rm LF} \ll 1$$

where the first one has to be verified at least at the frequency of the pole. Under these conditions, the total impedance of the filter can be simplified as:

$$Z_{\rm F} = \frac{1}{s(C_{\rm F} + C_{\rm p})} \frac{s^2 C_{\rm F} L_{\rm F} + s(g_{\rm DS3} L_{\rm F} + R_{\rm LF} C_{\rm F}) + 1}{s^2 \frac{C_{\rm F} C_{\rm p}}{C_{\rm F} + C_{\rm p}} L_{\rm F} + s \frac{C_{\rm p}}{C_{\rm F} + C_{\rm p}} (g_{\rm DS3} L_{\rm F} + R_{\rm LF} C_{\rm F}) + 1}$$
(3.4)

The series resonance is at  $\omega_z = 1/\sqrt{L_F C_F}$ , while the parallel occurs at  $\omega_p = 1/\sqrt{L_F C_s}$ , where  $C_s = C_p C_F/(C_p + C_F)$ , with  $g_{\text{DS3}}$  and  $R_{\text{LF}}$  reducing the Q of both resonances. Once the value of the parasitic capacitance  $C_p$  and the center frequencies of the signal and the image are known, it is always possible to find a pair of  $L_F$  and  $C_F$  that solves the next system:

$$\begin{cases} C_{\mathsf{F}}L_{\mathsf{F}} &= \frac{1}{\omega_{\mathsf{IMJ}}^2} \\ \frac{C_{\mathsf{F}}C_{\mathsf{P}}}{C_{\mathsf{F}}+C_{\mathsf{P}}}L_{\mathsf{F}} &= \frac{1}{\omega_{\mathsf{RF}}^2} \end{cases}$$
(3.5)

Alongside to the rejection of the image, the filter maximizes the gain of the RF signal, since the parallel resonance cancels out the parasitic capacitance  $C_{\rm p}$ . In this design the inductance and capacitance of the filter have been set to 225 pH and 290 fF respectively.

## 3.1.5 Noise analysis

A cascode gain stage is formed by the cascade of a transconductor and a load impedance, with a current buffer in between. An active mixer has exactly the

same structure: the tail transistor is the transconductor, while the real mixer is the switching pair only: it is named *active* because in periodic steady state the active transistor works in the saturation region, behaving like a switched current buffer<sup>4</sup>. The active mixer is so a "switching cascode" as shown in Figure 3.13,



Figure 3.13: Basic structure of cascode gain stage and active mixer.

and it has been designed to be low-noise similarly to the simple cascode [37]. The transconductor, together with its matching network, works as the sole LNA, providing enough gain to ensure o good operation of the entire front-end while occupying a small area and consuming a little power [38]. The noise performances of the proposed circuit has thus been compared against the LNA shown in Figure 3.14, characterized by the same size for transistors, while the load is designed to resonate at 60 GHz. The noise increase due to the switching operation, and the impact of the image rejection filter has thus been characterized. Table 3.1 shows the simulation results of the noise figure and the noise contribution of the single components. The impact of the image rejection filter has been highlighted. Without filtering, the merged LNA/mixer would lead to a degradation of more than 2 dB against the cascode primarily due to the conversion of the noise of the source resistance and  $M_1$  at the image frequency, that is folded

<sup>&</sup>lt;sup>4</sup>The same for passive mixers, with the exception that the active transistor works in deep triode region.



**Figure 3.14:** Standard cascode used for comparison. The size of the transistors and the matching network are the same of the proposed circuit, while the load resonates at 60 GHz.

|                      |          | LNA   | Merged LNA/mix |               |  |
|----------------------|----------|-------|----------------|---------------|--|
|                      |          |       | NO im. rej.    | im. rej.      |  |
| Source<br>resistance | [nV²/Hz] | 0.2   | 0.329          | 0.2           |  |
|                      | %        | 29.8  | 17.2           | 25.3          |  |
| Matching<br>Iosses   | [nV²/Hz] | 0.106 | 0.11           | 0.0961        |  |
|                      | %        | 15.8  | 10.1           | 12 <u>.</u> 1 |  |
| M1                   | [nV²/Hz] | 0.138 | 0.338          | 0.141         |  |
|                      | %        | 20.6  | 29.1           | 17.8          |  |
| M <sub>2</sub>       | [nV²/Hz] | 0.149 | 0.264          | 0.255         |  |
|                      | %        | 27.3  | 26             | 33.8          |  |
| Load<br>losses       | [nV²/Hz] | 0.078 | 0.11           | 0.1           |  |
|                      | %        | 11.6  | 9.5            | 12.6          |  |
| NF                   | [dB]     | 5.26  | 7.64           | 5.96          |  |

**Table 3.1:** Comparison of the noise performances of the proposed front-end against a traditional mm-wave LNA gain stage.

over of the desired signal at IF. When the image filter is introduced, these noise contributions are reduced and become very close to the corresponding ones in the LNA. The noise of the switching pair does not depend on the filtering of the image [37]; it is higher than the noise of the common gate in the cascode, but it can't be directly compared to the latter because it depends also on the amplitude of the LO that is not present in the simple LNA.

There is another important aspect for the comparison between the cascode and the active mixer: the current to voltage conversion is provided at IF only, where the typical quality factor of inductors is higher than at RF, thus increasing the voltage gain. Lastly, the signal is immediately converted from single-ended to differential, greatly relaxing the problem of the ground paths that is critical at millimeter-wave.

## 3.2 Experimental results

The microphotograph of the test chip is shown in Figure 3.15. It has been real-



Figure 3.15: Chip microphotograph.

ized in CMOS 65 nm technology provided by STMicroelectronics, using general purpose transistors with 1 V supply voltage. The core area is only 170×320  $\mu$ m<sup>2</sup> including the input pads, with a total power consumption of 5 mW. The LO is external and it is converted to differential by an on-chip spiral balun of 80  $\mu$ m diameter, while the parasitic capacitance of the output pads and the 50  $\Omega$  impedance of the measurement setup are driven by a single-stage output buffer. The last two elements are not included in the previous area and power consumption. The input reflection coefficient has been measured with an Agilent N5250C PNA, and the result is shown in Figure 3.16. The  $S_{11}$  is lower than -10 dB from



**Figure 3.16:** Measured  $S_{11}$ . Simulation in dotted line.

approximately 57 to 67 GHz, thus validating the proposed architecture for the input matching network. Gain and noise figure have been characterized with Agilent N9030A PXA spectrum analyzer equipped with the noise figure measurement utility. The results versus IF frequecy with a fixed LO of 37 GHz are reported in Figure 3.17, where it can be observed that the optimum IF frequency is 18.5 GHz, with a noise figure close to the minimum value of 6.5 dB and a gain around 17 dB. Figure 3.18 shows the gain and noise figure versus the RF frequency when downconverted at a fixed IF of 18.5 GHz. Under this condition the front-end covers a very wide RF bandwidth of more than 14GHz. The measured input 1dB compression point is -9dBm. Measurements point-out a center operating frequency lower than expected from simulations, and this is very likely due to an underestimation of the parasitic post-layout capacitances.

The proposed single-stage merged LNA and mixer has comparable or better performances than stand-alone LNAs published in the cited references, as can be derived from Table 3.2 where experimental results are summarized and compared against state of the art mm-wave CMOS standalone amplifiers. Different performances have been also normalized by means of the following Figure of Merit<sup>5</sup>

<sup>&</sup>lt;sup>5</sup>ITRS defines the FoM with IIP3. Since most of the cited LNAs doesn't declare the IIP3,



**Figure 3.17:** Measured gain and noise figure versus IF frequency at a fixed 37 GHz LO. Simulations in dotted lines.

defined in the ITRS report on System Drivers [3]:

$$FoM = \frac{G \cdot CP_{1dB} \cdot f_0}{(F-1)P_{diss}}$$

The proposed solution has Gain, noise figure and linearity comparable to single LNAs with the lowest power dissipation and combining together amplification and frequency translation. The avoidance of resonant tuned loads at RF leads also to a very wide covered RF bandwidth.

# 3.3 Conclusions

A stacked LNA and mixer for mm-wave applications has been reported in this chapter. Notch filtering at the image frequency before frequency translation and optimization of the circuit leads to a low front-end noise figure without

the 1 dB compression point has been used instead.



**Figure 3.18:** Measured gain and noise figure versus RF frequency at a fixed 18.5 GHz IF. Simulations in dotted lines.

requiring multiple stages LNA. The solution leads to very wide RF bandwidth and a remarkable power and area saving, i.e. very desirable properties especially in view of dense phased-array systems made of multiple transceivers on the same chip, operating in parallel.

|                      |       | [33]           | [11]  | [34]           | [10]    | [35]             | This work      |
|----------------------|-------|----------------|-------|----------------|---------|------------------|----------------|
| CMOS                 | tech. | 90nm           | 90nm  | 65nm           | 130nm   | 90nm             | 65nm           |
| $f_0$                | [GHz] | 64             | 58    | 60             | 60      | 63*              | 54             |
| RF BW                | [GHz] | 8.0            | 8.2*  | 7.7            | 14      | 4.5*             | 14             |
| Gain                 | [dB]  | 15.5           | 14.6  | 22.3           | 12.0    | 12.2             | 17.0           |
| NF                   | [dB]  | $6.5 \div 6.7$ | 5.0   | $6.1 \div 7.0$ | 8.8÷9.0 | $5.5 \div 6.5^*$ | $6.5 \div 8.0$ |
| $CP_{1dB}$           | [dBm] | -11.7          | -16.3 | -19.6          | -10.0   | -8.2             | -9.0           |
| Power                | [mW]  | 86             | 24    | 35             | 54      | 11               | 5.0            |
| FoM                  | [GHz] | 0.09           | 0.14  | 0.08           | 0.07    | 1.24             | 2.78           |
| *Cuerching the anti- |       |                |       |                |         |                  |                |

\*Graphically estimated

**Table 3.2:** Performance summary and comparison with state-of-the-art standalone LNAs

# Chapter 4

# **Quadrature frequency generation**

Quadrature frequency generation is a key aspect for phased-array systems, because it allows to build an arbitrary phase through a phase interpolator<sup>1</sup>. Frequency division is equally important, since the oscillator must be closed into a PLL for proper operation. This chapter addresses the realization of these two fundamental blocks.

## 4.1 Quadrature VCO

Several architectures can be used to generate qadrature phases. Single-phase VCOs followed by transmission lines or hybrids are suited for quadrature signal generation at high frequency [22], [39], but the drawback is a relatively large power consumed by the buffers interfacing VCOs to distributed passive components. Alternatives limiting power consumption have thus been investigated. Conventional cross-coupled LC VOCs constitute the most suitable topology borrowed by RF solutions [24], but the oscillation frequency dependence on the biasing current makes it susceptible to phase noise, close-in in particular [40]. On the contrary, the proposed solution relies on a ring of two tuned VCOs, where the oscillation frequency depends on inter-stage passive components only, demonstrating low noise and accurate quadrature phases.

## 4.1.1 Flicker noise issue in quadrature generation

The traditional LC-coupled quadrature VCO shown in Figure 4.1 presents an upconversion of the flicker noise into phase noise that is not present in its single-phase counterpart, thus leading to a lower Figure of Merit (FoM). This mechanism can be understood looking at the phasor diagram of voltages and

 $<sup>^{1}</sup>$ See Appendix A



Figure 4.1: Traditional cross-coupled LC quadrature VCO schematic.



**Figure 4.2:** *Phasor diagram of voltages and currents in the tanks of the coupled oscillators.* 

currents in the two LC tanks shown in Figure 4.2: the current  $I_{\text{tank}}$  is the sum of the two quadrature components  $I_{\text{I}}$  and  $I_{\text{Q}}$ , thus being shifted of  $\psi_{\text{tank}}$  from the voltage  $V_{\text{tank}}$ . The circuit is thus oscillating at a frequency that is different from the resonance of the load, where the equivalent impedance of the latter is purely real and maximum in modulus. The amplitude of the oscillation is consequently lower than the largest achievable and so even more susceptible to phase noise. From Figure 4.2 it results:

$$\psi_{\rm tank} = \arctan \frac{I_{\rm Q}}{I_{\rm I}}$$

The phase response of a parallel LC resonator can be approximated around its resonance frequency  $\omega_0$  as:

$$\psi_{\rm tank}(\Delta\omega) \approx \arctan\left(2Q\frac{\Delta\omega}{\omega_0}\right)$$

#### 4.1 Quadrature VCO



Figure 4.3: Ring of two magnetically coupled VCOs.

where  $\omega_0 = 1/\sqrt{LC}$  and Q is the quality factor. On this basis, the oscillation frequency of the QVCO can be expressed as [41]:

$$\omega_{\rm osc} = \omega_0 (1 \pm \Delta \omega) = \omega_0 \left( 1 \pm \frac{1}{2 Q} \frac{I_{\rm Q}}{I_{\rm I}} \right) \tag{4.1}$$

This result reveals that fluctuations in the amplitude of the currents  $I_{\rm Q}$  and  $I_{\rm I}$  are directly converted into phase noise, in contrast with classical single-phase VCOs. Moreover, the implementation at millimeter-waves asks for small devices in order to minimize the parasitic capacitances. Since the flicker noise is inversely proportional to the  $C_{\rm GS}$ , the  $1/f^3$  noise corner is usually encountered at several MHz [24], [42], too high for the application of interest.

## 4.1.2 Basic idea

All the phase shift necessary to sustain the oscillation can be provided using passive components only, in this way  $\omega_{osc}$  depends only on inductances and capacitances those are free of flicker noise. This result can be achieved using the structure shown in Figure 4.3, where each stage is made of a gain block  $(g_m)$  and a reactive load coupled to the cascaded stage. The two-port passive inter-stage network can be described by the admittance matrix  $\mathbb{Y}$ . By inspection

of Figure 4.3, and considering the simplification  $R_{p1} = R_{p2} = R$ , the following expressions can be derived:

$$y_{11} = y_{22} = \frac{1}{R} + sC + \frac{1}{sL(1-k^2)}$$
(4.2)

$$y_{12} = y_{21} = \frac{1}{s \, L(1-k^2)} \tag{4.3}$$

The principal parameter of the coupled resonator is the transimpedance that converts the output current of the previous gain stage in the input voltage of the following one. The  $z_{21}$  can be derived from the previous equations as:

$$z_{21} = -\frac{y_{21}}{y_{11}y_{22} - y_{12}y_{21}} = \frac{jk}{(1-k^2)\omega L \left\{ \left(\frac{k}{(1-k^2)\omega L}\right)^2 + \left[\frac{1}{R} + j\left(\omega C - \frac{1}{(1-k^2)\omega L}\right)\right]^2 \right\}}$$
(4.4)

The real part of  $1/z_{21}$  becomes null at resonance, showing that the phase shift gained by the signal traveling across the network is equal to  $\pi/2$  at  $\omega_0$ . The former can be calculated from (4.4) as:

$$Re\left[\frac{1}{z_{21}}\right] = \frac{-2 - 2C(k^2 - 1)\omega^2 L}{kR}$$

By putting the previous equation equal to zero, the resonance frequency results:

$$\omega_0 = \frac{1}{\sqrt{LC(1-k^2)}}$$
(4.5)

This oscillation frequency now depends on passive components only, so there is no direct conversion of flicker noise into phase noise as happens in (4.1).

In order to start-up and sustain the oscillation, a loop gain greater than one must be provided, so the condition  $(g_m z_{21})^2 > 1$  must be satisfied. The magnitude of the transimpedance of the coupled resonators at  $\omega_0$  can be derived by substituting (4.5) into (4.4):

$$|z_{21}(\omega_0)| = R \, \frac{kQ}{1 + (kQ)^2}$$

where  $Q = \omega_0 RC$ .



**Figure 4.4:** Transimpedance phase and magnitude for magnetically coupled resonators versus frequency for different k values.

## 4.1.3 Circuit design

Magnitude and phase response of  $z_{21}$  (4.4) versus frequency for different k values at a fixed Q = 5 are shown in Figure 4.4. The phase variation of the transimpedance becomes steeper as k decreases, so a minimum coupling is useful to reduce the phase noise because this diminish the frequency range where the Barkhausen criterion is satisfied. Since the amplitude at resonance is not monotonic with the coupling, an optimum value k = 0.2 has been found to minimize the power consumption.

In order to prove the concept of the flicker noise reduction and to establish a quantitative comparison, the two circuits shown in Figure 4.1 and Figure 4.3 have been simulated, and the results are shown in Figure 4.5. Both circuits draw 22 mA from 1 V supply, have the same Q of 5 for each LC tank, and have been tailored to provide a comparable I & Q error of about 2°. No varactors have been included in order to avoid 1/f phase noise due to AM to PM conversion. The proposed solution shows a significant improvement, since the 1/f noise corner is



**Figure 4.5:** Simulated phase noise for 60 GHz tones generated by quadrature oscillators based on active (black line) and passive (gray line) coupling.

moved downward by almost two decades, from  $\sim 10$  MHz to  $\sim 20$  kHz as shown by the dashed lines those highlight the slopes.

The transfer function of the coupled resonators is the same if the coupling is implemented in a capacitive or a magnetic fashion. In the first solution, four separate inductors are needed, and they should be placed relatively far to each other in order to keep the capacitive coupling as the main one [43]. The magnetic version has been chosen since it allows a  $2\times$  area saving, with the additional benefit of the simplification of the routing of the quadrature signals out of the core. A3-D view of the implemented transformer is shown in Figure 4.6, aside with a simple lumped model. In order to realize a comparable inductance for the two interleaved windings, those should be very close to each other, thus leading to a coupling factor  $k_1$  close to 1. As derived in the previous section, the optimum coupling is rather small, about 1-2 tenths for a Q of 5 of the overall resonator, so a closed-loop shield has been introduced surrounding the outside inductor. The effect can be demonstrated as follows: the complete system for the three coupled inductors is

$$\begin{cases} V_1 = sL_{\rm int} & I_1 + sk_1\sqrt{L_{\rm int}L_{\rm ext}} & I_2 + sk_2\sqrt{L_{\rm int}L_{\rm sh}} & I_{\rm x} \\ V_2 = sk_1\sqrt{L_{\rm int}L_{\rm ext}} & I_1 + sL_{\rm ext} & I_2 + sk_3\sqrt{L_{\rm ext}L_{\rm sh}} & I_{\rm x} \\ 0 = sk_2\sqrt{L_{\rm int}L_{\rm sh}} & I_1 + sk_3\sqrt{L_{\rm ext}L_{\rm sh}} & I_2 + sL_{\rm sh} & I_{\rm x} \end{cases}$$

where the third equation already takes into account that the shield is short-circuited.



Figure 4.6: Low-k transformer: 3-D view and simplified lumped model.

By putting all the inductances equal to L, the current  $I_{\star}$  can be removed from the system leading to:

$$\begin{cases} V_1 = s(1-k_2^2)L & I_1 + s(k_1-k_2k_3)L & I_2 \\ V_2 = s(k_1-k_2k_3)L & I_1 + s(1-k_3^2)L & I_2 \end{cases}$$
(4.6)

The equivalent coupling between the inner and outer inductors is so

$$k_{\rm eq} = k_1 - k_2 k_3 \tag{4.7}$$

The shield has been realized with two turns, embedding the secondary winding in order to maximize  $k_3$  and thus reducing  $k_{eq}$  down to the desired value of  $\approx 0.15$ . Accurate electromagnetic simulations have been performed over the entire structure of the coupled resonator, leading to the results shown in Figure 4.7. The internal inductance is 98 pH with a Q of 19, while the external one is 113 pH with a Q of 15. Since  $k_3$  is higher than  $k_2$ , the self inductance of  $L_{ext}$  is reduced more than the internal as shown by (4.6) and the resulting quality factor is lower. Since AMOS varactors show a quality factor around 7 at 60GHz, the overall quality factor of the tank depends mainly on those, so the Q reduction introduced by the shield has a negligible impact on the performances of the oscillator.

The complete schematic of the implemented QVCO is depicted in Figure 4.8. Three digitally controlled varactors implements the coarse tuning of each LC tank in eight different bands, with an AMOS varactor for fine analog tuning within the band. Since varactors are responsible for 1/f noise conversion into phase noise, the circuit has been biased by mean of digitally controlled resistors  $R_{\text{bias}}$  in order to minimize the sources of flicker noise. Biasing resistors have been



**Figure 4.7:** Simulated parameters of the low-k transformer: (a) inductance and quality factor for internal (black) and external (gray) windings; (b) coupling coefficient.



Figure 4.8: Complete schematic of the realized quadrature VCO.

#### 4.1 Quadrature VCO

placed between the supply voltage and the center taps of the transformers, in order to nominally set the output nodes around  $V_{\rm DD}/2$  and thus explore the tuning characteristic of the AMOS varactors in their region of steepest variation.

Transconductors have been set to an aspect ration of  $30 \,\mu\text{m} \,/ 60 \,\text{nm}$ , oversized with respect to the minimum required to start the oscillation, in order to ensure a wide margin because we were not completely confident of the modeling of the transformers, but at the cost of a limited tuning range in this prototype. Active devices work in a pseudodifferential way, increasing the risk of common mode oscillations. In fact, the Barkhausen condition on the phase holds true provided the coupled resonators result in a phase shift of 90°, but also 0° and 180°. The latter two cases lead to common mode oscillations and are to be avoided. Intuitively, resistors  $R_{\rm cm}$  drastically reduce the quality factor of the resonator under common mode excitation while the loop gain is not affected when a differential signal propagates. The resistance has been designed equal to  $6 \,\mathrm{k}\Omega$ , that it has been found by simulations to be enough in order to avoid common mode oscillations in any case.

## 4.1.4 Experimental results

The complete quadrature oscillator of Figure 4.8 has been realized in a standard 65 nm CMOS bulk node from STMicroelectonics, featuring seven copper layers plus an aluminum layer on top. The supply voltage is 1 V and general purpose transistors have been employed. The chip microphotograph is reported in Figure 4.9. The VCO occupies an area of  $0.075 \text{ mm}^2$  and draws 22 mA. The quadrature output signals can be tested directly at 60 GHz or downconverted to a lower frequency by mean of a quadrature mixer driven by an external frequency reference to better determine the phase accuracy. The mixer is based on a conventional Gilbert cell, and since it is used for testing purposes only it is not discussed in further detail for the sake of brevity. The VCO can be tuned between 56 and 60.4 GHz in eight bands 500 MHz wide. The total tuning range is limited to 7.6% only, mainly due to the oversizing of the transistors to assure a wide start-up margin as discussed before. From later estimations, the device width can be halved approximately doubling the tuning range but still ensuring a safe start-up condition. The control voltage of the AMOS varactors is tuned between 0 and 1.2 V, and the resulting analog tuning allows a 500 MHz overlap between adjacent bands.

Figure 4.10 (a) shows the measured phase noise for a 58.6 GHz tone: corner frequency is highlighted by the dashed lines and is less than 1 MHz, demonstrating a very low conversion of the 1/f noise into phase noise, and the spot valie at 10 MHz offset is -117 dBc/Hz. Phase noise values at 1 MHz offset from the carrier within the entire tuning range are reported in Figure 4.10, revealing



Figure 4.9: Chip microphotograph.



**Figure 4.10:** (a) Measured phase noise for a 58.6 GHz output frequency. (b) Phase noise spot values at 1 MHz offset from the carrier within the tuning range.



Figure 4.11: VCO quadrature output signals down-converted to 200 MHz

a maximum variation of 2 dB. Phase accuracy measurements are summarized in Figure 4.11 where a scope screenshot with quadrature signals down-converted to 200 MHz is reported. The same measurement has been repeated several times at different VCO frequencies, always obtaining a phase error  $< 1,5^{\circ}$  and an amplitude mismatch < 1 dB. The performances of the oscillator are summarized in Table 4.1 and compared with recent state-of-the-art millimeter-wave QVCOs. The obtained figure of merit for the phase noise, calculated as

$$FOM = L(\Delta\omega) - 20\log_{10}\left(\frac{\omega_0}{\Delta\omega}\right) + 10\log_{10}\left(\frac{P_{\text{diss}}}{1\,\text{mW}}\right)$$

ranges from -177 to -179 dBc/Hz (worst and best cases in band respectively) when calculated at 1 MHz offset from the carrier, the best published to authors' knowledge. The presented oscillator also shows good performances in terms of phase accuracy and occupies a relatively small area.

| Ref                   |                    | [24]  | [44]  | [42]  | This work   |
|-----------------------|--------------------|-------|-------|-------|-------------|
| Tech                  | CMOS               | 90 nm | 65 nm | 45 nm | 65 nm       |
| <b>f</b> <sub>0</sub> | [GHz]              | 48.0  | 93.1  | 61.6  | 58.2        |
| TR                    | [GHz]              | 8.0   | 4.0   | 9.0   | 4.4         |
| PN@1Mhz               | [dBc/Hz]           | -85   | -90   | -75   | -95 / -97   |
| FOM                   | [dBc/Hz]           | -165  | -173  | -156  | -177 / -179 |
| Phase error           | [°]                | n.a.  | n.a.  | n.a.  | < 1.5       |
| Area                  | [mm <sup>2</sup> ] | n.a.  | n.a.  | n.a.  | 0.075       |
| Power                 | [mW]               | 22.7  | 43.2  | 28.0  | 22.0        |

4.2 Frequency divider architecture based on dynamic latches

 Table 4.1: QVCO performances summary and comparison with the state of the art.

# 4.2 Frequency divider architecture based on dynamic latches

A key block of a millimeter-wave synthesizer is the first divider, required to cover a relatively wide band in order to compensate spreads due to process variations, save area, and power consumption, usually contrasting needs. Frequency dividers for PLLs, based on traditional static CML latches, work over a wide band, but power dissipation at mm-waves is extremely large. Injection-locked LC dividers save power dissipation, but have limited tunability and occupy a large silicon area [45]–[48]. Dividers based on injection-locked ring oscillators are compact, low power, and can be tuned over a wide frequency range. Many CMOS realizations have been proposed with an operating frequency up to 20 GHz [49]–[51]. Few realizations have been reported working at mm-wave [42], [52], but the frequency locking range is extremely narrow (less than 4%) mandating fine and frequent calibrations. Clocked differential amplifiers, working as dynamic CML latches, are introduced in this section to realize high-speed and low-power mmwave frequency dividers.

### 4.2.1 Overview of the state of the art

Several millimeter-wave frequency dividers have been already proposed in literature, and a brief overview of four peculiar solutions is presented in this paragraph.

In [53] a harmonic injection-locked frequency divider-by-four in 90 nm CMOS is presented. The architecture and the schematic of the circuit are reported in Figure 4.12. The nmos connected in parallel with the LC tank is biased in sub-threshold, thus a variation of the  $V_{\rm DS}$  induces a strong third-order harmonic component in the  $I_{\rm DS}$ . When an input signal at  $4f_{\rm osc}$  is applied at the gate of



**Figure 4.12:** Harmonic injection-locked frequency divider-by-four proposed in [53].

the mos, it induces a component in the  $I_{\rm DS}$  at the same frequency that beats with the third harmonic generated by the non-linearity, thus an output signal at  $4f_{\rm osc}$ - $3f_{\rm osc}=f_{\rm osc}$  results and sustains the oscillation of the ring. The variable capacitance in the load allows to tune the oscillation frequency and thus reconfigure the divider. The proposed solution achieves a locking range of 3.2% for a fixed oscillation frequency, and it can be locked within 62.9 and 71.6 GHz by tuning the load. Power consumption is 2.8 mW with an area occupancy of  $110 \times 130 \,\mu {\rm m}^2$ .

In [48] a multimode frequency divider-by-2-or-3 in  $0.13 \,\mu$ m CMOS is presented. The schematic of the circuit is reported in Figure 4.13. For the divide-by-2



Figure 4.13: Multimode frequency divider-by-2-or-3 proposed in [48].

operation, the input IN3 is connected to an AC ground. When the circuit self-oscillates, the cross coupled pair switches the bias current between the two branches two times each period, thus injecting a second-order harmonic current

#### 4.2 Frequency divider architecture based on dynamic latches

in the drain of the tail transistor. If the latter is injected by an input signal at  $2f_{\rm osc}$ , the circuit locks and the oscillation is sustained. For the divide-by-3 operation, the input IN2 is connected to an AC ground. The single-ended input signal is converted to differential by means of a T-line balun, and the locking is performed exploiting the third order non-linearity of the cross-coupled pair. The variable capacitance in the load allows to tune the oscillation frequency and thus reconfigure the divider. The proposed solution achieves a locking range of 3.9% for a fixed oscillation frequency, and it can be locked within 53.9 and 57.8 GHz in the  $\div$ 3 configuration and within 35.6 and 39.3 GHz in the  $\div$ 2 configuration. Power consumption is 3.1 mW with an area occupancy of  $130 \times 180 \,\mu\text{m}^2$ .

In [52] a injection-locked divider-by-four as a part of a frequency synthesizer in 90 nm CMOS is presented. The block diagram and the schematic of each block are reported in Figure 4.14. The ILFD is a two-stage ring oscillator based



Figure 4.14: Injection-locked frequency divider-by-four proposed in [52].

on modified CML latches. To maximize the locking range and device overdrive, the current source is replaced with a  $\lambda/4$  CPW that resonates at the VCO frequency. Since the ILFD locks on the  $4^{\rm TH}$  harmonic of its output, the input signals have to be injected in-phase in the two stages. Two pmos devices are used as voltage-controlled resistors to adjust the self-resonance frequency of the ILFD. The proposed solution achieves a locking range of 2.4% for a fixed oscillation frequency, and it can be locked within 38 and 44.5 GHz. Power consumption is 10 mW with an area occupancy of 220×350  $\mu m^2$ .

In [42] a injection-locked divider-by-four as a part of a frequency synthesizer in 45 nm CMOS is presented. The block diagram and the schematic of each block are reported in Figure 4.15. The ILFD is a three-stage differential ring oscillator. The differential LO signal is injected into the first two stages and locks the ring on the fourth harmonic of the self oscillation. The third stage is only biased and acts like a buffer. A bank of binary-weighted pmos transistors implement a variable-resistance load to adjust the self-resonance frequency of the ILFD. The proposed solution achieves a locking range of 3.9% for a fixed oscillation



**Figure 4.15:** Injection-locked frequency divider-by-four proposed in [42].

frequency, and it can be locked within 56 and 67.5 GHz. Power consumption is 6.3 mW with an area occupancy of  $50 \times 80 \,\mu\text{m}^2$ .

The use of a LC resonance to set the oscillation frequency leads to a poor reconfigurability of the circuit and a wide area occupancy, while the exploitation of high-order harmonic locking based on device non-linearities provides a very narrow locking range for a fixed oscillation frequency. Moreover, the injection-locking description of the operation of the divider is complex, thus increasing the development time of the circuit. In the next paragraph, a simple behavioral model for differential amplifiers used as dynamic CML latches is proposed to develop a high-reconfigurable low-power millimeter-wave frequency divider architecture.

## 4.2.2 Basic idea

The schematic of a standard CML latch is depicted in Figure 4.16 (a), together with the corresponding voltage waveforms in the read and hold modes. In the read phase the tail signal E is high and the differential pair senses the input, while in the hold mode E is low and the cross-coupled pair stores the sampled data indefinitely. If the latch is used in a millimeter-wave frequency divider, the output of each latch would be refreshed every period of the input signal. This observation leads to conclude that the cross-coupled pair could be eventually removed, since the output state can be momentarily stored in the loading parasitic capacitors, as usual in dynamic logic circuits. The corresponding schematic of the latch is modified as shown in Figure 4.16 (b), where signal waveforms in the read and hold phases are also reported. As expected, parasitic capacitors tend to discharge through load resistors during the hold phase. When used for frequency division, this mandates a minimum refresh rate of the output, i.e. a lower bound for the input signal frequency, but the removal of the cross-coupled transistors pair provides remarkable benefits halving input and output parasitic



**Figure 4.16:** (a) Traditional static CML latch and (b) proposed dynamic CML latch, together with a sample read and hold cycle. Initial conditions at time  $t_0$ : input is low (D < Dn), output is low (Q < Qn), E switches from 0 to 1.



**Figure 4.17:** Block diagram of the proposed divider by 4 based on dynamic CML latches (top); differential input and output waveforms of a latch in periodic steady state, assuming a square wave 60 GHz input clock (bottom).

capacitances, increasing the maximum operating frequency at the same time reducing the average power consumption.

## 4.2.3 Circuit design

Four latches properly closed in a feedback loop and driven by complementary signals, as shown at the top of Figure 4.17, realize a millimeter-wave frequency divider by 4. To gain insight into the behavior of the divider and derive guidelines for optimization, the input and output differential waveforms of one latch in periodic steady state are reported in the figure. A 60 GHz square-wave input clock is assumed for clarity, but the behavior with a sinusoidal input is qualitatively the same. To simplify waveforms inspection, the square waves produced by ideal static latches are also reported. The operation of the latch in the divider can be described dividing the period of the output signal in eight distinct time slots. Operations in  $T_i$  and  $T'_i$  are the same, but with complementary voltages. Before  $T_1$  the output is low. During  $T_1$ , when the clock is high, the latch samples the



4.2 Frequency divider architecture based on dynamic latches

**Figure 4.18:** Maximum operating frequency and locking range versus differential pair width for the dynamic latch in the inset.

input and the output differential signal tends asymptotically to  $R_L I_B$  ( $R_L$  and  $I_B$  being the load resistance and bias current respectively). During  $T_2$ , Ck is low, the latch enters the hold mode, and the output evolves toward zero. The logic state is maintained provided the signal does not fall below  $V_{\min} \propto V_{ov}$ , that is the minimum voltage to switch the biasing current of the cascaded pair ( $V_{ov}$  is the overdrive voltage of  $M_{1,2}$  in Figure 4.16 (b)). During  $T_3$ , where the clock is high and the latch samples the input, still in the high state, the output is refreshed. During  $T_4$  the latch is in the hold mode again.

A correct divider operation mandates the output signal crossing  $V_{\min}$  during rise transient in  $T_1$ , while not falling below it in the hold mode. The following key dependences emerge: a small  $R_L C_L$  time constant is required to maximize operating frequency while large voltage swing and low  $V_{ov}$  are desirable to achieve wide bandwidth. Larger voltage swing requires larger current, i.e. power consumption. For given biasing current, increasing transistors width reduces  $V_{ov}$ , but increases the parasitic capacitance limiting the maximum operating frequency. Width selection of the differential pair therefore sets a trade-off between maximum frequency and bandwidth. To gain insight, Figure 4.18 shows the results of simulations performed taking into account post-layout parasitics. The latch schematic with device dimensions is shown in the inset. Load resistances and size of the tail transistor do not change, in this way keeping the average power consumption roughly constant. Average current consumption for each latch is

#### 4. Quadrature frequency generation



**Figure 4.19:** Simulated sensitivity curves for the divider by 4 based on dynamic (solid line) and static (dashed line) CML latches.

1.6 mA while simulated differential voltage swing is 470 mV 0-pk. Maximum frequency of operation is 80 GHz with a fractional bandwidth of 14% achieved with a transistor width of 6  $\mu$ m. At the other extreme, a fractional bandwidth as large as 22% is achieved for a transistor width of 14  $\mu$ m, for which operating frequency is 50 GHz. In this design, we have selected 8  $\mu$ m width, as an optimum compromise, allowing a bandwidth around 20% close to the maximum achievable, and with a maximum input frequency of 70 GHz.

It is interesting to compare the performance of the dynamic divider against a solution made of static latches. The simulated sensitivity curves for the two circuits are shown in Figure 4.19. Schematics of the corresponding latches with component sizes are reported in the insets of the figure. In order to establish a fair comparison, it has been assumed for the static latches the same load resistances of the dynamic latches, but half the width for all transistors. Notice that in this way the average power consumption and the RC time constant of the load are the same for the two circuits. The dynamic version of the divider approximately doubles the maximum operating frequency. To gain insight, Figure 4.20 shows the simulated waveforms for the two circuits with a normalized time scale so that rise and fall edges for the two latches are aligned even if working at different frequencies. By inspection of the two waveforms, the higher speed of the dynamic divider is due to a twofold reason, which is: 1) the current of the dynamic latches


4.2 Frequency divider architecture based on dynamic latches

**Figure 4.20:** Comparison of simulated differential waveforms, normalized to the same frequency, for the dynamic (dark) and statich (light gray)

in the read mode, when commutating the outputs, is twice that of the static latch leading to a faster capacitance charge and discharge and 2) the self-discharge of the output, during the second hold phase of each semi-period, allows a faster commutation in the following read phase.

### 4.2.4 Experimental results

A prototype of the frequency divider has been realized in a 65 nm CMOS technology provided by STMicroelectronics. The final schematic of the integrated dynamic latch is shown in Figure 4.21. The load resistors have been realized as pMOS transistors biased in the triode region. The bias voltage of the pMOS sets the equivalent load resistance and allows tuning the operating center frequency. The divider is followed by a three-stage buffer driving the 50  $\Omega$  impedance presented by the measurement setup, while the differential input signals are generated by an on-chip transformer made of two concentric spirals of 80  $\mu$ m average diameter. A photomicrograph of the test-chip is shown in Figure 4.22. Without the input signal, if the tail transistors of each latch are properly biased, the circuit behaves like a ring oscillator. Its self-oscillation frequency can be tuned, changing the bias voltage of the pMOS. The measured tuning curve is shown in Figure 4.23 showing a free-run frequency tunable from 5 to 18 GHz. Figure 4.24 shows the measured sensitivity curves at different values of the pMOS bias voltage. The input operating frequency spans from 20 to 70 GHz. With an estimated input signal power of 0 dBm, the operating fractional bandwidth of the divider, for each



Figure 4.21: Schematic of the realized CML latch.



330 µm

Figure 4.22: Divider chip microphotograph.



**Figure 4.23:** Self-oscillation frequency of the divider versus biasing of pMOS devices.



**Figure 4.24:** Measured sensitivity curves and power consumption for different  $M_L$  bias voltages.



**Figure 4.25:** Phase noise measured at the input and the output of the divider.

value of the pMOS bias voltage, varies from a minimum of 10% to a maximum of 17%. The power consumption is roughly linear with the input frequency, and spans from 1.7 mW at 20 GHz to 6.4 mW at 70 GHz.

The phase noise measured at the input and output of the divider when driven by a commercial low-noise millimeter-wave source (Agilent E8257D) is shown in Figure 4.25. The approximately 12 dB difference demonstrates a negligible noise degradation introduced by the divider by 4, considering each division by 2 introduces a 6 dB phase noise reduction. The plot is significant up to  $\sim$ 7 MHz frequency offset from the carrier where the -140 dBm/Hz noise floor of the spectrum analyzer inhibits resolution of the phase noise of the device-under-test.

The experimental results are summarized in Figure 4.26 and compared against recently reported mm-wave injection locked dividers, all with division factors larger than 2. The proposed technique leads to the highest operating frequency without the use of inductors, the largest input frequency range, with the widest fractional bandwidth in each sub-band. Silicon area is also the smallest reported to date.

### 4.2.5 Performance improvement

The work described so far demonstrates the validity and feasibility of the proposed architecture for high-speed frequency dividers. The circuit exhibits a very wide

| Ref                          | Div<br>ratio | Frequency<br>f <sub>min</sub> / f <sub>max</sub><br>[GHz] | Frequency<br>range in<br>sub-bands | Power<br>[mW] | Area<br>[μm x μm] | Tech<br>CMOS |  |  |  |  |  |
|------------------------------|--------------|-----------------------------------------------------------|------------------------------------|---------------|-------------------|--------------|--|--|--|--|--|
| This<br>work                 | 4            | 20 / 70                                                   | 10 - 17 %                          | 6.5           | 15 x 30           | 65nm         |  |  |  |  |  |
| Injection Locked LC dividers |              |                                                           |                                    |               |                   |              |  |  |  |  |  |
| [15]                         | 3            | 53.9 / 57.8                                               | 3.9 %                              | 3.1           | 130 x 180         | 130nm        |  |  |  |  |  |
| [12]                         | 4            | 79.7 / 81.6                                               | 2.4 %                              | 12            | 106 x 330         | 65nm         |  |  |  |  |  |
| [28]                         | 4            | 62.9 / 71.6                                               | 3.2 %                              | 2.8           | 110 x 130         | 90nm         |  |  |  |  |  |
| Injection Locked RC dividers |              |                                                           |                                    |               |                   |              |  |  |  |  |  |
| [19]                         | 4            | 56.0 / 67.5                                               | 3.9 %                              | 6.3           | ~ 50 x 80         | 45nm         |  |  |  |  |  |
| [20]                         | 4            | >38 / 44.5                                                | 2.4 %                              | 10            | 350 x 220         | 65nm         |  |  |  |  |  |

#### 4.2 Frequency divider architecture based on dynamic latches

**Figure 4.26:** Divider performance summary and comparison with the state of the art.

tuning range, it can be locked to an input signal ranging from 20 to 70 GHz, and this feature is due to the very high  $K_{\rm VCO}$  shown by the ring while acting as an oscillator. The estimated  $K_{\rm VCO}$  from Figure 4.23 is about 42.5 GHz/V, that becomes 170 GHz/V when referred to the input. Such a high value, indispensable for a wide tunability, mandates an accurate control of the bias voltage, and makes the divider highly susceptible to the noise over  $V_{\rm tune}$ . Moreover, such noise reduces the locking range of each sub-band: the fluctuations of  $V_{\rm tune}$  are converted in a variation of center frequency of the sub-band through the relation depicted in Figure 4.23. As shown in Figure 4.27, if the center frequency of the sub-band is randomly moved between  $f_1$  and  $f_2$ , the locking is ensured only in the highlighted range that is always covered from the sub-band in any case. The equivalent locking range is so the more narrow the higher is the fluctuation of  $V_{\rm tune}$ .

Although a huge filtering between  $V_{\text{tune}}$  and  $V_{\text{dd}}$  would fix the problem, the relatively small capacitors that can be integrated on chip won't be big enough. The issue can be solved by considering the following steps: the pmos transistors  $M_{\text{L}}$  shown in Figure 4.21 are nominally biased into the triode region at DC, so their equivalent output resistance can be calculated as follows:

$$g_{\rm DS} \propto \beta V_{\rm ov}$$
 (4.8)

The frequency of a ring oscillator is inversely proportional to the RC time constant of the load, thus:

$$f_{\rm osc} \propto \frac{1}{R_{\rm L,eq}} \propto \beta V_{\rm ov}$$
 (4.9)



**Figure 4.27:** Reduction of the locking range due to the noise over  $V_{tune}$ .

From Figure 4.21, we have:

$$V_{\rm ov} = V_{\rm DD} - V_{\rm tune} + V_{\rm th,p} \tag{4.10}$$

so substituting (4.10) into (4.9) it turns into:

$$f_{\rm osc} \propto \beta \left( V_{\rm DD} + V_{\rm th,p} - V_{\rm tune} \right) \tag{4.11}$$

and this equation demonstrates the almost linear relationship between the self oscillation frequency and the bias voltage of the pmos shown in Figure 4.23. Going back to (4.9), if there is some additive noise on the supply or bias voltages, we have:

$$f_{\rm osc} \propto \beta \left( V_{\rm ov} - V_{\rm noise} 
ight)$$
 (4.12)

so if the overdrive is small, the behavior of the circuit is very sensitive to the considered noise. The best way to reduce this sensitivity is to maximize and keep constant  $V_{\rm GS}$  of the pmos transistors while modulating  $\beta$ . This can be easily achieved by the digital biasing approach shown in Figure 4.28.

To prove the concept, two new prototypes of dividers by 2 and 4 have been realized in 32 nm CMOS provided by STMicroelectronics. The structure of the modified latch, together with the dimensions of the transistors, is shown in Figure 4.29. The architecture of the divider by 4 is identical to the one depicted in Figure 4.17, while the divider by 2 is slightly different. The cascade of two latches depicted in Figure 4.30 (b) implements a divider by two, but since each latch has only one dominant pole, the loop can't satisfy the Barkhausen criterion. Without self-oscillation, it is not possible to lock the divider with a small input signal, but a very large input signal is required. To overcome this problem, a third latch with a bias at the input and acting like a buffer has been



Figure 4.28: Analog versus digital tuning of the pmos transistors.

|    | 001  | 010  | 011  | 100  | 101  | 110  | 111  |
|----|------|------|------|------|------|------|------|
| ÷2 | 10.5 | 17.7 | 25.0 | 32.2 | 39.3 | n.o. | n.o. |
| ÷4 | 8.8  | 13.8 | 17.5 | 21.0 | 24.7 | 28.2 | 31.2 |

**Table 4.2:** Self-oscillation frequency of the dividers versus biasing of the pmos devices.

included in the ring to allow the self oscillation, as depicted in Figure 4.30 (a). The photomicrograph of the test chip is shown in Figure 4.31. The generation of the differential input signal and the driving of the input impedance of the measurement setup has been made analogously to the work in 65 nm. The measured self oscillation frequencies are reported in Table 4.2, while the measured sensitivity curves are depicted in Figure 4.32. While a direct comparison between the dividers with analog and digital tuning is not possible (they are made in two different technology nodes), the results shown in Figure 4.32 clearly demonstrates an excellent result: the locking range is about 60% in each sub-band for each divider. Both dividers cover an input frequency range of 50 GHz in only three sub-bands.

### 4.3 Conclusions

Two architectures for low.noise quadrature generation and high-speed frequency division has been presented in this chapter. The quadrature voltage-controlled oscillator (VCO) relies on a ring of two tuned VCOs, where the oscillation fre-



Figure 4.29: Schematic of the modified latch with digital tuning.

quency depends on inter-stage passive components only, demonstrating low noise and accurate quadrature phases. The frequency divider architecture is based on clocked differential amplifiers, working as dynamic CML latches, achieving high speed and low power simultaneously.

### 4.3 Conclusions



**Figure 4.30:** Two architectures of divider by two. (a) not self-oscillating, (b) self-oscillating.



Figure 4.31: Chip microphotograph.



**Figure 4.32:** Measured sensitivity curves and power consumption of the dividers by 2 (grey curves) and 4 (black curves) in 32 nm

## **General conclusions**

This Ph.D. thesis addresses the development and implementation of novel high-performances building blocks for millimeter-wave phased-array receivers. A single-stage merged LNA and mixer is proposed to minimize area and power consumption, the two most critical aspects of the RF branches those are replicated for each element of the array. Through careful design and optimization, gain and noise figure comparable with state-of-the-art LNAs have been obtained. A quadrature VCO based on inter-stage passive coupling have been designed to greatly reduce the conversion of flicker noise into phase noise. The QVCO is suitable both for direct-conversion and heterodyne receiver architecture, and demonstrates low noise and accurate quadrature phases. A frequency divider architecture based on clocked differential amplifiers is introduced to achieve high operative speed, wide tunability and low power consumption. Prototypes of all the blocks have been realized in nm-scaled bulk CMOS technologies, and measurements demonstrates the validity of the proposed ideas.

# Appendix A

# **Phase rotators**

The unit circle is a subset of a bi-dimensional cartesian space, therefore two orthogonal vectors constitute a complete basis for it.

$$\mathbf{u}_{\theta} = \mathbf{u}_0 \cos \theta + \mathbf{u}_{90} \sin \theta \qquad \forall \theta \in [0, 2\pi)$$
(A.1)



Figure A.1: The unit circle

A valid architecture for phase interpolation has already been presented in [22], and it is reported in Figure A.2 for a fast referral (the bias voltages are omitted). Each differential pair works as an analog multiplier, and the circuit trivially implements the (A.1) in a differential fashion. Double balanced mixers are essential to avoid the LO feed-through that introduces an equivalent amplitude error in the two components in quadrature.



Figure A.2: Basic phase interpolator architecture

## Bibliography

- G. E. Moore, "Cramming more components onto integrated circuits, Reprinted from Electronics, volume 38, number 8, April 19, 1965, pp.114 ff.ore," *Solid-State Circuits Newsletter, IEEE*, vol. 20, no. 3, pp. 33 –35, Sept. 2006.
- [2] "Transistor count and moore's law 2011.svg Wikipedia, the free encyclopedia," 2011, [accessed 09-August-2011]. [Online]. Available: http://en.wikipedia.org/wiki/File:Transistor\_Count\_and\_Moore%27s\_Law\_-\_2011.svg
- [3] [Online]. Available: http://www.itrs.net/reports.html
- [4] A. Abidi, "RF CMOS comes of age," Solid-State Circuits, IEEE Journal of, vol. 39, no. 4, pp. 549 – 561, Apr. 2004.
- [5] B. Kleveland, C. Diaz, D. Vook, L. Madden, T. Lee, and S. Wong, "Exploiting CMOS reverse interconnect scaling in multigigahertz amplifier and oscillator design," *Solid-State Circuits, IEEE Journal of*, vol. 36, no. 10, pp. 1480 –1488, oct 2001.
- [6] P. Smulders, "Exploiting the 60 GHz band for local wireless multimedia access: prospects and future directions," *Communications Magazine, IEEE*, vol. 40, no. 1, pp. 140 –147, Jan. 2002.
- [7] X. Guan and A. Hajimiri, "A 24-GHz CMOS front-end," Solid-State Circuits, IEEE Journal of, vol. 39, no. 2, pp. 368 – 373, feb. 2004.
- [8] L. Franca-Neto, R. Bishop, and B. Bloechel, "64 GHz and 100 GHz VCOs in 90 nm CMOS using optimum pumping method," in *Solid-State Circuits Conference, 2004. Digest of Technical Papers. ISSCC. 2004 IEEE International*, feb. 2004, pp. 444 – 538 Vol.1.
- [9] K.-W. Yu, Y.-L. Lu, D.-C. Chang, V. Liang, and M. Chang, "K-band lownoise amplifiers using 0.18μm CMOS technology," *Microwave and Wireless Components Letters, IEEE*, vol. 14, no. 3, pp. 106 – 108, march 2004.

- [10] C. Doan, S. Emami, A. Niknejad, and R. Brodersen, "Millimeter-wave CMOS design," *Solid-State Circuits, IEEE Journal of*, vol. 40, no. 1, pp. 144 – 155, jan. 2005.
- [11] T. Yao, M. Gordon, K. Tang, K. Yau, M.-T. Yang, P. Schvan, and S. Voinigescu, "Algorithmic Design of CMOS LNAs and PAs for 60-GHz Radio," *Solid-State Circuits, IEEE Journal of*, vol. 42, no. 5, pp. 1044 – 1057, may 2007.
- [12] B. Razavi, "A Millimeter-Wave CMOS Heterodyne Receiver With On-Chip LO and Divider," *Solid-State Circuits, IEEE Journal of*, vol. 43, no. 2, pp. 477 –485, feb. 2008.
- [13] C. Liang and B. Razavi, "Systematic Transistor and Inductor Modeling for Millimeter-Wave Design," *Solid-State Circuits, IEEE Journal of*, vol. 44, no. 2, pp. 450 –457, feb. 2009.
- [14] Y. Jin, M. Sanduleanu, and J. Long, "A wideband millimeter-wave power amplifier with 20 db linear power gain and +8 dbm maximum saturated output power," *Solid-State Circuits, IEEE Journal of*, vol. 43, no. 7, pp. 1553-1562, july 2008.
- [15] C. E. Shannon, "A mathematical theory of communication," Bell System Technical Journal, vol. 27, pp. 379–423, July 1948.
- [16] A. Mazzanti, M. Sosio, M. Repossi, and F. Svelto, "A 24 GHz subharmonic direct conversion receiver in 65 nm CMOS," *Circuits and Systems I: Regular Papers, IEEE Transactions on*, vol. 58, no. 1, pp. 88–97, jan. 2011.
- [17] J. Lee, Y. A. Li, M. H. Hung, and S. J. Huang, "A fully-integrated 77-ghz fmcw radar transceiver in 65-nm cmos technology," *Solid-State Circuits*, *IEEE Journal of*, vol. 45, no. 12, pp. 2746 –2756, Dec. 2010.
- [18] [Online]. Available: http://ieee802.org/15/pub/TG3c.html
- [19] X. Guan, H. Hashemi, and A. Hajimiri, "A fully integrated 24-GHz eightelement phased-array receiver in silicon," *Solid-State Circuits, IEEE Journal* of, vol. 39, no. 12, pp. 2311 – 2320, dec. 2004.
- [20] A. Natarajan, A. Komijani, and A. Hajimiri, "A fully integrated 24-GHz phased-array transmitter in CMOS," *Solid-State Circuits, IEEE Journal of*, vol. 40, no. 12, pp. 2502 – 2514, dec. 2005.

#### BIBLIOGRAPHY

- [21] A. Babakhani, X. Guan, A. Komijani, A. Natarajan, and A. Hajimiri, "A 77-GHz phased-array transceiver with on-chip antennas in silicon: receiver and antennas," *Solid-State Circuits, IEEE Journal of*, vol. 41, no. 12, pp. 2795 –2806, dec. 2006.
- [22] A. Natarajan, A. Komijani, X. Guan, A. Babakhani, and A. Hajimiri, "A 77-GHz phased-array transceiver with on-chip antennas in silicon: transmitter and local LO-path phase shifting," *Solid-State Circuits, IEEE Journal of*, vol. 41, no. 12, pp. 2807 –2819, dec. 2006.
- [23] H. Krishnaswamy and H. Hashemi, "A variable-phase ring oscillator and PLL architecture for integrated phased array transceivers," *Solid-State Circuits, IEEE Journal of*, vol. 43, no. 11, pp. 2446 –2463, nov. 2008.
- [24] K. Scheir, S. Bronckers, J. Borremans, P. Wambacq, and Y. Rolain, "A 52 GHz phased-array receiver front-end in 90 nm digital CMOS," *Solid-State Circuits, IEEE Journal of*, vol. 43, no. 12, pp. 2651 –2659, dec. 2008.
- [25] Y. Yu, P. Baltus, A. de Graauw, E. van der Heijden, C. Vaucher, and A. van Roermund, "A 60 GHz phase shifter integrated with LNA and PA in 65 nm CMOS for phased array systems," *Solid-State Circuits, IEEE Journal of*, vol. 45, no. 9, pp. 1697 –1709, sept. 2010.
- [26] W. Chan and J. Long, "A 60-GHz band 2x2 phased-array transmitter in 65-nm CMOS," *Solid-State Circuits, IEEE Journal of*, vol. 45, no. 12, pp. 2682 –2695, dec. 2010.
- [27] A. Valdes-Garcia, S. Nicolson, J.-W. Lai, A. Natarajan, P.-Y. Chen, S. Reynolds, J.-H. C. Zhan, D. Kam, D. Liu, and B. Floyd, "A fully integrated 16-element phased-array transmitter in SiGe BiCMOS for 60-GHz communications," *Solid-State Circuits, IEEE Journal of*, vol. 45, no. 12, pp. 2757 –2773, dec. 2010.
- [28] A. Natarajan, S. Reynolds, M.-D. Tsai, S. Nicolson, J.-H. Zhan, D. G. Kam, D. Liu, Y.-L. Huang, A. Valdes-Garcia, and B. Floyd, "A fully-integrated 16-element phased-array receiver in SiGe BiCMOS for 60-GHz communications," *Solid-State Circuits, IEEE Journal of*, vol. 46, no. 5, pp. 1059–1075, may 2011.
- [29] S. Emami, R. Wiser, E. Ali, M. Forbes, M. Gordon, X. Guan, S. Lo, P. McElwee, J. Parker, J. Tani, J. Gilbert, and C. Doan, "A 60GHz CMOS phasedarray transceiver pair for multi-Gb/s wireless communications," in *Solid-State Circuits Conference Digest of Technical Papers (ISSCC), 2011 IEEE International*, feb. 2011, pp. 164 –166.

- [30] M. Tabesh, J. Chen, C. Marcu, L. Kong, S. Kang, E. Alon, and A. Niknejad, "A 65nm CMOS 4-element Sub-34mW/element 60GHz phased-array transceiver," in *Solid-State Circuits Conference Digest of Technical Papers* (ISSCC), 2011 IEEE International, feb. 2011, pp. 166 –168.
- [31] C. Doan, S. Emami, J. Marshall, C. Shung, T. Williams, R. Brodersen, J. Gilbert, and A. Poon, "Wireless communication device using adaptive beamforming," Patent US 7 904 117, March 8, 2008.
- [32] F. Vecchi, S. Bozzola, E. Temporiti, D. Guermandi, M. Pozzoni, M. Repossi, M. Cusmai, U. Decanis, A. Mazzanti, and F. Svelto, "A Wideband Receiver for Multi-Gbit/s Communications in 65 nm CMOS," *Solid-State Circuits, IEEE Journal of*, vol. 46, no. 3, pp. 551–561, March 2011.
- [33] S. Pellerano, Y. Palaskas, and K. Soumyanath, "A 64 GHz LNA with 15.5 dB Gain and 6.5 dB NF in 90 nm CMOS," *Solid-State Circuits, IEEE Journal* of, vol. 43, no. 7, pp. 1542 –1552, July 2008.
- [34] C. Weyers, P. Mayr, J. Kunze, and U. Langmann, "A 22.3dB Voltage Gain 6.1dB NF 60GHz LNA in 65nm CMOS with Differential Output," in *Solid-State Circuits Conference, 2008. ISSCC 2008. Digest of Technical Papers. IEEE International*, Feb. 2008, pp. 192–606.
- [35] B. Heydari, M. Bohsali, E. Adabi, and A. Niknejad, "Millimeter-Wave Devices and Circuit Blocks up to 104 GHz in 90 nm CMOS," *Solid-State Circuits, IEEE Journal of*, vol. 42, no. 12, pp. 2893 –2903, Dec. 2007.
- [36] T. O. Dickson, K. H. K. Yau, T. Chalvatzis, A. M. Mangan, E. Laskin, R. Beerkens, P. Westergaard, M. Tazlauanu, M.-T. Yang, and S. P. Voinigescu, "The Invariance of Characteristic Current Densities in Nanoscale MOSFETs and Its Impact on Algorithmic Design Methodologies and Design Porting of Si(Ge) (Bi)CMOS High-Speed Building Blocks," *Solid-State Circuits, IEEE Journal of*, vol. 41, no. 8, pp. 1830–1845, Aug. 2006.
- [37] H. Darabi and A. Abidi, "Noise in RF-CMOS Mixers: A Simple Physical Model," *Solid-State Circuits, IEEE Journal of*, vol. 35, no. 1, pp. 15–25, Jan. 2000.
- [38] H. Sjöland, A. Karimi-Sanjaani, and A. A. Abidi, "A Merged CMOS LNA and Mixer for a WCDMA Receiver," *Solid-State Circuits, IEEE Journal of*, vol. 38, no. 6, pp. 1045–1050, June 2003.
- [39] C. Marcu, D. Chowdhury, C. Thakkar, J.-D. Park, L.-K. Kong, M. Tabesh, Y. Wang, B. Afshar, A. Gupta, A. Arbabian, S. Gambini, R. Zamani, E. Alon,

and A. Niknejad, "A 90 nm CMOS Low-Power 60 GHz Transceiver With Integrated Baseband Circuitry," *Solid-State Circuits, IEEE Journal of*, vol. 44, no. 12, pp. 3434 –3447, Dec. 2009.

- [40] A. Mazzanti and P. Andreani, "A Time-Variant Analysis of Fundamental Phase Noise in CMOS Parallel -Tank Quadrature Oscillators," *Circuits and Systems I: Regular Papers, IEEE Transactions on*, vol. 56, no. 10, pp. 2173 –2180, oct. 2009.
- [41] A. Mazzanti, F. Svelto, and P. Andreani, "On the amplitude and phase errors of quadrature lc-tank cmos oscillators," *Solid-State Circuits, IEEE Journal of*, vol. 41, no. 6, pp. 1305 – 1313, june 2006.
- [42] K. Scheir, G. Vandersteen, Y. Rolain, and P. Wambacq, "A 57-to-66GHz quadrature PLL in 45nm digital CMOS," in *Solid-State Circuits Conference Digest of Technical Papers, 2009. ISSCC 2009. IEEE International*, feb. 2009, pp. 494 –495,495a.
- [43] A. ElSayed and M. Elmary, "Low-phase-noise LC quadrature VCO using coupled tank resonators in a ring structure," *Solid-State Circuits, IEEE Journal of*, vol. 36, no. 4, pp. 701 –705, Apr. 2001.
- [44] E. Laskin, M. Khanpour, S. Nicolson, A. Tomkins, P. Garcia, A. Cathelin, D. Belot, and S. Voinigescu, "Nanoscale CMOS Transceiver Design in the 90–170-GHz Range," *Microwave Theory and Techniques, IEEE Transactions* on, vol. 57, no. 12, pp. 3477–3490, Dec. 2009.
- [45] P. Mayr, C. Weyers, and U. Langmann, "A 90GHz 65nm CMOS Injection-Locked Frequency Divider," in *Solid-State Circuits Conference, 2007. ISSCC 2007. Digest of Technical Papers. IEEE International*, feb. 2007, pp. 198 –596.
- [46] K.-H. Tsai and S.-I. Liu, "A 43.7mW 96GHz PLL in 65nm CMOS," in Solid-State Circuits Conference - Digest of Technical Papers, 2009. ISSCC 2009. IEEE International, feb. 2009, pp. 276 –277,277a.
- [47] H.-K. Chen, T. Wang, and S.-S. Lu, "A Millimeter-Wave CMOS Triple-Band Phase-Locked Loop With A Multimode LC-Based ILFD," *Microwave Theory and Techniques, IEEE Transactions on*, vol. 59, no. 5, pp. 1327 –1338, May 2011.
- [48] H.-K. Chen, H.-J. Chen, D.-C. Chang, Y.-Z. Juang, Y.-C. Yang, and S.-S. Lu, "A mm-wave CMOS multimode frequency divider," in *Solid-State*

Circuits Conference - Digest of Technical Papers, 2009. ISSCC 2009. IEEE International, Feb. 2009, pp. 280–281,281a.

- [49] A. Bonfanti, A. Tedesco, C. Samori, and A. Lacaita, "A 15-GHz broadband ÷2 frequency divider in 0.13-μm CMOS for quadrature generation," *Microwave and Wireless Components Letters, IEEE*, vol. 15, no. 11, pp. 724 – 726, Nov. 2005.
- [50] S. Cheng, H. Tong, J. Silva-Martinez, and A. Karsilayan, "A Fully Differential Low-Power Divide-by-8 Injection-Locked Frequency Divider Up to 18 GHz," *Solid-State Circuits, IEEE Journal of*, vol. 42, no. 3, pp. 583 –591, March 2007.
- [51] S. Toso, A. Bevilacqua, M. Tiebout, N. Da Dalt, A. Gerosa, and A. Neviani, "A 0.06 mm<sup>2</sup> 11 mW Local Oscillator for the GSM Standard in 65 nm CMOS," *Solid-State Circuits, IEEE Journal of*, vol. 45, no. 7, pp. 1295 –1304, July 2010.
- [52] S. Pellerano, R. Mukhopadhyay, A. Ravi, J. Laskar, and Y. Palaskas, "A 39.1-to-41.6GHz ΔΣ Fractional-N Frequency Synthesizer in 90nm CMOS," in Solid-State Circuits Conference, 2008. ISSCC 2008. Digest of Technical Papers. IEEE International, Feb. 2008, pp. 484–630.
- [53] K. Yamamoto and M. Fujishima, "70GHz CMOS Harmonic Injection-Locked Divider," in Solid-State Circuits Conference, 2006. ISSCC 2006. Digest of Technical Papers. IEEE International, Feb. 2006, pp. 2472 –2481.