

## UNIVERSITÁ DEGLI STUDI DI PAVIA Dipartimento di Ingegneria Industriale e dell'Informazione

# Insights into Wideband PAs for High Speed mm-Wave Transceivers

by: Junlei Zhao Cycle: XXVII

Supervisor: Prof. Francesco Svelto

A dissertation submitted in partial satisfaction of the requirements for the degree of

Doctor of Philosophy

2014

To my parents and my sister To Angela

There is only one heroism in the world: to see the world as it is, and to love it. — Romain Rolland

## Acknowledgements

Time passes fast! I can still remember clearly the first day when I came to Italy, but now my PhD is going to finish. During the past three years, I have suffered and enjoyed, hated and loved. To all the people I have met and all the experiences I have had, I want to say: Thanks! You make me love the life more!

First thanks Frank for giving me an opportunity to come to Pavia and work with so many talented and hard working people.

Thanks Mazza for his attitude to work and way of analyzing problems.

Thanks for Ghilinos and Matteo for helping me in IC design, software usage and thesis writing.

Thanks for Dan, Kambiz, Fabrizio, Lorenzo and other people in our lab, working with you is really enjoyable.

Thanks for Zhi Chong, Yao, Tie and LingLing, you give me the feeling of home.

Thanks for Angela, Evis, Sanaz, Camilla, Gozden, Hanna, Timoteo, Murathan and other friends in Biomedica. You make these three years easier.

Thanks for my parents and sister, your love and support encourage me to travel so far.

## Abstract

The last decade has witnessed the exponential growth in the demand for high speed mobile data capacity, and the growth trend is expected to continue in the following years. However, due to the limited channel bandwidth below 10 GHz, traditional wireless communication systems can hardly satisfy the increasing demand with reasonable systematic complex and power consumption. On the other hand, millimeter wave band offers enormous bandwidth and therefore provides the opportunity to achieve multi Gbps communication with simple modulation schemes and low power consumption. The advancement in CMOS technology allows to significantly improve the system integration and minimize the cost. However, realizing mm-wave transceivers in CMOS technology still faces many challenges, such as low supply voltage and high loss of passive components.

To this regard, this work focuses on the design of power amplifiers as they are the most power hungry blocks of RF transceivers. Especially, wideband design techniques are studied to take full advantages of the large available bandwidth at mm-wave frequency. A design methodology for wideband and compact matching networks of PAs has been proposed to achieve high gain over a large frequency range, while minimizing the insertion loss. Furthermore, a novel power splitter is introduced to suppress the potential oscillation problem in traditional non-isolated power combining PAs. Two prototypes have been realized in advanced CMOS technologies, demonstrating state-of-the-art performances.

As second major contribution, a wideband mm-wave OOK transceiver has been realized in 28 nm CMOS technology. Optimizations are performed from architecture level down to transistor level to minimize the power consumption. The implemented transceiver achieves error-free transmission up to 5 Gpbs over 13 cm, while consuming only 130 mW, demonstrating the feasibility of realizing high speed low power links in bulk CMOS technology.

# Contents

| Ac       | Acknowledgements                  |                                                     |                                                                                    |  |
|----------|-----------------------------------|-----------------------------------------------------|------------------------------------------------------------------------------------|--|
| Ał       | ostra                             | $\mathbf{ct}$                                       | ix                                                                                 |  |
| Li       | List of figures x                 |                                                     |                                                                                    |  |
| Li       | st of                             | tables x                                            | viii                                                                               |  |
| 1        | Intr                              | oduction                                            | 1                                                                                  |  |
|          | 1.1                               | Mm-Wave Applications                                | 2                                                                                  |  |
|          | 1.2                               | Opportunities and Challenges of CMOS                | 3                                                                                  |  |
|          | 1.3                               | Organization of this Thesis                         | 7                                                                                  |  |
| <b>2</b> | Mil                               | imeter Wave Transceivers for Wireless Communication | 9                                                                                  |  |
|          | 2.1                               | Data Communication Basics                           | 9                                                                                  |  |
|          | 2.2                               |                                                     |                                                                                    |  |
|          |                                   | Review of mm-Wave Transceivers                      | 12                                                                                 |  |
|          | 2.3                               | Review of mm-Wave Transceivers                      | 12<br>17                                                                           |  |
|          | 2.3                               | Review of mm-Wave Transceivers                      | 12<br>17<br>17                                                                     |  |
|          | 2.3                               | Review of mm-Wave Transceivers                      | 12<br>17<br>17<br>19                                                               |  |
|          | 2.3                               | Review of mm-Wave Transceivers                      | 12<br>17<br>17<br>19<br>25                                                         |  |
|          | <ul><li>2.3</li><li>2.4</li></ul> | Review of mm-Wave Transceivers                      | <ol> <li>12</li> <li>17</li> <li>17</li> <li>19</li> <li>25</li> <li>26</li> </ol> |  |
| 3        | 2.3<br>2.4<br><b>A</b> 4          | Review of mm-Wave Transceivers                      | <ol> <li>12</li> <li>17</li> <li>19</li> <li>25</li> <li>26</li> <li>27</li> </ol> |  |

|   | 3.2                             | System                                                                                                    | n Block Diagram                                                                                                                                                                                                                                                                                                                                                                                                              | 29                                                                                                                                 |
|---|---------------------------------|-----------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------|
|   | 3.3                             | Match                                                                                                     | ing Networks Design                                                                                                                                                                                                                                                                                                                                                                                                          | 29                                                                                                                                 |
|   |                                 | 3.3.1                                                                                                     | Inductively Coupled Resonator                                                                                                                                                                                                                                                                                                                                                                                                | 30                                                                                                                                 |
|   |                                 | 3.3.2                                                                                                     | Impedance Transformation Techniques                                                                                                                                                                                                                                                                                                                                                                                          | 34                                                                                                                                 |
|   |                                 | 3.3.3                                                                                                     | Output Matching Network                                                                                                                                                                                                                                                                                                                                                                                                      | 36                                                                                                                                 |
|   |                                 | 3.3.4                                                                                                     | Interstage Matching Network                                                                                                                                                                                                                                                                                                                                                                                                  | 41                                                                                                                                 |
|   | 3.4                             | Active                                                                                                    | e Stages Design                                                                                                                                                                                                                                                                                                                                                                                                              | 46                                                                                                                                 |
|   |                                 | 3.4.1                                                                                                     | Capacitive Neutralization                                                                                                                                                                                                                                                                                                                                                                                                    | 46                                                                                                                                 |
|   |                                 | 3.4.2                                                                                                     | Output Stage                                                                                                                                                                                                                                                                                                                                                                                                                 | 49                                                                                                                                 |
|   |                                 | 3.4.3                                                                                                     | Input Stage                                                                                                                                                                                                                                                                                                                                                                                                                  | 50                                                                                                                                 |
|   | 3.5                             | Comp                                                                                                      | lete Circuit                                                                                                                                                                                                                                                                                                                                                                                                                 | 53                                                                                                                                 |
|   | 3.6                             | Measu                                                                                                     | urements                                                                                                                                                                                                                                                                                                                                                                                                                     | 55                                                                                                                                 |
|   |                                 | 3.6.1                                                                                                     | Measurement Setup                                                                                                                                                                                                                                                                                                                                                                                                            | 55                                                                                                                                 |
|   |                                 | 3.6.2                                                                                                     | Measurement Results                                                                                                                                                                                                                                                                                                                                                                                                          | 56                                                                                                                                 |
|   |                                 | 3.6.3                                                                                                     | Comparisons                                                                                                                                                                                                                                                                                                                                                                                                                  | 59                                                                                                                                 |
|   | 3.7                             | Conclu                                                                                                    | usions                                                                                                                                                                                                                                                                                                                                                                                                                       | 59                                                                                                                                 |
|   |                                 |                                                                                                           |                                                                                                                                                                                                                                                                                                                                                                                                                              |                                                                                                                                    |
| 4 | АТ                              | wo-Pa                                                                                                     | th Power Combing CMOS Power Amplifier                                                                                                                                                                                                                                                                                                                                                                                        | 61                                                                                                                                 |
| 4 | <b>Α</b> Τ<br>41                | <b>wo-Pa</b>                                                                                              | th Power Combing CMOS Power Amplifier                                                                                                                                                                                                                                                                                                                                                                                        | <b>61</b>                                                                                                                          |
| 4 | <b>A T</b><br>4.1<br>4 2        | C <b>wo-Pa</b><br>Outpu<br>On-Cł                                                                          | ath Power Combing CMOS Power Amplifier         at Power Limitations         at Power Combiner and Splitter                                                                                                                                                                                                                                                                                                                   | <b>61</b><br>61<br>64                                                                                                              |
| 4 | <b>A T</b><br>4.1<br>4.2        | <b>Ywo-Pa</b><br>Outpu<br>On-Cł<br>4.2.1                                                                  | ath Power Combing CMOS Power Amplifier         at Power Limitations         at Power Limitations         bip Passive Power Combiner and Splitter         Isolated Power Combiners                                                                                                                                                                                                                                            | <ul> <li>61</li> <li>61</li> <li>64</li> <li>65</li> </ul>                                                                         |
| 4 | <b>A T</b><br>4.1<br>4.2        | <b>Wo-Pa</b><br>Outpu<br>On-Ch<br>4.2.1<br>4.2.2                                                          | ath Power Combing CMOS Power Amplifier         at Power Limitations         at Power Limitations         at Power Combiner and Splitter         at Isolated Power Combiners         Non-Isolated Power Combiners                                                                                                                                                                                                             | <ul> <li>61</li> <li>61</li> <li>64</li> <li>65</li> <li>65</li> </ul>                                                             |
| 4 | <b>A T</b><br>4.1<br>4.2        | <b>Wo-Pa</b><br>Outpu<br>On-Ch<br>4.2.1<br>4.2.2<br>4.2.3                                                 | ath Power Combing CMOS Power Amplifier         at Power Limitations         at Power Limitations         at Power Combiner and Splitter         at Isolated Power Combiners         Non-Isolated Power Combiners         Stability Issue of Traditional Non-Isolated Power Combiners                                                                                                                                         | <ul> <li>61</li> <li>64</li> <li>65</li> <li>65</li> </ul>                                                                         |
| 4 | <b>A T</b> 4.1 4.2              | <b>Cwo-Pa</b><br>Outpu<br>On-Ch<br>4.2.1<br>4.2.2<br>4.2.3                                                | ath Power Combing CMOS Power Amplifier         at Power Limitations         at Power Limitations         at Power Combiner and Splitter         at Isolated Power Combiners         Non-Isolated Power Combiners         Stability Issue of Traditional Non-Isolated Power Combiners         and Power Splitter                                                                                                              | <ul> <li>61</li> <li>64</li> <li>65</li> <li>65</li> <li>68</li> </ul>                                                             |
| 4 | <b>A T</b> 4.1 4.2              | Cwo-Pa<br>Outpu<br>On-Ch<br>4.2.1<br>4.2.2<br>4.2.3<br>4.2.4                                              | ath Power Combing CMOS Power Amplifier         at Power Limitations         at Power Limitations         at Power Combiner and Splitter         Isolated Power Combiners         Non-Isolated Power Combiners         Stability Issue of Traditional Non-Isolated Power Combiners         And Power Splitter         Proposed Power Splitter                                                                                 | <ul> <li>61</li> <li>64</li> <li>65</li> <li>65</li> <li>68</li> <li>72</li> </ul>                                                 |
| 4 | <b>A T</b><br>4.1<br>4.2        | <b>Cwo-Pa</b><br>Outpu<br>On-Ch<br>4.2.1<br>4.2.2<br>4.2.3<br>4.2.4<br>Circui                             | ath Power Combing CMOS Power Amplifier         at Power Limitations         anip Passive Power Combiner and Splitter         Isolated Power Combiners         Non-Isolated Power Combiners         Stability Issue of Traditional Non-Isolated Power Combiners         Proposed Power Splitter         t Design                                                                                                              | <ul> <li>61</li> <li>64</li> <li>65</li> <li>65</li> <li>68</li> <li>72</li> <li>73</li> </ul>                                     |
| 4 | <b>A T</b><br>4.1<br>4.2<br>4.3 | <b>Cwo-Pa</b><br>Outpu<br>On-Ch<br>4.2.1<br>4.2.2<br>4.2.3<br>4.2.4<br>Circui<br>4.3.1                    | ath Power Combing CMOS Power Amplifier         at Power Limitations         anip Passive Power Combiner and Splitter         Isolated Power Combiners         Non-Isolated Power Combiners         Stability Issue of Traditional Non-Isolated Power Combiners         Arbitrary         Proposed Power Splitter         Active Stages                                                                                       | <ul> <li>61</li> <li>64</li> <li>65</li> <li>65</li> <li>68</li> <li>72</li> <li>73</li> <li>74</li> </ul>                         |
| 4 | <b>A T</b><br>4.1<br>4.2<br>4.3 | <b>Cwo-Pa</b><br>Outpu<br>On-Ch<br>4.2.1<br>4.2.2<br>4.2.3<br>4.2.4<br>Circui<br>4.3.1<br>4.3.2           | ath Power Combing CMOS Power Amplifier         at Power Limitations         anip Passive Power Combiner and Splitter         Isolated Power Combiners         Non-Isolated Power Combiners         Stability Issue of Traditional Non-Isolated Power Combiners         and Power Splitter         Proposed Power Splitter         t Design         Active Stages         Matching Networks                                   | <ul> <li>61</li> <li>64</li> <li>65</li> <li>65</li> <li>68</li> <li>72</li> <li>73</li> <li>74</li> <li>76</li> </ul>             |
| 4 | <b>A T</b><br>4.1<br>4.2<br>4.3 | <b>Cwo-Pa</b><br>Outpu<br>On-Ch<br>4.2.1<br>4.2.2<br>4.2.3<br>4.2.4<br>Circui<br>4.3.1<br>4.3.2<br>Impler | ath Power Combing CMOS Power Amplifier         at Power Limitations         aip Passive Power Combiner and Splitter         Isolated Power Combiners         Non-Isolated Power Combiners         Stability Issue of Traditional Non-Isolated Power Combiners         and Power Splitter         Proposed Power Splitter         t Design         Active Stages         Matching Networks         mentation and Measurements | <ul> <li>61</li> <li>64</li> <li>65</li> <li>65</li> <li>68</li> <li>72</li> <li>73</li> <li>74</li> <li>76</li> <li>77</li> </ul> |

#### CONTENTS

|   |     | 4.4.2   | Measurement Results                  | . 79  |
|---|-----|---------|--------------------------------------|-------|
|   |     | 4.4.3   | Comparison                           | . 81  |
|   | 4.5 | Conclu  | usions                               | . 81  |
| 5 | Tra | nsceive | er Building Blocks                   | 83    |
|   | 5.1 | Transi  | mitter                               | . 83  |
|   |     | 5.1.1   | Voltage Controlled Oscillator        | . 84  |
|   |     | 5.1.2   | Power Amplifier                      | . 86  |
|   |     | 5.1.3   | Antenna                              | . 89  |
|   | 5.2 | Receiv  | /er                                  | . 92  |
|   |     | 5.2.1   | Low Noise Amplifier                  | . 92  |
|   |     | 5.2.2   | Envelope Detector                    | . 94  |
|   |     | 5.2.3   | Limiting Amplifier and Output Buffer | . 97  |
|   | 5.3 | Measu   | urement                              | . 101 |
|   |     | 5.3.1   | Measurement Setup                    | . 101 |
|   |     | 5.3.2   | Measurement Results                  | . 102 |
|   |     | 5.3.3   | Comparisons                          | . 102 |
|   | 5.4 | Conclu  | usion                                | . 105 |
| 6 | Cor | clusio  | n                                    | 107   |

# List of Figures

| 1.1  | E-band backhaul network                                                        | 2  |
|------|--------------------------------------------------------------------------------|----|
| 1.2  | 60 GHz band applications                                                       | 3  |
| 1.3  | Automotive radar                                                               | 4  |
| 1.4  | $f_T$ and $f_{MAX}$ of CMOS technologies                                       | 4  |
| 1.5  | PA performance roadmap                                                         | 5  |
| 1.6  | Trade-offs in PA design                                                        | 6  |
| 2.1  | BER as a function of $E_b/N_0$                                                 | 11 |
| 2.2  | BER as a function of SNR                                                       | 12 |
| 2.3  | An OOK transceiver                                                             | 13 |
| 2.4  | An BPSK transceiver                                                            | 13 |
| 2.5  | Transceiver architectures                                                      | 15 |
| 2.6  | Power efficient transceiver                                                    | 16 |
| 2.7  | Cartesian transmitter with quadrature spatial combining $\ldots \ldots \ldots$ | 16 |
| 2.8  | Different realizations of OOK modulation                                       | 18 |
| 2.9  | OOK receiver architecture                                                      | 19 |
| 2.10 | A generalized wireless link                                                    | 19 |
| 2.11 | OOK power spectral density                                                     | 20 |
| 2.12 | Equivalent model of the receiver with noise sources $\ldots \ldots \ldots$     | 21 |
| 2.13 | Communication distance as a function transmitter output power                  | 23 |
| 2.14 | Communication distance as a function LNA gain                                  | 24 |
| 2.15 | Communication distance as a function LNA noise figure                          | 24 |

#### LIST OF FIGURES

| 3.1  | Block diagram of a two-stage PA                                                         | 28 |
|------|-----------------------------------------------------------------------------------------|----|
| 3.2  | System block diagram of a one-path PA                                                   | 30 |
| 3.3  | Inductively coupled resonator                                                           | 30 |
| 3.4  | Transimpedance of coupled resonator                                                     | 32 |
| 3.5  | The effects of $P$ and $Q_{Ind}$ on transimpedance $\ldots \ldots \ldots \ldots \ldots$ | 33 |
| 3.6  | The input impedance of coupled resonator                                                | 34 |
| 3.7  | Impedance transformation techniques                                                     | 35 |
| 3.8  | Output matching network                                                                 | 37 |
| 3.9  | Loss of output matching network                                                         | 39 |
| 3.10 | Impedance and efficiency of output matching network                                     | 41 |
| 3.11 | Transimpedance of output matching network                                               | 42 |
| 3.12 | Coupled resonator as interstage matching network                                        | 42 |
| 3.13 | Interstage matching network                                                             | 44 |
| 3.14 | Schematic of a differential neutralized amplifier                                       | 47 |
| 3.15 | Power gain of 3 kinds of amplifier                                                      | 48 |
| 3.16 | Q factors of input and output impedances with and without neutral-<br>ization           | 48 |
| 3.17 | Output stage and the driver of modulating signal                                        | 49 |
| 3.18 | Schematic of input stage                                                                | 51 |
| 3.19 | Input matching network                                                                  | 52 |
| 3.20 | Input matching layout                                                                   | 52 |
| 3.21 | Complete schematic of the PA                                                            | 53 |
| 3.22 | Micrograph of the chip                                                                  | 54 |
| 3.23 | Measurement setup for S-parameters                                                      | 55 |
| 3.24 | Measurement setup for large signal performance                                          | 56 |
| 3.25 | Measured S-parameters                                                                   | 57 |
| 3.26 | K-factor and group delay                                                                | 57 |
| 3.27 | Performance vs. input power                                                             | 58 |
| 3.28 | Performance vs. frequency                                                               | 58 |

#### LIST OF FIGURES

| 4.1  | A general PA diagram                                                               | 62 |
|------|------------------------------------------------------------------------------------|----|
| 4.2  | Schematic of a stacking PA                                                         | 63 |
| 4.3  | General block diagram of a power combining PA                                      | 64 |
| 4.4  | Schematic of Wilkinson combiner and quadrature coupler                             | 66 |
| 4.5  | Schematic of transmission line based combiner and transformer based combiner       | 67 |
| 4.6  | Simplified diagram of a one-path PA and a traditional two-path power combining PA  | 69 |
| 4.7  | The effects of neutralization on feedback $\hdots$                                 | 70 |
| 4.8  | Simplified input matching network of a one-path PA and a traditional two-path PA   | 71 |
| 4.9  | Simplified schematic of proposed power splitter                                    | 72 |
| 4.10 | Block diagram of three-stage two-path power combining PA $\ \ldots \ \ldots$       | 73 |
| 4.11 | Schematic of an active stage                                                       | 74 |
| 4.12 | Schematic of matching networks                                                     | 75 |
| 4.13 | Equivalent schematic of the matching network between input stage and driving stage | 76 |
| 4.14 | Simplified layout from input stage to power combiner                               | 78 |
| 4.15 | Chip photo of power combining PA                                                   | 78 |
| 4.16 | Measured S-parameters                                                              | 79 |
| 4.17 | Large signal performances vs. input power                                          | 80 |
| 4.18 | Large signal performances vs. frequency                                            | 80 |
| 5.1  | Block diagram of the OOK transmitter                                               | 84 |
| 5.2  | Schematic of the proposed VCO                                                      | 85 |
| 5.3  | The output stage and the inverter-based driver                                     | 86 |
| 5.4  | Two kinds of switching PA                                                          | 88 |
| 5.5  | The driving stage and its input and output matching networks                       | 88 |
| 5.6  | GSG connection and GSSG connection                                                 | 90 |
| 5.7  | The planar monopole antenna and the bonding wires $\ldots \ldots \ldots$           | 90 |
| 5.8  | The input matching of the antenna and the bonding wires $\ldots$ .                 | 91 |

#### LIST OF FIGURES

| 5.9  | The radiation pattern of the antenna at 50 GHz $\hdots$ 91                                       |
|------|--------------------------------------------------------------------------------------------------|
| 5.10 | Block diagram of the OOK receiver                                                                |
| 5.11 | Interstage matching network of the LNA                                                           |
| 5.12 | Schematic of the current-reuse LNA                                                               |
| 5.13 | Three topologies of squarer                                                                      |
| 5.14 | Equivalent noise model of envelope detector                                                      |
| 5.15 | Schematic of the ED                                                                              |
| 5.16 | Schematics of the LA                                                                             |
| 5.17 | Schematic of the buffer                                                                          |
| 5.18 | Micrographs of the transceiver                                                                   |
| 5.19 | Connection between the transmitter and the board $\ldots \ldots \ldots \ldots \ldots \ldots 102$ |
| 5.20 | Measurement setup of the OOK transceiver                                                         |
| 5.21 | Eye Diagram at the output of the receiver                                                        |
| 5.22 | BER performance of the link                                                                      |

# List of Tables

| 2.1 | Specifications of the proposed OOK transceiver                | 25  |
|-----|---------------------------------------------------------------|-----|
| 3.1 | Specifications of the PA                                      | 29  |
| 3.2 | Comparison table of mm-wave CMOS PAs without power combining  | 59  |
| 4.1 | Comparison table of mm-wave CMOS PAs with power combining $~$ | 81  |
| 5.1 | Summary of the VCO                                            | 85  |
| 5.2 | Summary of the PA                                             | 89  |
| 5.3 | Summary of the antenna                                        | 91  |
| 5.4 | Summary of the LNA                                            | 94  |
| 5.5 | Summary of the ED                                             | 98  |
| 5.6 | Summary of the LA and output buffer                           | 101 |
| 5.7 | Comparison table of mm-wave links                             | 105 |

## Chapter 1

## Introduction

Along with the wide use of smart phones and tablets, the demand for wireless data capacity has increased exponentially in the last decade [1]. This increasing data demand poses great challenges on traditional wireless transceivers, whose working frequencies are relatively low and the bandwidths are limited by the spectrum crunch. To achieve high data rate transmission, complex modulations are used to improve spectrum efficiency at the cost of system complexity and power consumption. To circumvent this problem, mm-wave frequency band, which has large spectrum bandwidth, has recently been exploited to achieve multi-gigabit speed wireless communications with simple modulations [2, 3].

Historically, mm-wave electronics are implemented by III-V compound semiconductor techniques [4]. Nowadays complementary metal oxide semiconductor (CMOS) circuits are also capable to operate at speed beyond 300 GHz [5], which provides the possibility to design low cost, high efficiency and high integration transceivers. However, due to low power gain and low transistor breakdown voltage, designing high performance mm-wave circuits, especially power amplifiers (PAs), is still challenging in CMOS technologies [6]. The two major difficulties in designing CMOS mm-wave transceivers are achieving large output power to extend link span and obtaining wide bandwidth to increase data capacity.

This thesis explores the feasibility of realizing high speed low power wireless transceivers in bulk CMOS technology. A 10 Gbps on-off keying (OOK) transceiver has been realized to take full advantages of the enormous bandwidth available at mm-wave frequency band. It demonstrates the capability to achieve high data rate with minimal system complexity and power consumption. Special attentions are given to PA, since it is the most challenging and power hungry block of a transceiver. Two wideband CMOS PAs have been implemented, where novel design techniques are proposed to achieve large output power and high efficiency over a large frequency range.

### 1.1 Mm-Wave Applications

The global mm-wave market is growing rapidly in many segments and applications with the expectation to double by 2020 in terms of market revenue [7]. New applications are being devised for this technology which will further propel the market in the coming five to seven years. Revenue for global millimeter wave technology market is estimated to reach \$208.12 million by the end of 2014 and is expected to grow to cross \$1.9 billion in 2020 at a compound annual growth rate (CAGR) of 45% [7].

One of the major applications of mm-wave technology is wireless communication. Thanks to the ultra wide bandwidth, mm-wave wireless links can achieve capacities more than 10 Gbps [8], which is unlikely to be matched by any lower frequency RF wireless technologies. In fact, the 10 GHz bandwidth available in the E-band is more than the sum of all other licensed spectrum for wireless communication. In addition, mm-wave casts very narrow beams, allowing for deployment of multiple independent links in close proximity [9]. As a result, E-band, which has negligible atmospheric attenuation, begins to be used for wireless backhaul to transfer data over a few kilometers, as shown in Fig. 1.1.



Figure 1.1: E-band backhaul network [10]

Besides the E-band, the unlicensed 60 GHz band can also provide multi-gigabit throughput. In contrast with E-band, 60 GHz band has significant atmospheric attenuation (larger than 10 dB/km). This atmospheric attenuation along with the

high free-space propagation loss limits undesired propagation over long distances, helps to minimize inter-system interference and allows the reuse of same frequency band. Therefore, as shown in Fig. 1.2, the 60 GHz band is well-suited for indoor applications to transfer lightly compressed or uncompressed high-definition videos, audios and data signals.



Figure 1.2: 60 GHz band applications [11]

In additional to high speed wireless communications, mm-wave technology can also be used in automotive radars and imaging sensors. The 77/79 GHz automotive radars can perform adaptive cruise control, collision warning, blind-spot detection and braking intervention, improving the comfort and safety of driving, as shown in Fig. 1.3. Mm-wave body scanners can detect concealed weapons and are suitable for homeland security. Furthermore, mm-wave imaging can be used for medical diagnosis and treatment, achieving higher spatial resolution than their low frequency counterparts.

### **1.2** Opportunities and Challenges of CMOS

Due to the superior performance at mm-wave frequency, III-V compound semiconductor techniques, such as gallium arsenide (GaAs) and indium phosphide (InP), are traditionally used for mm-wave transceivers [4]. As the development of silicon technologies, silicon germanium (SiGe) technology are also starting to be used in mature commercial products [13]. On the other hand, although CMOS technologies may offer both low cost in volume production and RF/baseband co-integration, it faces many challenges in mm-wave region. However, the advancement in process



Figure 1.3: Automotive radar [12]

and design techniques provide bulk CMOS technologies the potential to implement high performance mm-wave transceivers [14, 15].

Thanks to the technology scaling, the cut-off frequency  $f_T$  and the maximum oscillation frequency  $f_{MAX}$  of MOS devices have reached 400 GHz and are expected to double in the coming five years [16]. Fig. 1.4 shows the prediction of  $f_T$  and  $f_{MAX}$  by the international technology roadmap for semiconductors (ITRS)<sup>1</sup>. The  $f_T$  and  $f_{MAX}$  are comparable or even higher than other technologies, making CMOS technology a good candidate for mm-wave transceivers. In fact, low noise amplifiers (LNAs) - the most critical building block of a RF transceiver - implemented in bulk CMOS technologies can achieve roughly the same performance as those implemented in SiGe and III-V technologies [6].



Figure 1.4:  $f_T$  and  $f_{MAX}$  of CMOS technologies

 $<sup>^1{\</sup>rm The}$  results between 2014 and 2017 are of planar bulk CMOS technology, while those from 2018 to 2020 are of multi-gate MOSFETs

On the other hand, due to the scaled dimensions, the breakdown voltage of MOS transistors continually diminishes. In fact, the product of breakdown voltage and  $f_T$  of silicon devices is almost constant [17]. As a result, the supply voltage and thus the power throughput capability are limited in advanced CMOS technologies. Compared to other technologies, where larger supply voltages can be used, MOS transistors require much larger size to achieve the same output power. The large layout results in significant parasitics and degrades both power gain and efficiency [18]. Moreover, the low supply voltage necessitates small load impedance for high output power, which means large transformation ratio from the antenna impedance. Matching networks with large impedance transformation ratio would introduce high insertion loss and further degrade the efficiency. Fig. 1.5 shows the figure of merit (FOM) of PAs implemented in different technologies  $^{2}$ . It can be seen that pure silicon technologies, including bulk CMOS, ultrathin-body (UTB) fully depleted (FD) devices and multi-gate (MG) MOSFETs, have very poor PA FOM. Therefore, novel circuit topologies and design techniques are required to enhance PA performance. For example, on-chip power combining are widely used in order to increase the output power [19]. Non-isolated combiners and splitters are usually adopted due to their compact dimensions and low insertion losses. However, the interaction between different PA cells, together with the limited output-to-input isolation at mm-wave frequency, may cause oscillation problems.



Figure 1.5: PA performance roadmap [6]

<sup>&</sup>lt;sup>2</sup>The FOM takes into account of saturated output power and power gain. For MOS PAs, it is defined as  $\frac{I_{ON}}{4}min\{V_{DD}, \frac{BV_{GD}}{2}\}min\{MSG, \frac{f_{MAX}^2}{f^2}\}0.5f^2$ , where  $I_{ON}$  is the on-state current,  $V_{DD}$  supply voltage,  $BV_{GD}$  gate-drain breakdown voltage, MSG the maximum stable gain, f the operating frequency.

Another detrimental effect caused by technology scaling is the decreasing thickness of individual metals and the overall stack height. This increases resistive losses and vertical parasitic capacitances, and therefore impairs the quality factors of the onchip integrated inductors, transformers and capacitors. This effect is particularly destructive for PAs, where large power are handled and any losses will degrade a lot the achievable power and system efficiency.

In addition, as discussed in Sec. 1.1, large bandwidth are required for many mm-wave applications. For example, to cover the entire 60 GHz band and account for process variations, the transceiver needs to be designed with 12 GHz or even larger bandwidth [20]. Traditionally, bandwidth is increased at the cost of lower gain. However, at mm-wave frequency, the gain is rare and can hardly be traded for bandwidth. Furthermore, cascading multi stages cannot guarantee large gain-bandwidth product (GBW) because the gain increase is limited while the bandwidth reduction is severe. On top of that, cascading increases significantly power consumption and therefore degrades the efficiency. Fortunately, high-order networks are able to enhance the bandwidth without compromising gain or efficiency, and therefore allow to achieve high performance over large frequency range. However, due to the limited quality factors at mm-wave frequency, the number of passive components should be small and the layout should be compact to minimize the insertion loss.

To sum up, due to the low supply voltage, high losses of passive components, and large required bandwidth, design mm-wave transceivers in CMOS technology faces many challenges. Specifically, there are various trade-offs between different merits. Fig. 1.6 shows some trade-offs in PA design. The key to realizing high performance transceivers is to achieve a good balance among all desired merits. To accomplish this, novel circuit architectures and design techniques are required.



Figure 1.6: Trade-offs in PA design

### **1.3** Organization of this Thesis

To address the aforementioned challenges of mm-wave IC design in CMOS technologies, this thesis is focused on the design of novel passive networks, including wideband matching networks and stable and efficient power combiner/splitter, to enhance the bandwidth and output power of PAs. Two PA prototypes are implemented to verify the proposed technologies. Furthermore, using the proposed technologies, a wideband mm-wave transceiver is realized to explore the feasibility of achieving high speed low power wireless communication in advanced CMOS technology.

Chapter 2 begins with some basic communication theories, followed by a review of the state-of-the-art mm-wave transceivers for wireless communication. After comparing different transceiver architectures, a wideband OOK architecture is chosen in this work to achieve high data rate with minimal complexity and low power consumption. Finally, the link budget is calculated and specifications are chosen for different blocks.

Chapter 3 proposes a methodology for designing wideband matching networks in PAs. The design technique leverages inductively coupled resonators and impedance transformations to achieve high performance over a broad frequency range. A two-stage prototype has been implemented in ST 28 nm bulk CMOS technology. The PA achieves 27 GHz bandwidth, 13 dBm saturated power  $P_{SAT}$  and 16 % power-added efficiency (PAE).

Chapter 4 discusses different approaches to increase the output power of PAs. Particular attention is given to on-chip passive power combining and splitting techniques. A novel power splitter is proposed to suppress the oscillation problem of traditional non-isolated power splitter. A three-stage two-path differential PA prototype has been realized in ST 65 nm CMOS. The PA achieves 30 dB power gain, 20 dBm  $P_{SAT}$  and 22% peak PAE with an operation band from 58.5 GHz to 73.5 GHz.

Chapter 5 presents the implementation and characterization of a wideband OOK transceiver in ST 28 nm CMOS LP. A modulating PA is adopted to obviate the explicit OOK modulator. The PA can be switched off when transmitting '0' to minimize power consumption. The current-reuse topology is adopted for the LNA to further reduce the power consumption. The amplified signal is then demodulated by a squarer based envelope detector. Wideband LA and buffer amplify the demodulated signal to achieve large output amplitude, and a feedback path is inserted in the LA chain to cancel DC offset. The fabricated prototype achieves error-free transmission up to 13 cm at 5 Gbps, while consuming only 130 mW.

A summary of the major research contributions of this work composes Chapter 6 and completes this thesis.

## Chapter 2

# Millimeter Wave Transceivers for Wireless Communication

As described in the previous chapter, to satisfy the increasing demand for wireless data capacity, wireless transceivers are required to operate in millimeter wave band to enhance data throughout. Furthermore, limited battery capacity of portable devices poses stringent requirements on the power consumption. In order to meet these goals, a wideband OOK transceiver is proposed in this chapter to achieve both high data rate and low power consumption.

This chapter is organized as follow. Sec. 2.1 introduces some elements of communication theories, including the channel capacity and bit error rate. Sec. 2.2 reviews the state-of-the-art mm-wave transceivers designed for wireless communication. Finally a wide band OOK transceiver is described in Sec. 2.3, focusing on the architecture and link budget analysis.

#### 2.1 Data Communication Basics

One major performance merit of a wireless communication system is the data capacity. In the previous chapter, we implied that large bandwidth available at mm-wave frequency band leads to high data rate. To get quantitative understanding, we need to refer to ShannonHartley theorem, which sets an upper bound on the clean capacity of a communication link [21]. The upper limit in terms of data rate is given as:

$$C = BW \log_2(1 + \frac{S}{N}) \tag{2.1}$$

where C is the channel capacity in bits per second *bps*, *BW* the bandwidth of the channel in Hz, S the received signal power in *watts*, N the in-band noise power in *watts*,  $\frac{S}{N}$  the signal-to-noise ratio (SNR).

It can be seen that the maximal data rate C of a link is proportional to the bandwidth BW of the channel and the logarithm of the SNR. Therefore, large BW and high SNR allow to achieve high data rate. Furthermore, for a given data rate, BW can be used to trade with SNR. In other words, large BW can decrease the required signal power and thus improve the power efficiency <sup>1</sup>. This feature provides the possibility to realize high power efficiency links at mm-wave frequency.

Another important merit of a wireless link is the bit error rate (BER), which defines as the number of bit errors divided by the total number of transferred bits during a studied time interval. In a noisy channel, the BER can be expressed as a function of the normalized energy per bit to noise power spectral density ratio  $\frac{E_b}{N_0}$ . For example, within an additive white Gaussian noise (AWGN) channel, the BERs of some common modulation schemes can be calculated as [22, 23]:

$$BER_{OOK} = \frac{1}{2} exp\left(\frac{-E_b}{2N_0}\right) + \frac{1}{4} erfc\left(\sqrt{\frac{E_b}{2N_0}}\right)$$
$$BER_{BPSK} = BER_{QPSK} = \frac{1}{2} erfc\left(\sqrt{\frac{E_b}{N_0}}\right)$$
$$BER_{M-QAM} \approx \frac{2}{log_2 M} \left(1 - \frac{1}{\sqrt{M}}\right) erfc\left(\sqrt{\frac{3log_2 M}{2(M-1)}}\frac{E_b}{N_0}\right)$$
$$(2.2)$$

where erfc denotes the complementary error function, OOK, BPSK, QPSK and M-QAM the non-coherent on-off keying, the binary phase-shift keying, the quadrature phase-shift keying, and the M array quadrature amplitude modulation, respectively.

Fig. 2.1 plots the BER as a function of  $\frac{E_b}{N_0}$  for OOK, BPSK/QPSK, 16 - QAM and 64 - QAM modulation schemes. It can be seen that, to achieve the same BER, BPSK/QPSK requires the minimal  $\frac{E_b}{N_0}$ , while 64 - QAM requires the maximal among the four modulation schemes.

Compared to  $\frac{E_b}{N_0}$ , a more commonly used merit to measure the signal quality in the presence of noise is the SNR in Eq.(2.1). The SNR is proportional to  $\frac{E_b}{N_0}$  with following relationship:

$$SNR = \frac{S}{N} = \frac{E_b f_s k}{N_0 BW} \tag{2.3}$$

where  $f_s$  is the symbol rate, k the number of bits per symbol, and BW the channel bandwidth.

Since the channel bandwidth BW <sup>2</sup> is usually chosen to be roughly equal to the symbol rate  $f_s$  [23], Eq.(2.3) can be simplified to

$$SNR = \frac{E_b f_s k}{N_0 BW} = k \frac{E_b}{N_0}.$$
(2.4)

<sup>&</sup>lt;sup>1</sup>This is virtually the trade-off between spectral efficiency and power efficiency

<sup>&</sup>lt;sup>2</sup>Note that here the BW is the baseband bandwidth, while the RF bandwidth may be different.



Figure 2.1: BER as a function of  $E_b/N_0$ 

Therefore, Eq.(2.2) can be rewritten as

$$BER_{OOK} = \frac{1}{2}exp\left(\frac{-SNR}{2}\right) + \frac{1}{4}erfc\left(\sqrt{\frac{SNR}{2}}\right)$$
$$BER_{BPSK} = \frac{1}{2}erfc\left(\sqrt{SNR}\right)$$
$$BER_{QPSK} = \frac{1}{2}erfc\left(\sqrt{\frac{SNR}{2}}\right)$$
$$BER_{M-QAM} \approx \frac{2}{\log_2 M}\left(1 - \frac{1}{\sqrt{M}}\right)erfc\left(\sqrt{\frac{3}{2(M-1)}SNR}\right)$$
(2.5)

where  $k = log_2 M$  for M-QAM modulation.

Fig. 2.2 plots the BER as a function of SNR for OOK, BPSK, QPSK, 16-QAM and 64-QAM modulation schemes. To achieve the same BER, BPSK requires the minimal SNR, while 64 - QAM requires the maximum. In other words, with the same receiver, higher order modulation schemes require higher transmitter power, and non-coherent modulation, such as OOK, requires higher transmitter power than its coherent counterpart.



Figure 2.2: BER as a function of SNR

### 2.2 Review of mm-Wave Transceivers

In order to satisfy different applications, various modulation schemes and system architectures are employed at mm-wave frequency. In this section some state-ofthe-art work will be briefly studied.

The non-coherent OOK is the simplest amplitude-shift keying. It allows to minimize the system complexity and power consumption. [22] presents a 60 GHz OOK transceiver which is capable to achieve error-free operation up to 6 cm for a data rate of 1.5 Gbps. The adopted architecture is illustrated in Fig. 2.3. On the transmitter side, the binary input data directly modulates the 60 GHz output signal of a voltage controlled oscillator (VCO), and the output of the modulator is fed to a PA to enlarge the output power. On the receiver side, RF signal from the on-board antenna is amplified by a low noise amplifier (LNA) and then down-converted by a mixer to about 10 GHz. The output of the mixer is amplified by an intermediate frequency (IF) amplifier and then delivered to a demodulator. Note that the OOK modulator/demodulator obviate the need for complicated interfacing digitizers and subsequent DSPs, which may consume large amount of power at multi-gigabit data rate. Furthermore, the non-coherent OOK Link needs no frequency alignment since the frequency shift between two VCOs does not affect the BER of the transceiver, which further reduces the complexity and power consumption of the system.



Figure 2.3: An OOK transceiver [22]



Figure 2.4: An BPSK transceiver [24]

As the simplest form of phase-shift keying (PSK), BPSK has the same spectral efficiency as OOK, but it is more robust to noises and interferences, which can be seen from Fig. 2.2. The robustness stems from the precise phase alignment, which necessitates the use of phase-locked loop (PLL) and increases the complexity of the transceiver. [24] describes a 60 GHz BPSK transceiver with data rates exceeding 6 Gbps. Fig. 2.4 shows the simplified architecture, where the PLL is omitted from the schematic for simplicity. Compared with the transmitter in Fig. 2.3, here the modulation is directly performed on the PA. In other words, the modulator also serves as a PA. This can ease the requirement for the linearity and bandwidth of the PA, and therefore improve power efficiency. On the other hand, the modulator cannot output large power. For example, the transmitter output power in [24] is only 2.4 dBm, which is too small for many applications.

Compared with OOK and BPSK, more spectral efficient modulation schemes, such as QPSK, 8-PSK, 16-QAM and 64-QAM, can be employed to increase the data rate using the same bandwidth [2, 27, 28, 29]. Most mm-wave transceivers used to implement theses modulations can be categorized into heterodyne architecture and direct-conversion architecture. Two typical examples are illustrated in Fig. 2.5. Both architectures are made up of four major sub-systems in the front-end: RF amplifiers (i.e. LNA and PA), modulation and frequency conversion, frequency generation, and analog baseband. The direct-conversion architecture has been commonly used especially for less than 5 GHz because of fewer components and no need for SAWfilter, which is advantageous in terms of layout area and power consumption [26]. However, at mm-wave frequency, it is challenging to implement direct-conversion architecture due to the trade-off between the frequency tuning range and phase noise in the frequency synthesizer. Furthermore, leakage from the PA may drag the VCO oscillation frequency and degrade phase noise [23].

As can be seen from Fig. 2.5, the traditional transceiver architectures for spectral efficient modulations are generally very complex and need to consume large power. As a result, they are not suitable for portable devices, where the power consumption is a key concern. To improve the power efficiency, [30, 31] propose a novel transceiver which is shown in Fig. 2.6<sup>3</sup>. On the transmitter side, a fast start-up oscillator directly generates a QPSK-modulated RF signal, and therefore eliminates the need for power-hungry local oscillator (LO) buffer and mixer. On the receiver side, the mixer is stacked on the LO to reuse the current and reduce power consumption. This transceiver achieves 10.4 Gbps over a range of > 40 cm in all directions while consuming only 115 mW.

[31] achieves good power efficiency by eliminating the LO buffer and mixer. However, typically these two blocks do not occupy a significant percentage of the overall transceiver power consumption, which is generally dominated by the PA. Therefore,

 $<sup>^{3}</sup>$ The transceiver implemented in [31] is a 4-element phased-array transceiver. Here only one path is shown for clarity.



(a) A heterodyne transceiver [25]



(b) A direct-conversion transceiver [26]

Figure 2.5: Transceiver architectures







Figure 2.7: Cartesian transmitter with quadrature spatial combining [23]

16
to further improve the efficiency, new architectures and design techniques need to be exploited to save power from PA. Direct digital-to-RF conversion, power-DAC, and free-space power combining can be used to reduce the power consumption by switching off PA cells at back-off [32, 33, 34, 35]. Fig. 2.7 shows a direct digitalto-RF transmitter [23], where the I/Q PAs consist of an array of PA cells. The output amplitude of I/Q path are digitally controlled by the number of on-state PA cells. In other words, some PA cells are switched off at back-off to reduce power consumption. Furthermore, since each PA cell amplifies a constant envelope signal, a nonlinear amplifier can be used to maximize the efficiency of each PA cell. Moreover, the quadrature output power of I/Q path are combined in space to minimize the combining loss and maximize the isolation between the two paths. Thanks to the aforementioned techniques, this transmitter achieves a peak efficiency of 17.4% and an average efficiency of 7% at 6 dB back-off, representing ~ 1.5X improvement over prior art [30].

## 2.3 A Wideband OOK Transceiver

This section describes a mm-wave transceiver for high-speed short-range wireless communication. The work is within the European Project *Mirandela*. The target of this transceiver is to achieve 10 Gbps data rate over a distance of several centimeters. It can be used for chip-to-chip communication, high definition videos download, and other emerging applications.

#### 2.3.1 Transceiver Architecture

OOK modulation is adopted in this design due to the simple architecture, which allows to minimize the power consumption and time-to-the-market. There are mainly three different ways to implement OOK modulation, as illustrated in Fig. 2.8.

The first architecture shown in Fig. 2.8(a) uses an explicit modulator to perform OOK modulation, while the last two architectures directly modulate the PAs, i.e. switching on the PAs when transmitting one and switching off the PAs when transmitting zero. By switching off the PAs and eliminating the modulator, the last two can achieve better power efficiency<sup>4</sup>. The transmitter shown in Fig. 2.8(c) can save the largest amount of power by switching off both PA and VCO. However, it pose several challenges. First, the VCO usually has load with high quality factor Q to maximize the output amplitude, while the high Q load slows down the switching speed of the VCO<sup>5</sup>. The substantial switching time limits the on/off power ratio

<sup>&</sup>lt;sup>4</sup>The on-state performance of PA may drop a bit due to the switching transistor. However, this penalty is negligible compared to the power saving when transmitting zero.

<sup>&</sup>lt;sup>5</sup>The switching on/off time is approximately Q times the oscillation period.



(a) A conventional OOK transmitter [22]



(b) A OOK transmitter with a switching PA [26]



(c) A OOK transmitter with a switching PA and switching VCO [36]

Figure 2.8: Different realizations of OOK modulation

and achievable data rate. Second, the output of a switching VCO occupies large bandwidth and requires a wideband PA, which generally leads to low gain and limited efficiency. On the other hand, the power consumption of the VCO is much less than that of the PA. Therefore, the transmitter shown in Fig. 2.8(b) is adopted in this design. It is worth noting that, the adopted OOK transmitter is essentially a 1-bit power DAC, and a good power efficiency can be achieved by switching off the PA at back-off, i.e. transmitting zero in this case. By combining several such OOK transmitters together (either on chip or in free space), multi-bit DACs and complex modulation transmitters can be realized.

The receiver architecture is shown in Fig. 2.9, where the RF signal from the antenna is amplified by a LNA and then directly demodulated by a squarer based envelope detector (ED). The demodulated signal is amplified by a limiting amplifier (LA) to

obtain large output amplitude. Compared to the receiver in Fig. 2.3, this architecture eliminates the mixer and VCO, therefore is cable to minimize the chip area and maximize the power efficiency.



Figure 2.9: OOK receiver architecture

#### 2.3.2 Link Budget

To determine the specifications for individual blocks, it is necessary to perform the link budget analysis. Fig. 2.10 shows a generalized wireless link, where the transmitter delivers an output power of  $P_{TX}$ , and the antenna gains of the transmitting and receiving antennas are  $G_{TX}$  and  $G_{RX}$  respectively.



Figure 2.10: A generalized wireless link

According to the Friis transmission equation, the received signal power  $P_{RX}$  can be expressed in dBm as:

$$P_{RX} = P_{TX} + G_{TX} + G_{RX} - 20log(\frac{4\pi R}{\lambda})$$
  
=  $P_{TX} + G_{TX} + G_{RX} - 20log(\frac{4\pi f_c R}{c})$  (2.6)

where R is the distance between the antennas, and  $\lambda$ , c and  $f_c$  the wavelength, speed and frequency of the traveling signal, respectively. The last item is the so-called free-space path loss, which is proportional to the square of the signal frequency. As a result, mm-wave band link has much larger free-space loss than that of low frequencies, and therefore is more challenging to reach a large distance. The robustness of the link can be measured by link margin, which is defined as the difference between the minimum required signal level  $P_{sen}$ , i.e. receiver sensitivity, and the actual received power  $P_{RX}$ . The sensitivity  $P_{sen}$  in dBm is given as:

$$P_{sen} = P_{noise} + F_{RX} + SNR_{out}$$
  
=10log(KT) + 30 + 10log(BW) + F\_{RX} + SNR\_{out} <sup>6</sup> (2.7)

where  $P_{noise}$  is the in-band input noise power, K Boltzmann's Constant, T the absolute temperature of the receiver input, BW the receiver bandwidth,  $F_{RX}$  the receiver noise figure, and  $SNR_{out}$  the required SNR of the receiver output signal, which is determined by the modulation scheme and desired BER.

Therefore, the link margin LM is:

$$LM = P_{RX} - P_{sen}$$
  
=  $P_{TX} + G_{TX} + G_{RX} - 20log(\frac{4\pi f_c R}{c}) - 10log(KT) - 30 - 10log(BW) - F_{RX} - SNR_{out}$   
(2.8)

In this design, the antenna gains on both transmit and receive sides are 3 dBi. The carrier frequency  $f_c$  is chosen to be 50 GHz due to the limited operating range of measurement instruments. The  $SNR_{out}$  is chosen to be 17 dB to achieve a BER less than  $10^{-12}$ , which can be seen from Fig. 2.2. In addition, as illustrated in Fig. 2.11, the OOK power spectral density is a  $sinc^2$  function with  $2B_r$  main lobe width around the carrier  $f_c$ , where  $B_r$  denotes the bit rate [22]. As a consequence, to achieve a data rate of 10 Gbps, 20 GHz RF bandwidth is required.



Figure 2.11: OOK power spectral density

<sup>&</sup>lt;sup>6</sup>The 30 is added because  $P_{sen}$  is expressed in dBm.

Therefore, the link margin LM of this OOK transceiver is:

$$LM = P_{TX} + 3 + 3 - 20log(\frac{4\pi R50 * 10^9}{c}) - 10log(KT) - 30 - 10log(20 * 10^9) - F_{RX} - 17$$
  
=  $P_{TX} + 3 + 3 - 66.5 - 20logR + 174 - 103 - F_{RX} - 17$   
=  $P_{TX} - F_{RX} - 20logR - 6.5$  (2.9)

The communication distance R can be calculated from Eq.(2.9):

$$R = 10^{(P_{TX} - F_{RX} - LM - 6.5)/20}.$$
(2.10)

From Eq.(2.10), it can be seen that to estimate the communication distance R, we need to first calculate the receiver noise figure  $F_{RX}$ . Since the proposed OOK receiver is based on nonlinear energy detection, the classic Friis formula for noise factor does not apply, and therefore the computation is not straightforward [37]. To calculate the overall receiver noise figure  $F_{RX}$ , a equivalent model with noise sources is shown in Fig. 2.12, where  $s_{in}$  is the desired input signal,  $n_s$  the channel noise,  $n_{amp}$  the input-referred LNA noise and  $n_{int}$  the aggregated noise of the envelope detector and following stages, while  $G_{LNA}$  and  $a_2$  are the LNA gain and squarer gain, respectively.



Figure 2.12: Equivalent model of the receiver with noise sources

The noise figure of the receiver  $F_{RX}$  is defined as:

$$F_{RX} = \frac{SNR_{in}}{SNR_{out}}.$$
(2.11)

The input SNR is given as:

$$SNR_{in} = \frac{E[s_{in}^2]}{E[n_s^2]} = \frac{E_b B_r}{N_0 BW} = \frac{P_b}{P_{N0}}$$
(2.12)

where  $E_b$  is the energy of the bit,  $B_r$  the bit rate,  $N_0$  the power spectral density of the channel noise.

The output SNR can be calculated as:

$$SNR_{out} = \frac{E[s_{out}^2]}{E[n_{out}^2]} = \frac{E[(a_2(G_{LNA}s_{in})^2)^2]}{E[(a_2G_{LNA}^2(2s_{in}(n_s + n_{amp}) + (n_s + n_{amp})^2) + n_{int})^2]}$$
(2.13)

where  $E[\bullet]$  denotes the expected value. Note that  $n_{int}$  is uncorrelated with  $n_s + n_{amp}$ , and therefore the expectation of their product is zero. Furthermore, since  $n_s$  and  $n_{amp}$  are normally distributed variables, their sum is also normally distributed variable and thus the odd-order moments of their sum are zero. Accordingly, the output SNR can be simplified as:

$$SNR_{out} = \frac{G_{LNA}^{4}a_{2}^{2}E[s_{in}^{4}]}{4G_{LNA}^{4}a_{2}^{2}E[s_{in}^{2}]E[(n_{s}+n_{amp})^{2}] + G_{LNA}^{4}a_{2}^{2}E[(n_{s}+n_{amp})^{4}] + E[n_{int}^{2}]}$$

$$= \frac{G_{LNA}^{4}a_{2}^{2}P_{b}^{2}}{4G_{LNA}^{4}a_{2}^{2}P_{b}E[(n_{s}+n_{amp})^{2}] + 3G_{LNA}^{4}a_{2}^{2}(E[(n_{s}+n_{amp})^{2}])^{2} + \sigma_{n_{int}}^{2}}$$

$$= \frac{G_{LNA}^{4}a_{2}^{2}P_{b}^{2}}{4G_{LNA}^{4}a_{2}^{2}P_{b}P_{N0}F_{LNA} + 3G_{LNA}^{4}a_{2}^{2}P_{N0}^{2}F_{LNA}^{2} + \sigma_{n_{int}}^{2}}$$

$$(2.14)$$

where  $F_{LNA}$  is the noise figure of the LNA, defined as:

$$F_{LNA} = \frac{SNR_{in,LNA}}{SNR_{out}} = \frac{P_b/P_{N0}}{P_b/E[(n_s + n_{amp})^2]} = \frac{E[(n_s + n_{amp})^2]}{P_{N0}}$$
(2.15)

Substituting Eq.(2.12) and Eq.(2.14) into Eq.(2.11), we have

$$F_{RX} = \frac{SNR_{in}}{SNR_{out}}$$

$$= \frac{P_b}{P_{N0}} \frac{4G_{LNA}^4 a_2^2 P_b P_{N0} F_{LNA} + 3G_{LNA}^4 a_2^2 P_{N0}^2 F_{LNA}^2 + \sigma_{n_{int}}^2}{G_{LNA}^4 a_2^2 P_b^2}$$

$$= \frac{4G_{LNA}^4 a_2^2 P_b F_{LNA} + 3G_{LNA}^4 a_2^2 P_{N0} F_{LNA}^2 + \sigma_{n_{int}}^2 / P_{N0}}{G_{LNA}^4 a_2^2 P_b}$$

$$= 4F_{LNA} \left(1 + \frac{3F_{LNA}}{4SNR_{in}} + \frac{\sigma_{n_{int}}^2}{4F_{LNA} G_{LNA}^4 a_2^2 P_b P_{N0}}\right)$$
(2.16)

Two important insights can be pointed out. First, even if the receiver is completely noiseless, the SNR degrades by more than 6 dB. Second, unlike the common linear case, the equivalent receiver noise figure  $F_{RX}$  depends not only on the gain and noise

figure of its blocks, but also on the input SNR. This is due to the squaring action of the energy detector that translates to the output an amount of noise proportional to the power of the input signal.

From Eq.(2.10) and Eq.(2.16), we can estimate the affects of different blocks on the communication distance R. For example, assuming 25 dB LNA gain, 10 dB LNA noise figure,  $\sigma_{n_{int}}^2 = 500nV^2$ ,  $a_2 = 1$  and 4 dB link margin, the communication distance R is a function of the output power  $P_{TX}$  of the transmitter, as depicted in Fig. 2.13. It can be seen that larger transmitter output power leads to longer communication distance. On the other hand, large transmitter output power requires large size of power amplifier, which may degrade the transmitter efficiency due to the loss of layout parasitics. Therefore, 10 dBm output power, which can result in more than 14 cm distance, is chosen in this design.



Figure 2.13: Communication distance as a function transmitter output power

In the same way, assuming 10 dBm transmitter output power, 10 dB LNA noise figure,  $\sigma_{n_{int}}^2 = 500nV^2$ ,  $a_2 = 1$  and 4 dB link margin, the communication distance Ris a function of the LNA gain  $G_{LNA}$ , as depicted in Fig. 2.14. It is not surprising that higher LNA gain leads to longer communication distance. Interestingly, the distance begins to saturate when the LNA gain is higher than 25 dB. This is because when the gain of the LNA is very high, the noises of the ED and LA become negligible when they are transformed to the receiver input, and thus the LNA dominates the noise figure of the whole receiver. On the other hand, high gain complicates the design and increases the power consumption. Therefore, 25 dB gain, which is the starting point of the saturation, is chosen for the LNA.



Figure 2.14: Communication distance as a function LNA gain



Figure 2.15: Communication distance as a function LNA noise figure

24

As analyzed before, when the LNA has very high gain, the noise figure of the LNA determines the communication distance. To estimate this effect, we can simulate the communication distance R as a function of the LNA noise figure  $F_{LNA}$ . Assuming 10 dBm transmitter output power, 25 dB LNA gain,  $\sigma_{n_{int}}^2 = 500nV^2$ ,  $a_2 = 1$  and 4 dB link margin, we have Fig. 2.15. It can be seen that, smaller LNA noise figure results in larger communication distance. A noise figure of 10 db is enough to achieve a reasonable long distance.

In the same way, we can estimate other parameters such as  $\sigma_{n_{int}}^2$  and  $a_2$ , the results are not shown here for simplicity of the thesis.

## 2.3.3 OOK Transceiver Specifications

Based on the analysis and simulations in Sec. 2.3.2, to achieve 10 Gbps data rate over a distance of 10 cm, the transmitter, i.e. the PA, needs to deliver 10 dBm output power, and the LNA requires to have 25 dB gain and 10 dB noise figure. Furthermore, the RF amplifiers, i.e. the PA and LNA, need to have 20 GHz bandwidth, while the baseband blocks, i.e. the ED and LA, 10 GHz bandwidth <sup>7</sup>. In addition, the VCO oscillates at 50 GHz with 10 % tuning range to cover process, voltage and temperature (PVT) variations. The specifications are summarized in Tab. 2.1.

| VCO              |                    | PA               |                    |
|------------------|--------------------|------------------|--------------------|
| Center Frequency | $50~\mathrm{GHz}$  | Center Frequency | 50 GHz             |
| Tuning Range     | 10%                | Bandwidth        | 20 GHz             |
| LNA              |                    | ED and LA        |                    |
| Center Frequency | $50 \mathrm{~GHz}$ | Bandwidth        | $10 \mathrm{~GHz}$ |
| Bandwidth        | 20 GHz             | Integrated Noise | $500 \ nV^2$       |
| Gain             | 25  dB             | Power at the     |                    |
| Noise Figure     | 10 dB              | output of ED     |                    |

Table 2.1: Specifications of the proposed OOK transceiver

Tab. 2.1 only summarizes the obligatory requirements calculated from link budget analysis. There are many other important parameters needed to be considered in order to achieve a good performance. For example, the power efficiency of the PA is critical to realize a low power transceiver. Moreover, the gains of ED and LA need to be high enough to obtain a wide-open eye at the receiver output. Since these merits are related to circuit design, they are discussed in the following chapters.

<sup>&</sup>lt;sup>7</sup>In fact, 7 GHz bandwidth, i.e.  $0.7B_r$ , may be a better value to minimize the overall effect of noise and inter symbol interference (ISI). Here we set 10 GHz bandwidth to cover PVT variation. Bandwidth tuning techniques can be used to adjust the actual bandwidth to the optimal value.

## 2.4 Conclusions

Thanks to the large available spectral resources at mm-wave band, simple modulation schemes can be adopted to minimize the power consumption of wireless links, while achieving high data rate. In addition, digital-to-RF techniques, such as directly modulating PAs, can be exploited to further improve the transceiver efficiency. On the other hand, mm-wave signals experience significant free-space path loss. Therefore mm-wave transceivers are suitable for low-power high-speed short-range communications especially when non-directional antennas are used.

A wideband OOK transceiver is proposed in this chapter to target 10 Gbps data rate over a distance of 10 cm. An in-depth analysis indicates that the nonlinear energy detector has a substantial impact on the receiver noise figure. This impact is properly taken into account in the link budget analysis, and specifications are calculated accordingly for different building blocks.

## Chapter 3

# A 40-67 GHz CMOS Power Amplifier

This chapter describes a 50 GHz two-stage wideband PA for the OOK transceiver. A methodology for designing wideband matching networks in PAs is proposed. Based on inductively coupled resonators, the design technique leverages impedance transformations and topological rearrangements to achieve high performance over a broad frequency range.

A two-stage differential PA has been realized in ST 28 nm CMOS using low-power devices to verify the proposed techniques. The PA has a bandwidth from 40 GHz to 67 GHz, and delivers 13 dBm saturated output power with a peak power-added efficiency (PAE) of 16% without power combining. To the best of the author's knowledge, the proposed PA achieves the largest fractional bandwidth (51%) with state of the art performance among mm-wave PAs reported so far.

This chapter is organized as follows. Sec. 3.1 introduces the design challenges of achieving wideband performance for mm-wave PAs. Sec. 3.2 describes the system specifications and block diagram. In Sec. 3.3, a design methodology for wideband matching networks is presented and two examples are given to illustrate the procedure. In Sec. 3.4, the design of active stages is discussed. Measurements and conclusions close the chapter.

## **3.1** Broadband Power Amplifier Design Challenges

CMOS technologies may offer both low cost in volume production and RF/baseband co-integration for mm-wave transceivers. However, power amplifier, which greatly affects the entire transmitters power efficiency and output signal quality, still remains one of the most challenging blocks. Besides conventional PA metrics, such as output power and efficiency, various emerging applications pose further requirements on the operating bandwidth. High data rate short-range wireless links may require wideband operation to allow the use of simple modulation schemes [2], which can considerably reduce the complexity of the system and cut time to market. Along with wireless communications, remote sensing and imaging systems are continuously evolving towards mm-wave frequencies, pushing for very large fractional bandwidths to achieve high spatial resolution [38].

However, achieving a good performance over a large frequency range is really challenging for mm-wave PAs. Fig. 3.1 shows the block diagram of a two-stage PA. The transistor  $M_{PA}$  is chosen with a high form factor to deliver the desired output power. However, large devices cause sizable parasitics, which not only impair the limited power gain of the transistor, but also introduce large parasitic capacitance. As a result, the gain-bandwidth product (GBW) is limited, especially at mm-wave frequency. Furthermore, a trade-off between power efficiency and gain-bandwidth product exists: first, the output transistor  $M_{PA}$  is biased at a low level to maintain high efficiency at the cost of low gain; second, a small input transistor  $M_{In}$ is desirable to achieve high efficiency, while a large  $M_{In}$  is preferable to increase the transconductance and GBW. Since PA is the most power hungry block of a transceiver, maximizing its efficiency is usually of primary importance, which on the other hand limits the GBW.



Figure 3.1: Block diagram of a two-stage PA

Many prior studies proposed CMOS PAs at mm-wave frequencies [39, 40, 41, 19]. However, only a few of them show truly wideband operation, and generally few insights are given on the matching network design techniques. In this chapter, we focus on the analysis and design of wideband matching networks for mm-wave PAs. As a result of this study, a two-stage wideband PA has been fabricated in 28nm CMOS LP technology [42]. As it will be detailed in Sec. 3.6, the PA achieves good performances over a fractional bandwidth greater than 51% around 55GHz.

## 3.2 System Block Diagram

As discussed in chapter 2, to transmit data at 10 *Gbps* over 10 *cm*, PA has to deliver at least 10 *dBm* output power while achieving a bandwidth around 20 *GHz*. Moreover, since PA is the most power consuming block of a transceiver, achieving high power efficiency is critical for implementing a low power system. In addition, to facilitate the design of the preceding stage, i.e. easing the requirement of output power for VCO in our case, the power gain of the PA needs to be high. Based on these considerations, the specifications of this PA are reported in Tab. 3.1, where  $P_{SAT}$  is the saturated output power,  $f_c$  the center frequency, *BW* the bandwidth, *Gain* the power gain, and *PAE* the power-added efficiency.

| $P_{SAT}$ | $10 \ dBm$ |
|-----------|------------|
| $f_c$     | 50 GHz     |
| BW        | 20 GHz     |
| PAE       | 10%        |
| Gain      | $10 \ dB$  |

Table 3.1: Specifications of the PA

Due to the moderately low required power gain and reasonably high transistor gain of this process (larger than 10 dB for each stage), the one-path two-stage architecture in Fig. 3.2 is chosen for this design. Differential topology, which can double the maximal output swing and achieve 40 mW, i.e. 16 dBm output power with the nominal 1 V supply, is adopted for both active stages to achieve the required output power without power combining.

Furthermore, impedance transformation techniques are exploited in the networks to perform the desired impedance matching, i.e. transforming standard 50  $\Omega$  antenna to the optimal load  $Z_{OPT}$  of output stage to maximize the output power, matching the input impedance  $Z_{PA,in}$  of output stage to the output impedance  $Z_{In,out}$  of input stage to minimize the size and power consumption of the input stage, and matching the input impedance  $Z_{In,in}$  of input stage to standard 50  $\Omega$  to minimize S11. To achieve wideband performance, inductively coupled resonators, which will be discussed in Sec. 3.3.1, are employed as matching network candidate to achieve the required bandwidth without impairing the gain.

## 3.3 Matching Networks Design

This section presents a design methodology for wideband matching networks. A high-order low-complexity network is presented and adopted as matching networks



Figure 3.2: System block diagram of a one-path PA

to achieve wideband performance. Impedance transformations techniques are exploited to obtain the required impedance matching and rearrange the network topology to enable a compact layout. Finally, applying the methodology, the design procedures of the output and interstage matching network are discussed in detail.

#### 3.3.1 Inductively Coupled Resonator

High-order networks are able to accommodate large capacitors and thus achieve wideband performance at low frequency [43]. However, due to the low quality factor of passive components at mm-wave frequency, the number of passive components should be small to minimize power losses. Furthermore, the topology of the network and the component values should enable a compact layout to minimize parasitics. In this scenario, inductively coupled resonator in Fig. 3.3 is promising. The capacitance  $C_1$ , that denotes the parasitic of the preceding stage, is resonated out by inductor  $L_1$ , while  $L_3$  resonates out the input capacitance  $C_3$ . The two resonators are coupled through  $L_2$  for wideband operation. This network has very simple topology and thus can the insertion loss can be small.



Figure 3.3: Inductively coupled resonator

From Fig. 3.3, the transimpedance  $Z_T$  and the input impedance Zin of the network can be calculated as <sup>1</sup>:

$$Z_T = \frac{V \, out}{Iin} \approx \frac{Z_1 Z_3}{Z_1 + Z_2 + Z_3}$$

$$Zin = \frac{Z_1 \left(Z_2 + Z_3\right)}{Z_1 + Z_2 + Z_3}$$
(3.1)

where  $Z_1$ ,  $Z_2$ , and  $Z_3$  are the three impedances marked with dashed boxes:

$$Z_{1} = \frac{L_{1}s}{1 + L_{1}C_{1}s^{2}}$$

$$Z_{2} = L_{2}s$$

$$Z_{3} = \frac{1}{\frac{1}{\frac{1}{R_{3}} + C_{3}s + \frac{1}{L_{3}s}}}.$$
(3.2)

As illustrated in Fig. 3.4, the frequency response of transimpedance  $|Z_T|$  exhibits two peaking frequencies. When the  $Q = R_3 C_3 \omega_0$  is high enough, the two frequencies become:

$$\omega_L \approx \omega_0 \tag{3.3}$$
$$\omega_H \approx \omega_0 \sqrt{1 + L_1/L_2 + L_3/L_2}$$

where  $\omega_0$  is the resonating frequency of the two resonators:

$$\omega_0 = 1/\sqrt{L_1 C_1} = 1/\sqrt{L_3 C_3}.$$
(3.4)

From Fig. 3.4 and Eq.(3.3), we can see that the lower boundary of the bandwidth is set by the resonating frequency of the two resonators, and the higher boundary is set by the ratio  $T = \frac{L_1+L_3}{L_2}$ . The bandwidth can be increased by decreasing  $L_2$ at the cost of larger in-band ripple. Moreover, the Q needs to be small to minimize the in-band ripple.

To have a symmetric frequency response, the two resonators should have the same quality factor. However, in most real cases the two quality factors are very different and sometimes the larger one is assumed infinite [44]. In this case, balanced resonators, i.e.  $C_1 \approx C_3$  and  $L_1 \approx L_3$ , can be used to achieve a quasi-symmetric response. On the other hand, layout parasitics such as limited quality factor  $Q_{Ind}$  of inductors and unwanted coupling capacitors may change the symmetry of an ideal network. Fortunately, unbalanced resonators can be exploited to accommodate potential layout parasitics and obtain a flat response. For example, as illustrated in Fig. 3.5(a), the limited  $Q_{Ind}$  decreases  $|Z_T|$  within the operating band, especially at higher frequency. As a result, the gain at higher frequency would be much smaller than the one at lower frequency. To achieve a flat response across the bandwidth,

<sup>&</sup>lt;sup>1</sup>To calculate  $Z_T$ , we may need to consider the output resistance of the preceding stage, which lowers the quality factor of the network and makes the frequency response more flat.



Figure 3.4: Transimpedance of coupled resonator

 $P = \frac{L_1}{L_3}$  can be increased to increase the gain at higher frequency. A example is shown in Fig. 3.5(b), where  $Q_{Ind} = 10$  is assumed and  $T = \frac{L_1 + L_3}{L_2}$  is kept unchanged to maintain a constant bandwidth.

The input impedance Zin of the coupled resonator remains constant across large bandwidth. This feature is helpful for performing impedance matching, which is



Figure 3.5: The effects of P and  $Q_{Ind}$  on transimpedance

particularly important for the output matching network to maximize the output power from the active device. As shown in Fig. 3.6, the real part of the input impedance Zin is around  $R_3 = 50 \ \Omega$  within the operating band in the case of high Q and balanced resonator, while the imaginary part is around zero.

It is worth noting that, as the dual of inductively coupled resonator, capacitively



Figure 3.6: The input impedance of coupled resonator

coupled resonators are also able to achieve wideband performance [20, 45]. However, capacitors have smaller quality factors and thus more losses than inductors at mm-wave frequency. Moreover, through topological transformation, inductively coupled resonators can integrate transformers to perform differential to single-ended transformation and absorb potential layout parasitics to enable compact layout, a feature that cannot be leveraged in capacitively coupled resonators.

#### 3.3.2 Impedance Transformation Techniques

Inductively coupled resonator is promising to achieve wideband performance, but it poses some limitations on the resonators. For example,  $C_1 \approx C_3$  is required to obtain a symmetric network response. Considering the interstage matching network,  $C_1$  is the output capacitor of the input stage, and  $C_3$  is the input capacitor of the output stage. To make  $C_1 \approx C_3$ , the input stage needs to be very large, thus consuming a large amount of power, which will degrade the power efficiency of the system. To avoid this problem, we can insert an explicit capacitor  $C_E$  while using a small input stage to make  $C_1 + C_E \approx C_3$ . However, the explicit capacitor will increase the Q factor of the network and the in-band ripple, as illustrated in Fig. 3.4(b). To circumvent this problem, design techniques are required to transform impedance while preserving the network response.

A simple impedance transformation technique is using a transformer. Depicted in Fig. 3.7(a), an ideal transformer with a ratio of 1 : N can change by N times the impedance at one side of the circuit, without affecting the impedance at the

other side. As a result, this technique can allow a large scaling factor between the input stage and output stage while achieving wideband matching. Furthermore, the transformer can be used to decouple two stages and provide supply and bias voltages.



Figure 3.7: Impedance transformation techniques

Norton transform can also be used for impedance transformation [46]. The transformation is performed on an appropriate pair of adjacent branches of a ladder network. In the simplest forms, the branches may contain only a pair of single capacitors or inductors. An example is illustrated in Fig. 3.7(b), where the two inductors  $L_S$  and  $L_P$  in the left network are flipped horizontally and scale down by M times. At the same time, the impedance  $Z_S$  is scaled down by  $M^2$  times, where M is dependent on the circuit elements:

$$M = \frac{L_S + L_P}{L_P}.$$
(3.5)

Note that, Norton transform can also be applied to the right network to scale up the impedance by  $M^2$  times. In other words, depending on the configuration of the component pair, this technique can increase or decrease the impedance.

It is worth noting that both techniques do not change the order of the network or the frequency response. So they can be conveniently used in wideband networks without impairing the bandwidth. Moreover, compared to inserting a transformer, Norton transform not only changes the impedance, but also rearranges circuit topology, which may be used to enable a compact layout and minimize the losses. In the following sections, detailed discussions will show how these impedance transformation techniques are used in coupled resonators to achieve wideband matching.

#### 3.3.3 Output Matching Network

For a given power device and technology, the maximum output voltage  $V_{max}$  is constrained by the devices breakdown voltage, while the maximum output current  $I_{max}$  is limited by the device size and the input voltage drive. To maximize the output power from a given device, specific load impedance should be presented at the device output [47]. The desired impedance is normally determined through large signal load-pull simulation or measurement. To the first order approximation, the optimal load can be expressed as a inductance  $L_{opt}$  to resonate with the device output capacitance  $C_{device}$  at the operation frequency  $\omega$  and a parallel resistor  $R_{opt}$ , given by the following equations:

$$L_{opt} \approx \frac{1}{\omega^2 C_{device}}$$

$$R_{opt} \approx \frac{V_{max} - V_{knee}}{I_{max}}$$
(3.6)

where  $V_{knee}$  represents the knee voltage of the power device. It is worth noting that:  $C_{device}$  and  $R_{opt}$  can be considered constant across the operation frequency range[48].

Therefore, to achieve a high power broadband PA, the output matching network needs to provide a desired load over a wide bandwidth with minimum insertion loss. A design procedure, illustrated in Fig. 3.8, which starts from the coupled resonator of Fig. 3.3 and leads to a transformer-based PA output matching network is here presented. The procedure leverages impedance transformation techniques to transform the standard 50  $\Omega$  to the desired load impedance across a wide frequency range.

Firstly, a Norton transformation is performed on inductors  $L_2$  and  $L_3$ , which get topologically swapped in Fig. 3.8(b). This action scales down the input impedance Zin by a ratio of

$$N^2 = \left(\frac{L_2 + L_3}{L_3}\right)^2. \tag{3.7}$$

Secondly, the two shunt inductors  $\frac{L_1}{N^2}$  and  $\frac{L_3}{N}$  are combined as  $L_A$ , and the inductor  $\frac{L_2}{N}$  is split into  $L_{B1}$  and  $L_{B2}$ , with following relationships:

$$L_{A} = \frac{L_{1}L_{3}}{NL_{1} + N^{2}L_{3}}$$

$$L_{B1} = (\frac{1}{k^{2}} - 1)L_{A}$$

$$L_{B2} = \frac{L_{2}}{N} - L_{B1}$$
(3.8)

where k is a positive number smaller than 1.



Figure 3.8: Output matching network

Thirdly, an ideal transformer with a turn ratio of  $1 : k\sqrt{r}$  is inserted between  $L_{B1}$  and  $L_{B2}$ , as shown in Fig. 3.8(d). This action further downscales the impedance to the left of the transformer by a factor of  $k^2r$ .

Finally, the network highlighted in dashed box represents a non-ideal transformer model with both leakage inductances shifted to the secondary side [49], and can be replaced by a practical transformer with an actual turn ratio of r and coupling coefficient of k. The design equation of the transformer's inductances  $L_P$  and  $L_S$ are given as

$$L_P = \frac{L_A}{k^2 r}$$

$$L_S = \frac{L_A}{k^2}.$$
(3.9)

To sum up, through a Norton transformation and insertion of a transformer, the inductively coupled resonator can be used to achieve a third-order bandpass function with both impedance conversion and differential to single-ended signal combining. The proposed network achieves a total impedance down transformation of  $k^2 r N^2$ , thus the optimal load now can be calculated as

$$R_{opt} = \frac{R_L}{k^2 r N^2} \tag{3.10}$$

where  $R_L = 50 \ \Omega$ , N the ratio  $\frac{L_2 + L_3}{L_3}$ , k and r the coupling coefficient and turn ratio of the transformer.

Compared to the canonical third-order bandpass filter based network in [43], the above design method naturally utilizes the parasitic inductor  $L_{trace}$  of the trace between the transformer and GSG pads. This feature allows the network to be easily implemented as a compact layout to minimize the insertion loss.

As can be seen from Eq.(3.10), to achieve a large output power, i.e. to obtain a small  $R_{opt}$ , the k, r and N need be large. However, the maximum achievable k is around 0.85 in practice, and r is usually set to 1 to minimize the loss of the transformer. Furthermore, large N means small  $\frac{L_3}{L_2}$ , which, as discussed in Sec. 3.3.1, causes a small bandwidth for the matching network. This implies a trade-off between output power and the bandwidth. Moreover, there is another limitation: the inductor  $L_{B2}$  should be equal or larger than the parasitic inductor  $L_{trace}$ . In this design,  $k \approx 0.65$  and  $N \approx 1.8$  are chosen to make  $L_{B2}$  equal to  $L_{trace} \approx 30 \ pH$ .

It is worth noting that, using the  $\pi$  model of transformer, the inductively coupled resonator in Fig. 3.8(a) can be replaced by a non-ideal transformer [50]. However, compared to a simple transformer, inductively couple resonators with Norton transformations lead to a larger output power and allow to rearrange network topology, while conveniently including layout parasitics.

Besides the bandwidth and impedance transformation, the power efficiency is also critical for the output matching network. For example, if the network has  $3 \ dB$ 

insertion loss, the output power will drop by 3 dB and PAE will drop by more than 50%. To estimate the insertion loss, we need to consider the parasitic resistance of inductors and the output network in Fig. 3.8(d) can be redrawn in Fig. 3.9(a), where  $R_A$  is the parasitic resistance of the primary inductor  $\frac{L_A}{k^2 r}$ ,  $R_B$  the parasitic resistance of the secondary inductor  $\frac{L_A}{k^2}$  and serial inductor  $L_{B2}$ . Assuming all inductors have the same quality factor  $Q_{ind}$  at operation frequency  $\omega_C$ , the two parasitic resistors can be expressed as:

$$R_A = \frac{L_A}{k^2 r} \frac{\omega_C}{Q_{ind}}$$

$$R_B = \left(\frac{L_A}{k^2} + L_{B2}\right) \frac{\omega_C}{Q_{ind}}.$$
(3.11)



(a) output matching network with parasitic resistors



(b) equivalent output matching network with parasitic resistors

Figure 3.9: Loss of output matching network

To facilitate the efficiency calculation, we can replace the transformer with its T model [51], as shown in Fig. 3.9(b), where the inductor  $L_C$  is the primary leakage inductance of the transformer, the inductor  $L_M$  the mutual inductance, and the inductor  $L_D$  is the sum of the secondary leakage inductance of the transformer and the serial inductance  $L_{B2}$ , expressed as:

$$L_{C} = \frac{(1 - k\sqrt{r})L_{A}}{k^{2}r}$$

$$L_{M} = \frac{L_{A}}{k\sqrt{r}}$$

$$L_{D} = \frac{(\sqrt{r} - k)L_{A}}{k^{2}\sqrt{r}}.$$
(3.12)

Considering the power loss due to  $R_A$  and  $R_B$  in Fig. 3.9(b), we have

$$\eta = \frac{P_L}{P_A + P_B + P_L} = \frac{|I_L|^2 R_L}{|I_A|^2 R_A + |I_B|^2 R_B + |I_L|^2 R_L}$$
$$\frac{I_B}{I_A} = \frac{j\omega L_M}{R_B + \frac{R_L}{1 + Q_r^2} + j[(L_M + L_D)\omega - \frac{1}{\omega C_3(1 + \frac{1}{Q_r^2})})]}$$
$$\frac{I_L}{I_B} = \frac{1}{1 + j\omega C_3 R_L} = \frac{1}{1 + jQ_r}$$
$$Q_r = \omega C_3 R_L$$
(3.13)

where  $I_A$ ,  $I_B$  and  $I_L$  are respectively the currents across the resistors  $R_A$ ,  $R_B$  and  $R_L$ , and  $\omega$  the operation frequency.

Substituting the expressions of  $L_M$ ,  $L_D$ ,  $R_A$  and  $R_B$  into Eq.(3.13), the efficiency  $\eta$  can be expressed as <sup>2</sup>:

$$\eta = \frac{P_L}{P_A + P_B + P_L} = \frac{|I_L|^2 R_L}{|I_A|^2 R_A + |I_B|^2 R_B + |I_L|^2 R_L}$$

$$= \frac{1}{1 + (1 + Q_r^2) \frac{R_B}{R_L} + \frac{(1 + Q_r^2) R_A}{\omega^2 L_M^2 R_L} \{ (R_B + \frac{R_L}{1 + Q_r^2})^2 + [(L_M + L_D)\omega - \frac{1}{\omega C_3(1 + \frac{1}{Q_r^2})}]^2 \}}$$

$$= \frac{1}{1 + \frac{(1 + Q_r^2) N L_3 \omega}{1 + \frac{(N^2 + N)(1 + Q_r^2)}{Q_{ind} \omega L_3 R_L}} \{ (\frac{N L_3 \omega}{(N + 1) Q_{ind}} + \frac{R_L}{1 + Q_r^2})^2 + [\frac{N}{N + 1} L_3 \omega - \frac{1}{\omega C_3(1 + \frac{1}{Q_r^2})}]^2 \}}.$$
(3.14)

From Eq.(3.14), we can see that: to enhance the efficiency, the inductor Q factor  $Q_{ind}$  should be large. In addition, in narrow band PAs, the values of  $Q_r$  and N can be chosen to maximize the efficiency [52, 53]. However, in wideband PAs, the  $Q_r$  and N are constrained by the required bandwidth. This implies a trade-off between bandwidth and efficiency of the output network. Another interesting insight can be found from Eq.(3.10) and Eq.(3.14): the coupling coefficient k only affects the optimal load  $R_{opt}$  but not the efficiency  $\eta^{-3}$ . As a result, k can be used to tune the  $R_{opt}$  for different output power. In this work,  $k \approx 0.65$  and  $R_{opt} \approx 40 \ \Omega$  to satisfy the power requirement and facilitate the design.

The finite  $Q_{ind}$  and interwinding capacitance of the transformer may change the behavior of the matching network, so the values of the passive components may need fine tuning using simulation software such as Agilent Advanced Design System

<sup>&</sup>lt;sup>2</sup>In Eq.(3.14),  $L_1 = L_3$ , which is close to the real case, is assumed to simplify the calculation with little penalty on accuracy.

<sup>&</sup>lt;sup>3</sup>This is different from [52, 53] where k affects both  $R_{opt}$  and  $\eta$ . The divergence stems from the fact that, according to Eq. (3.11) the parasitic resistor  $R_A$  in our design is inversely proportional to  $k^2$ , while in the analysis of [52, 53]  $R_A$  is assumed to be independent from k.



(ADS). After several optimizations, the input impedance Zin and power efficient  $\eta$  of the final output network are shown in Fig. 3.10:

Figure 3.10: Impedance and efficiency of output matching network

As can be seen from Fig. 3.10, the real part of the input impedance  $Zin_{re}$  is very close to  $R_{opt} = 40\Omega$  in the frequency band from 40 to 62GHz. The imaginary part  $Zin_{re}$  is around zero. The efficiency of the network is larger than 60% between 30GHz and 62GHz with a peak around 73% at 42GHz. The passive efficiency drops at higher frequencies mainly due to metal and substrate losses.

The transimpedance  $Z_T$  of the output matching network is simulated and shown in Fig. 3.11<sup>4</sup>.  $Z_T$  has only 0.5dB ripple between 40GHz and 60GHz, which guarantees a flat gain response.

## 3.3.4 Interstage Matching Network

The design of the interstage matching network is also an important step in the PA design process. Together with the input stage, the interstage network needs to deliver enough power to drive and even saturate the output stage over the whole operation bandwidth. To enhance the power efficiency of the system, the interstage network needs to maximize the power delivered to the output stage so as to allow

 $<sup>^{4}</sup>$ In Fig. 3.11 the output resistance of the output stage is also taken into account.



Figure 3.11: Transimpedance of output matching network

the usage of a small input stage. In addition, the interstage matching network needs to be layout-friendly to minimize the insertion loss.

Thanks to its the wideband performance, inductively coupled resonator can be used as interstage matching network, as shown in Fig. 3.12, where  $R_1$  and  $C_1$  represent the output impedance of the input stage, while  $R_3$  and  $C_3$  are the input impedance of the output stage.



Figure 3.12: Coupled resonator as interstage matching network

Compared to Fig. 3.3, Fig. 3.12 considers the output resistance  $R_1$  of the input stage. In fact, the ripple of the interstage matching network is mainly determined by the Q factors of the two resonators, i.e.  $R_1C_1\omega_C$  and  $R_3C_3\omega_C$ , where  $\omega_C$  is the center operating frequency. Since in this precess  $(ST \ 28 \ nm \ LP \ CMOS) \ Q_1 = R_1C_1\omega_C \approx 2$  is much smaller than  $Q_3 = R_3C_3\omega_C \approx 12$ , the ripple is mainly determined by  $R_1C_1\omega_C$ . As discussed in Sec. 3.3.1, the frequency response of the network in Fig. 3.12 has two peaking frequencies, which are defined in Eq.(3.3). Large  $T = \frac{L_1+L_3}{L_2}$  leads to large wideband, and  $Q_1 = Q_3$  is required to achieve a symmetric frequency response. However,  $Q_3$  is about 5 times larger than  $Q_1$ . As a result, explicit resistors or capacitors are required to compensate the difference. An explicit resistor  $R_E$  can be added in parallel with  $R_3$  to decrease  $Q_3$ . However, the explicit resistor  $R_E$ will degrade severely the gain and efficiency. For example,  $R_E \approx \frac{R_3}{5}$  is required to decrease  $Q_3$  from 12 to 2, which also diminishes the gain by 83%. To circumvent this problem, an explicit capacitor  $C_E$  can be added in parallel with  $C_1$  to make  $Q_1 \approx Q_3$ . However, the increased  $Q_1$  deteriorates the in-band ripple and limits the bandwidth of the network. To circumvent the aforementioned problems, we need to seek other approaches.

Fortunately, when  $Q_1 \ll Q_3$ ,  $C_1 \approx C_3$  can also obtain a quasi-symmetrical response. Therefore in this design, an explicit capacitor  $C_{Add}$  is added in parallel with  $C_1$  to make  $C_1 + C_{Add} \approx C_3$  and ensure a symmetrical response, while an explicit resistor  $R_{Add}$  is added in parallel with  $R_1$  to lower the quality factor of the network and achieve the required bandwidth. Moreover, impedance transformation techniques are utilized to scale down the size of input stage and improve power efficiency. The detailed procedure is explained in Fig. 3.13.

Firstly, the interstage matching network starts from a balanced coupled resonator in Fig. 3.13(a) with  $C_{1,t} \approx C_3$  and  $L_{1,t} \approx L_3$ , where  $C_{1,t}$  is the sum of the output capacitance  $C_1$  and the explicit capacitor  $C_{Add}$ , and  $R_{1,t}$  the parallel result of the output resistance  $R_1$  and the explicit resistor  $R_{Add}$ , given as:

$$C_{1,t} = C_1 + C_{Add}$$

$$R_{1,t} = \frac{R_1 R_{Add}}{R_1 + R_{Add}}.$$
(3.15)

In this design,  $C_1 \approx \frac{C_3}{2}$ , i.e.  $C_{Add} \approx C_1$ , is chosen to provide enough power for output stage.  $R_{1,t}$  and thus  $R_{Add}$  are designed to obtain reasonable in-band ripple, while  $L_2$  is set to achieve the desired bandwidth. Note that the output stage is represented by  $R_3$  and  $C_3$ , while the symbol of the MOS transistor is ignored for simplicity. *Vin* and *Vout* are the gate-source voltages of input and output stage respectively.

Secondly, the inductor  $L_2$  is split into  $L_{2a}$  and  $L_{2b}$ , with the following relationships:

$$L_{2a} = \left(\frac{1}{k^2} - 1\right)L_1$$

$$L_{2b} = L_2 - L_{2a}$$
(3.16)

where k is a positive number smaller than 1.

Thirdly, an ideal transformer with a turn ratio of  $1 : k\sqrt{r}$  is inserted between  $L_{2a}$  and  $L_{2b}$ , as shown in Fig. 3.13(c), where  $k\sqrt{r}$  is chosen to be smaller than 1. This



Figure 3.13: Interstage matching network

action upscales all the impedances at the input stage side by a factor of  $\frac{1}{k^2r}$ . Since the ideal transformer has a voltage gain of  $k\sqrt{r} < 1$ , the transconductance and thus the size of the input stage need to be downscaled by  $\frac{1}{k\sqrt{r}}$  rather than  $\frac{1}{k^2r}$  to achieve the same voltage gain  $\frac{Vout}{Vin}$ . The output impedance of the scaled input stage, which is inversely proportional to the new size, is

$$C'_{1} = k\sqrt{r}C_{1}$$
  
 $R'_{1} = \frac{R_{1}}{k\sqrt{r}}.$ 
(3.17)

Therefore, the output impedance scales up by a factor of  $\frac{1}{k\sqrt{r}}$ , which is different from the scaling factor  $\frac{1}{k^2r}$  of the overall impedance at the input stage side. As a result, the explicit capacitor and resistor need to be modified accordingly. The new

explicit capacitor  $C_{E}^{'}$  and resistor  $R_{add}^{'}$  can be calculated as

$$C'_{E} = k^{2}rC_{1,t} - C'_{1}$$

$$R'_{add} = \frac{1}{\frac{k^{2}r}{R_{1,t}} - \frac{1}{R'_{1}}}.$$
(3.18)

Substituting Eq. (3.15) and Eq. (3.17) into Eq. (3.18), we have

$$C'_{E} = k^{2} r C_{1,t} - C'_{1} = k^{2} r [C_{E} - (\frac{1}{k\sqrt{r}} - 1)C_{1}]$$

$$R'_{add} = \frac{1}{\frac{k^{2}r}{R_{1,t}} - \frac{1}{R'_{1}}} = \frac{1}{k^{2}r} \frac{R_{1}R_{add}}{R_{1} - (\frac{1}{k\sqrt{r}} - 1)R_{add}}.$$
(3.19)

Note that the parallel explicit impedances scale by a factor larger than  $\frac{1}{k^2r}$ , and thus their detrimental affects are minimized by inserting the transformer. In fact the clumsy explicit  $C'_E$  and  $R'_{add}$  can be totally removed if the following equations are satisfied:

$$C_E = \left(\frac{1}{k\sqrt{r}} - 1\right)C_1$$

$$R_{add} = \frac{k\sqrt{r}}{1 - k\sqrt{r}}R_1$$
(3.20)

which is the case of this design.

Finally, the network highlighted in dashed box in Fig. 3.13(c) represents a nonideal transformer model with both leakage inductances shifted to the secondary side [49], and can be replaced by a practical transformer with an actual turn ratio of r and coupling coefficient of k. The design equation of the transformer's primary inductance  $L_{P,inter}$  is given as

$$L_{P,inter} = \frac{L_1}{k^2 r}.$$
(3.21)

To sum up, a balanced coupled resonator is employed as the interstage matching network to obtain symmetric frequency response. Furthermore, an ideal transformer is inserted to reduce the size and power consumption of input stage by a factor of  $\frac{1}{k\sqrt{r}}$ , i.e. 2 in this design, while achieving the same voltage gain and network response. As a result, a input stage which is only one fourth of the output stage is used in this design to enhance power efficiency. In this regard, it is worth considering the affect of reducing the size of input stage on the power gain  $G_P$ :

$$G_P = \frac{Pout}{Pin} = \frac{Vout^2}{R_3} \frac{Rin}{Vin^2} = \frac{Vout^2}{Vin^2} \frac{Rin}{R_3}$$
(3.22)

where Rin is the input resistance of the input stage. Since the voltage gain  $\frac{Vout}{Vin}$ and the input resistance  $R_3$  of the output stage are constant, the power gain  $G_P$  is proportional to Rin. Therefore, reducing the size of the input stage increases its input impedance Rin and leads to a larger power gain  $G_P$ .

Compared to the final output matching network in Fig. 3.8(e), the final interstage matching network in Fig. 3.13(d) has an extra inductor  $L_3$  at the input of output stage. The difference stems from that a Norton transform is performed on  $L_3$  when designing the output matching network. This Norton transform scales down the impedance at the left side and allow to obtain larger output power. However, the goal of the interstage matching network is to upscale the impedance at the left side so as to reduce the size of input stage for high power efficiency. As a result, no Norton transform is performed and  $L_3$  remains.

## 3.4 Active Stages Design

One of the biggest design issues for mm-wave PAs is the limited power gain at frequencies close to the cut-off frequency of the technology. For example, even in the advanced technology such as 28 nm CMOS process, the power gain of a common source amplifier is only around 12 dB at 50 GHz. On top of that, due to the feedback from output to input via the gate-drain capacitor  $C_{GD}$ , the stability of the amplifier is impaired. In fact, many mm-wave PAs tend to oscillate at frequencies smaller than operating frequency [54], where the transistor gain increases much quicker than the losses of the matching networks. As a result, explicit damping or loss is required to suppress potential oscillation [39, 50]. However, the explicit loss degrades the gain and limits the power efficiency.

In this design, capacitive neutralization, which can cancel out the feedback due to  $C_{GD}$  [40, 55], is adopted for both active stages to enhance the stability and power gain. Furthermore, inductive degeneration is employed on the input stage to lower the quality factor of input impedance and achieve wideband input matching.

## 3.4.1 Capacitive Neutralization

A differential neutralized amplifier is shown in Fig. 3.14, where two cross-coupled capacitors  $C_N$  are added to minimize the detrimental effect of the intrinsic gate-drain capacitor  $C_{GD}$ . Specifically, thanks to the opposite phase of differential outputs, the feedback current from  $Out_N$  to  $In_P$  via  $C_{GD}$  is opposite as the feedback current from  $Out_P$  to  $In_P$  via  $C_N$ . Therefore, the overall feedback current from the differential outputs to  $In_P$  can be zero. The same for the  $In_N$ .

It can be proven that the feedback from output to the input can be completely removed when the following relationship is satisfied [55]:

$$C_N \approx C_{GD}.\tag{3.23}$$



Figure 3.14: Schematic of a differential neutralized amplifier

As illustrated in Fig. 3.14, capacitively neutralization can be conveniently implemented in differential topology to achieve a near unilateral behavior with no penalty in power consumption. As a result, both stability and reverse isolation are improved. A good reverse isolation minimizes the interaction between matching networks and simplifies their design, which is critical to achieve a wideband performance. Furthermore, since the feedback due to  $C_{GD}$  is negative, annulling the feedback can also improve the power gain.

Cascode structure is traditionally used to improve the stability and gain at low frequencies. However, since the parasitic capacitance has high admittance at mmwave frequencies, the improvements on stability and gain are limited, especially after considering the layout parasitics [55]. Moreover, due to the voltage drop on the common gate transistor, the output swing and thus the output power and efficiency are impaired.

To get some quantitative understanding, we compare the power gain of 3 different amplifier topologies: common source amplifier with and without neutralization and cascode amplifier, where all the transistors of all the amplifiers have the same size, all the common source transistors have the same bias voltage 0.5 V and the common gate ones are biased at supply voltage 1 V. The simulated power gains are shown in Fig. 3.15, where CS,  $CS_{Neu}$  and Cas represent the power gain of common source amplifier without and with neutralization and cascode amplifier, respectively.

Compared to simple common source amplifier, the neutralized one offers  $12 \ dB$  more gain at 10 GHz, and higher gain up to 200 GHz. In addition, neutralized common source amplifier achieves absolute stability across all frequencies, while the common source one is only conditional stable below 200 GHz. Compared to cascode amplifier,



Figure 3.16: Q factors of input and output impedances with and without neutralization

neutralized common source amplifier has both higher gain and better stability.

Capacitive neutralization is very attractive with respect to stability and gain. However, neutralization increases the Q factors of both input and output impedances and poses difficulties on wideband matching. Fig. 3.16 compares the Q factors of a differential amplifier with and without neutralization, where Qin and Qout are the Q factors of input and output impedance without neutralization, and  $Qin_{Neu}$  and  $Qout_{Neu}$  the Q factors of input and output impedance with neutralization<sup>5</sup>.

As illustrated in Fig. 3.16, both input and output Q factors rise through capacitive neutralization. The output Q factor remains small even after neutralization, and thus causes few problems for matching. However, the input Q factor increases to an unacceptable value and will pose difficulty for wideband impedance matching [56]. More discussion about the matching challenge and the technique used to overcome it will be discussed in Sec. 3.4.3.

### 3.4.2 Output Stage

The schematic of the output stage is shown in Fig. 3.17, where  $M_{1,2}$  are the input transistors,  $M_{3,4}$  the drain-source shorted transistors serving as neutralizing capacitors. Note that compared to MOM capacitors, transistor-based capacitors cost less area and can better track the  $C_{GD}$  of input transistors [57].



Figure 3.17: Output stage and the driver of modulating signal

<sup>&</sup>lt;sup>5</sup>Q is defined as the ratio between the imaginary part of Y11 and its real part, i.e.  $Q = \frac{Im\{Y11\}}{Re\{Y11\}}$ .

The output stage is co-designed with the output matching network to minimize the loss of the network while delivering enough output power. After several optimization, the size of the input transistors is chosen to be 200um/28nm, and the size of the neutralizing transistors is set to 100um/28nm to maximize the stability.

#### 3.4.3 Input Stage

To achieve a wideband performance, the input stage not only needs to provide enough power to drive the output stage over the desired bandwidth, but also needs to facilitate the design of input matching network. The latter one is proven to be more difficult in this design. To gain some insight, we can refer to Bode-Fano criterion[56], given as

$$\int_0^\infty ln \frac{1}{|\Gamma|} \,\mathrm{d}\omega \le \frac{\pi\omega_C}{Q} \tag{3.24}$$

where  $\Gamma$  is the reflection coefficient,  $\omega_C$  the center frequency, and Q the quality factor of the input impedance of input stage. In the ideal case, where  $|\Gamma|$  is constant within matching band and close to 1 out of band, Eq.(3.24) can be simplified as

$$\Delta \omega ln \frac{1}{|\Gamma|} \le \frac{\pi \omega_C}{Q} \tag{3.25}$$

where  $\Delta \omega$  is the matching bandwidth. From Eq.(3.25) it can be seen that: it is necessary to have a impedance with low Q factor to achieve a good matching over a large bandwidth. For example, to achieve  $S11 \leq -10dB$ , i.e.  $|\Gamma| \leq 10^{-0.5}$ , within 20GHz frequency band around 50GHz, the Q factor of the input impedance of the input stage needs to be smaller than 7.

However, in this technology, the Q factor of the input impedance of a common source amplifier is very high. Furthermore, the capacitive neutralization, which is used to ensure stability and improve power gain, further increases the Q factor. After considering the layout parasitics, the Q factor of the input impedance of a neutralized common source amplifier is around 16, which makes it impossible to achieve the required 20GHz bandwidth. As a result, it is necessary to use explicit techniques to lower the Q factor. A straightforward method is to add resistors [43], but the explicit resistors will consume power and degrade the gain and efficiency severely. For example, to decrease the Q factor from 16 to 4, the gain of the matching network will drop by 6 dB.

To circumvent the aforementioned drawbacks, this design utilizes inductively degeneration, which is commonly used for input matching in LNA [17], to generate a real part and lower the Q factor of input impedance. The schematic of the input stage is shown in Fig. 3.18, where two degenerating inductors  $L_{Deg}$  are added at the sources of transistors to decrease the Q factor.



Figure 3.18: Schematic of input stage

To get some quantitative understanding, we can compare the two Q factors of the input impedance of the amplifier with and without inductively degeneration. The input impedance  $Zin_{wo}$  of a common source amplifier without degeneration and its quality factor  $Qin_{wo}$  can be calculated as:

$$Zin_{wo} \approx R_G + \frac{1}{C_{GS}s}$$

$$Qin_{wo} \approx \frac{1}{R_G C_{GS} \omega_C}$$
(3.26)

where  $R_G$  is the gate resistance and  $C_{GS}$  the gate-source capacitor of the input transistor.

For an inductively degenerated amplifier, we have

$$Zin_w \approx R_G + \frac{g_m L_{Deg}}{C_{GS}} + \frac{1}{C_{GS}s} + L_{Deg}s$$

$$Qin_w \approx \frac{1}{(R_G + \frac{g_m L_{Deg}}{C_{GS}})C_G\omega_C}.$$
(3.27)

Comparing Eq.(3.27) with Eq.(3.26), we can see that the degenerated inductor generate a real part  $\frac{g_m L_{Deg}}{C_{GS}}$  in the input impedance and decreases the *Qin*.

On the other hand, the gain of a degenerated amplifier is related to the quality factor of its input impedance [51], and smaller Qin leads to smaller gain. As a result, the

Qin cannot be too small. In this design, the degenerating inductor is set around 30 pH to decrease the Qin from 16 to 4, while the maximum available gain drops from 19 dB to 15 db at 50 GHz. Compared to adding resistor at transistor gate, this approach achieves 2 dB higher gain while obtaining the same Qin. In addition, inductively degeneration can increase the linearity of the amplifier. Therefore, despite a lower gain, degenerated amplifier has larger 1 dB compression output power than a un-degenerated amplifier. This feature implies that inductively degeneration allows to achieve a better input matching while increasing the linear output power of input stage.

The design procedure of the input matching network is similar to the other two matching networks. Starting from an inductively coupled resonator, impedance transformation techniques are used to match the input stage to 50  $\Omega$  over a large frequency range. The final input matching network is shown in Fig. 3.19, where  $L_1$ is the parasitic inductance between the input GSG pads and the transformer,  $L_2$  the inductance between the transformer and the input stage, and  $L_{Deg}$  the degenerating inductor. The  $L_{Deg}$  is coupled to  $L_2$  to form "nested inductors" [58].



Figure 3.19: Input matching network



Figure 3.20: Input matching layout

The simplified layout of the input matching network is shown in Fig. 3.20, where  $L_{Deg}$  is placed inside  $L_2$  to simplify the layout routing. Since the gate current  $I_{in}$
over  $L_2$  and the current  $I_{in}+I_{out}$  over  $L_{Deg}$  are not in phase, some calculations are required to understand the effect of the coupling. From Fig. 3.19, we have

$$V_{L_{2}} = sL_{2}I_{in} + sM (I_{in} + I_{out})$$

$$V_{L_{Deg}} = sMI_{in} + sL_{Deg} (I_{in} + I_{out})$$

$$V_{GS} = \frac{1}{sC_{GS}}I_{in}$$

$$I_{out} = g_{m}V_{GS} = \frac{g_{m}}{sC_{GS}}I_{in}$$

$$V_{in} = V_{L_{G}} + V_{GS} + V_{L_{S}}$$

$$= I_{in} \left(s (L_{2} + 2M + L_{Deg}) + \frac{1}{sC_{GS}} + \frac{g_{m}}{C_{GS}} (L_{Deg} + M)\right)$$

$$Z_{in} = \frac{V_{in}}{I_{in}} = s (L_{2} + 2M + L_{Deg}) + \frac{1}{sC_{GS}} + \frac{g_{m}}{C_{GS}} (L_{Deg} + M)$$
(3.28)

where  $g_m$  is the transconductance of the input stage, M the mutual inductance of  $L_2$  and  $L_{Deg}$ , given as:

$$M = k_2 \sqrt{L_2 L_{Deg}} \tag{3.29}$$

where  $k_2$  is the coupling coefficient of  $L_2$  and  $L_{Deg}$ .

As can be seen from Eq.(3.28), although the currents over  $L_2$  and  $L_{Deg}$  are out of phase, the effect of their coupling on the input impedance  $Z_{in}$  is the same as normal coupling: both inductors are increased by the mutual inductance M. Therefore the physical lengths and ohmic losses of both inductors can be reduced.

## 3.5 Complete Circuit

The simplified schematic of the implemented PA is shown in Fig. 3.21. The amplifying transistors of the output stage have a size of  $200\mu m/28nm$ , while the size of input stage is only  $40\mu m/28nm$ . Compared to other PAs in literatures, this design has a relatively small input stage, which is due to the large GBW of proposed network. The small input stage can reduce the power consumption and thus improve the overall power efficiency.



Figure 3.21: Complete schematic of the PA

As discussed before, both stages employ capacitively neutralization to enhance stability and increase power gain, while the size of the neutralizing transistors are chosen to minimize the feedback from output to input <sup>6</sup>. All the bias and supply voltages are provided at the center taps of transformers, which facilitates the layout routing and provides a good electrostatic discharge (ESD) protection [45]. Furthermore, big resistors are added in biasing path to suppress common-mode oscillation. All the matching networks are based on inductively coupled resonators to obtain a wideband performance. Impedance transformation techniques are also exploited to maximize the transfered power and enable a compact layout.

The PA has been fabricated in ST 28 nm bulk CMOS LP process. The micrograph of the chip is shown in Fig. 3.22, where all matching networks are marked with white boxes. It can be seen that the output matching network is very compact. The overall chip occupies an area of 620x540  $\mu m^2$ . It is pad limited, and the core area is only 470x120  $\mu m^2$ .



Figure 3.22: Micrograph of the chip

 $<sup>^6\</sup>mathrm{The}$  neutralizing capacitors shown in Fig. 3.21 are implemented by transistors.

## **3.6** Measurements

#### 3.6.1 Measurement Setup

The measurement setup for small signal performance is shown in Fig. 3.23. A vector network analyzer is used to provide the input and measure the output signal via GSG pads. The bias and supply voltages are provided by DC signal generators, which can measure the DC power consumption.



Figure 3.23: Measurement setup for S-parameters

The measurement setup for large signal performance is shown in Fig. 3.24. The input signal is provided by a signal generator, and the output signal is measured by a spectrum analyzer, whose operating range is up to only 50 GHz and thus the measurement is limited to 50 GHz. It is worth noting that, the de-embedding of the loss of the signal path is critical for PA. For this purpose we connected directly the signal generator and spectrum analyzer with a cable, and get the loss from signal generator to spectrum analyzer. In the measurement setup, we use two cables, which are the same as the one used for de-embedding, to connect the input and output. After the de-embedding, when the input power is small, the PA gain in large signal setup is in accordance with the one got from S-parameters setup, which means the de-embedding has a good accuracy.



Figure 3.24: Measurement setup for large signal performance

#### 3.6.2 Measurement Results

The measured S-parameters are shown in Fig. 3.25. The peak gain is 13.3  $dB^{7}$ , and the 3 dB bandwidth is from 40 GHz to 67 GHz. The S11 is less than -8 dB from 40 GHz to 65 GHz, and S22 less than -10 dB from 43 GHz to 67 GHz. The S12 is less than -40 dB over the whole measured frequency range.

The measured k-factor and group delay are shown in Fig. 3.26. The K is larger than 10 and  $|\delta| = |S11S22 - S12S21|$  smaller than 1, ensuring an unconditional stability of the PA. In addition, the group delay has an average value of 40 ps with a maximum  $\pm 4$  ps from 47-67 GHz.

The large signal performance at 50GHz versus input power is shown in Fig. 3.27. When the input power Pin is small, the gain is almost the same as the S21. As Pin increases, PAE increases and reaches its peak 16% around Pin = 3dBm. Pout also increases with Pin and starts to saturate when Pin is large. The  $P_{SAT}$  is around 13.3dBm and the  $P_{1dB}$  is about 12.1dBm.

The large signal performance versus frequencies is shown in Fig. 3.28. Since the measurement instrument can only operate up to 50GHz, the measurement is limited between 40GHz and 50GHz. In this frequency band,  $P_{SAT}$  and  $P_{1dB}$  are almost constant. The *PAE* is larger than 15% from 47GHz to 50GHz. It is worth noting that the gain S21 peaks around 42GHz while  $P_{SAT}$  and *PAE* peak around 50GHz. The inconsistency is due to the fact that, the gain, which is small signal feature, mainly depends on the transimpedance of the network, while  $P_{SAT}$  and *PAE*, which are large signal features, also depend on the input impedance of the output matching network.

<sup>&</sup>lt;sup>7</sup>The relatively low gain is mainly caused by the fact that the chip works in SS corner, while the gain will be around 20 dB if the chip works in TT couner.



Figure 3.25: Measured S-parameters



Figure 3.26: K-factor and group delay



Figure 3.27: Performance vs. input power



Figure 3.28: Performance vs. frequency

#### 3.6.3 Comparisons

The performance is summarized and compared with published mm-wave CMOS PAs without power combining in Tab. 3.2. The implemented PA shows the highest fractional bandwidth with state-of-the-art efficiency and output power. The achieved GBW is comparable to [59, 60], where however a high power supply is needed.

|                  | This Work | [59]            | [55]  | [60]  | [61]            |
|------------------|-----------|-----------------|-------|-------|-----------------|
| Tech.            | 28 nm     | $65\mathrm{nm}$ | 65 nm | 45 nm | $65\mathrm{nm}$ |
| Vdd [V]          | 1         | 1.8             | 1     | 2     | 1.2             |
| Gain [dB]        | 13        | 16              | 16    | 20    | 18              |
| BW [GHz]         | 27        | 21              | 9     | 13    | 12.5            |
| GBW [GHz]        | 121       | 133             | 57    | 130   | 99              |
| $P_{SAT}$ [dBm]  | 13        | 13              | 11.5  | 14.5  | 9.6             |
| $P_{-1dB}$ [dBm] | 12        | 8               | -     | 11.2  | -               |
| PAE [%]          | 16        | 8               | 15.2  | 14.4  | 13.6            |
| Frac. BW [%]     | 51        | 35              | 15    | 21.7  | 20.8            |

Table 3.2: Comparison table of mm-wave CMOS PAs without power combining

## 3.7 Conclusions

For mm-wave PAs, due to large layout parasitics and limited gain, it is challenging to achieve a good performance over a large frequency range. To reach this goal, this chapter presents a design methodology for high-order matching networks to obtain wideband operation, while enabling a compact layout and minimizing the insertion loss. The fabricated prototype demonstrates 13  $dBm P_{SAT}$ , 16% peak PAE and 27 GHz bandwidth which is the largest among mm-wave PAs. The wide bandwidth enables the transceiver to achieve large data rate with simple modulation methods.

## 3.7. CONCLUSIONS

## Chapter 4

# A Two-Path Power Combing CMOS Power Amplifier

Large PA output power is generally desirable to increase the link span and minimize the bite error rate (BER) of a wireless transceiver. However, compared with compound semiconductor such as *GaAs*, CMOS technology has much smaller breakdown voltage, which severely limits the supply voltage. Furthermore, low quality factors of passive components at limit the impedance transformation ratio between the fixed antenna and PA load impedance. Therefore, the achievable output power of CMOS PAs are generally low and explicit techniques are required to obtain large output power.

This chapter discusses different approaches to increase PA output power. Particular attention is given to on-chip passive power combining and splitting techniques. A though analysis is conducted on the stability of non-isolated power splitter, which reveals the oscillation problem of traditional non-isolated power splitter. A novel power splitter is thus proposed to suppress the potential oscillation. Using the proposed structure, a three-stage two-path differential PA has been realized in ST 65 nm CMOS GP. The PA shows no stability issue and achieves 30 dB power gain, 20 dBm  $P_{SAT}$  and 22% peak PAE with a bandwidth from 58.5 GHz to 73.5 GHz.

## 4.1 Output Power Limitations

Fig. 4.1 shows a general PA diagram, where *Vout* and *Vin* denote the output voltage over the load and the input voltage at the gate of power device  $M_{PA}$ , Vdd and  $V_D$  the supply voltage and the drain voltage of  $M_{PA}$ ,  $R_L$  and  $R_{opt}$  the antenna resistor and the optimal load resistance of  $M_{PA}$ , respectively.

Assuming a lossless matching network, the power delivered to  $R_L$  equals to the



Figure 4.1: A general PA diagram

power absorbed by  $R_{opt}$ , given as:

$$Pout = \frac{Vout^2}{2R_L} = \frac{V_D^2}{2R_{out}}$$
(4.1)

Considering class-AB operation, which is the case for most mm-wave PAs, the maximal drain voltage  $V_D$  approximates the supply voltage Vdd. So the maximal output power is

$$Pout, max = \frac{Vdd^2}{2R_{opt}} \tag{4.2}$$

From Eq. 4.2, it is easy to see that there are two different ways to increase the maximal output power *Pout*, max: increasing Vdd and decreasing  $R_{opt}$ .

Vdd is normally constrained by the breakdown voltage of CMOS technology, which is much smaller than other processes. For example, the nominal supply voltage Vdd, nom is only 1 V in 65 nm CMOS technologies, while it can be 10 V in GaAs MESFET technology [62]. This means 100 times smaller output power for CMOS PAs. To increase the output power, several techniques are proposed to push the actual supply voltage Vdd higher than the nominal Vdd, nom, while avoiding damaging transistor and ensuring long-term reliability. A widely used technique is cascode topology, which allows the usage of a supply higher than Vdd, nom by dividing Vddbetween the common-source and common-gate transistors [50]. Stacking topology extends the idea of dividing supply voltage between multiple transistors [63, 64]. Fig. 4.2 shows a simplified schematic of a PA with N stacked transistors.

Theoretically, stacking N transistors can increase the maximal allowable supply voltage by N times, which means increasing the output power by  $N^2$  times without



Figure 4.2: Schematic of a stacking PA

causing serious reliability problem<sup>1</sup>. Although stacking looks effective in enlarging output power, it has several practical limitations. Firstly, the input network, which is used to generate N inputs, i.e.  $Vin_1, Vin_2 \dots Vin_N$ , needs to be designed delicately to ensure all the inputs have the same phase and all the transistors have similar drain-source voltages. Secondly, it requires a supply voltage much larger than the nominal value, which poses difficulty for integration and probably increases system complexity and cost. Furthermore, substrate breakdown in CMOS technology further limits the usage of stacking. As a result, stacking structure is rarely used in CMOS PAs.

The second way to increase Pout, max is decreasing  $R_{Opt}$ . However, since in most applications PAs need to drive a 50  $\Omega$  antenna (i.e.  $R_L = 50$  in Fig. 4.1), using a very small  $R_{Opt}$  implies a output matching network with large impedance transformation ratio  $IR = \frac{R_L}{R_{Opt}}$ , which probably limits the bandwidth and introduces non-negligible insertion loss. Moreover, small  $R_{Opt}$  requires large transistor size, causing significant layout parasitics and degrading the gain and efficiency. As a result,  $R_{Opt}$  is usually kept larger than 20  $\Omega$  at mm-wave frequencies [41, 19]. Assuming Vdd = 1 V and  $R_{Opt} = 20 \Omega$ , the maximal output power *Pout*, max is only 14 dBm, which mandates new techniques to improve output power.

<sup>&</sup>lt;sup>1</sup>Stacked transistors may experience substrate breakdown or substrate leakage, or both, in bulk CMOS or BiCMOS processes, but not in silicon-on-isolate (SOI) technology or compound semiconductor FETs on semi-insulating substrates [64].

## 4.2 On-Chip Passive Power Combiner and Splitter

In addition to the aforementioned two methods, there is another commonly used technique to increase output power, which is called power combining. The basic idea is to combine together the output power of multiple PA cells. Ideally, a PA with M cells can achieve M times larger power than a single cell, given as

$$Pout, cob = M \frac{V dd^2}{2R_{opt}}$$

$$\tag{4.3}$$

Power combing can be realized both on-chip and in free-space [65, 66, 67]. On-chip combining can achieve good power efficiency and small area when the number of PA cells is small, while free-space combining can maintain good combining efficiency even with considerable number of cells at the expense of increased complexity and cost of the system [68]. Note that the two methods can be used simultaneously to maximize the output power. Since this design requires moderate output power, this section only focuses on on-chip combining.

Fig. 4.3 shows a general block diagram of a M-path on-chip power combining PA, where the input power is split into M parts to drive the PA cells, and then M output power from all the cells are combined into one final output power.



Figure 4.3: General block diagram of a power combining PA

On-chip power combiners have active and passive modes. Active designs provide gain and better isolation, but poor power efficiency due to extra power consumption, and so are not preferred for PAs [69]. As a result, most power combiners are made up of passive components. Dependent on the interaction among different PA cells, passive power combiners can be categorized as isolated combiners and non-isolated combiners. The former combiners isolate different cells, while in the latter combiners, different cells can interact with each other.

A passive power combiner is reciprocal and can be used as a power splitter by swapping the input and output [70]. In this section, the term "combiners" is used for brevity in the text, however, the analysis and design techniques apply to power splitters as well.

#### 4.2.1 Isolated Power Combiners

Isolated power combiners are widely used on printed circuit boards (PCBs). The most popular examples are 3 dB couplers, including Wilkinson power combiner and quadrature coupler [71, 72]. The schematics of two 3 dB couplers are shown in Fig. 4.4. Wilkinson power combiner composes two quarter wavelength transmission lines with characteristic impedance of  $\sqrt{2}Z_0$  between input and output ports, and a resistor  $2Z_0$  between two input ports, where  $Z_0$  is the load impedance of three ports. Quadrature coupler has two inputs with 90<sup>0</sup> phase shift, one output port summing the two inputs, and one isolated port which has zero gain from input.

The design of 3 dB combining networks are straightforward since the couplers sum the output power of two PA cells and increase the overall output power by 3 dB. Moreover, the configuration provides a high degree of isolation between the two input ports and achieves excellent return loss, which is useful for system integrations. However, a 3 dB coupler is designed based on quarter wavelength transmission lines, so limited bandwidth and large area are expected especially when the number of combining transistors increases [73]. As a result, they are suitable for design on PCBs, but not preferred on chip, where area is of great concern.

### 4.2.2 Non-Isolated Power Combiners

Most modern mm-wave PAs use non-isolated combiners due to their compact dimensions and low insertion losses. The non-isolated combiners can be classified as transmission line based combiners [73, 74], transformer based combiners [52, 41], and hybrid combiners which employ both transmission lines and transformers [19, 75]. Fig. 4.5 shows a simplified diagram of two kinds of combiner.

Transmission line based combiners are usually used to sum the output current of multiple PA cells, as shown in Fig. 4.5(a), where the transmission lines are designed much shorter than quarter wavelength to enable a compact and low loss layout. Since the parallel current summing increases the load impedance of each PA cell by a factor of the number of PA cells, i.e. P in Fig. 4.5(a), each PA cell requires a size which is P times smaller. As a result, the layout parasitics and losses of active transistors will decrease and the gain will increase. However, due to the increased



Figure 4.4: Schematic of Wilkinson combiner and quadrature coupler

load impedance, the output power of each cell diminishes and the overall output power remains the same as the one-path PA shown in Fig. 4.1, as follows

$$Pout_{TL} = P \frac{V dd^2}{2PR_{opt}} = \frac{V dd^2}{2R_{opt}}$$

$$\tag{4.4}$$

Although the transmission line can perform impedance transformation and reduce the increased load for each cell, in practice, transmission line based combiners usually cannot obtain large output power [76, 64].

On the other hand, transformer based combiners can overcome this limit and achieve high output power [52]. Dependent on the combining topologies, transformer based combiners can be categorized into parallel combining transformers (PCTs), series combining transformers (SCTs), and series-parallel combining transformers (SPCTs) [77]. PCTs have parallel topology, which is similar to transmission line based combiners. As a result, PCTs allow to use small PA cells but cannot obtain large output power. In contrast, SCTs with N cells decrease the load impedance of each cell by N times and can enlarge the overall output power by  $N^2$  times. SPCTs exploit



(b) Transformer based combiner

Figure 4.5: Schematic of transmission line based combiner and transformer based combiner

both parallel and serial combining techniques, and therefore are able to increase the overall output power while using small and high gain PA cells. Fig. 4.5(b) shows a SCPT with M \* N PA cells, where N PA cells are parallel combined as a unit, and M units are serial combined together to form a M \* N array of PA. Because of the series-parallel configuration, the overall output power can be calculated as

$$Pout_{SPCT} = MN \frac{Vdd^2}{2\frac{N}{M}R_L} = M^2 \frac{Vdd^2}{2R_L}.$$
 (4.5)

Note that the output power of a M \* N SCPT PA does not depend on the parallel coefficient N, which conforms to the conclusion that parallel combining does not change the overall output power. However, thanks to parallel combining, the load seen by each PA cell can be kept at a reasonable large value to maintain good active gain, given as:

$$R_{L,SPCT} = \frac{N}{M} R_L. \tag{4.6}$$

Despite the aforementioned advantages, SPCTs have some limitations in practice. One limitation is the undesired coupling between adjacent inductors, which may induce substantial amplitude and phase imbalance on the output of different PA cells, resulting in degraded combining efficiency [64]. Furthermore, SPCTs usually require complex layout, which will introduce large parasitics and passive losses. This problem is especially severe for mm-wave PAs, where the Q factors of passive components are limited. As a result, M and N are rarely larger than 2 [19, 75]. It is worth noting that all aforementioned combining methods can ensure the reliability of PA, because each cell operates under a safe voltage.

### 4.2.3 Stability Issue of Traditional Non-Isolated Power Combiners and Power Splitter

Due to the compact dimension and low losses, the non-isolated combiners are widely used in integrated PAs. However, undesired interactions between different PA cells may change the behavior of each cell, degrading the overall performance and even causing oscillation. Fig. 4.6 shows a simplified diagram of a one-path PA and a traditional two-path power combining PA [41]. Note that in Fig. 4.6(b) two serial transformers are merged into a 2-input 1-output transformer for compact layout, where  $Out_{1p}$  and  $Out_{1n}$  are the outputs of  $PA_1$ ,  $Out_{2p}$  and  $Out_{2n}$  the output of  $PA_2$ , respectively.

Assuming an ideal 1:1 transformer, the one-path PA sees a load of  $R_L$ . For the two-path PA, due to serial voltage summing, ideally each PA cell should see a load of  $\frac{R_L}{2}$ . This is the case when  $Out_{1p}$  and  $Out_{2p}$  are equal in amplitude and phase, while out of phase with  $Out_{1n}$  and  $Out_{2n}$ . However, undesired interactions may



(b) two-path combining PA

Figure 4.6: Simplified diagram of a one-path PA and a traditional two-path power combining PA

change the amplitudes and phases of the outputs of two PA cells. For example, the outputs of  $PA_1$  -  $Out_{1p}$  and  $Out_{1n}$  - may have the same amplitude and phase, i.e. common-mode (CM) signals, while the two outputs of  $PA_2$  may have the same amplitude but an opposite phase with the outputs of  $PA_1$ , i.e. differential-mode (DM) signal. In this CM DM condition, magnetic flux of the left half and right half of the primary winding in Fig. 4.6(b) have opposite directions and thus cancel

each other. As a result, the output of the secondary winding is zero. Furthermore, each PA cell sees an infinite load impedance<sup>2</sup>, which leads to an infinite feedforward gain. In addition, the capacitive neutralization technique, which is widely used to minimize the feedback from output to input for differential signals and ensure DM stability, increases the feedback for CM signals, as shown in Fig. 4.7.



(a) Neutralization cancels the feedback for DM signals



(b) Neutralization increases the feedback for CM signals

To sum up, in CM DM condition, the feedforward gain is infinite and the feedback gain is also high. As a result, the possibility of oscillation is high. Therefore, special techniques are required to ensure CM DM stability.

To suppress CM oscillation of a one-path PA, big resistors can be conveniently added in the bias path, as shown in Fig. 4.8(a). This technique can decrease the gain of CM signal with no penalty on DM signal [39]. Due to the simplicity and effectiveness, this technique is widely used in both one-path and multi-path PAs [78, 79, 50, 41].

Figure 4.7: The effects of neutralization on feedback

 $<sup>^{2}</sup>$ Here we neglect the parasitic resistance of the transformer and the finite output resistance of active devices.



(b) two-path PA

Figure 4.8: Simplified input matching network of a one-path PA and a traditional two-path PA

However, in multiple-path PAs, adding big resistors on the bias path may not suppress the oscillation due to the interaction between different paths. As illustrated in Fig. 4.8(b), when CM DM oscillation happens, the two inputs of  $PA_1 - In_{1p}$  and  $In_{1n}$  - have the same amplitude and phase, while the other  $PA_2$  has two inputs which are of opposite phase with  $In_{1p}$  and  $In_{1n}$ . As a result, the two center taps of the secondary winding, i.e.  $ct_1$  and  $ct_2$  in Fig. 4.8(b), are virtual grounds. Therefore, the big resistors  $R_{Big}$  are bypassed and thus the oscillation cannot be killed in this configuration.

#### 4.2.4 Proposed Power Splitter

To suppress the common-mode differential-mode oscillation and ensure stability, a novel power splitter is proposed to minimize the interaction between different paths, which is shown in Fig. 4.9.



Figure 4.9: Simplified schematic of proposed power splitter

In the proposed power splitter, the inputs of  $PA_1$  and  $PA_2$  are isolated with a big resistor  $R_{Big}$ . As a result, the oscillation path is adequately de-Qued and the CMDMO can be quenched. Compared with the traditional splitter in Fig. 4.8(b), proposed topology increases a bit the length of the secondary winding, and causes larger ohmic loss and less gain. Simulation shows that proposed splitter has 0.6 dB less gain than tradition configuration. However, this gain penalty is far less important than the benefit of stability.

The power combiner used in this design is the traditional combiner, which is shown in Fig. 4.6(b). Although this topology itself leads to oscillation, when used with proposed power splitter, the whole system is stable. This is because the oscillating loop is made up of both power combiner and splitter, so the big resistor in the splitter can suppress potential common-mode differential-mode oscillation. Furthermore, the traditional power combiner has a compact dimension and low losses, which is critical to achieve large output power and high efficiency.

## 4.3 Circuit Design

Using the proposed power splitter and the wideband design methodology introduced in the previous chapter, a stable wideband PA is designed in ST 65 nm CMOS LP. The three-stage two-path serial combining architecture in Fig. 4.10 is adopted to achieve high gain and large output power. Note that the power splitter can be also inserted between driving stage and output stage [41]. However, since the power splitter is more lossy than normal interstage matching network, placing it in a less critical position, i.e. far from output stage, allows to achieve a higher efficiency. Locating the splitter preceding the input stage can minimize the losses, but will pose difficulties for input matching and increase the mismatches between two paths. Therefore, the splitter is inserted between input stage and driving stage to obtain a good trade-off among different factors.



Figure 4.10: Block diagram of three-stage two-path power combining PA

#### 4.3.1 Active Stages

The three-stage PA has one input amplifier, two driving amplifiers and two output amplifiers. As depicted in Fig. 4.11, all active stages use differential common source topology with capacitive neutralization to increase the power gain and stability. The neutralizing capacitors are implemented by drain-source connected MOS transistors, and the size of neutralizing transistor  $M_{Neu}$  is chosen as half the size of amplifying transistor  $M_{Amp}$  to minimize the feedback and ensure stability. Note that the inductive degeneration, which was used in the previous chapter to lower the quality factor of the input impedance and enable wideband input matching, is not adopted in this design, because the input impedance quality factor of the design is around 9, which is adequately low to obtain an acceptable input matching over the 60 GHz band.



Figure 4.11: Schematic of an active stage

The sizes of the input stage, driving stage and output stage are set with a ratio of 1: 2x1: 2x2 to ensure preceding stages have enough power to saturate succeeding stages under the worst case <sup>3</sup>. Furthermore, the input stage has a large bias - 600 mV - to enhance the power gain, while the output stage has a smaller bias - 500 mV - to minimize the power consumption and increase the power efficiency.

<sup>&</sup>lt;sup>3</sup>The "2x" means two paths, i.e. two driving amplifiers and two output amplifiers.





(c) Input stage to driving stage matching network



Figure 4.12: Schematic of matching networks

#### 4.3.2 Matching Networks

The matching networks are designed following the methodology proposed in previous chapter. The design procedure is omitted for the concision of the thesis. The final matching networks are shown in Fig. 4.12.

As illustrated in Fig. 4.12(a), the power combiner is naturally integrated into the matching network to enable a compact layout.  $Out_1$ ,  $Out_2$  and Out are the outputs of two paths and the overall output of the PA, respectively. Furthermore,  $C_{Out}$  and  $C_L$  denote the output capacitance of output stage and the PAD capacitance, while  $L_{S1}$  and  $L_{S2}$  include the parasitic inductance between the transformer and GSG pads.

Fig. 4.12(b) shows the matching network between driving stage and output stage, which is exactly the same as the interstage matching network in the previous chapter. The  $R_{Out,in}$  and  $C_{Out,in}$  represent the input impedance of output stage, and  $R_{Dr,out}$  and  $C_{Dr,out}$  the output impedance of driving stage.

Fig. 4.12(c) demonstrates the matching network between input stage and driving stage, which absorbs the power splitter. The In represents the output of input stage,  $In_1$  and  $In_2$  the inputs of two driving stages, and  $R_{Dr,in}$  and  $C_{Dr,in}$  the input impedance of driving stage. It is worth noting that, although the two driving stages are parallel distributed along the centerline of the chip. Due to the serial feature of the power splitter, the two driving stages are considered to be serial connected by the input stage, and therefore the overall load impedance seen by the input stage is  $(2R_{Dr,in})/(\frac{C_{Dr,in}}{2})$  rather than  $\frac{R_{Dr,in}}{2}/(2C_{Dr,in})$ . As a result, the network in Fig. 4.12(c) can be simplified as Fig. 4.13. Compared to Fig. 4.12(b), where the preceding stage also drives a succeeding stage of twice the size, the ratio between load and source impedance IR in Fig. 4.13 is 4 times larger. Therefore, an impedance transformation with a factor of 4 is required to obtain the network in Fig. 4.13. In this design, both Norton transform and inserting transformer are exploited to obtain the required impedance transformation. The Norton transform can also eliminate the inductor  $L_p$  in Fig. 4.12(b) to enable a simpler topology, while a transformer is inserted to further increase the transformation ratio to 4.



Figure 4.13: Equivalent schematic of the matching network between input stage and driving stage

To achieve an overall flat response, the input matching network is designed to compensate the ripples due to other matching networks, which unsurprisingly poses difficulties on input matching. To overcome this problem, a 3-order bandpass filter is used because it has one more peaking frequency than coupled resonator, thus allows to achieve a better matching. The final input matching network is depicted in Fig. 4.12(d).

## 4.4 Implementation and Measurements

#### 4.4.1 Layout Considerations

The layout is of extreme importance at mm-wave frequency. Parasitic ohmic losses need to be minimized for large gain and high efficiency. Undesired electrical and magnetic coupling should be suppressed to avoid changing the behavior of matching networks. Moreover, the outputs of all PA cells in a combining topology need to have the same amplitude and phase to maximal output power and combining efficiency. However, due to the compact layout and high operating frequency, the coupling can not be disregarded at mm-wave frequency [80]. A good way to minimize the effect of coupling is to make the layout symmetrical to a centerline. So the electromagnetic (EM) fields on the two sides of the centerline have opposite phase and their coupling to other blocks is minimized.

Symmetrical layout can be easily implemented in a one-path PA, where the differential signals are distributed symmetrically on the two sides of a centerline [39, 55]. Two-path PA can also have a symmetrical layout, where two paths can obtain a good match in both amplitude and phase. However, within each path, it is difficult, or maybe impossible, to achieve a perfect match between two differential signals [41, 81]. This is because the paths of the differential signal have different distances from the centerline. To circumvent this problem, a crossed topology is adopted to minimize the imbalance between the two differential signals. Fig. 4.14 shows the simplified layout from input stage to power combiner, where the red, blue and yellow traces are different layers of metal, and the gray boxes MOS transistors.

As illustrated in Fig. 4.14, the traces from the power splitter to the input of driving stages are cross connected to make sure their lengths and parasitic inductances almost the same so as to minimize the imbalance between the differential signals. It is the same for the connections from the driving stages to power combiner. Note that the potential coupling between the power splitter and combiner may be used to improve the gain [19], but it increases the feedback from output to input and may cause instability issue. In this design, a thick metal bar, which is used to provide supply voltage for the output stages, is inserted between the splitter and combiner to diminish the coupling and ensure the stability.



Figure 4.14: Simplified layout from input stage to power combiner



Figure 4.15: Chip photo of power combining PA

The chip photo is shown in Fig. 4.15. It occupies an area of 590x970  $\mu m^2$ . The chip is pad limited, and the core area is 450x250  $\mu^2$ .

#### 4.4.2 Measurement Results

The measured S-parameters are shown in Fig. 4.16. The peak gain is 30 dB at 65 GHz with a 3 dB bandwidth from 58.5 GHz to 73.5 GHz. The S11 is about -5 dB within band, which is not an issue thanks to the high gain, because the required input power is negligible. The S12 is less than -80 dB over the simulated frequency range, which ensures the stability of the PA.



Figure 4.16: Measured S-parameters

The large signal performances at 65 GHz versus input power are shown in Fig. 4.17. When the input power Pin is small, the gain is around 30 dB. As Pin increases, PAE increases and reaches its peak 22% around  $Pin = 2 \ dBm$ . Pout also increases with Pin and starts to saturate when Pin is large. The  $P_{SAT}$  is around 20.2 dBm and the  $P_{1dB}$  is about 16 dBm.

The large signal performances versus frequency are shown in Fig. 4.18. The  $P_{SAT}$  and  $P_{1dB}$  are about 20 dBm and 16 dBm with maximum 1 dB variation from 57 GHz to 72 GHz. The *PAE* is larger than 15% from 57 GHz to 73 GHz with a peak 22% at 64 GHz.



Figure 4.17: Large signal performances vs. input power



Figure 4.18: Large signal performances vs. frequency

#### 4.4.3 Comparison

The performance is summarized and compared with state-of-the-art mm-wave CMOS PAs with power combining in Tab. 4.1. The implemented PA shows the highest gain with state-of-the-art efficiency, output power and bandwidth.

|                  | This Work       | [50]  | [81]  | [82]  | [19]  |
|------------------|-----------------|-------|-------|-------|-------|
| Tech.            | $65\mathrm{nm}$ | 28 nm | 40 nm | 40 nm | 40 nm |
| Vdd [V]          | 1               | 1     | 1     | 1.8   | 0.9   |
| Freq. [GHz]      | 65              | 60    | 60    | 60    | 78    |
| Gain [dB]        | 30              | 24    | 21    | 22.4  | 18    |
| BW [GHz]         | 15              | 11    | 6     | -     | 15    |
| $P_{SAT}$ [dBm]  | 20              | 16.5  | 17.4  | 16.4  | 20.9  |
| $P_{-1dB}$ [dBm] | 16              | 11.7  | 14    | 13.9  | 17.8  |
| PAE [%]          | 22              | 13    | 28.5  | 23    | 22.3  |

Table 4.1: Comparison table of mm-wave CMOS PAs with power combining

## 4.5 Conclusions

This chapter discusses several techniques to improve output power of PAs. Transformer based power splitters and combiners are capable of offering impedance transformation with low insertion loss over a wide bandwidth, and thus are well suitable for mm-wave multi-path PAs. However, due to the interaction between different paths, PAs with traditional power splitters and combiners may have common-mode differential-mode oscillation problem. To overcome this problem and ensure stability, a novel power splitter is proposed to suppress the potential oscillation. The prototype demonstrates 30 dB power gain, 20 dBm  $P_{SAT}$ , 22% peak PAE, and a bandwidth from 58.5 GHz to 73.5 GHz.

## 4.5. CONCLUSIONS

## Chapter 5

## **Transceiver Building Blocks**

Chapter 2 proposes a wideband OOK transceiver targeting 10 Gbps data rate over 10 cm distance. Chapter 3 elaborates the design of a 50 GHz wideband PA. In this chapter, all building blocks of the proposed transceiver are discussed in detail. A prototype is fabricated in ST 28 nm CMOS LP. It demonstrates the capability to achieve error-free transmission up to 5 Gpbs (limited by measurement instruments) over 13 cm, while consuming only 130 mW.

This chapter is organized as follows. Sec. 5.1 discusses the design of the transmitter, including a 50 GHz VCO, a modulating PA and a off-chip planar monopole antenna. Sec. 5.2 describes the detailed architecture and building blocks of the receiver. A novel envelope detector is proposed to achieve high conversion gain and low noise figure over large frequency range. Sec. 5.3 summarizes the measurement results and comparison with the state-of-the-art links.

## 5.1 Transmitter

As discussed in Chapter 2, combining the OOK modulator and power amplifier can minimize the system complexity and power consumption. The detailed block diagram of the realized transmitter is shown in Fig. 5.1. It mainly contains three parts: a 50 GHz VCO, a modulating PA and a off-chip antenna. The PA consists of a driving stage (DR), a modulating output stage (OP) and a inverter-based buffer for 10 Gbps input binary signal. The modules in the dotted box, i.e. the VCO and PA, are implemented on chip, and the output of the PA is connected to the off-chip antenna through bonding wires.



Figure 5.1: Block diagram of the OOK transmitter

#### 5.1.1 Voltage Controlled Oscillator

Since a non-coherent OOK transceiver is immune to the frequency shift of carrier signal, the phase noise of the VCO is not important. Therefore, this VCO is targeted on low power consumption to enhance link efficiency, and large tuning range to cover PVT variations. In this scenario, a fundamental oscillator is preferable to the combination of a subharmonic oscillator and a multiplier due to its lower complexity and lower power consumption.

The push-pull topology (employing both NMOS and PMOS cross-coupled pairs) is chosen due to several advantages over a single differential pair oscillator. First, it provides twice the output voltage swing for the same current consumption and resonator, provided the oscillator works in the current-limited regime. This feature allows to minimize the power consumption of the VCO. Furthermore, the DC voltage at the drain and source of the MOS varactor can be set to half of the supply voltage, and thus the whole tuning range of the varactor can be used, resulting in a larger VCO tuning range. In addition, the loop gain of a push-pull oscillator is almost twice larger due to the contribution of both NMOS and PMOS transconductance. Finally, the tank voltage is always within the supply rail, and therefore avoiding reliability issues [83].

In order to minimize the power consumption of the VCO, the class-C configuration is adopted due to its superior dc-to-RF conversion efficiency. The schematic of the proposed VCO is shown in Fig. 5.2. Compared to the traditional push-pull class-C VCO [83], this architecture eliminates the tail current source, and therefore is capable to achieve a larger output voltage swing. To control the bias current, the bias voltage *Vbias* is programmed by a 8-bit R-2R ladder DAC. This tunable *Vbias* can be set high at first to ensure start-up, and set low to minimize the power consumption.

The VCO employs one switching capacitor bank and one varactor to achieve wide



Figure 5.2: Schematic of the proposed VCO [84]

tuning range. The capacitor bank has four identical branches. Each branch consists of one MOS switch and two MOM capacitors. Comparing to placing only one capacitor in each branch, putting two allows to halve the effect of the parasitic onstate resistance of the switch. Therefore, this configuration is capable to achieve a higher tank quality factor and a larger output swing. An accumulation MOS varactor is used to finely tune the oscillating frequency <sup>1</sup>. The circuit parameters and simulation results of the VCO are summarized in Tab. 5.1.

| Table 5.1: | Summary | of the | VCO |
|------------|---------|--------|-----|
|------------|---------|--------|-----|

| All Transistor Size   | $25 \mathrm{um}/28 \mathrm{nm}$ |  |  |
|-----------------------|---------------------------------|--|--|
| Switching Capacitors  | $4^{*}26/2$ fF <sup>a</sup>     |  |  |
| Varactor Range        | 22 fF                           |  |  |
| Coupled Capacitors Cc | 100 fF                          |  |  |
| Oscillating Frequency | 47.4-52.7 GHz                   |  |  |
| Power Consumption     | 4  mW                           |  |  |

<sup>a</sup>Multiplying by 4 is because of 4 branches, dividing by 2 because 2 capacitors in serial.

<sup>&</sup>lt;sup>1</sup>Although fine frequency tuning is not mandatory for a non-coherent OOK transceiver, the varactor is employed here to extent frequency tuning range.

#### 5.1.2 Power Amplifier

The PA employed in this transmitter is based on the one described in Chapter 3. Both of them are made of a driving stage and a output stage, while all stages use capacitive neutralization to enhance the stability and gain. However, the two PAs have some significant differences.

One of the major difference is that the output stage of the PA in the transmitter needs to be switchable to realize OOK modulation. To meet this requirement, a switch is inserted into the output stage. The schematic is shown in Fig. 5.3, where  $M_{1,2}$  are the input transistors,  $M_{3,4}$  the drain-source shorted transistors serving as neutralizing capacitors, and  $M_5$  the switching transistor to implement OOK modulation. *Bitstream* is the transmitted binary signal, driving the switching transistor  $M_5$  via 4-stage tapered inverters.



Figure 5.3: The output stage and the inverter-based driver

When the modulating signal Vmod is high,  $M_5$  is on and the large parasitic capacitance creates a virtual ground at the common source of  $M_1$  and  $M_2$ . As a result, the output stage works as a normal neutralized common source amplifier. When Vmod becomes low,  $M_5$  becomes off, and the differential output decays to zero with a speed inversely proportional to the quality factor of the output matching network [85]. Since the output impedance of the transistor has a quality factor of 2, the output voltage is capable to decrease to zero rapidly, which is critical to obtain a large ratio between on and off power.  $M_{3,4}$  have a size half of  $M_{1,2}$  to minimize the feedback from output to input and therefore maximize the stability. A significant advantage of the modulating output stage is that it does not consume any power when transmitting '0', i.e. when Vmod is low. This feature allows to minimize the power consumption of the PA. However, the tail transistor  $M_5$  impairs the maximal achievable output power due to the voltage drop on it. To circumvent this problem,  $M_5$  needs to have large size. On the other hand, large  $M_5$  requires large inverter-based buffer, which consumes considerable power at 10 Gbps. After several iterations,  $M_5$  is sized twice as  $M_{1,2}$  and the voltage drop on it is only 50 mV. Note that the inverter chain scales with a ratio of 2, and its power consumption is approximately one fifth of the output stage at 10 Gbps.

Switching the output stage can significantly improve the power efficiency. However, switching also the driving stage brings more troubles than benefits. First, as illustrated in Fig. 5.4(a), keeping the DR always on makes both input and output signal of the DR narrow band, and thus allows the usage of narrow band matching networks (NB MNs). However, switching the DR makes the output of the DR occupy a wide frequency band, and therefore requires a wideband matching network (WB MN). Since a WB MN is usually more complex and introduces higher losses than a NB MN, the topology in Fig. 5.4(b) leads to a larger chip area and requires a larger DR to compensate the higher losses <sup>2</sup>, which may cancel out the benefit of switching off DR. Furthermore, modulating both DR and OP slows down the switching-on speed and degrades the on/off power ratio of the transmitter. Therefore, the architecture in Fig. 5.4(a) adopted in this design.

The schematic of the DR and its matching networks are shown in Fig. 5.5. It is a capacitively neutralized common-source amplifier. Compared to the input stage of the PA in Chapter 3, since the input of the DR does not need to match to  $50\Omega$  over a large frequency range, the degenerating inductors are eliminated, which results in a larger gain. Furthermore, since both the input and output signals of the DR are narrow band, the input is directly connected to the VCO through two big capacitors <sup>3</sup>, and the output are coupled to the OP through a transformers. These simple matching networks have very low losses and thus allow to use a smaller DR stage while achieving the same gain and output power as the wideband input stage in Chapter 3. Furthermore, since the transmitter employs OOK modulation, the DR can work in saturation region. Therefore, a small DR, which is only one tenth of the OP, is used to provide enough power to drive the OP.

The design of the output matching network is similar to the one described in Chapter 3. However, as will be discussed in the following section, here the load of the PA, i.e. the antenna impedance, is 110  $\Omega$  rather than 50  $\Omega$ . This difference can be easily accommodated by adjusting the coupling coefficient of the transformer, which connects the output stage and the pads. The circuit parameters and the simulation results of the PA are summarized in Tab. 5.2.

<sup>&</sup>lt;sup>2</sup>At the same time, Fig. 5.4(b) requires a larger VCO to drive the larger DR, and a larger inverter-based buffer to drive two modulating transistors.

 $<sup>^{3}</sup>$ The input bias of the DR is provided by two big resistors, which are omitted in Fig. 5.5.



Figure 5.4: Two kinds of switching PA



Figure 5.5: The driving stage and its input and output matching networks
| DR Input Transistor Size                | $20 \mathrm{um}/28 \mathrm{nm}$  |  |  |
|-----------------------------------------|----------------------------------|--|--|
| OP Input Transistor Size                | 200um/28nm                       |  |  |
| OP Modulating Transistor Size           | $400 \mathrm{um}/28 \mathrm{nm}$ |  |  |
| Tapered Inverter Transistor Size $^{a}$ | 25um, 50um, 100um, 200um $^{b}$  |  |  |
| Continuous Wave Output Power            | 14.1 dBm                         |  |  |
| Continuous Wave Power Gain              | 16 dB                            |  |  |
| Continuous Wave Peak PAE                | 24 %                             |  |  |
| On-State Output Power @ 10 Gbps         | 12.1 dBm                         |  |  |
| On/Off Output Power Ratio               | 39 dB                            |  |  |
| DR Power Consumption                    | $15 \mathrm{~mW}$                |  |  |
| OP Power Consumption                    | $60/2$ mW $^c$                   |  |  |
| Inverter Power Consumption @ 10 Gbps    | $7 \mathrm{mW}$                  |  |  |

#### Table 5.2: Summary of the PA

<sup>a</sup>NMOS and PMOS have the same size.

<sup>b</sup>All transistors have the minimal channel length, i.e. 28 nm.

 $^c\mathrm{Dividing}$  by 2 is because the OP is off for half time.

#### 5.1.3 Antenna

The output of the PA is connected to an off-chip antenna through bonding wires. In order to maximize the power transfer and minimize the reflection over 20 GHz bandwidth, the bonding wires need to be conjugately matched to the antenna. The traditional bonding connection is shown in Fig. 5.6(a), which uses three bonding wires for GSG pads respectively. The characteristic impedance of the bonding wires, i.e. the PA load impedance, is about 180  $\Omega$ , and so the output power of the PA is limited. To circumvent this problem, this design employs GSSG pads and four bonding wires to reduce the characteristic impedance to 110  $\Omega$ , as shown in Fig. 5.6(b). Furthermore, the space between the pads on the chip is close to the space of the on-board coplanar waveguide with the same characteristic impedance. Therefore, the bonding wires can be placed straight and paralleled with each other <sup>4</sup>.

The antenna has a planar monopole topology, which has the advantages of intrinsic wideband operation and ease of fabrication. It is optimized together with the four bonding wires. The whole structure shown in Fig. 5.7 is simulated in commercial electromagnetic simulator Ansoft HFSS, based on the finite element method (FEM). The geometrical dimensions of the antenna are chosen for optimal performance in the frequency band 40-60 GHz.

The return loss of the structure is shown in Fig. 5.8. As can be seen that, the input matching is better than -10 dB over 40-70 GHz band.

<sup>&</sup>lt;sup>4</sup>Unparalleled bonding wires do not have an unique characteristic impedance across wide frequency band, and therefore difficult to obtain wideband matching.



Figure 5.6: GSG connection and GSSG connection



Figure 5.7: The planar monopole antenna and the bonding wires

The radiation pattern at 50 GHz is shown in Fig. 5.9. In the plane of  $\phi = 0^{\circ}$ , the gain is quite uniform in all directions, however, the gain is only -5 dBi. In the plane of  $\phi = 90^{\circ}$ , the gain has a peak value of 2.5 dBi, but with significant variation for different directions. Note that the low gain of the antenna limits the communication distance of the link, and the directional gain cause difficulties in measurement setup.

The antenna is realized on Rogers RT/Duroid 5880. Its performance is summarized in Tab. 5.3.



Figure 5.8: The input matching of the antenna and the bonding wires



Figure 5.9: The radiation pattern of the antenna at 50 GHz (Red curve:  $\phi = 0^{\circ}$ ; purple curve:  $\phi = 90^{\circ}$ )

| Table 5 | .3: | Summary | of | the | antenna |
|---------|-----|---------|----|-----|---------|
|---------|-----|---------|----|-----|---------|

| Input Impedance | 110 Ω     |
|-----------------|-----------|
| Operation Band  | 40-70 GHz |
| Peak Gain       | 2.5 dBi   |

### 5.2 Receiver

The block diagram of the realized receiver is shown in Fig. 5.10. The RF signal from the antenna is first amplified by a wideband LNA, and then the LNA output is demodulated by a squarer based envelope detector. A dummy squarer is used to provide a voltage reference. The difference between the outputs of the two squarers is amplified by a 5-stage limiting amplifier and an output buffer to obtain large output swing. A negative feedback loop is employed in the LA to cancel the offset of the two squarers and LA. Since the feedback path has low-pass feature, it only suppresses the low frequency components and has little effect on high frequent components. In other words, the low-pass pole in the feedback path sets a low boundary on the signal frequency  $^{5}$ .



Figure 5.10: Block diagram of the OOK receiver

#### 5.2.1 Low Noise Amplifier

The LNA is the most important building block of this receiver. As discussed in Chapter 2, its gain, noise figure and bandwidth determine the receiver performance. Since the intrinsic gain of CMOS devices are limited at operating frequency close to their cut-off frequency, many stages are required to obtain high gain. However, cascading multi stages not only increases power consumption, but also reduces the overall bandwidth. Therefore, explicit techniques are required to achieve high gain over large bandwidth with low power consumption.

In order to achieve enough gain, a 7-stage cascaded common source LNA is employed to compensate the low transistor gain and high losses of passive matching networks. The first six stages are divided into three pairs, and the two amplifiers in each pair are stacked together to share DC current and reduce power consumption [86]. The current-reuse topology looks like cascode amplifier, but they are intrinsically

<sup>&</sup>lt;sup>5</sup>Note that the low boundary is larger than the low-pass pole.

different. The upper transistor in the current-reuse topology operates as a common source amplifier, and its source is virtual ground and the RF signal feeds its gate. On the other hand, the upper transistor in the cascode topology operates as a common gate amplifier, and its gate is virtual ground and the RF signal feeds its source. Furthermore, the current-reuse topology can provide higher gain and better noise performance compared to its cascode counterpart [87].

Since we have seven stages, it is really challenging to obtain 20 GHz bandwidth. In this design, second-order LC matching networks, together with staggered tuning, are used to achieve the desired bandwidth. The matching network is shown in Fig. 5.11, where  $R_1$  and  $C_1$  are the output impedance of preceding stage,  $R_3$  and  $C_3$  the input impedance of succeeding stage, and  $L_2$  separates  $C_1$  and  $C_3$  for broadband operation. This network has two peaking frequencies, which can be exploited to obtain large bandwidth. Moreover, this topology can be conveniently implemented in layout, allowing to minimize the insertion loss <sup>6</sup>.



Figure 5.11: Interstage matching network of the LNA

Different from other blocks in this transceiver, the LNA employs transmission lines (TLs) to implement inductors due to their several advantages. First, TLs have better defined current return path than spiral inductors, which is particularly important for our single-ended LNA [88]. Second, although a single transmission line generally occupies larger area than a spiral inductor, TLs can be routed very flexibly and compactly, and therefore may save area when the number of inductors are large. Third, TLs are more immune to nearby TLs, while unwanted coupling between adjacent inductors may change the behavior of each inductor. Fourth, TLs can be designed much faster thanks to the scalable model, while several iterations of time consuming EM simulations are required for spiral inductors. To sum up, TLs can achieve higher modeling accuracy and minimize the discrepancy between simulation and measurement results. Considering the large amount of inductances required in this LNA, coplanar waveguides (CPWs) are used to implement the inductors to increases the possibility of first-pass success.

The schematic of the integrated LNA is shown in Fig. 5.12, where all capacitors are very big and can be considered shorted at operating frequency, and the gray lines are

<sup>&</sup>lt;sup>6</sup>Higher-order networks can be used to achieve a higher GBW, but the complexity and losses of the layout make them unrealistic.

CPWs. The LNA consists of three current-reuse stages, one simple common source stage, and a balun to convert the single-ended output of the LNA to differential inputs of the following squarer. Note that the last active stage does not adopt current-reuse technique because it can have large output amplitude, which may drive transistors into triode region in current-reuse configuration, degrading the gain and noise performance. The circuit parameters and the simulation results of the LNA are summarized in Tab. 5.4.



Figure 5.12: Schematic of the current-reuse LNA [89]

| First 3 Stages Transistor Size | $40 \mathrm{um}/28 \mathrm{nm}$ |
|--------------------------------|---------------------------------|
| Last Stage Transistor Size     | $20 \mathrm{um}/28 \mathrm{nm}$ |
| Voltage Gain                   | 34  dB                          |
| Operating Band                 | 37.3-66.4 GHz                   |
| In-band Noise Figure           | $6.4-8.6 \mathrm{~dB}$          |
| In-band S11                    | < -10 dB                        |
| Power Consumption              | 24  mW                          |

Table 5.4: Summary of the LNA

#### 5.2.2 Envelope Detector

To demodulate the received signal, an envelope detector is required after the LNA. Since the transceiver adopts OOK modulation, the received signal contains only 1 and 0 information, and therefore linearity is not important in this work. As a result, a squarer is employed as the envelope detector in order to minimize power consumption. Due to the stringent receiver noise requirement, ED gain and NF are two critical parameters to maintain high SNR and relax the gain of the LA, drastically reducing the overall power consumption.



(a) Follower based ED [90] (b) Amplifier based ED [22] (c) Single-ended ED [91]

Figure 5.13: Three topologies of squarer

Fig. 5.13(a) shows the source-follower based ED, where the push-push connection of the pair also nulls the 1st order component of the output signal. The main drawback of this circuit is the limited gain. To increase the gain, [22] proposes a amplifier based ED shown in Fig. 5.13(b). Assuming the input transistors operate in saturation region, the output amplitude of the source-follower based ED  $Aout_{SF}$ and the amplifier based one  $Aout_{Amp}$  can be calculated as:

$$Aout_{SF} \approx \frac{1}{16} \frac{1}{V_{ov}} Ain^{2}$$

$$Aout_{Amp} \approx \frac{1}{8} \frac{g_{m} R_{L}}{V_{ov}} Ain^{2}$$
(5.1)

where  $g_m$  is the MOS transconductance,  $R_L$  the load resistor,  $V_{ov}$  the overdrive voltage and Ain the input amplitude. Note that the amplifier based ED can have more than 2 times higher gain, relaxing the requirements on the gain and noise performance of following stages.

Since the output of the LNA is single-ended, it is tempting to use a single-ended ED to obviate the lossy balun between the LNA and ED. A common source amplifier as shown in Fig. 5.13(c) satisfies such requirement. Its output amplitude is:

$$Aout_{SE} \approx \frac{1}{2} \frac{g_m R_L}{V_{ov}} Ain^2.$$
(5.2)

Compared Eq. (5.2) with Eq. (5.1), the single-ended ED has 4 times higher gain, which stems from twice the input amplitude. However, the input impedance of the single-ended ED is also 4 times lower than its differential counterpart, which may reduce the output amplitude of the LNA, i.e. the input amplitude of the ED. Therefore, the gain difference would be less than 4 times if the loss of balun is neglected.

Although the single-ended ED has superior gain among the three topologies, it has many disadvantages due to the non-zero gain at fundamental frequency. First, the supply noise of the single-ended LNA would be amplified by ED to LA, and then coupled back to LNA. Due to the poor power supply rejection ratio (PSRR) of the LNA and high gain of LA, this loop may cause oscillation problems. Furthermore, the output of the single-ended ED has a large interference at carrier frequency. Although this interference can be filtered out by the limited bandwidth of ED and following stages, it may desensitize the gain of following stages. Worse more, this component at carrier frequency would significantly degrade the noise performance. To elaborate this issue, we can calculate the NF of ED. Fig. 5.14 shows a simplified noise model of the ED, where  $s_{in}$  and  $n_s$  are the input signal and noise,  $s_{out}$  and  $n_{out}$ the output signal and noise,  $n_{int}$  the noise contributed by ED and following stages,  $a_1$  and  $a_2$  the gain at fundamental frequency and 2rd harmonics. For Fig. 5.13(a, b)  $a_1$  is zero, and  $a_2$  is non-zero; while for Fig. 5.13(c), both  $a_1$  and  $a_2$  are non-zero, and  $a_1$  is much larger than  $a_2$ .



Figure 5.14: Equivalent noise model of envelope detector

From Fig. 5.14, it is easy to get:

$$SNR_{in} = \frac{E[s_{in}^{2}]}{E[n_{s}^{2}]} = \frac{P_{s,in}}{P_{n,in}}$$

$$SNR_{out} \approx \frac{E[s_{out}^{2}]}{E[n_{out}^{2}]} = \frac{P_{s,in}^{2}}{4P_{s,in}P_{n,in} + (a_{1}^{2}P_{n,in} + P_{int})/a_{2}^{2}}$$

$$F_{ED} = \frac{SNR_{in}}{SNR_{out}} \approx 4 + \frac{a_{1}^{2}P_{n,in} + P_{int}}{a_{2}^{2}P_{s,in}P_{n,in}}$$
(5.3)

where  $P_{s,in}$ ,  $P_{n,in}$  and  $P_{int}$  are the power of input signal, input noise, and the noises contributed by ED and following stages.

As shown in Eq. (5.1), to minimize the NF, we need to not only maximize  $a_2$ , but also minimize  $a_1$ . Especially  $a_1^2 P_{n,in}$  is much larger than  $P_{int}$  for the single-ended ED. Simulation shows that the differential ED in Fig. 5.13(b) achieves much better noise performance than the single-ended one, and therefore is chosen in this design.

Since this transceiver is targeted for data rate up to 10 Gbps, the ED needs to have enough bandwidth to reduce the effect of inter-symbol interference (ISI) and ensure wide eye opening. To increase bandwidth, several techniques are employed in this design. First, a shunt peaking inductor is added between the power supply and load resistor. Second, a small size cascode transistor is used to reduce the output capacitance. Third, the load resistor is implemented with a PMOS, whose gate voltage can be tuned to vary the equivalent load resistance, thus trading gain with bandwidth.

The final structure of the ED is shown in Fig. 5.15. Since the output of a squarer is single-ended, a dummy is added to provide a voltage reference to form a "differential" output. Although a low pass filter can be used to get the DC component of the squarer output and provide a voltage reference, the dummy approach achieves much better PSRR and occupies much less area at the cost of adding negligible power consumption. Note that the output of the dummy is always higher or equal to the output of the squarer. In other words, the two voltages of the differential output have systematic DC offset, which necessitates the usage of a DC offset canceling circuit <sup>7</sup>. However, this offset canceling circuit is also required by the following LA, so this systematic offset cannot be treated as a penalty of this ED.



Figure 5.15: Schematic of the ED  $^{8}$ 

The circuit parameters and the simulation results of the ED are summarized in Tab. 5.5.

#### 5.2.3 Limiting Amplifier and Output Buffer

The output of the ED is fed to a chain of baseband amplifiers to achieve large output swing. Given the receiver sensitive, the gain of LNA and ED, 33 dB gain is required

<sup>&</sup>lt;sup>7</sup>This is because the differential output of the ED is directly connected to the input of LA. AC coupling can be used to avoid this issue at the cost of huge capacitor and large chip area.

<sup>&</sup>lt;sup>8</sup>The input transistors can operate in sub-threshold region and the load PMOS operate in triode region, so stacking three transistors is feasible under 1 V supply.

| Input Transistor Size   | 40 um/28 nm                     |
|-------------------------|---------------------------------|
| Cascode Transistor Size | $16 \mathrm{um}/28 \mathrm{nm}$ |
| Loading PMOS Size       | $6 \mathrm{um}/28 \mathrm{nm}$  |
| Gain $a_2$              | 2-10 a                          |
| Bandwidth               | $3-15~\mathrm{GHz}$             |
| Power Consumption       | 1.5  mW                         |

| Table $5.5$ : | Summary | of the | ED |
|---------------|---------|--------|----|
|---------------|---------|--------|----|

 $^{a}$ The gain can be tuned by PMOS biasing voltage. The same for the bandwidth.

for the LA to obtain 400 mV output amplitude. Besides the gain, the bandwidth is also critical for maximizing SNR and minimizing ISI. As discussed in Chapter 2, 7 GHz bandwidth is optimal for 10 Gbps OOK transceivers.

Cascading multi stages generally increases the gain but decreases the bandwidth. Assuming first order loading network, the total gain bandwidth product  $GBW_{Tot}$  of a chain of n identical amplifier cells is [92]:

$$GBW_{Tot} = A_S^n BW_S \sqrt{2^{1/n} - 1} = GBW_S A_{Tot}^{1 - 1/n} \sqrt{2^{1/n} - 1}$$
(5.4)

where  $A_S$ ,  $BW_S$  and  $GBW_S$  are the gain, bandwidth and GBW of each stage, respectively.

As shown in Eq. (5.4), for a certain desired  $GBW_{Tot}$ , smaller  $GBW_S$  requires larger n and thus larger power consumption<sup>9</sup>. Therefore, it is necessary to maximize  $GBW_S$  in order to minimize n and the power consumption.

Due to the relatively high gain, the Miller effect increases a lot the input capacitance of each stage, severely limiting  $BW_S$  and  $GBW_S$ . To circumvent this problem, Negative Miller capacitor, i.e. the capacitive neutralization used in the PA, has been adopts for the LA to increase  $GBW_S$  without any penalty on power consumption. No inductive peaking is used in order to minimize the chip area. The first four stages of the LA are set to be identical to simplify the design. The fifth stage, which needs to drive the large size output buffer, employs a  $f_T$ -doubler architecture to double its driving ability while providing the same load impedance for the fourth stage <sup>10</sup>. Both schematics are shown in Fig. 5.16.

To achieve single-ended output amplitude of 400 mV, the output buffer needs to deliver 8 mA into the 50  $\Omega$  of measurement instruments. The large amount of current necessaries the usage of large size transistor. In order to minimize the load

<sup>&</sup>lt;sup>9</sup>Although larger n does not necessarily lead to larger  $GBW_{Tot}$ , when n is small,  $GBW_{Tot}$  increases as n increases.

<sup>&</sup>lt;sup>10</sup>The  $f_T$ -doubler also doubles the power dissipation, and thus are not used for the first four stages.

<sup>&</sup>lt;sup>11</sup>The bias voltage Vb is got by feeding the input  $In_p$  and  $In_n$  into a low pass filter.



Figure 5.16: Schematics of the LA

capacitance for the previous stage, the buffer also adopts the  $f_T$ -doubler architecture. Furthermore, the open drain configuration has been selected to avoid current partition with the load resistance of the buffer, thus minimizing the buffer size for the given output swing. The drain bias are provided by two off-chip bias tees. The schematic of the buffer is shown in Fig. 5.17.



Figure 5.17: Schematic of the buffer

In order to cancel the systematic offset of the ED output and the random offset of the LA due to PVT variations, a DC offset cancellation loop is employed. The offset is sensed at the input of the buffer rather than the output, since the buffer is open drain and thus its DC gain is zero. The feedback is closed at the output of the first LA stage in order not to decrease the load of the ED, which would severely degrade its gain. The feedback path consists of a RC low pass filter and a differential amplifier, which is the similar to the one shown in Fig. 5.16(a) but without negative Miller capacitors. The pole of the low pass filter has been set to 450 kHz, low enough to avoid significant eye closure due to the drop of longest expected run.



(a) Chip micrograph of the transmitter



(b) Chip micrograph of the receiver

Figure 5.18: Micrographs of the transceiver

The circuit parameters and the simulation results of the LA and output buffer are summarized in Tab. 5.6.

| LA Input Transistor Size     | $20 \mathrm{um}/28 \mathrm{nm}$   |
|------------------------------|-----------------------------------|
| LA Tail Transistor Size      | $100 \mathrm{um}/100 \mathrm{nm}$ |
| Buffer Input Transistor Size | $40 \mathrm{um}/28 \mathrm{nm}$   |
| Buffer Tail Transistor Size  | $200 \mathrm{um}/28 \mathrm{nm}$  |
| Gain                         | $35 \mathrm{dB}$                  |
| Operation Band               | 23 MHz - 8.9 GHz $^{a}$           |
| Output Swing                 | 410 mV                            |
| Power Consumption            | $25 \mathrm{~mW}$                 |

Table 5.6: Summary of the LA and output buffer

 $^{a}$ The low frequency components are removed by the offset canceling loop.

# 5.3 Measurement

A prototype of the proposed OOK transceiver is fabricated in ST 28 nm bulk CMOS LP technology. The chip micrographs of both transmitter and receiver are shown in Fig. 5.18. The transmitter occupies an area of 950x630  $\mu m^2$ . It is pad limited, and the core area is only 360x150  $\mu m^2$ . The receiver occupies an area of 1450x1050  $\mu m^2$ . The large area is caused by the transmission lines of the LNA.

#### 5.3.1 Measurement Setup

Both the transmitter and the receiver are assembled on custom designed boards, which contain the planar antennas. The connection of the transmitter with its board is illustrated in Fig. 5.19.

The measurement of the transceiver is much more complicated than that of PA. The setup is shown in Fig. 5.20. First, the transmitting pseudorandom binary sequence (PRBS) is generated from a BER tester (Anritsu MP 1763B pulse generator). The PRBS signal modulates the PA, and the PA output is sent to free space by an antenna. The receiver antenna receives the RF signal, which is then amplified and demodulated by the receiver. The differential output signals of the receiver are sent to the BER tester and an oscilloscope, respectively. The BER tester calculates the BER and the oscilloscope shows the eye diagram. All the biasing voltages and the oscillating frequency of the VCO are programed on PC via Arduino.



Figure 5.19: Connection between the transmitter and the board

#### 5.3.2 Measurement Results

Fig. 5.21 shows the eye diagrams at the output of the receiver when transmitting 1 Gbps and 3 Gbps respectively. The vertical eye opening is larger than 400 mV in both cases. The horizontal eye opening is larger than 700 ps for 1 Gbps, and larger than 200 ps for 3 Gbps, i.e. larger than 60% of data period in both cases.

The result of BER versus transmission distance is shown in Fig. 5.22. It can be seen that the transceiver can achieve error-free transmission up to 13 cm for data rate of 5 Gbps, which is the up limit of measurement instruments.

### 5.3.3 Comparisons

The performance is summarized and compared with published mm-wave links in Tab. 5.7. The implemented transceiver shows the highest combination of data rate and communication distance employing non-directive antenna while keeping low power consumption.



(a) Setup of the measurement



(b) Close-up picture of the transceiver front-end

Figure 5.20: Measurement setup of the OOK transceiver

### 5.3. MEASUREMENT



(a) Transmitting 1 Gbps over 5 cm



(b) Transmitting 3 Gbps over 2 cm

Figure 5.21: Eye Diagram at the output of the receiver



Figure 5.22: BER performance of the link

104

|                  | This Work     | [93]       | [94]            | [25]            | [95]            |
|------------------|---------------|------------|-----------------|-----------------|-----------------|
| Tech.            | <b>2</b> 8 nm | 40 nm      | $65\mathrm{nm}$ | $65\mathrm{nm}$ | $90\mathrm{nm}$ |
| Mod.             | OOK           | ASK        | 16-QAM          | QPSK            | QPSK            |
| $G_{Ant}$ [dBi]  | 2.5           | 4          | 2               | 0               | 6.5             |
| Data Rate [Gbps] | $5^{a}$       | 11         | 11              | 2.6             | 1.5             |
| Distance [mm]    | 130           | 14         | 50              | 40              | 1000            |
| BER              | $10^{-12}$    | $10^{-11}$ | $10^{-3}$       | n.d.            | n.d.            |
| Pdiss [mW]       | 130           | 70         | 230             | 358             | 1772            |

Table 5.7: Comparison table of mm-wave links

<sup>*a*</sup>Limited by measurement instruments.

# 5.4 Conclusion

Following the system level analysis in chapter 2, this chapter describes the implementation of a wideband OOK transceiver. A modulating PA is employed to realize OOK modulation and improve power efficiency. An off-chip planar antenna is used to send the PA output to free space. On the receiver side, the current-reuse topology is adopted for the LNA to minimize power consumption. The amplified signal is then demodulated by a squarer based envelope detector. Wideband LA and buffer amplify the demodulated signal, and a feedback path is inserted in the LA chain in order to cancel the offset. The fabricated prototype achieves error-free transmission up to 13 cm at 5 Gbps, while consuming only 130 mW.

## 5.4. CONCLUSION

# Chapter 6

# Conclusion

Thanks to the large available bandwidth, mm-wave frequency band offers an opportunity to satisfy the increasing demand for wireless data capacity. However, due to limited transistor gain and low quality factors of passive components, it is challenging to realize high performance transceivers. This thesis focuses on the power amplifier, a key block of the transceiver. Several design techniques are proposed in order to generate high output power in wide mm-wave band. Furthermore, a wideband OOK transceiver has been implemented to demonstrate the feasibility of realizing high speed low power links in bulk CMOS technology. The major contributions of this thesis are summarized below.

A design methodology is proposed for wideband and compact matching networks in PAs. The proposed technique leverages the high GBW feature of inductively coupled resonator to achieve high gain over wide frequency range. Furthermore, impedance transformation and topology rearrangement techniques are exploited to maximize the power transfer and minimize the layout losses. Using the proposed methodology, a two-stage prototype has been realized in 28 nm CMOS technology [42]. It shows 27 GHz bandwidth with 13.3 dBm  $P_{SAT}$  and 16 % PAE.

Besides obtaining large bandwidth, achieving high output power is also desirable for enhancing link performance. The limitations of the output power of CMOS PAs are analyzed, and some approaches to break the limitations are discussed in this thesis. Particular attention is given to the widely used non-isolated power combining and splitting techniques. Stability analysis reveals the common-mode differential-mode oscillation problem in traditional non-isolated splitter. A novel power splitter, which is capable to isolate the common-mode signals in different PA paths, is proposed to suppress the oscillation. Using the proposed structure, a three-stage two-path PA has been implemented in 65 nm CMOS technology. The PA shows no stability issue and achieves 30 dB power gain, 20 dBm  $P_{SAT}$  and 22% peak PAE with a bandwidth from 58.5 GHz to 73.5 GHz. A wideband OOK transceiver has been realized in 28 nm CMOS technology [37]. Various techniques, including modulation PA, current-reuse LNA and squarer based envelope detector, are employed to simplify the transceiver architecture and minimize the power consumption. Large bandwidth has been exploited to achieve high data rate with low power consumption. With 2.5 dBi antenna gain, the realized transceiver achieves error-free transmission up to 5 Gpbs over 13 cm, while consuming only 130 mW.

# Bibliography

- [1] ERICSSON, "Ericsson mobility report," 2014. [Online]. Available: http://www.ericsson.com/res/docs/2014/ ericsson-mobility-report-august-2014-interim.pdf
- [2] C. Marcu, D. Chowdhury, C. Thakkar, L.-K. Kong, M. Tabesh, J.-D. Park, Y. Wang, B. Afshar, A. Gupta, A. Arbabian, S. Gambini, R. Zamani, A. Niknejad, and E. Alon, "A 90nm CMOS low-power 60GHz transceiver with integrated baseband circuitry," in *Solid-State Circuits Conference - Digest of Technical Papers, 2009. ISSCC 2009. IEEE International*, Feb 2009, pp. 314–315,315a.
- [3] J. Lee, Y. Huang, Y. Chen, H. Lu, and C. Chang, "A low-power fully integrated 60GHz transceiver system with OOK modulation and on-board antenna assembly," in *Solid-State Circuits Conference - Digest of Technical Papers*, 2009. *ISSCC 2009. IEEE International*, Feb 2009, pp. 316–317,317a.
- [4] H. Wang, K.-W. Chang, L. Tran, J. Cowles, T. R. Block, E. Lin, G. Dow, A. Oki, D. Streit, and B. Allen, "Low phase noise millimeter-wave frequency sources using InP-based HBT MMIC technology," *Solid-State Circuits, IEEE Journal of*, vol. 31, no. 10, pp. 1419–1425, Oct 1996.
- [5] B. Razavi, "A 300-GHz Fundamental Oscillator in 65-nm CMOS Technology," Solid-State Circuits, IEEE Journal of, vol. 46, no. 4, pp. 894–903, April 2011.
- [6] ITRS, "Radio frequency and analog/mixed-signal technologies summary," 2013.
   [Online]. Available: http://www.itrs.net/Links/2013ITRS/2013Chapters/ 2013RFAMS\_Summary.pdf
- [7] marketsandmarkets.com, "Millimeter Wave Technology Market by Components, Products, Applications /- Analysis Forecast to 2020," 2014.
   [Online]. Available: http://www.marketsandmarkets.com/Market-Reports/millimeter-wave-technology-market-981.html
- Broadcom, "Broadcom Announces Industry's First 10 Gbps Millimeter Wave SoC," 2014. [Online]. Available: http://www.broadcom.com/press/release. php?id=s826575

- [9] P. Adhikari, "Understanding Millimeter Wave Wireless Communication," 2008.
   [Online]. Available: http://www.loeacom.com/L1104-WP\_Understanding% 20MMWCom.pdf
- [10] [Online]. Available: http://www.exaltcom.com/Mobile-Operators.aspx
- [11] D. Murph, "Heavily-backed WiGig Alliance to stream everything over 60GHz," 2009. [Online]. Available: http://www.engadget.com/2009/05/06/ heavily-backed-wigig-alliance-to-stream-everything-over-60ghz/
- [12] U. of Stuttgart, "Statistical Signal Processing Automotive Radar." [Online]. Available: http://www.iss.uni-stuttgart.de/lehre/masterLabRadar/
- Bosch, "LRR3: 3rd generation Long-Range Radar Sensor," 2009.
   [Online]. Available: http://www.bosch-automotivetechnology.com/media/ db\_application/downloads/pdf/safety\_1/en\_4/lrr3\_datenblatt\_de\_2009.pdf
- [14] Y.-A. Li, M.-H. Hung, S.-J. Huang, and J. Lee, "A fully integrated 77GHz FMCW radar system in 65nm CMOS," in *Solid-State Circuits Conference Di*gest of Technical Papers (ISSCC), 2010 IEEE International, Feb 2010, pp. 216–217.
- [15] A. Tang, G. Virbila, D. Murphy, F. Hsiao, Y. Wang, Q. Gu, Z. Xu, Y. Wu, M. Zhu, and M.-C. Chang, "A 144GHz 0.76cm-resolution sub-carrier SAR phase radar for 3D imaging in 65nm CMOS," in *Solid-State Circuits Conference Digest* of Technical Papers (ISSCC), 2012 IEEE International, Feb 2012, pp. 264–266.
- [16] ITRS, "RF and Analog/Mixed-signal Technologies (RFAMS)," 2012. [Online]. Available: http://www.itrs.net/Links/2012ITRS/Home2012.htm
- [17] B. Razavi, *RF Microelectronics*, 2nd ed. NJ, USA: Prentice Hall Press, 2011.
- [18] B. Heydari, M. Bohsali, E. Adabi, and A. Niknejad, "A 60 GHz Power Amplifier in 90nm CMOS Technology," in *Custom Integrated Circuits Conference*, 2007. *CICC '07. IEEE*, Sept 2007, pp. 769–772.
- [19] D. Zhao and P. Reynaert, "A 0.9V 20.9dBm 22.3%-PAE E-band power amplifier with broadband parallel-series power combiner in 40nm CMOS," in *Solid-State Circuits Conference Digest of Technical Papers (ISSCC), 2014 IEEE International*, Feb 2014, pp. 248–249.
- [20] F. Vecchi, S. Bozzola, M. Pozzoni, D. Guermandi, E. Temporiti, M. Repossi, U. Decanis, A. Mazzanti, and F. Svelto, "A wideband mm-Wave CMOS receiver for Gb/s communications employing interstage coupled resonators," in *Solid-State Circuits Conference Digest of Technical Papers (ISSCC), 2010 IEEE International*, Feb 2010, pp. 220–221.

- [21] C. E. Shannon, "The Mathematical Theory of Communication," in *The Bell System Technical Journal*, Oct 1948.
- [22] J. Lee, Y. Chen, and Y. Huang, "A Low-Power Low-Cost Fully-Integrated 60-GHz Transceiver System With OOK Modulation and On-Board Antenna Assembly," *Solid-State Circuits, IEEE Journal of*, vol. 45, no. 2, pp. 264–275, Feb 2010.
- [23] J. Chen, "Advanced Architectures for Efficient mm-Wave CMOS Wireless Transmitters," Ph.D. dissertation, EECS Department, University of California, Berkeley, May 2014. [Online]. Available: http://www.eecs.berkeley.edu/Pubs/ TechRpts/2014/EECS-2014-42.html
- [24] A. Tomkins, R. Aroca, T. Yamamoto, S. Nicolson, Y. Doi, and S. Voinigescu, "A Zero-IF 60 GHz 65 nm CMOS Transceiver With Direct BPSK Modulation Demonstrating up to 6 Gb/s Data Rates Over a 2 m Wireless Link," *Solid-State Circuits, IEEE Journal of*, vol. 44, no. 8, pp. 2085–2099, Aug 2009.
- [25] T. Mitomo, Y. Tsutsumi, H. Hoshino, M. Hosoya, T. Wang, Y. Tsubouchi, R. Tachibana, A. Sai, Y. Kobayashi, D. Kurose, T. Ito, K. Ban, T. Tandai, and T. Tomizawa, "A 2Gb/s-throughput CMOS transceiver chipset with in-package antenna for 60GHz short-range wireless communication," in *Solid-State Circuits Conference Digest of Technical Papers (ISSCC), 2012 IEEE International*, Feb 2012, pp. 266–268.
- [26] K. Okada, K. Kondou, M. Miyahara, M. Shinagawa, H. Asada, R. Minami, T. Yamaguchi, A. Musa, Y. Tsukui, Y. Asakura, S. Tamonoki, H. Yamagishi, Y. Hino, T. Sato, H. Sakaguchi, N. Shimasaki, T. Ito, Y. Takeuchi, N. Li, Q. Bu, R. Murakami, K. Bunsen, K. Matsushita, M. Noda, and A. Matsuzawa, "Full Four-Channel 6.3-Gb/s 60-GHz CMOS Transceiver With Low-Power Analog and Digital Baseband Circuitry," *Solid-State Circuits, IEEE Journal of*, vol. 48, no. 1, pp. 46–65, Jan 2013.
- [27] I. Sarkas, S. Nicolson, A. Tomkins, E. Laskin, P. Chevalier, B. Sautreuil, and S. Voinigescu, "An 18-Gb/s, Direct QPSK Modulation SiGe BiCMOS Transceiver for Last Mile Links in the 70-80 GHz Band," *Solid-State Circuits*, *IEEE Journal of*, vol. 45, no. 10, pp. 1968–1980, Oct 2010.
- [28] V. Vidojkovic, V. Szortyka, K. Khalaf, G. Mangraviti, S. Brebels, W. Van Thillo, K. Vaesen, B. Parvais, V. Issakov, M. Libois, M. Matsuo, J. Long, C. Soens, and P. Wambacq, "A low-power radio chipset in 40nm LP CMOS with beamforming for 60GHz high-data-rate wireless communication," in *Solid-State Circuits Conference Digest of Technical Papers (ISSCC), 2013 IEEE International*, Feb 2013, pp. 236–237.

- [29] K. Okada, R. Minami, Y. Tsukui, S. Kawai, Y. Seo, S. Sato, S. Kondo, T. Ueno, Y. Takeuchi, T. Yamaguchi, A. Musa, R. Wu, M. Miyahara, and A. Matsuzawa, "A 64-QAM 60GHz CMOS transceiver with 4-channel bonding," in *Solid-State Circuits Conference Digest of Technical Papers (ISSCC), 2014 IEEE International*, Feb 2014, pp. 346–347.
- [30] L. Kong and E. Alon, "A 21.5mW 10+Gb/s mm-Wave phased-array transmitter in 65nm CMOS," in VLSI Circuits (VLSIC), 2012 Symposium on, June 2012, pp. 52–53.
- [31] L. Kong, D. Seo, and E. Alon, "A 50mW-TX 65mW-RX 60GHz 4-element phased-array transceiver with integrated antennas in 65nm CMOS," in *Solid-State Circuits Conference Digest of Technical Papers (ISSCC), 2013 IEEE International*, Feb 2013, pp. 234–235.
- [32] J. Chen, L. Ye, D. Titz, F. Gianesello, R. Pilard, A. Cathelin, F. Ferrero, C. Luxey, and A. Niknejad, "A digitally modulated mm-Wave cartesian beamforming transmitter with quadrature spatial combining," in *Solid-State Circuits Conference Digest of Technical Papers (ISSCC), 2013 IEEE International*, Feb 2013, pp. 232–233.
- [33] A. Balteanu, I. Sarkas, E. Dacquay, A. Tomkins, G. Rebeiz, P. Asbeck, and S. Voinigescu, "A 2-Bit, 24 dBm, Millimeter-Wave SOI CMOS Power-DAC Cell for Watt-Level High-Efficiency, Fully Digital m-ary QAM Transmitters," *Solid-State Circuits, IEEE Journal of*, vol. 48, no. 5, pp. 1126–1137, May 2013.
- [34] K. Khalaf, V. Vidojkovic, K. Vaesen, J. Long, W. Van Thillo, and P. Wambacq, "A digitally modulated 60GHz polar transmitter in 40nm CMOS," in *Radio Frequency Integrated Circuits Symposium, 2014 IEEE*, June 2014, pp. 159–162.
- [35] S. Shopov, A. Balteanu, and S. Voinigescu, "A 19 dBm, 15 Gbaud, 9 bit SOI CMOS Power-DAC Cell for High-Order QAM W-Band Transmitters," *Solid-State Circuits, IEEE Journal of*, vol. 49, no. 7, pp. 1653–1664, July 2014.
- [36] T.-A. Phan, J. Lee, V. Krizhanovskii, S.-K. Han, and S.-G. Lee, "A 18-pJ/Pulse OOK CMOS Transmitter for Multiband UWB Impulse Radio," *Microwave and Wireless Components Letters, IEEE*, vol. 17, no. 9, pp. 688–690, Sept 2007.
- [37] J. Zhao, K. Hadipour, A. Ghilioni, M. Bassi, A. Mazzanti, and F. Svelto, "A Highly-Integrated Low-Power 10Gbps OOK Receiver for mm-Wave Short-Haul Wireless Link in CMOS 28nm," *International Journal of Electronics and Electrical Engineering*, vol. 2, no. 1, pp. 60–64, March 2014.
- [38] J. Nanzer, Microwave and Millimeter-Wave Remote Sensing for Security Applications. Artech House, 2012.

112

- [39] D. Chowdhury, P. Reynaert, and A. Niknejad, "A 60GHz 1V + 12.3dBm Transformer-Coupled Wideband PA in 90nm CMOS," in *Solid-State Circuits Conference, 2008. ISSCC 2008. Digest of Technical Papers. IEEE International*, Feb 2008, pp. 560–635.
- [40] W. Chan, J. Long, M. Spirito, and J. Pekarik, "A 60GHz-band 1V 11.5dBm power amplifier with 11% PAE in 65nm CMOS," in *Solid-State Circuits Conference - Digest of Technical Papers, 2009. ISSCC 2009. IEEE International*, Feb 2009, pp. 380–381,381a.
- [41] J. Chen and A. Niknejad, "A compact 1V 18.6dBm 60GHz power amplifier in 65nm CMOS," in Solid-State Circuits Conference Digest of Technical Papers (ISSCC), 2011 IEEE International, Feb 2011, pp. 432–433.
- [42] J. Zhao, M. Bassi, A. Bevilacqua, A. Ghilioni, A. Mazzanti, and F. Svelto, "A 40-67GHz power amplifier with 13dBm PSAT and 16% PAE in 28 nm CMOS LP," in *European Solid State Circuits Conference (ESSCIRC), ESSCIRC 2014* 40th, Sept 2014, pp. 179–182.
- [43] H. Wang, C. Sideris, and A. Hajimiri, "A CMOS Broadband Power Amplifier With a Transformer-Based High-Order Output Matching Network," *Solid-State Circuits, IEEE Journal of*, vol. 45, no. 12, pp. 2709–2722, Dec 2010.
- [44] S. Thyagarajan, A. Niknejad, and C. Hull, "A 60 GHz Drain-Source Neutralized Wideband Linear Power Amplifier in 28 nm CMOS," *Circuits and Systems I: Regular Papers, IEEE Transactions on*, vol. 61, no. 8, pp. 2253–2262, Aug 2014.
- [45] K. Raczkowski, S. Thijs, W. De Raedt, B. Nauwelaers, and P. Wambacq, "50-to-67GHz ESD-protected power amplifiers in digital 45nm LP CMOS," in Solid-State Circuits Conference - Digest of Technical Papers, 2009. ISSCC 2009. IEEE International, Feb 2009, pp. 382–383,383a.
- [46] L. Besser and R. Gilmore, Practical RF Circuit Design for Modern Wireless Systems, Volume I: Passive Circuits and Systems. Artech House, 2003.
- [47] S. Cripps, RF Power Amplifiers for Wireless Communications, 2nd ed. Artech House, 2006.
- [48] S. C. Cripps, Advanced Techniques in RF Power Amplifier Design. Artech House, 2002.
- [49] J. Long, "Monolithic transformers for silicon RF IC design," Solid-State Circuits, IEEE Journal of, vol. 35, no. 9, pp. 1368–1382, Sept 2000.
- [50] S. Thyagarajan, A. Niknejad, and C. Hull, "A 60 GHz linear wideband power amplifier using cascode neutralization in 28 nm CMOS," in *Custom Integrated Circuits Conference (CICC)*, 2013 IEEE, Sept 2013, pp. 1–4.

- [51] T. Lee, The Design of CMOS Radio-Frequency Integrated Circuits. Cambridge University Press, 2004.
- [52] I. Aoki, S. Kee, D. Rutledge, and A. Hajimiri, "Distributed active transformera new power-combining and impedance-transformation technique," *Microwave Theory and Techniques, IEEE Transactions on*, vol. 50, no. 1, pp. 316–331, Jan 2002.
- [53] S. Kim, K. Lee, J. Lee, B. Kim, S. Kee, I. Aoki, and D. Rutledge, "An optimized design of distributed active transformer," *Microwave Theory and Techniques*, *IEEE Transactions on*, vol. 53, no. 1, pp. 380–388, Jan 2005.
- [54] L. Samoska, K.-Y. Lin, H. Wang, Y.-H. Chung, M. Aust, S. Weinreb, and D. Dawson, "On the stability of millimeter-wave power amplifiers," in *Microwave Symposium Digest, 2002 IEEE MTT-S International*, vol. 1, June 2002, pp. 429–432 vol.1.
- [55] W. Chan and J. Long, "A 58-65 GHz Neutralized CMOS Power Amplifier With PAE Above 10% at 1-V Supply," *Solid-State Circuits, IEEE Journal of*, vol. 45, no. 3, pp. 554–564, March 2010.
- [56] R.M.Fano, "Theoretical limitations on the broadband matching of arbitrary impedances," Jan 1948.
- [57] N. Deferm, J. Osorio, A. de Graauw, and P. Reynaert, "A 94GHz differential power amplifier in 45nm LP CMOS," in *Radio Frequency Integrated Circuits* Symposium (RFIC), 2011 IEEE, June 2011, pp. 1–4.
- [58] B. Razavi, "A Millimeter-Wave CMOS Heterodyne Receiver With On-Chip LO and Divider," *Solid-State Circuits, IEEE Journal of*, vol. 43, no. 2, pp. 477–485, Feb 2008.
- [59] A. Siligaris, O. Richard, B. Martineau, C. Mounet, F. Chaix, R. Ferragut, C. Dehos, J. Lanteri, L. Dussopt, S. Yamamoto, R. Pilard, P. Busson, A. Cathelin, D. Belot, and P. Vincent, "A 65-nm CMOS Fully Integrated Transceiver Module for 60-GHz Wireless HD Applications," *Solid-State Circuits, IEEE Journal of*, vol. 46, no. 12, pp. 3005–3017, Dec 2011.
- [60] M. Abbasi, T. Kjellberg, A. de Graauw, E. van der Heijden, R. Roovers, and H. Zirath, "A broadband differential cascode power amplifier in 45 nm CMOS for high-speed 60 GHz system-on-chip," in *Radio Frequency Integrated Circuits* Symposium (RFIC), 2010 IEEE, May 2010, pp. 533–536.
- [61] T. Wang, T. Mitomo, N. Ono, and O. Watanabe, "A 55-67GHz power amplifier with 13.6% PAE in 65 nm standard CMOS," in *Radio Frequency Integrated Circuits Symposium (RFIC), 2011 IEEE*, June 2011, pp. 1–4.

- [62] G. Hanington, P.-F. Chen, P. Asbeck, and L. Larson, "High-efficiency power amplifier using dynamic power-supply voltage for CDMA applications," *Mi*crowave Theory and Techniques, IEEE Transactions on, vol. 47, no. 8, pp. 1471–1476, Aug 1999.
- [63] D. Fritsche, R. Wolf, and F. Ellinger, "Analysis and Design of a Stacked Power Amplifier With Very High Bandwidth," *Microwave Theory and Techniques*, *IEEE Transactions on*, vol. 60, no. 10, pp. 3223–3231, Oct 2012.
- [64] J.-H. Chen, S. Helmi, R. Azadegan, F. Aryanfar, and S. Mohammadi, "A Broadband Stacked Power Amplifier in 45-nm CMOS SOI Technology," *Solid-State Circuits, IEEE Journal of*, vol. 48, no. 11, pp. 2775–2784, Nov 2013.
- [65] P. Reynaert and A. Niknejad, "Power combining techniques for RF and mmwave CMOS power amplifiers," in *Solid State Circuits Conference*, 2007. ESS-CIRC 2007. 33rd European, Sept 2007, pp. 272–275.
- [66] K. H. An, O. Lee, H. Kim, D. H. Lee, J. Han, K. S. Yang, Y. Kim, J. J. Chang, W. Woo, C.-H. Lee, H. Kim, and J. Laskar, "Power-Combining Transformer Techniques for Fully-Integrated CMOS Power Amplifiers," *Solid-State Circuits*, *IEEE Journal of*, vol. 43, no. 5, pp. 1064–1075, May 2008.
- [67] C. Liang and B. Razavi, "Transmitter Linearization by Beamforming," Solid-State Circuits, IEEE Journal of, vol. 46, no. 9, pp. 1956–1969, Sept 2011.
- [68] J. Harvey, E. Brown, D. Rutledge, and R. York, "Spatial power combining for high-power transmitters," *Microwave Magazine*, *IEEE*, vol. 1, no. 4, pp. 48–59, Dec 2000.
- [69] A. Safarian, L. Zhou, and P. Heydari, "CMOS Distributed Active Power Combiners and Splitters for Multi-Antenna UWB Beamforming Transceivers," *Solid-State Circuits, IEEE Journal of*, vol. 42, no. 7, pp. 1481–1491, July 2007.
- [70] Y. Zhao, "High-Performance mm-Wave and Wideband Large-Signal Amplifiers," PhD dissertation, Delft University of Technology, October 2013.
- [71] C. Law and A.-V. Pham, "A high-gain 60GHz power amplifier with 20dBm output power in 90nm CMOS," in Solid-State Circuits Conference Digest of Technical Papers (ISSCC), 2010 IEEE International, Feb 2010, pp. 426–427.
- [72] Y.-S. Jiang, J.-H. Tsai, and H. Wang, "A W-Band Medium Power Amplifier in 90 nm CMOS," *Microwave and Wireless Components Letters, IEEE*, vol. 18, no. 12, pp. 818–820, Dec 2008.
- [73] Y.-H. Hsiao, Z.-M. Tsai, H.-C. Liao, J.-C. Kao, and H. Wang, "Millimeter-Wave CMOS Power Amplifiers With High Output Power and Wideband Performances," *Microwave Theory and Techniques, IEEE Transactions on*, vol. 61, no. 12, pp. 4520–4533, Dec 2013.

- [74] A.-K. Chen, Y. Baeyens, Y.-K. Chen, and J. Lin, "An 83-GHz High-Gain SiGe BiCMOS Power Amplifier Using Transmission-Line Current-Combining Technique," *Microwave Theory and Techniques, IEEE Transactions on*, vol. 61, no. 4, pp. 1557–1569, April 2013.
- [75] K.-Y. Wang, T.-Y. Chang, and C.-K. Wang, "A 1V 19.3dBm 79GHz power amplifier in 65nm CMOS," in Solid-State Circuits Conference Digest of Technical Papers (ISSCC), 2012 IEEE International, Feb 2012, pp. 260–262.
- [76] M. Bohsali and A. Niknejad, "Current combining 60GHz CMOS power amplifiers," in *Radio Frequency Integrated Circuits Symposium*, 2009. *RFIC 2009*. *IEEE*, June 2009, pp. 31–34.
- [77] K. H. An, O. Lee, H. Kim, D. H. Lee, J. Han, K. S. Yang, Y. Kim, J. J. Chang, W. Woo, C.-H. Lee, H. Kim, and J. Laskar, "Power-Combining Transformer Techniques for Fully-Integrated CMOS Power Amplifiers," *Solid-State Circuits*, *IEEE Journal of*, vol. 43, no. 5, pp. 1064–1075, May 2008.
- [78] M. Boers, "A 60GHz transformer coupled amplifier in 65nm digital CMOS," in *Radio Frequency Integrated Circuits Symposium (RFIC)*, 2010 IEEE, May 2010, pp. 343–346.
- [79] D. Chowdhury, P. Reynaert, and A. Niknejad, "Design Considerations for 60 GHz Transformer-Coupled CMOS Power Amplifiers," *Solid-State Circuits*, *IEEE Journal of*, vol. 44, no. 10, pp. 2733–2744, Oct 2009.
- [80] M. Beattie and L. Pileggi, "Modeling magnetic coupling for on-chip interconnect," in *Design Automation Conference*, 2001. Proceedings, 2001, pp. 335–340.
- [81] D. Zhao and P. Reynaert, "A 60-GHz Dual-Mode Class AB Power Amplifier in 40-nm CMOS," *Solid-State Circuits, IEEE Journal of*, vol. 48, no. 10, pp. 2323–2337, Oct 2013.
- [82] S. Kulkarni and P. Reynaert, "14.3 A Push-Pull mm-Wave power amplifier with < 0.8 ° AM-PM distortion in 40nm CMOS," in *Solid-State Circuits Conference Digest of Technical Papers (ISSCC), 2014 IEEE International*, Feb 2014, pp. 252–253.
- [83] A. Mazzanti and P. Andreani, "A Push-Pull Class-C CMOS VCO," Solid-State Circuits, IEEE Journal of, vol. 48, no. 3, pp. 724–732, March 2013.
- [84] K. Hadipour, A. Ghilioni, J. Zhao, and A. Mazzanti, "A Highly-Integrated Low-Power 10Gbps OOK Receiver for mm-Wave Short-Haul Wireless Link in CMOS 28nm," A Wide Tuning Range mm-Wave LC VCO, vol. 2, no. 1, pp. 70–74, March 2014.
- [85] J. W. Nilsson and S. Riedel, *Electric Circuits*, 9th ed. Prentice Hall, 2010.

116

- [86] C.-H. Liao and H.-R. Chuang, "A 5.7-GHz 0.18-um CMOS gain-controlled differential LNA with current reuse for WLAN receiver," *Microwave and Wireless Components Letters, IEEE*, vol. 13, no. 12, pp. 526–528, Dec 2003.
- [87] C.-Y. Cha and S.-G. Lee, "A low power, high gain LNA topology," in *Microwave and Millimeter Wave Technology*, 2000, 2nd International Conference on. ICMMT 2000, 2000, pp. 420–423.
- [88] T. Rappaport, J. Murdock, and F. Gutierrez, "State of the Art in 60-GHz Integrated Circuits and Systems for Wireless Communications," *Proceedings of* the IEEE, vol. 99, no. 8, pp. 1390–1436, Aug 2011.
- [89] K. Hadipour, "Design of building blocks of a high rate wireless transceiver for short range communications," Ph.D. dissertation, University of Pavia, 2014.
- [90] Z. Wang, "Full-wave precision rectification that is performed in current domain and very suitable for CMOS implementation," *Circuits and Systems I: Fundamental Theory and Applications, IEEE Transactions on*, vol. 39, no. 6, pp. 456–462, Jun 1992.
- [91] C. W. Byeon, C. H. Yoon, and C. S. Park, "A 67-mW 10.7-Gb/s 60-GHz OOK CMOS Transceiver for Short-Range Wireless Communications," *Microwave Theory and Techniques, IEEE Transactions on*, vol. 61, no. 9, pp. 3391–3401, Sept 2013.
- [92] S. Galal and B. Razavi, "10-Gb/s limiting amplifier and laser/modulator driver in 0.18-um CMOS technology," *Solid-State Circuits, IEEE Journal of*, vol. 38, no. 12, pp. 2138–2146, Dec 2003.
- [93] K. Kawasaki, Y. Akiyama, K. Komori, M. Uno, H. Takeuchi, T. Itagaki, Y. Hino, Y. Kawasaki, K. Ito, and A. Hajimiri, "A Millimeter-Wave Intra-Connect Solution," *Solid-State Circuits, IEEE Journal of*, vol. 45, no. 12, pp. 2655–2666, Dec 2010.
- [94] K. Okada, N. Li, K. Matsushita, K. Bunsen, R. Murakami, A. Musa, T. Sato, H. Asada, N. Takayama, S. Ito, W. Chaivipas, R. Minami, T. Yamaguchi, Y. Takeuchi, H. Yamagishi, M. Noda, and A. Matsuzawa, "A 60-GHz 16QAM/8PSK/QPSK/BPSK Direct-Conversion Transceiver for IEEE802.15.3c," *Solid-State Circuits, IEEE Journal of*, vol. 46, no. 12, pp. 2988–3004, Dec 2011.
- [95] T. Tsukizawa, N. Shirakata, T. Morita, K. Tanaka, J. Sato, Y. Morishita, M. Kanemaru, R. Kitamura, T. Shima, T. Nakatani, K. Miyanaga, T. Urushihara, H. Yoshikawa, T. Sakamoto, H. Motozuka, Y. Shirakawa, N. Yosoku, A. Yamamoto, R. Shiozaki, and N. Saito, "A fully integrated 60GHz CMOS transceiver chipset based on WiGig/IEEE802.11ad with built-in self calibration

for mobile applications," in Solid-State Circuits Conference Digest of Technical Papers (ISSCC), 2013 IEEE International, Feb 2013, pp. 230–231.