A 1.76 mW, 100 Mbps Impulse Radio UWB Receiver with Multiple Sampling Correlators Eliminating Need for Phase Synchronization in 65-nm CMOS

Lechang LIU†a), Zhiwei ZHOU†, Nonmembers, Takayasu SAKURAI†, Fellow, and Makoto TAKAMIYA††, Member

SUMMARY A low power impulse radio ultra-wideband (IR-UWB) receiver for DC-960 MHz band is proposed in this paper. The proposed receiver employs multiple DC power-free charge-domain sampling correlators to eliminate the need for phase synchronization. To alleviate BER degradation due to increased charge injection in a subtraction operation in the sampling correlator than that of an addition operation, a comparator with variable threshold (=offset) voltage is used, which enables an addition-only operation. The developed receiver fabricated in 1.2 V 65 nm CMOS achieves the lowest energy consumption of 17.6 pJ/bit at 100 Mbps in state-of-the-art correlation-based UWB receivers.

classification: impulse radio ultra-wideband (IR-UWB), charge-domain sampling correlators, charge injection, variable threshold comparator, low power

1. Introduction

Impulse radio ultra-wideband (IR-UWB) has been investigated as a radio technology that can achieve lower power than carrier-based UWB (e.g., OFDM-UWB [1]). IR-UWB receiver can be classified as threshold detection-based receiver [2],[3] and correlation-based receiver [4],[5]. Correlation-based UWB receiver attains superior noise performance and robust narrowband interference suppression at the expense of increased power consumption and increased circuit complexity over threshold detection-based receivers. This work aims to minimize power dissipation of correlation-based receiver by employing multiple DC power-free charge-domain sampling correlators to eliminate the need for phase synchronization. The variable threshold voltage comparator can eliminate the subtraction operation from the conventional sampling correlator to alleviate the BER degradation due to the increased charge injection in the subtraction circuits.

Overview of the proposed receiver architecture is described in Sect. 2. Section 3 presents the circuit implementations of the sampling correlators and the variable threshold voltage comparator. Experimental results are presented in Sect. 4 and Sect. 5 concludes the paper.

2. Impulse Radio UWB Receiver Architecture

In the conventional IR-UWB receiver, correlation operation is implemented in the continuous-time voltage domain [5] as shown in Fig. 1. Both the incoming signal \(v_s\) and the template \(v_t\) of the correlator are represented by the voltage. To reduce the power consumption of the analog voltage multiplier, the correlator in the proposed receiver is designed in the discrete-time charge domain. In the discrete-time charge domain, the incoming signal and the template are represented by the voltage \(v_s\) and the capacitance \(c_t\), respectively. The multiplication of \(v_s\) and \(c_t\) is performed by the charge of the sampling correlator and the analog voltage multiplier is not required, thereby achieving the lower power than the conventional continuous-time voltage correlator.

In Figs. 2(a) and (b), the conventional [4] and the proposed receiver architecture with the sampling correlator are compared. The conventional receiver in Fig. 2(a) has single sampling correlator and a phase synchronization circuit. The single sampling correlator has both addition and subtraction circuits as shown in Fig. 3, because the charge domain correlation is performed by both addition and subtraction of charges. \(\phi_A\) and \(\phi_E\) show sampling clocks with different phases. The new receiver architecture with multiple sampling correlators eliminating the need for phase synchronization and the subtraction circuits is proposed and

![Fig. 1](conventional_continuous_time_voltage_domain.png) Conventional continuous-time voltage correlator and proposed discrete-time charge domain correlator.
shown in Fig. 2(b) (The front-end amplifier was not implemented in this work). The received single input is first converted to differential outputs by the front-end amplifier and then correlated with the discrete templates in the multiple sampling correlators. In the multiple sampling correlators, 4 sets of the sampling correlator always perform the correlation operation with 1-ns phase difference, thereby eliminating the power consuming phase synchronization circuit. 4-input comparators with built-in digitally tunable threshold voltages performs the subtraction function instead of the sampling correlator in the conventional architecture, thereby eliminating the subtraction circuits in the sampling correlator and preventing the BER degradation.

Figure 2(c) shows the correlator output waveforms of the proposed receiver for data “1” and data “0.” Two 4-input comparators in Fig. 2(b) are used to detect data “1” and data “0” respectively. When the upper correlator output (V1 − V2) is higher than the lower correlator output (V1b − V2b) by the threshold (offset) voltage (V_{TH1}) of the comparator 1, the output data is decided as “1” by the comparator 1. On the contrary, when the lower correlator output (V1b − V2b) is higher than the upper correlator output (V1 − V2) by the threshold (V_{TH2}) of the comparator 2, the output data is judged as “0” by the comparator 2. V_{TH1} should be set to correctly detect the data “1,” and V_{TH2} should be set to correctly detect the data “0.” The V_{TH1} and V_{TH2} tuning circuit was not implemented in this work.

3. Circuit Implementation

3.1 DC Power-Free Pulse Discriminator

Figures 4(a) and (b) show the circuits and the timing chart of the proposed multiple sampling correlators, respectively. The multiple sampling correlators have 4 sets of single sampling correlator and each single sampling correlator is composed of four capacitors and eight switches. The proposed multiple sampling correlators have only addition operation in order to alleviate BER degradation due to an increased charge injection in the subtraction operation in the conventional sampling correlator than that of the addition operation. The received 100-Mbps Gaussian first-order derivative pulse with 4-ns width is sampled at 1 GSa/s and correlated with the discrete templates in the multiple sampling correlators. The discrete templates are determined by the capacitance values of the sampling correlators. Sampling operation is implemented by turning on the switches controlled by clocks φ1, φ3, φ5, and φ7 sequentially and sampling results are stored in the corresponding capacitors respectively. Summing and averaging results of the four sampling correlators are calculated and dumped to the following comparators by turning on the switches controlled by clocks φ2, φ4, φ6, and φ8 sequentially. In this way, 4 sets of the single sampling correlator always perform the correlation operation with 1-ns phase difference, thereby eliminating the power consuming phase synchronization circuit.

In Fig. 5, the conventional [4] and the proposed implementation of the correlation calculation are compared in or-
The design of the sampling rate of the proposed multiple sampling correlators is discussed. The sampling rate determines the trade-off relationship between the BER and the power consumption. Lower BER can be achieved with higher sampling rate. However, the number of the capacitors is proportional to the square of the sampling rate and therefore the power consumption of the correlator is proportional to the cubic of the sampling rate. For example, 1 GSa/s requires 4 correlators and 4 \times 4 capacitors while 2 GSa/s requires 8 correlators and 8 \times 8 capacitors. Figure 6(a) shows the simulated BER dependence on $E_\text{b}/N_0$ with various sampling rates under the perfect phase synchronization condition ($\Delta T/T = 0\%$). The higher sampling rate achieves the lower BER at the cost of high power. BER is also determined by the timing mismatch $\Delta T$ between the incoming signal and the templates of the sampling correlators. Figure 6(b) shows the simulated BER dependence on $\Delta T/T$ with various $E_\text{b}/N_0$ at 1 GSa/s. When $E_\text{b}/N_0$ is higher than 2.5 dB, BER is less than $3 \times 10^{-3}$. In this way, 1 GSa/s can tolerate any $\Delta T$ and eliminate the need for phase synchronization. In this work, therefore, 1 GSa/s is adopted to minimize the power consumption of the multiple sampling correlators with a reasonable BER.

3.2 Variable Threshold Voltage Comparator

The developed transceiver is proposed for ad hoc and sensor network where groups of wireless terminals are located in a limited area and communicate in an infrastructure-free fashion without any central coordinating unit or base-station. The received signal strength can be affected by path loss and multi-path effect, which are largely depending on the distance between the transmitter and receiver antennas. Variable gain amplifier (VGA) is required in the real application. However in this work VGA is not implemented and

![Fig. 4: Proposed multiple sampling correlators. (a) Circuit schematic. (b) Timing chart.](image1)

![Fig. 5: Implementation of correlation calculation. (a) Conventional implementation [4]. (b) Proposed implementation.](image2)
therefore the variable threshold voltage comparator is used.

In this section, the circuit implementation of variable threshold (=offset) voltage comparator to eliminate the subtraction operation from the conventional sampling correlator is shown. Figure 7 shows the four-input comparator with built-in digitally tunable thresholds based on variable resistance method. The comparator is based on [6], [7] and digital tunability is added to [7]. The four inputs are connected to \( V_1 \), \( V_2 \), \( V_{\text{ref}1} \), and \( V_{\text{ref}2} \) in Fig. 2(b). In Fig. 7, the lower set of nMOSFET’s operate in the triode region and they are connected to the input and the reference voltages (\( V_{\text{ref}1} \) and \( V_{\text{ref}2} \)). The threshold voltage (\( V_{\text{TH}} \)) of the comparator is determined by Eq. (1) [7]:

\[
V_{\text{TH}} = (V_{\text{ref}2} - V_{\text{ref}1}) \frac{W_{\text{total}}}{L_L} \frac{1}{W_1/L_1}
\]

The \( W_{\text{total}} \) in Eq. (1) can be calculated as:

\[
W_{\text{total}} = (a_0 + 2 \times a_1 + 4 \times a_2 + 8 \times a_3)W
\]

where \( a_i = 0 \) (\( i = 0, \ldots, 3 \)) when the corresponding switch is turned to ground and \( a_i = 0 \) when the switch is turned to \( V_{\text{ref}1}/V_{\text{ref}2} \).

Therefore, by changing \( W_{\text{total}} \) by changing the gate voltage of the lower set of nMOSFET’s with the switches, the threshold voltage of the comparator is varied digitally with a good linearity. The comparator has 4 bits for the variable threshold voltage.

### 4. Experimental Results

The proposed UWB receiver shown in Fig. 2(b) except LNA was designed and fabricated in 1.2 V 65 nm CMOS process. The chip micrograph is shown in Fig. 8. The die size is 1050 \( \times \) 670 \( \mu \)m\(^2\) and the core area is 110 \( \times \) 30 \( \mu \)m\(^2\).

Figure 9 shows the measured threshold (=offset) voltage dependence on the 4-bit digital control of the proposed comparator. Three dies are measured. The die-to-die variation could be compensated by the digital calibration using the digital control bit. The deviation and the notch of the measured comparator threshold voltage are due to the size of the lower set of transistors operating in the triode region. In this work, small size is used to minimize the power consumption of the proposed comparator. The linearity of the comparator threshold can be improved by increasing the size of the transistors. The impact of the nonlinearity for the system is compensated by off-chip calibration. The calibration must monitor the input noise signal and adaptively set a threshold such that only a small percentage of false detections will occur.

Figure 10 shows the measured output of the receiver with different comparator thresholds (\( V_{\text{TH1}} \) and \( V_{\text{TH2}} \)). Input BPSK signal was generated with an arbitrary waveform generator. As shown in Fig. 10(a), when \( V_{\text{TH1}} \) and \( V_{\text{TH2}} \) are...
Fig. 9 Measured threshold (=offset) voltage dependence on 4-bit digital control of proposed variable threshold comparator.

Fig. 10 Measured output of the receiver with different comparator thresholds ($V_{TH1}$ and $V_{TH2}$). (a) Incorrect operation with inappropriate $V_{TH1}$ and $V_{TH2}$. (b) Correct operation with appropriate $V_{TH1}$ and $V_{TH2}$.

too small, the output of the receiver is not correct, because the auxiliary pulse of the correlation result is also detected and the receiver output is corrupted by the auxiliary pulse. In contrast, as shown in Fig. 10(b), when $V_{TH1}$ and $V_{TH2}$ are increased, the receiver returns to correct operation. The $V_{TH1}$ and $V_{TH2}$ adaptive tuning circuit was not implemented in this work.

The chip performance is summarized in Table 1. The power consumption without amplifier is 0.96 mW at 100 Mbps. In the correlation-based UWB receiver, noise figure (NF) can be traded-off for power consumption in the front-end gain circuits without degrading performance [13]. The simulated power consumption of the front-end amplifier is 0.8 mW and the estimated total power consumption of the receiver is 1.76 mW, which corresponds to 17.6 pJ/bit. Figure 11 shows energy and power comparison with state-of-the-art correlation-based UWB receivers. The proposed receiver with multiple sampling correlators eliminating phase synchronization achieves the lowest energy consumption of 17.6 pJ/bit at 100 Mbps in the correlation-based UWB receivers.

### Table 1 Performance summary of fabricated receiver.

<table>
<thead>
<tr>
<th>Technology</th>
<th>65nm CMOS</th>
</tr>
</thead>
<tbody>
<tr>
<td>Supply Voltage</td>
<td>1.2V</td>
</tr>
<tr>
<td>Input Signal</td>
<td>0–960MHz</td>
</tr>
<tr>
<td>Pulse Width</td>
<td>4ns</td>
</tr>
<tr>
<td>Sampling Rate</td>
<td>1GHz</td>
</tr>
<tr>
<td>Comparator $V_{TH}$</td>
<td>23mV/bit</td>
</tr>
<tr>
<td>Data Rate</td>
<td>100Mbps</td>
</tr>
<tr>
<td>Power</td>
<td>0.86mW</td>
</tr>
<tr>
<td>Energy per bit</td>
<td>9.6pJ/bit (w/o Amp)</td>
</tr>
<tr>
<td>Core Area</td>
<td>3309 $\mu$m$^2$</td>
</tr>
</tbody>
</table>

Fig. 11 Energy and power comparison with state-of-the-art correlation-based UWB receivers.

5. Conclusions

A 1.2 V 100 Mbps IR-UWB receiver in 65 nm CMOS for DC-960 MHz band is developed. The proposed receiver features multiple DC power-free charge-domain sampling correlators to eliminate the need for phase synchronization, thereby achieving the lowest energy of 17.6 pJ/bit in state-of-the-art correlation-based UWB receivers.
Acknowledgments

This work is partially supported by CREST/JST. The fabrication of VLSI chips is supported by VLSI Design and Education Center (VDEC), the University of Tokyo in collaboration with STARC, e-Shuttle, and Fujitsu.

References


Lechang Liu received the B.S. degree from Shandong University, China in 2000, the M.S. degree from Harbin Institute of Technology, China in 2002, and the Ph.D. degree in electronic engineering from Shanghai Jiao Tong University, China in 2006. From 2007 to 2009, he was with VLSI Design and Education Center, The University of Tokyo, Japan, where he conducted research on low-power and high-performance impulse radio ultra-wideband (UWB) transceivers. Since 2009, he has been a project researcher with the Institute of Industrial Science, The University of Tokyo, Japan. His current research interests include signal processing and mixed-signal circuit design for low-power wireless communications.

Zhiwei Zhou received the M.S. degrees in electronic engineering from the University of Tokyo, Japan in 2008. His research interests include the circuit design of the low-power RF receiver circuits. He is now with Sony Corporation.

Takayasu Sakurai received the Ph.D. degree in EE from the University of Tokyo in 1981. In 1981 he joined Toshiba Corporation, where he designed CMOS DRAM, SRAM, RISC processors, DSPs, and SoC Solutions. He has worked extensively on interconnect delay and capacitance modeling known as Sakurai model and alpha power-law MOS model. From 1988 through 1990, he was a visiting researcher at the University of California Berkeley, where he conducted research in the field of VLSI CAD. From 1996, he has been a professor at the University of Tokyo, working on low-power high-speed VLSI, memory design, interconnects, ubiquitous electronics, organic IC’s and large-area electronics. He has published more than 400 technical publications including 100 invited presentations and several books and filed more than 200 patents. He served as a conference chair for the Symp. on VLSI Circuits, and ICICDT, a vice chair for ASPDAC, a TPC chair for the first A-SSCC, and VLSI symp. and a program committee member for ISSCC, CICC, A-SSCC, DAC, ESSCIRC, ICCAD, ISLLED, and other international conferences. He will be an executive committee chair for VLSI Sympiosa and a steering committee chair for A-SSCC from 2010. He is a recipient of 2010 IEEE Donald O. Pederson Award in Solid-State Circuits, 2009 achievement award of IEICE, 2005 IEEE ICICDT award, 2004 IEEE Takuo Sugano award and 2005 P&I patent of the year award and four product awards. He gave keynote speech at more than 50 conferences including ISSCC, ESSCIRC and ISLLED. He was an elected AdCom member for the IEEE Solid-State Circuits Society and an IEEE CAS and SSCS distinguished lecturer. He is a STARC Fellow and an IEEE Fellow.
Makoto Takamiya received the B.S., M.S., and Ph.D. degrees in electronic engineering from the University of Tokyo, Japan, in 1995, 1997, and 2000, respectively. In 2000, he joined NEC Corporation, Japan, where he was engaged in the circuit design of high speed digital LSIs. In 2005, he joined University of Tokyo, Japan, where he is an associate professor of VLSI Design and Education Center. His research interests include the circuit design of the low-power RF circuits, the ultra low-voltage digital circuits, and the large area electronics with organic transistors. He is a member of the technical program committee for IEEE Symposium on VLSI Circuits and IEEE Custom Integrated Circuits Conference (CICC).