

Received 2 September 2024, accepted 20 September 2024, date of publication 30 September 2024, date of current version 10 October 2024.

Digital Object Identifier 10.1109/ACCESS.2024.3471168

## **RESEARCH ARTICLE**

# A 4 Gb/s Multi-Dot PIN-Photodiode-Based CMOS Optical Receiver Using a Single to Differential TIA-Equalizer

## BASET MESGARI<sup>®1</sup>, SEYED SAMAN KOHNEH POUSHI<sup>®1,2</sup>, (Member, IEEE), AND HORST ZIMMERMANN<sup>®1</sup>

<sup>1</sup>Institute of Electrodynamics, Microwave and Circuit Engineering, TU Wien, 1040 Vienna, Austria
<sup>2</sup>Electronic Sensors Unit, Silicon Austria Labs, 8010 Graz, Austria

Corresponding author: Baset Mesgari (baset.mesgari@tuwien.ac.at)

This work was supported by the Vienna University of Technology (TU Wien) Bibliothek through the Open Access Funding Program.

**ABSTRACT** This paper outlines the development of an optical receiver capable of handling data at a rate of 4 Gb/s. The receiver makes use of a multi-dot PIN CMOS photodiode with a bandwidth of 930 MHz (capacitance of 48.8fF) and a responsivity of 0.294 A/W at a wavelength of 675 nm. It also features a single-to-differential (SDT) noise-suppressed transimpedance amplifier (TIA) equalizer. By implementing a low-frequency zero synthesis within the STD-TIA feedback path, the 3-dB frequency roll-off of the photodiode is extended by a factor of 2.63, resulting in an overall front-end bandwidth of 2.45 GHz with a transimpedance gain of 84 dB $\Omega$ . The SDT TIA eliminates the need for a dummy TIA, improving the noise performance of the receiver achieving the integrated input-referred current noise for the entire front-end to less than 717 nA rms. Additionally, a detailed theoretical analysis of equalization methods, as well as the impact of inter-symbol interference (ISI) and noise on bit error rate (BER) degradation is presented. The receiver has been successfully tested for data transmission at rates of 4 Gb/s, 3 Gb/s, and 2.5 Gb/s, achieving bit-error ratios of less than 10<sup>-9</sup> at minimum average optical powers of -16.2 dBm, -17.2 dBm, and -18 dBm, respectively. Furthermore, the receiver consumes 28 mA from a 3.3V power supply and occupies a core area of 1.4 mm  $\times$  0.7 mm.

**INDEX TERMS** Analog equalizer, CMOS optical receiver, intersymbol interference, noise-suppressed transimpedance amplifier, optoelectronic integrated circuit, PIN photodiode, single-to-differential TIA.

## **I. INTRODUCTION**

As the demand for high-speed and cost-effective data transmission continues to grow, and electrical links are increasingly being replaced by optical interconnects due to the frequency-dependent limitations of channel loss [1], [2], [3], [4], [5]. The shift is especially noticeable in short-distance applications such as chip-to-chip interconnects. Other growing optoelectronic applications are light detection and ranging (LiDARs), optical sensors, visible light communication (VLC), and wireless optics [6], [7], [8], [9], [10], [11], [12], [13].

In recent years, there has been significant growth in optical communication systems operating at

The associate editor coordinating the review of this manuscript and approving it for publication was Tianhua Xu<sup>(D)</sup>.

gigabit-per-second (Gb/s) speeds. For these applications, fully integrated optical receivers operating in the 600-850 nm wavelength range, implemented using standard complementary metal-oxide-semiconductor (CMOS) or BiCMOS technology, offer substantial advantages in terms of cost-effective fabrication and manufacturability [14], [15], [16], [17], [18], [19], [20], [21]. Fully integrated optical detectors provide a significant advantage by mitigating issues such as common-mode ringing, electrostatic discharge (ESD) challenges, bond pad capacitances, and bond-wire-induced complications. These challenges are frequently encountered when dealing with electrical input signals, particularly in the context of multi-die optical detectors and front-end circuits.

However, one of the fundamental challenges in developing optoelectronic integrated circuits (OEICs) within silicon technology lies in the interaction of light with the silicon substrate. The inherent properties of standard CMOS technologies can lead to a diminished photodiode (PD) response, primarily due to carrier diffusion. This results in slow diffusion currents restricting the PDs' operational speed to the tens of megahertz (MHz) range [20], [21].

To address these limitations, considerable efforts have been made to enhance the PD's speed. Notably, several high-speed monolithically integrated optical receivers have been realized in CMOS technology, employing approaches such as specially designed PD structures like spatially-modulated PDs (SMPDs) [12], [13], [14], multi-dot structure [22], avalanche PDs (APD) [17], [23] or incorporating electronic equalizers [12], [13], [14], [15], [16], [17], [18], [19], [20], [21], [23].

The equalizer's design and functionality are crucial for addressing the bandwidth limitations of the integrated PD. To effectively mitigate these limitations, the equalizer must possess characteristics that precisely counterbalance the PD's intrinsic response. This is typically achieved through the implementation of weighted parallel branches of first-order filters. For optimal performance, the equalizer's magnitude response should be the inverse of the PD's response, ensuring a consistent magnitude across all frequencies. This approach ensures that all frequency components of the received light pulse are uniformly amplified. Additionally, to maintain phase integrity across the frequency spectrum, the equalizer must counteract the PD's phase response, resulting in a flat overall phase delay response in the frequency domain [20], [21].

An alternative approach to equalizer implementation involves determining its optimal placement within the transimpedance amplifier (TIA). Each option uniquely affects system performance. Integrating the equalizer within the TIA addresses bandwidth limitations early, while limitingamplification (LA) placement allows for refined signal correction. Positioning it between the TIA and LA offers a compromise, balancing early compensation with subsequent amplification. The choice hinges on specific design goals, including noise, linearity, and system complexity.

In [14], the speed of the PD was enhanced using a SMPD technique, resulting in a 6.9 GHz 3-dB bandwidth. This design achieved a data rate of 10 Gb/s with an optical sensitivity of -6 dBm at a bit-error rate (BER) of  $10^{-11}$  in 180 nm CMOS, without the need for an electrical equalizer. Conversely, in [13], a CMOS N-well/P-sub PD was followed by a two-stage continuous-time linear equalizer, which utilized multiple active shunt-shunt feedback networks combined with a TIA, implemented in 65 nm CMOS. In [12], a 0.18  $\mu$ m CMOS optical receiver is described. It features a monolithically integrated spatially modulated light (SML) PD and an analog equalizer placed after TIA. The receiver achieves up to 5 Gbps with an optical sensitivity of -3 dBm at a BER of  $10^{-11}$  at a PD responsivity of 0.052 A/W.

In the studies by [20] and [21], a standard CMOS process is used to create a N-well/P-substrate (PN) photodiode. By using an electrical equalizer after the TIA, the system achieves speeds of 3 Gb/s and 4.5 Gb/s, with BER of  $10^{-11}$  (at optical sensitivity of -19 dBm) and  $10^{-12}$  (at optical sensitivity of -3.4 dBm) respectively. Some designs [18], [19] have implemented an OEIC using a BiCMOS process, achieving higher speed and lower noise of bipolar transistors, with data rates of 5 and 11 Gb/s. Additionally, they have demonstrated a higher 3-dB bandwidth of the PIN PD.

Another approach is to create a peaking effect in the frequency response of the TIA without adding any first-order filter, resulting in an equalized bandwidth for the TIA and PD combination, though at the cost of increased noise [23], [24], [25]. In [23], a CMOS APD with a 3-dB bandwidth of 4.7 GHz, combined with an under-damped shunt feedback (SF) TIA, was employed to enhance the APD bandwidth by a factor of 1.27.

In this paper, we introduce the development of a 4 Gb/s optical receiver that utilizes a multi-dot PIN photodiode (MD PIN) with a 930 MHz bandwidth and a capacitance of 50 fF, fabricated in 350 nm CMOS technology as referenced in [26]. We perform a thorough circuit analysis to explore how a TIA-equalizer can improve the overall bandwidth of the PD receiver, the associated trade-offs, and the potential bandwidth enhancement achievable. Furthermore, we examine the optimal placement strategy for the equalizer, as well as the relationship between bandwidth enhancement, bit error rate (BER), and the sensitivity of the entire receiver.

The design achieves a 2.63x bandwidth enhancement through the implementation of a single-to-differential, noise-suppressed TIA. In this design, we introduce a feedback capacitor and transconductance control current source within the feedback loop of the single-to-differential TIA (SD-TIA) structure proposed in [27]. This improvement to this structure effectively creates a zero near the 3-dB bandwidth of the MD PIN photodiode. This adjustment is intended to compensate for the inherent speed limitations by equalizing the frequency response, thereby extending the bandwidth and improving the overall performance.

We suggest in this paper a detailed approach to how a TIA should be designed in terms of its bandwidth  $(\omega_{3dB_{TIA}})$  and quality factor  $(Q_{TIA})$ . This investigation has been formalized based on the bandwidth enhancement ratio x, where the receiver's total bandwidth is divided by the PD's bandwidth  $(x = \omega_{3dB,RX}/\omega_{3dB,PD})$ . We also discuss the limitations and trade-offs involved in this process, particularly between noise performance and  $Q_{TIA}$ . The paper is organized as follows. The analysis of the Single-to-differential (SDT) receiver is presented in section II. Section III is devoted to the results of the measurement of the entire receiver. Finally, conclusions are provided in section IV.

## II. CMOS INTEGRATED SINGLE-TO-DIFFERENTIAL OPTICAL RECEIVER

### A. CMOS MD-PIN-PD

In optical wireless communication (OWC) receivers, a photodiode with a large photodetection area is advantageous to relieve alignment issues and improve the received

signal-to-noise ratio (SNR) by capturing more optical power from the incident light. This is particularly crucial since the received light spot in OWC is often larger than the photo-sensitive area of the receiver. Various optical receivers employing large PDs have been demonstrated in the literatures [28], [29], [30], and [31]. However, enlarging the photodetection area of PDs typically leads to increased parasitic capacitance. To address the inherent trade-off between light-sensitive area and capacitance in conventional planar photodiodes, a novel design approach based on a multi-dot structure has been introduced [22]. This approach aims to enlarge the sensitive area while maintaining a small capacitance suitable for use in optical communication systems. Using this methodology, we developed a standard CMOS based  $5 \times 5$  multi-dot PIN photodiode (MD-PIN-PD) featuring an active area of 70  $\mu$ m × 70  $\mu$ m [26].



FIGURE 1. 3D schematic drawing (not to scale) of the MD-PIN-PD.

Fig. 1 depicts a 3D schematic view of the  $5 \times 5$  array MD-PIN-PD structure. In this configuration, the MD-PIN structure consists of a  $5 \times 5$  array of semi-hemispherical highly-doped n+ regions, each with a radius of 2  $\mu$ m, functioning as cathode dots. These dots are interconnected by traces in metal layer 4, each with a minimum width of 0.6  $\mu$ m, as illustrated in Fig. 1. The cathode dots are embedded within a lightly p-doped epitaxial layer (p- epi) with a doping concentration of approximately  $2 \times 10^{13}$  cm<sup>-3</sup> and a thickness of approximately 12  $\mu$ m. Surrounding this cathode array is a surface anode ring, delineating the overall diode size, and the p+ substrate serves as a common backside anode. It is noteworthy that the MD-PIN photodiode is fabricated using the 0.35  $\mu$ m CMOS modular optical sensor technology platform (XO035) from X-FAB semiconductor foundries, requiring no process modifications.

Unlike planar structures, the MD-PIN-PD exhibits a radial electric field distribution, facilitating both vertical and peripheral charge capture. Under reverse biasing, each cathode dot generates a spherical high electric field, while a less intense electric field extends radially across the diode, guiding charge carriers towards the cathode dots. As a result, the area beneath and between all cathodes functions as a detection zone. This feature significantly contributes to achieving a large light-sensitive area and efficient peripheral charge collection. Furthermore, the MD-PIN-PD exhibits low

capacitance due to the limited size of the p/n junctions. Specifically, the total capacitance comprises the cumulative capacitance of all cathode dots within the array, along with any parasitic capacitance stemming from metal tracks. Experimental results indicate that a 5  $\times$  5 multi-dot PIN photodiode with a pitch of 15  $\mu$ m, corresponding to an active area of 70  $\mu$ m  $\times$  70  $\mu$ m, achieves a capacitance of 48.8 fF and a responsivity of 0.294 A/W at a wavelength of 675 nm under an operating voltage of 10 V [26].

Fig. 2 presents the frequency and transient responses of the MD-PIN-PD. These measurements have been done using a modulated 675 nm laser source, which was coupled to the PD via a 50  $\mu$ m / 125  $\mu$ m multimode fiber. For AC measurements, the resulting photocurrents were directly measured on-chip using a 50  $\Omega$  ground-signal probe connected via a bias-tee to the 50  $\Omega$  terminations of the vector network analyzer. The step response was measured using a ground-signal probe connected via the bias-tee to a Tektronix TDSC6124C 12 GHz analog oscilloscope. A 5530B bias-tee from Picosecond (20 kHz to 12.5 GHz) was employed to apply a DC reverse voltage to the device. To compensate for line losses during AC measurements, a thorough calibration was performed beforehand using a New Focus 1580B (DC to 12 GHz) active photodetector. The meaurement results show a 3-dB bandwidth of 930 MHz and 170 ps rise time at an operating voltage of 10 V.



**FIGURE 2.** (a) Frequency response [26] and (b) transient response of the fabricated  $5 \times 5$  array MD-PIN-PD.

It should be mentioned that the superiority of this PD lies in expanding the light-sensitive area through the enlargement of the cathode dot array while maintaining low capacitance. It is important to note that the responsivity and frequency response of the MD-APD remain unaffected by the array size but are influenced by the distance between each cathode dot, known as the array pitch size. This is because the distribution of the electric field remains consistent regardless of the number of cathodes in the array, yet it changes with alterations in the spacing between the cathodes.

The ability to scale this photodiode up or down represents an exciting avenue that highlights the versatility of the multi-dot design for various applications, tailored to specific needs. In [26], we provided more details on the design approach and performance trade-offs associated with such photodiodes, providing valuable insights for future implementations.

### **B. PD BANDWIDTH ENHANCEMENT APPROACH**

As previously mentioned, the speed of the optical receiver is constrained by the bandwidth of the photodiode. The speed limitation of a CMOS PD primarily arises from two factors [23]: the large transit time ( $\tau_{tr}$ ) and the presence of parasitic capacitance ( $C_{pd}$ ) along with its series resistance ( $R_S$ ), as highlighted in the equivalent circuit model of the PD shown in Fig. 3.



FIGURE 3. Simplified model of the PD connected to a TIA.

The speed limitation of a photodiode (PD), particularly the transit time ( $\tau_{tr}$ ), poses a notable challenge. However, this constraint can be effectively addressed by integrating a circuit frequency equalizer in either the TIA, the LA, or a combination of both stages. One potential solution is to use an inductor in series, such as a bond wire ( $L_{bw}$ ), as shown in Fig. 3. However, we will show that although a bond wire can induce a frequency equalization, the required large inductance value makes it an inadequate solution for data rates below 5 Gb/s.

Fig. 3 presents a simplified model illustrating the conversion of incident light power ( $P_{opt}$ ) into output voltage ( $V_{out}$ ) through the PD connected to a TIA. The optical power ( $P_{opt}$ ) is converted to electrical current ( $I_{pd}$ ) by the PD's responsivity ( $R(\lambda)$ ). This current initially experiences the transit time ( $\tau_{tr}$ ), as modeled by a first-order Laplace transform in Fig. 3. The resulting current then flows into the PD as ( $i_{in}$ ), encountering a current divider between the PD's parasitic elements and the input impedance of the TIA ( $Z_{in}$ ). By calculating the current divider at the input port and assuming that is the combination of the parallel input capacitance ( $C_{in}$ ) and resistance ( $R_{in}$ ) of the TIA, the  $(i_{in})$  is expressed using (1):

$$i_{in} = \frac{R(\lambda)P_{opt} (1 + \tau_{in}S) (1 + \tau_{p}S)}{\binom{1 + (R_{in}C_{pd} + (\tau_{p} + \tau_{in}))S +}{(L_{bw}C_{pd} + \tau_{p}\tau_{in})S^{2} + L_{bw}C_{pd} \tau_{in}S^{3}} (1 + \tau_{tr}S)}$$
(1)

In (1),  $\tau_{in} = C_{in} \times R_{in}$  and  $\tau_p = C_{pd} \times R_S$ , where S is the complex frequency expressed as  $2\pi f i$ , with  $i = \sqrt{-1}$  and f is operating frequency in Hz.

From the previous section, we know that this PD achieves  $\tau_{tr} = 170$  ps,  $C_{pd} = 48.8$  fF, and R = 0.294 A/W. The series resistance  $(R_S)$  in the MD-PIN-PD is a parallel combination of many single-dot PDs, resulting in a lowered value of 15  $\Omega$ . As discussed in [32], the input impedance of a TIA should be low enough to capture all the current generated by the PD. For a typical value, we consider  $R_{in} =$ 70  $\Omega$  and  $C_{in} = 100$  fF including input pad capacitance. Considering (1) and applying a incident power of  $100 \,\mu$ W,  $V_{in}$ and  $i_{in}$  are plotted in Fig.4 for different values of  $L_{bw}$  varying from 0 to 5 nH. By substituting the parameters, it becomes evident that  $\tau_{tr}$  is significantly greater than  $\tau_{in}$ , which in turn is significantly greater than  $\tau_p (\tau_{tr} \gg \tau_{in} \gg \tau_p)$ . Accordingly, the dominant time constant identified in (1) is  $\tau_{tr}$ , which likely characterizes the system frequency behavior as that of a single pole system, and one can find the resonance frequency  $(f_{\rm rs})$  of both transfer functions  $(V_{\rm in}, i_{\rm in})$  using (2):

$$f_{rs} \simeq \frac{1}{2\pi \sqrt{L_{bw} C_{pd}}} \tag{2}$$

For our case, if we wanted to achieve peaking at 2 GHz, a bond wire of 129.7 nH is required. However, this value is quite large and practically not achievable. Even if it could be realized, it wouldn't be suitable for an ultra-wideband system with low variation group delay (GD). To address this problem [17], a T-coil is used instead of relying on just a bond wire, and a 53-ohm resistor is added to enhance GD performance. This enhancement comes at the expense of degrading the sensitivity performance of the optical receiver. concluding the above discussion, it is evident that a bond wire does not provide a robust solution.

There are two more approaches that can compensate for a large value of  $\tau_{tr}$  of the PD, firstly: utilizing a welldesigned TIA/LA to achieve a quality factor (Q) greater than  $\frac{\sqrt{2}}{2}$  instead of using the RLC network for frequency peaking. In a second-order transfer function, gain peaking can be introduced by creating complex conjugate poles. This gain peaking compensates for the drop in responsivity of the photodiode (PD). By carefully adjusting the amount of gain peaking and its frequency of occurrence, the frequency response of the PD and receiver can be made flat at the desired operating frequency. However, it is important to note that this approach can lead to a high Q, which may result in increased noise and larger variations in group delay [33].

Another approach involves utilizing pole-zero cancellation. By adding a zero to the all-pole system, we can reduce the impact of the significant transit time  $(\tau_{tr})$  of the photodiode (PD). This method is more effective because it

especially when a slow PD is connected to the input port.



**FIGURE 4.** (a) TIA input voltage, and (b) ratio of TIA input current to PD current as a function of  $L_{bw}$  (0 to 5 nH).

allows the transfer function's Q to be chosen independently of the required gain peaking. Consequently, this technique helps to balance frequency equalization with the level of circuit noise. We will delve into the specifics of both methods and examine their advantages and disadvantages.

Here, our approach focuses on utilizing a well-designed TIA to achieve a quality factor (Q) greater than  $\frac{\sqrt{2}}{2}$  instead of using the RLC network for frequency peaking. The frequency response of cascading PD with TIA can be achieved in general form as (3):

$$\frac{V_{out}(S)}{P_{opt}R(\lambda)} = \frac{1}{\left(1 + \frac{S}{\omega_{pd}}\right)} \underbrace{\frac{Z_0}{\left(1 + \frac{S}{Q\omega_0} + \frac{S^2}{\omega_0^2}\right)}}_{H(S)}$$
(3)

H(S) represents the transfer function of the TIA, defined as  $\frac{V_{\text{out}}}{i_{\text{in}}}$ , where  $\omega_{\text{pd}} = \frac{1}{\tau_{\text{tr}}}$  and  $\omega_0$  is the natural frequency of TIA's transfer function. To find the 3-dB bandwidth ( $\omega_{3\text{dB}}$ ) of TIA and PD together, we should set the absolute value of (3) to  $\frac{Z_0}{\sqrt{2}}$ . Through algebraic simplification, we can obtain a straightforward expression for Q, as shown in (4):

$$Q = \frac{y \times \sqrt{1 + x^2}}{\sqrt{2 - (1 + x^2)(1 - y^2)^2}}$$
(4)

The bandwidth enhancement ratio, represented by  $x = \frac{\omega_{3dB}}{\omega_{pd}}$ , is an essential factor in the design of an optical receiver,

Additionally, the variable y represents the bandwidth of the TIA and is equal to  $y = \frac{\omega_{3dB}}{\omega_0}$ . Considering the bandwidth of the PD, appropriate values for Q and  $\omega_0$  can be selected by balancing the trade-off between noise, gain, and speed of the receiver. Understanding the limitations and constraints of this equation requires exploring the valid relationship between x and y, as defined in (5). For instance, when the bandwidth enhancement ratio is 2, it can yield values within the range  $0.606 \le y \le 1.27$ .

$$\sqrt{1 - \sqrt{\frac{2}{(1+x^2)}}} \le y \le \sqrt{1 + \sqrt{\frac{2}{(1+x^2)}}}$$
 (5)



**FIGURE 5.** 3D plot illustrating the relationship between *Q* and the ratios x and y.

It is important to focus on reducing noise when choosing the best value for y. Typically, a higher Q value results in more noise at the TIA output. We will later demonstrate these concepts using the total integrated noise power output. Therefore, when selecting y, it is crucial to ensure that the minimum required Q value is achieved. By using the partial derivative of Q with respect to y for a given  $x \ge 1$ , we can determine the lowest positive value of y at which Q is minimized, as shown in (6).

$$\frac{\partial Q}{\partial y} = 0 \rightarrow y_{Q-\min} = \sqrt[4]{\frac{x^2 - 1}{x^2 + 1}} \tag{6}$$

Fig. 5 depicts (5) plotted against x and the valid range of y. For a given bandwidth extension ratio, there exists a corresponding y value as defined in (5), at which the minimum Q value is observed. In light of the preceding discussion, and considering our photodiode's bandwidth of 930 MHz and the target bit rate B of 4 Gb/s, we have selected the parameters x = 2.86, y = 0.94, and Q = 2. It should be noted that we employed the criterion  $\omega_{3dB} =$  $2\pi(0.6 - 0.8) \times B$ , as recommended in [32], to optimize the receiver's performance in terms of noise, speed, and intersymbol interference (ISI).

Fig. 6 effectively demonstrates the normalized ( $Z_0 =$  1) transimpedance gain for PD alone, TIA alone, and PD

combined with TIA based on (3), Each case is consistently plotted to commence at the same point (0 dB at DC), facilitating easy comparison. Due to a high Q required for frequency equalization, the frequency response of the TIA is no longer of the low-pass type and behaves like a band-pass. A significant Q value increase not only raises integrated noise but also, as shown in Fig. 6, leads to passband gain variation. This variation distorts the data pulse shape and introduces another form of ISI, which ultimately limits the receiver's accuracy.



FIGURE 6. Normalized transimpedance gains.

To mitigate the effects of a large transit time of the photodiode (PD), another approach is to introduce a zero with a time constant of  $t_r$  in the transfer function (as can be inferred from (3)). This should not impact the bandwidth of the TIA. In the next section, we will first discuss how creating a zero in the TIA's transfer function can help alleviate the impact of the significant PD transit time. Subsequently, we will provide a comprehensive description of our complete single-to-differential optical receiver design to achieve the desired performance.

## C. CIRCUIT DESIGN APPROACH

Fig. 7a illustrates the proposed single-to-differential optical receiver architecture. It consists of a single-ended MD-PIN photodiode, a single-to-differential TIA, an optimized differential amplifier, a 4-stage limiting amplifier with a current combining offset compensation loop, and a 50 $\Omega$  output buffer. All these building blocks, and also the photodiode, are designed and fabricated using the 0.35  $\mu$ m CMOS process from X-FAB Foundry. The MD-PIN photodiode is connected to the receiver chip through a 1 nH bond wire as shown in Fig. 15. Wire-bonding was done to be flexible in the choice of the PD. However, the MD-PIN PD can be integrated on the same chip together with the TIA/RX.

#### 1) TRANSIMPEDANCE AMPLIFIER

The TIA is the first amplification stage of the receiver chain. Its bandwidth, noise, and gain have a significant impact on the overall performance of the optical receiver. In this section, we will redesign the single-to-differential (SDT) TIA, originally introduced by [27] (shown in Fig. 7b), first to generate a zero in the feedback path, and then to achieve the desired values of Q and  $\omega_0$ , for optimal noise and speed performance.

The calculation performed here is more advanced than that in [27], as it considers the possibility of complex conjugate poles and includes the components  $C_f$  and  $M_f$  to manage the zero at the frequency of the PD's dominant pole. $\tau_f \simeq \frac{C_f}{g_{m5}}$  is the zero time constant and can be defined based on the MD-PIN PD's transit time which is shown in Fig. 7b,  $M_f$  and  $C_f$ can be adjusted in such a way that  $\tau_f \simeq \tau_{tr}$ .

Another observation is that in this TIA, the feedback path behaves like an active inductor. The feedback capacitor  $C_f$ can be viewed at the input of the TIA as an equivalent inductor  $L_{eq} \simeq \frac{C_f}{g_{122}}$ , as conceptually illustrated in Fig. 7c. A large value of  $L_{eq}$  can be achieved, which, in turn, can appropriately increase the MD-PIN PD's speed. In this analysis, it is assumed that  $g_{mf} \ll g_{m5}$ . After performing extensive calculations as outlined in Appendix A, we can observe the simplified transfer function correlating  $\frac{V_{out}}{l_{in}}$  (highlighted in Fig. 7b) as well as Q and  $\omega_0$  of the SDT TIA, defined in (7).

$$\frac{V_{\text{out}}}{I_{\text{in}}} \simeq \frac{Z_0 \left(1 + \tau_f S\right)}{\left(1 + \underbrace{\left(\tau_f + \tau_o\right)}_{1/Q\omega_0} S + \underbrace{\frac{\tau_f \tau_o}{\left(A_f + 1\right)}}_{1/\omega_0^2} S^2\right) \left(1 + \tau_{\text{in}} S\right)}$$

$$\omega_0 = \sqrt{\frac{\left(A_f + 1\right)}{\tau_f \tau_o}}, Q = \frac{\sqrt{\tau_f \tau_o \left(A_f + 1\right)}}{\left(\tau_o + \tau_f\right)} \tag{7}$$

We assume that the SDT TIA transfer function features a pair of conjugate poles and a real pole with a time constant of  $\tau_{in}$ . This assumption implies that  $\frac{1}{\tau_{in}} \gg \frac{\omega_0}{2Q}$ . The TIA must provide differential signaling, which requires  $A_1 = A_2 = A$  and  $\tau_0 = \tau_1 = \tau_2$ . The input time constant is denoted by  $\tau_{in} = \frac{C_{in}}{g_{m1}+g_{o3}}$  and the DC transimpedance gain  $Z_0$  is equal to  $\frac{2\times A}{(g_{m1}+g_{o3})(A_f+1)}$ .  $A_f$  is a feedback factor and represents a transconductance boosting factor for the  $M_1$  transistor (see Fig. 7b) similar to the regulated cascode (RGC) TIA [34], and equal to  $A_f = \frac{AA_ng_{m3}}{g_{m1}+g_{o3}}$  where also  $A_n = \frac{g_{m4}}{g_{m5}}$ . The relationship between the time constants  $\tau_f$  and  $\tau_o$  must

The relationship between the time constants  $\tau_f$  and  $\tau_o$  must be specific to achieve the desired Q value. We employ the Q formula outlined in (7). By solving this formula to determine the ratio  $(\frac{\tau_0}{\tau_{in}})$  that establishes the important relationship between Q,  $A_f$  and  $(\frac{\tau_0}{\tau_{in}})$ , we arrive at the outcome depicted in (8).

$$\frac{\tau_o}{\tau_{in}} = \frac{1}{\left(-\left(1 - \frac{(A_f + 1)}{2Q^2}\right) \pm \sqrt{\left(1 - \frac{(A_f + 1)}{2Q^2}\right)^2 - 1}\right)}$$
(8)

For example, if  $Q = \frac{\sqrt{2}}{2}$  then  $\frac{\tau_o}{\tau_{in}} \simeq \frac{1}{2A_f}$ , a calculation similar to the one in [32] for shunt-shunt feedback TIA. Focusing on this equation reveals that  $A_f$  must be selected



FIGURE 7. (a) Architecture of the proposed single-to-differential optical receiver.(b) Circuit diagram of the single-to-differential transimpedance amplifier (SDT).(c) Single-to-differential TIA functions as an active inductor.

within a valid range of  $0 \le Q \le \frac{\sqrt{(A_f+1)}}{2}$ . As a result, the TIA feedback factor  $A_f$  plays a critical role in achieving the desired bandwidth extension. To interpret the TIA design parameters discussed with the physical values of the transistors and biasing conditions, a DC analysis is performed to find the current branches  $(I_2, I_3 \text{ and } I_5 \text{ in Fig. 7b})$ . The SDT TIA is self-biased, eliminating the need for a biasing current source in the design and reducing noise generation. The only variable to consider in this design is  $V_b$ .

Consequently, we characterize the DC bias current of the transistor through  $V_b$ . Based on the square law of the MOSFET's current equation, and neglecting channel-length modulation and the body effect, we can express the current value and gate-source voltage as shown in (9).

$$I_i = \frac{1}{2} \underbrace{\mu_n C_{ox} \left(\frac{W}{L}\right)_i}_i (V_{GSi} - V_{th})^2 \& V_{GSi} = \sqrt{\frac{2I_i}{\beta_i}} + V_{th}$$
(9)

Considering  $V_b = V_{GS1} + V_{GS2}$  and  $V_{GS4} = R_2 \times I_2$ , along with  $I_3 = \beta_5 \times I_5$ , after performing some algebraic simplification,  $I_2$  and  $I_5$  are calculated by

$$I_2 \simeq \frac{V_B}{\sqrt{\alpha}R_2} \to I_5 = \frac{\beta_5}{2} \frac{\left(V_b - 2V_{tn}\right)^2}{\alpha} \tag{10}$$

where  $\alpha = \frac{\beta_3}{\beta_5} \times \frac{\beta_4}{\beta_1}$  and  $V_{tm}$  and  $V_{tp}$  are the threshold voltages of NMOS and PMOS transistor, respectively. In addition,  $V_B = V_b + \sqrt{\alpha} |V_{tp}| - 2V_{tm}$ , and is a design parameter which can control the bias current of transistors. (10) shows a reasonable approximation for  $I_2$  for simplicity.  $I_2$  could have two non-zero values that are relatively close to each other. We provide the minimum value to offer a reasonable approximation and simplify the current equation. in this calculation, we assumed that  $I_5$  is much larger than  $I_{Mf}$ . In this design,  $V_b = 2.1V$  is used for biasing the TIA.

In our design, we used capacitive coupling between the TIA and the differential pair, as illustrated in Fig. 7a, to control the DC bias of the TIA independently. To prevent "DC wander" or "baseline wander" [33], which can cause ISI, the time constant of the high-pass ( $\tau_{hf} = R_h \times C_h$ ) filter needs to be significantly larger than the longest allowable

143000

binary sequence, as indicated in (11). The equation shows us the maximum number of binary sequences (m) that can result in a voltage drop of  $\Delta V_{dB}$  in dB at a data rate of  $T_b$ .

$$m = -\ln\left(10^{-\left|\frac{\Delta V_{dB}}{20}\right|}\right) \times \frac{(R_h \times C_h)}{T_b} \tag{11}$$

## 2) DIFFERENTIAL PAIR, LIMITING AMPLIFIER AND OUTPUT BUFFER

Although, the TIA significantly amplifies the photocurrent, the signal swing at the TIA output remains inadequate for digital data interpretation. The limiting amplifier (LA) addresses this by amplifying the signal swing to a rail-torail level, completing the final analog stage in the optical receiver. The LA functions as the intermediate stage linking a TIA with a clock and data recovery (CDR) circuit, and it must meet several requirements. The LA identifies signal levels above a few mV, necessitating a relatively high gain to generate sufficiently large voltage swings for the subsequent CDR and decision circuits. Furthermore, it should provide enough bandwidth to reduce ISI [35].

In our design, after the TIA, we utilized a differential pair with the capacitive degeneration network ( $R_{eq}$ ,  $C_{eq}$ ) at the source of the  $M_6$  and  $M_7$  transistors, as shown in Fig. 8a. This stage serves two main purposes: first, it collects the differential signal generated by the TIA and achieves fine equalization using the mentioned RC network; second, it provides primary low-noise amplification for subsequent stages. By satisfying the conditions  $R_{eq}C_{eq} = R_DC_{od}$  and  $g_{m6}R_{eq} = 1$ , we can derive the input-output voltage transfer function of the differential amplifier using (12). It is worth noting that  $C_{od}$  represents the single-ended output capacitance, while  $A_{dp} = \frac{g_{m6}R_D}{2}$ .

$$\frac{V_o}{V_i} = -\frac{A_{dp}}{\left(1 + \frac{R_{eq}C_{eq}}{2}S\right)} \to \omega_{3dB} = \frac{2}{R_{eq}C_{eq}} \qquad (12)$$

This knowledge, as discussed in [33], highlights the cancellation of the first dominant pole by the zero generated in the source of transistors  $M_6$ ,  $M_7$ , and how the second pole constrains the bandwidth according to (12). Our calculations are based on the assumptions presented in [33].

To suppress the DC-offset generated by the high gain LA stages, the receiver needs an offset cancellation mechanism,



**FIGURE 8.** (a) Differential pair with a capacitive degeneration network.(b) The circuitry of the limiting amplifier. (c) 50  $\Omega$  output buffer.(d) Gain-enhanced error amplifier employing a negative resistor.

in Fig. 8b. To minimize the offset voltage caused by mismatches and process variations in the LAs, a low-pass RC filter  $(R_o, C_o)$  is used in conjunction with current-combining differential amplifiers, as shown in Fig. 8d, which serves as a feedback path error amplifier (EA). The error current signal is fed back into the output port  $(V_{od})$  of the differential pair via transistors  $M_{19}$  and  $M_{20}$ , effectively suppressing the offset voltage. In order to design the DC-offset current combining loop, we used the simplified equivalent small-signal halfcircuit shown in Fig. 9.  $V_{if-LA}$  and  $V_{if-EA}$  represent the voltage offset at the input port of the limiting amplifier and error amplifier, respectively. The DC gain of the cascaded LA is represented by  $A_L$ , and  $C_X$  is a combination of the gate-drain capacitance of  $M_{19,20}$ , along with a frequency compensation capacitor to ensure the stability of the offset compensation loop.



FIGURE 9. Simplified equivalent circuit model for offset cancellation in a single-ended configuration.

The voltage transfer function for both mentioned offset voltages is calculated in Eq.13 and Eq.14, precisely showing the influence of each path where  $\tau_{oc} = R_{oc} \times C_{oc}$ ,

 $\tau_x = \frac{C_x}{g_{m15}-g_{m16}}$  and  $\tau_Z = \frac{C_x}{g_{m20}}$ . By using Monte Carlo simulations and performing simulations at different process corners and temperatures, we ensured that  $g_{m15}$  is greater than  $g_{m16}$  through precise sizing. In this calculation, for simplicity, the  $\lambda$  and  $\gamma$  parameters of each transistor are neglected. The most important result of this calculation is that the offset voltage of the error amplifier directly appears at the output, while the LA's offset can be suppressed by the gain of the two-stage error amplifier ( $A_{oc} = A_{ea1} \times A_{ea2}$ ).

$$V_{o-LA} \simeq \frac{(1 + \tau_{oc}S) V_{if-LA}}{A_{ea1}A_{ea2} \left(1 + \frac{\tau_{oc}S}{A_{L}A_{ea1}A_{ea2}}\right)}$$
(13)

$$V_{o-EA} \simeq \frac{(1 - \tau_z S) (1 + \tau_{oc} S) V_{if-EA}}{\left(1 + \frac{\tau_{oc} S}{A_L A_{ea1} A_{ea2}}\right) (1 + \tau_x (A_{ea2} + 1) S)}$$
(14)

 $A_{ea1}$  represents the DC gain of the error amplifier's first stage, given by  $A_{ea1} = \frac{g_{m18}}{g_{m15}-g_{m16}}$ , while  $A_{ea2}$  is the gain of the second stage, equal to  $g_{m20}R_D$ . As demonstrated by these equations, increasing the gain of the EA effectively suppresses the offset of the LAs. However, this introduces additional challenges, including the offset of the EA itself, potential stability issues within the feedback loop, and high-pass filtering effects due to the use of  $R_o$  and  $C_o$ .

In this design, we utilized a cross-coupled pair  $(M_{23}, M_{24})$  at the first stage instead of employing a high-sheet-resistance integrated resistor with relatively high tolerance. This choice allows us to provide a more accurate high-resistance load per occupied area, potentially resulting in a more compact layout. When put into practice, various mismatches in the circuit can cause the output DC voltage to deviate from the ideal value. To correct this, a differential voltage must be applied to the input. This helps bring the output back to the ideal scenario. The differential input voltage applied in this situation is referred to as the input-referred DC-offset voltage [36].

To determine the EAs offset and identify its significant contributing elements, considering the first stage circuit diagram of the error amplifier shown in Fig. 8d,  $V_{if-EA}$  is calculated using (15), following the specified method in [36], [37], and [38]. The detailed parameter definitions are provided in Appendix C for this calculation.

$$V_{if-EA} = \Delta V_{Tn} - \left( \sqrt{\frac{\beta_{p1}I_{p1}}{\beta_n I_{D_n}}} \Delta V_{Tp1} + \sqrt{\frac{\beta_{p2}I_{p2}}{\beta_n I_{D_n}}} \Delta V_{Tp2} \right) + \frac{1}{2} \sqrt{\frac{2I_{D_n}}{\beta_n}} \left( \frac{I_{p1}}{I_{D_n}} \frac{\Delta\beta_{p1}}{\beta_{p1}} + \frac{I_{p2}}{I_{D_n}} \frac{\Delta\beta_{p2}}{\beta_{p2}} - \frac{\Delta\beta_n}{\beta_n} \right)$$
(15)

The mismatch in the dimensions of NMOS and PMOS pairs, as well as variations in gate oxide thickness, leads to a mismatch in the oxide capacitance (*Cox*). These effects are described by two terms  $(\frac{\Delta\beta_n}{\beta_n}, \frac{\Delta\beta_{p1,2}}{\beta_{p1,2}})$  in (15). Moreover, any threshold voltage mismatch between the input transistors *M*17 and *M*<sub>18</sub> directly influences the input-referred DC-offset voltage. This emphasizes the importance of having symmetrical devices at the input stage. To achieve maximum symmetry in the layout, each transistor is split into two identical parts and connected diagonally, which is known as the common centroid layout technique. This layout technique minimizes the impact of cross-chip gradients in oxide thickness and doping, thereby improving the matching performance of the circuit [38].

As mentioned, due to the use of the offset cancellation loop, another challenge is the stability of the feedback loop and high-pass filtering effects. In order to address baseline wander in high-pass filtering, we can determine the values of  $R_o$  and  $C_o$  using (11). Typically, these components require significant values. Instead of using a large capacitor, as mentioned in [39], connecting  $C_o$  between the input and output of the EA can induce the Miller effect. This effect allows for the creation of large effective capacitances with small on-chip capacitors. However, this method can introduce complexities due to nested loops, resulting in intricate stability criteria [38].

Accordingly, in our design (refer to Fig. 7), we chose to use a simple RC network to resolve this issue. To maintain the stability of the offset cancellation loop, we calculate the loop gain in the circuit depicted in Fig. 9. Offset voltages are excluded from this calculation. The loop is opened at the gate of  $M_{18}$ , and a test signal is applied. By tracing the loop and multiplying the cascade gains, the loop gain (LG) is determined, as shown in (16).

$$LG = -\frac{A_L A_{oc} (1 - \tau_z S)}{(1 + \tau_x (A_{ea2} + 1) S) (1 + \tau_{oc} S)}$$
(16)

The dominant pole of this transfer function is due to the low-pass filter and is significantly separated from the second pole while  $\tau_z = \frac{C_x}{g_{m20}}$  is a right -half plane zero and doesn't have much influence of LG since  $C_x \ll C_o$ . This ensures that  $|\omega_{p1}| \ll \left(\omega_u = \frac{A_L A_o}{\tau_{oc}}\right) \ll |\omega_{p2}|$  allowing a phase margin of

The offset cancellation concern is summarized in Fig. 10, which shows the magnitude plot of the transfer function (Equations 13, 14, and 16) versus angular frequency where  $\omega_x = \frac{1}{\tau_x}$ . This plot highlights the offset contribution of each component and the trade-offs in the stability of the loop. For example, to minimize the offset of the LA, the value of  $A_{oc}$  should be chosen to be quite large. This results in the second pole being closer to  $\omega_u$  and a phase margin getting closer to 45 degrees.



**FIGURE 10.** Closed-loop magnitude versus frequency plot of the transfer functions corresponding to Fig. 9.

The last two components of the chain are LA stages and the 50  $\Omega$  buffer illustrated in Fig. 8b and Fig. 8c, respectively. Buffers driving off-chip loads often face a bandwidth limitation due to the large input transistors required for high current drive capability [35]. To ensure the high-speed voltage swing capability of the buffer, we consider a single-ended output swing of  $\Delta V_{so}$ . This swing corresponds to an equivalent resistance of 25  $\Omega$ , which results from the parallel combination of the buffer's load resistance and the measurement equipment's input resistance.

Due to the gain of the subsequent LA stage, the differential input signal of the buffer is relatively high. As a result, complete current switching may occur, steering the entire tail current to one side as highlighted in Fig. 11a. When the buffer senses a differential signal at its input with an amplitude of  $V_{in2} - V_{in1} = 2V_{i0}$  greater than  $V_{od-13,14} = \sqrt{\frac{2I_{0b}}{\beta_{13,14}}}$ , the overdrive voltage of  $M_{13}$  and  $M_{14}$ ,  $M_{13}$  is cut off and  $M_{14}$ is in saturation. As a result, the output exhibits large signal behavior, which is illustrated in Fig. 11b. When a random amplified binary signal is applied to the input of the buffer with a bit period  $T_b$ , and disregarding the limited rise and fall times of the input signal, the output can reach  $\kappa V_{DD}$  at half of the bit period  $(0.5 T_b)$ . This condition is met if the output capacitance  $C_{ob}$  is below the threshold value determined by (17).

$$C_{ob} = -\frac{T_b}{R_{ob} \ln\left(\frac{V_{DD}(1-\kappa)}{\Delta V_{so}}\right)} \tag{17}$$

If the capacitance  $C_{ob}$  exceeds this value, the topology must be modified using inductive peaking to mitigate the effects of the increased capacitance. Generating a swing of  $\Delta V_{so}$  also implies that the width of the input devices  $M_{13}$  and  $M_{14}$  must be greater than the value calculated in (18), which results in a large input capacitance on the order of hundreds of femtofarads.

$$W_{13,14} \ge \frac{2I_{0b} \times L_{13,14}}{\mu_n C_{ox} (V_{od-13,14})^2}$$
(18)

Here, the final design procedure involves utilizing cascaded N-identical gain cells to enhance the gain-bandwidth product (GBW) beyond what is achievable with a single highgain amplifier. As previously discussed, the primary function of the LA is to amplify the small signal from the TIA to a level that ensures reliable operation of the CDR circuit.

In this design, we utilized the topology depicted in Fig. 8b. This topology, inspired by [35], is an inductor-less version of the LA with a second-order transfer function that allows for gain peaking with  $Q_{la} = \frac{\sqrt{2}}{2}$ . This approach can offer a more efficient GBW compared to first-stage amplifiers, as discussed by [40].



**FIGURE 11.** (a) Large-signal equivalent circuit of the buffer with input voltage treated as  $\sqrt{2}$  times the overdrive of the input transistor. (b) Output large-signal swing of the buffer. (c) Equivalent circuit of the limiting amplifier (LA).

Using the small-signal equivalent circuit shown in Fig. 11c and neglecting channel-length modulation, the body effect, and the Miller effect, the voltage transfer function of a single-stage LA is expressed in (19).

$$\frac{V_{out}}{V_{in}} = \frac{A_{0-LA}}{1 + \underbrace{\frac{(\tau_{lx1} + \tau_{lx2})}{(A_F A_{L2} + 1)}}_{\frac{1}{2}Q_{la} \times \omega_{la}} S + \underbrace{\frac{\tau_{lx1}\tau_{lx2}}{(A_F A_{L2} + 1)}}_{\frac{1}{2}\omega_{la}^2} S^2$$
(19)

In this equation, the DC gain of a single LA is given by  $A_{0-LA} = \frac{A_{L2}A_{L1}}{(A_FA_{L2}+1)}$ , where  $A_{L1} = g_{m8}R_{L1}$ ,  $A_{L2} = g_{m12}R_{L2}$ , and  $A_F = g_{m10}R_{L1}$ . To enhance the bandwidth of the LA, transistors  $M_{10}$  and  $M_{11}$  are employed to introduce local negative feedback, which generates complex conjugate poles. This negative feedback creates two dominant time constants associated with nodes  $V_{x1}$  and  $V_{out}$ , as depicted in Fig. 11c. The time constants  $\tau_{lx1} = R_{L1}C_{L1}$  and  $\tau_{lx2} = R_{L2}C_{L2}$  can form a complex conjugate pair when the quality factor  $Q_{la}$  is  $\frac{\sqrt{2}}{2}$ , or if the condition expressed in (20) is satisfied.

$$\frac{C_{L2}}{C_{L1}} \simeq \frac{1}{2g_{m10}g_{m12}R_{L2}^2} \tag{20}$$

In this equation,  $g_{m_10}$  and  $g_{m_12}$  are the transconductances of transistors  $M_{10}$  and  $M_{12}$ , respectively. (8) is manipulated and simplified under the condition that the quality factor  $Q_{la}$  is chosen  $\frac{\sqrt{2}}{2}$ .

A cascade of identical LA cells, each with a natural frequency of  $\omega_{la}$  and a quality factor  $Q_{la}$ , results in a combined bandwidth that depends on the properties of the individual stages and the overall system configuration. If we consider N identical second-order stages, the combined bandwidth  $\omega_t$  of the cascade can be precisely calculated by (21).

$$\omega_{t} = \frac{\omega_{la}}{\sqrt{2}Q_{la}} \times \sqrt{\sqrt{\left(1 - 2Q_{la}^{2}\right)^{2} + 4Q_{la}^{4}\left(2^{\frac{1}{N}} - 1\right) - \left(1 - 2Q_{la}^{2}\right)^{2}}}$$
(21)

If  $Q_{la}$  is  $\frac{\sqrt{2}}{2}$  the (21) is simplified to  $\omega_t = \omega_{la} \sqrt[4]{2^N} - 1$ , and our calculation is similar to the one in [40] for a secondorder system. Providing further justification for this equation can be achieved by selecting N = 1, which yields the same 3-dB bandwidth result for all pole second-order systems, as reported by [41]. In this case, the optimum number of stages can be approximated  $N_{opt} \simeq 4 \times \ln(A_L)$  where  $A_L$ is the total gain of N identical LAs. The optimum number of stages is a starting value for our design, which would be reduced considering the LAs' power consumption and noise budget.

When the differential input voltage at the input of the second or third stage becomes sufficiently large, these stages begin to switch and operate in the large-signal mode, which is described by a nonlinear differential equation with memory effects. Under these conditions, the calculations presented in (19), (20), and (21) are no longer applicable. Instead, we should consider the rise and fall times of the large signal to characterize their speed, similar to the approach we have already used for the buffer.

## 3) NOISE ANALYSIS

Both noise and ISI have a significant impact on the vertical and horizontal openings of the eye diagram. Our

comprehensive analysis will delve into how these factors influence the eye diagram, and we will offer a BER estimation that takes into account the combined effects of noise, ISI, and offset.

This section focuses on a detailed characterization of circuit noise while disregarding photodiode noise, which is assumed to be much lower than circuit noise. The primary source of noise in the receiver is the TIA, followed by the differential amplifier and the LAs as the second and third contributors, respectively. In this analysis, we compute the output power spectral densities (PSD) of each block individually (i.e., TIA, differential pair, LA). Subsequently, using the integral formula outlined in Appendix B, we accurately determine the output variance (integrated noise). Dividing the total output noise variance by the total DC gain of the receiver allows us to ascertain the integrated input-referred current noise at the receiver's input.

In light of Fig. 10, we can examine the similarities between noise calculation and offset when  $V_{if-LA}$  represents the total amplified TIA and differential amplifier noise reaching the input of the LA, and  $V_{if-EA}$  represents the total inputreferred EA's noise. Both types of noise experience different frequency shaping, as discussed in Fig. 10. For instance, the overall integrated noise of the EA is filtered by the use of  $R_o$  and  $C_o$ . The squared area underneath the offset transfer function is negligible compared to the area underneath the LA path transfer function. For precise calculation, we should have taken into account the shaping caused by the use of the DC-offset feedback loop, but as a conservative approach, we utilized the integral limits of zero to infinity.

TIA's noise: the noise model of the TIA is shown in Fig. 23 (Appendix B), which includes the noise sources from all components. The model only considers the thermal noise from the resistors and MOSFETs, while neglecting flicker noise for simplicity. We calculate the noise transfer function  $H_{x-n}(f)$  for each noise current source to the output. Using the simplified integral formulas provided in Appendix B, we determine the integrated output PSD. This calculation yields the voltage noise output variance for each noise current source,  $\sigma_{V_{onx}}^2$ . By dividing  $\sigma_{V_{onx}}^2$  by  $Z_0^2$ , the input-referred current noise variance,  $\sigma_{I_{inx}}^2$ , is calculated. Since all noise sources are statistically independent, their variances can be summed (see Appendix B for more detail).

Similar to the methodology outlined in Appendix A for determining the output voltage across frequencies induced by the photodiode input current, this process can be replicated to ascertain the output voltage contributions from each individual noise source. For instance, the current noise of  $M_3$  ( $I_{n3}$ ) follows the same path as the photodiode current, resulting in a similar transfer function. Thus, considering (7) and (39) with N(S) = 1,  $\sigma i_{inM3}^2$  can be expressed as follows:

$$\sigma_{inM_3}^{2} \simeq \frac{kT}{C_1} \times \frac{\gamma g_{m3}}{R_1} \times \left(A_f + 1\right) \left(A_f + 0.5\right)$$
(22)

where  $(k = 1.38 \times 10^{-23} J_K)$  denotes the Boltzmann constant, T represents the absolute temperature in Kelvin,

and  $\gamma$  is a coefficient specifically set to  $\frac{2}{3}$  for long-channel devices.  $g_{m3}$  expresses the transconductance of  $M_3$  and  $C_1$  is the output capacitance of the SDT. The current noise from  $M_4$  and  $M_5$  is also transferred to the output using a slightly different transfer function compared to  $M_3$ , with an additional multiplication factor from the current mirror, given by  $A_{cm} = \left(\frac{g_{m3}}{g_{m4}}\right)$ , and without a zero at  $\tau_f$ . Thus, the sum of both variances can be expressed as follows:

$$\sigma_{inM_5}^{2} + \sigma_{inM_4}^{2} \simeq \frac{kT}{C_1} \times \frac{A_{cm}^{2}\gamma \left(g_{m4} + g_{m5}\right)\left(A_f + 1\right)}{R_1 \left(2A_f + 1\right)}$$
(23)

The noise from  $R_1$  and  $R_2$  appears at the output via a first-order RC filter. For  $R_1$ , this follows the standard behavior of such a filter. However, as discussed in Appendix B, the noise contribution from  $R_2$  deviates slightly from the fundamental variance of a first-order RC filter  $\binom{kT}{C_1}$ . As a result, the input-referred current variance of  $R_1$  and  $R_2$  can be found using (24).

$$\sigma_{inR_1}^2 + \sigma_{inR_2}^2 \simeq \frac{kT}{C_1} \times \frac{1}{Z_0^2} (1+1.3)$$
 (24)

The noise contributions from  $M_1, M_2$  are expressed in (25).

$$\sigma_{inM_1}^2 + \sigma_{inM_2}^2 \simeq \frac{kT}{C_1} \times \frac{1.3\gamma}{Z_0^2} (A_1 + A_2)$$
 (25)

In the subsequent analysis, the primary objective is to calculate noise using equations 22 to 25. This process results in the computation of integrated current noise for all components. We can proceed to calculate the total input-referred RMS current noise of the TIA ( $I_{intia}$ ) in Eq. 26.

$$I_{\text{int }ia} \simeq \sqrt{\frac{kT}{C_1} \binom{\frac{2.6}{Z_0^2} (1 + \gamma A) + \frac{\gamma g_{m5}(A_f + 1)A_{cm}}{R_1}}{\binom{A_{cm}(A_f + 1)}{(2A_f + 1)} + (A_f + 0.5)}}$$
(26)

Differential Pair: to compute the total PSD of the differential pair, we used the halved circuit shown in Fig. 24a (refer to Appendix B) and, due to the circuit symmetry, then doubled the calculated PSD following the discussion in the section on the differential amplifier. Recall the pole-zero cancellation mentioned in the analysis of the differential pair. Using the noise calculation method provided in Appendix B, we can find the total noise current generated by the differential pair at the input port of the receiver. We took into account the noise contributions from  $M_6$ ,  $R_{eq}$ , and  $R_D$ , while disregarding the current tail noise, as it appears as a common mode signal and will be eliminated in the differential output. More details can be found in Appendix B.

Based on the above discussion, the input-referred RMS current noise of the differential amplifier  $(I_{indp})$  is given in Eq. 27.

$$I_{indp} \simeq \sqrt{\frac{kT}{C_{od}} \times \frac{2}{Z_0^2} \left(\frac{1}{A_{dp}^2} + \frac{\gamma}{A_{dp}} + \frac{1}{3A_{dp}}\right)}$$
(27)

LA: the noise equivalent circuit model for the LA is depicted in Fig. 24b. This half-circuit can represent the LA's noise contribution to the receiver output, revealing which component is the dominant contributor. As in previous noise analyses, we first calculate the output PSD of each noise source. Then, using the integral formula provided in Appendix B, we can calculate the variance of each noise source and sum them up. This value is then referred to as the input of the receiver by dividing it by the gain of the TIA, differential pair, and LAs. The detailed calculation, including the noise transfer function of each noise source, is provided in Appendix B. The total input-referred RMS current noise of a single-stage LA current noise ( $I_{inla}$ ) is found in (28).

$$I_{\text{inla}} \simeq \sqrt{\frac{2kTA_F^2}{C_{L2}Z_0^2 A_{L1}^2 A_{dp}^2} \binom{\frac{C_{L2}}{C_{L1}} \left(\gamma A_{L2} \left(1 + \frac{g_{m8}}{g_{m10}}\right) + \frac{A_{L2}}{A_F}\right) + (1 + \gamma A_{L2})}}{(28)}$$

Although Equations (26), (27), and (28) may appear complex and influenced by many parameters, it is interesting to note that they are fundamentally based on the noise limit of a simple RC integrator,  $\frac{kT}{C}$ . This shows how the selection of the receiver capacitor can significantly affect the noise performance of the receiver. For further simplification, we can manipulate these three formulas into a general form under a square root,  $\sqrt{\frac{kT \cdot n}{C \cdot Z_0^2}}$ , to express the integrated input-referred noise current of the optical receiver,  $I_{irn-tot}$ , as follows:

$$I_{irn-tot} = \sqrt{\frac{kT}{C_1} \times \frac{n_{tia}}{Z_0^2}} \times \left(1 + \frac{C_1}{C_{od}} \frac{n_{dp}}{n_{tia}} + \frac{C_1}{C_{L2}} \frac{n_{la}}{n_{tia}}\right)$$
(29)

Providing a holistic design example, consider a receiver with an equal capacitor value of 100 fF for all stages and an impedance  $Z_0$  greater than 1000 ohms. With noise amplification factors of  $n_{\text{tia}} = 7$ ,  $n_{\text{dp}} = 4$ , and  $n_{\text{la}} = 2$  at a temperature of 27 °C, the input-referred current noise of the entire receiver can be estimated using the previous equation to be approximately 735 nA-rms.

## 4) SENSITIVITY AND BER

In evaluating the power budget of an optical transmission system, it's essential to consider the sensitivity of the optical receiver, which refers to the minimum average optical power needed to maintain a specific BER. At the data decision point, the signal may suffer significant degradation due to the accumulation of random noise and ISI, leading to erroneous decisions resulting from eye closure [42].

As mentioned earlier, the vertical eye opening is limited by ISI and amplitude noise, while the horizontal eye opening is also at risk, especially at very high data rates where this issue becomes more pronounced. In practical receiver implementation, ISI can occur due to factors such as limitations in receiver bandwidth, high-pass filtering, unequal rise and fall times, and noise perturbations at data zero crossings. By analyzing the signal eye diagram before making a data decision, it is apparent that, in addition to random noise, the signal also undergoes bounded amplitude fluctuations caused by ISI. These fluctuations are significantly influenced by the specific signal pattern, as depicted in Fig. 12.



**FIGURE 12.** Effect of additive noise on a random data pattern, impacting both amplitude and midpoint crossing time. Comparison of the data pattern and its corresponding probability density function (PDF) in the absence of noise, where the receiver consistently makes correct decisions if  $V_{th}$  is between  $V_1$  and  $V_0$ .

The figure illustrates an NRZ signal bit pattern  $V_s(t)$ (010010) combined with Additive White Gaussian Noise (AWGN)  $V_n(t)$ . The system's limited bandwidth affects both the pulse settling time and the rise and fall times, especially when isolated ones or zeros are input to the receiver. This situation frequently results in the smallest signal swing. The received optical power levels corresponding to logical one and zero,  $P_{1-in}$  and  $P_{0-in}$ , are first converted into electrical current by the PD's responsivity  $R(\lambda)$ , and then amplified by the receiver's overall transimpedance gain  $Z_t$ . However, these levels are affected by the settling time limitations of the optical receiver.

As a result, the expected analog voltage levels for logic one  $(V_{1-a})$  and logic zero  $(V_{0-a})$  deviate from their intended values due to ISI, resulting in deviations denoted as  $V_{ISI1}$  and  $V_{ISI0}$ , as illustrated in Fig. 12. The distance from the decision voltage point  $(V_{th})$  can worsen when amplitude noise is taken into account. To estimate the vertical BER degradation, a straightforward method is to consider a worst-case scenario where the amplitude drop is modeled as an impulse function. By convolving this dropped voltage, represented as an impulse, with the probability density function (PDF) of the additive noise at both the one and zero levels, we can assess the impact of the degradation.

Assuming that the convolved result preserves a Gaussian distribution, the BER can be expressed using the following formula [32], [33], [42]:

$$P_{Er} = \frac{1}{2} (P(0 \to 1) + P(1 \to 0)) \to P_{Er} = \frac{1}{2} \operatorname{erfc} \left( \frac{Q_{ber}}{\sqrt{2}} \right)$$
$$Q_{ber} \triangleq \frac{V_{1-a} - V_{0-a} - (V_{ISI1} + V_{ISI0})}{\sigma_1 + \sigma_0}$$
(30)

In this equation,  $\sigma_1$  and  $\sigma_0$  represent the standard deviations of the noise at the logic one and zero levels, respectively.

The probability of error denoted as  $P_{Er}$ , can be calculated using the following relationship, where  $P(0/1 \rightarrow 1/0)$  is probability receiving 1/0 given that 0/1 transmitted, and erfc is the complementary error function.

In (30), assuming perfect synchronization between the bit stream and the bit clock, the primary challenges to making accurate decisions are noise and vertical ISI in the received data. However, amplitude noise also contributes to timing jitter at the data midpoint crossing, reducing the horizontal eye-opening as it is highlighted in Fig. 12. Timing uncertainty at the midpoint crossing ( $\sigma_{t_0}$ ) can be expressed in terms of the noise voltage  $V_n(t_0)$  and the derivative of the signal  $V_s(t_0)$ , as highlighted in the figure. A higher derivative of the signal at the midpoint results in less timing error at the sampling moment, reducing the likelihood of errors.



**FIGURE 13.** Second-order system subjected to white noise and a rectangular pulse with a bit period of  $T_b$ .

(30) illustrates the system-level trade-offs of an optical receiver by accounting for noise and ISI. To analyze the contribution of each factor, we conduct a simplified examination based on the optical receiver model depicted in Fig. 13. An ideal input signal bit stream of (0010) is converted to electrical current, alternating between 0 (with the laser extinction ratio of infinity for the leading bit) and  $I_{s1}$ , with a bit period of  $T_b$ . A similar discussion on the worst-case eye-opening can be found in [43] and [44]. However, in this paper, we focus on an isolated one-bit sequence to highlight the trade-offs between noise, bandwidth, and ISI.

As discussed in the SDT-TIA section, the limitations of the MD-PIN PD are mitigated by incorporating a zero in the feedback loop. The resulting output voltage to input current frequency response can be modeled as a secondorder low-pass system characterized by a quality factor Q, a natural frequency  $\omega_0$ , and a DC gain of  $Z_t$ . The Laplace transform of the first two bits is calculated in this figure. It is assumed that the bandwidth of the limiting amplifiers (LAs) and differential pair is larger than that of the TIA and that all blocks operate within the linear regime.

To simulate the effect of noise on both the signal amplitude and the midpoint zero crossing, we assume white current noise with a flat PSD of  $4kTg_{neq}$ , where  $g_{neq}$  is the equivalent noise conductance. This noise conductance is selected to match the total noise variance generated by the various noise sources within the receiver. For example, to analyze the effect of noise from transistor  $M_3$  in SDT TIA,  $g_{neq}$  would be set to  $g_{m3} \cdot \gamma$ . The resulting output voltage from the input current  $I_s(t)$  and its derivative at the positive rising edge are calculated in (31) and (32), respectively.

$$V_{o}(t) = Z_{t} \times I_{s1} \times \left( \left( 1 - \frac{\omega_{0} \sin \left(\omega_{d} t + \varphi\right) e^{-\frac{\omega_{0}}{2Q}t}}{\omega_{d}} \right) u(t) + \left( \frac{\omega_{0} \sin \left(\omega_{d} \left(t - T_{b}\right) e^{-\frac{\omega_{0}}{2Q}\left(t - T_{b}\right)} + \varphi\right)}{\omega_{d}} - 1 \right) \right)$$

$$\times u(t - T_{t})$$
(31)

$$\times u(t - T_b)) \tag{31}$$

$$V_{or}(t)' = Z_t \times I_{s1} \times \frac{\omega_0^2}{\omega_d} e^{-\frac{\omega_0}{2Q}t} \sin(\omega_d t)$$
(32)

The calculations are performed for  $Q \ge \frac{1}{2}$ , where the parameters are defined as follows:  $\omega_d = \frac{\omega_0}{2Q}\sqrt{4Q^2 - 1}$  is the damped natural frequency, and  $\varphi = \tan^{-1}\left(\sqrt{4Q^2 - 1}\right)$  is the phase angle associated with the damping.(31) and (32) quantitatively describe the system's settling behavior and the derivative of the output voltage, respectively. Both equations depend on the quality factor Q, the natural frequency  $\omega_0$ , the impedance  $Z_t$ , and the converted input current.

To ensure minimal ISI degradation in the vertical BER calculation, it is imperative to minimize both  $V_{ISI1}$  and  $V_{ISI0}$ . This can be accomplished by widening the system's bandwidth and ensuring that at  $T_b/2$ , the output voltage  $V_o(t)$  reaches  $I_{s1}Z_t$ , resulting in an ISI-free eye-opening. If this condition is not met and the settling time causes a voltage drop, the optical power must be increased to compensate, leading to a power penalty due to ISI. However, increasing the system's bandwidth, as expressed in (39) (Appendix B), induces a greater variance in noise attributed to  $i_n(f)$ . This effect precipitates a reduction in  $Q_{ber}$ , as outlined in (30), thereby elevating the probability of error. Similarly, in horizontal BER calculation, widening the system's bandwidth leads to a steeper rise time and larger derivative, resulting in less uncertainty in midpoint crossing time conversion. Moreover, as illustrated in Fig. 12, an enhancement in the derivative of  $V_o(t)$  at the signal midpoints effectively suppresses amplitude noise relative to time uncertainty. This suppression is achieved through a steeper signal rise and fall, which mitigates time uncertainty. Furthermore, (32) demonstrates that increasing  $\omega_0$  results in a near-linear enhancement of the derivative, thereby reducing time uncertainty at midpoints. Nevertheless, a continued increase in  $\omega_0$  also leads to a corresponding linear increase in amplitude noise.

To illustrate the aforementioned discussion, three distinct scenarios for the natural frequency  $\omega_0$  have been considered:  $\omega_0 = 0.75 \times \frac{2\pi}{T_b}, \ \omega_0 = \frac{2\pi}{T_b}, \ \text{and} \ \omega_0 = 1.5 \times \frac{2\pi}{T_b}.$  For



**FIGURE 14.** Time-domain response of the second-order system to a rectangular pulse with a bit period of  $T_b$ , illustrating the impact of additive noise on signal amplitude and midpoint crossing. The effects are calculated and analyzed for varying Q factors and natural frequencies  $\omega_0$ .

each case, the normalized output voltage  $V_o(t)$  and the ISI at the vertical axis,  $V_{ISI1}$ , are calculated at  $T_b/2$  and  $\frac{4}{5} \times \frac{T_b}{2}$ for different values of the quality factor Q, as shown in Fig. 14. Additionally, the derivative of the output voltage at the midpoint  $(t_0)$  is calculated and analyzed for varying values of Q. In this figure, for each  $\omega_0$ , there exists a corresponding  $Q^*$  where  $V_{ISI}$  at  $\frac{T_b}{2}$  is effectively suppressed, ensuring no voltage drop due to limited settling time. Within this analysis, at the point  $(\frac{T_b}{2}, \omega_0)$ , the input current noise  $(\sigma_{li})$  and the normalized horizontal timing uncertainty  $\frac{\sigma_{l_0}}{T_L}$ are calculated and summarized in a table for each scenario. In these calculations,  $i_{ins}$  represents the peak-to-peak input signal current required to achieve the desired BER. For simplicity and to circumvent the complexity of solving a nonlinear equation, the derivative of the output signal at  $t_0$  is approximated as  $\omega_0 I_{s1} Z_t$ . Furthermore, in calculating  $V_n(t_0)$ , which denotes the sampled noise at the midpoint crossing, it is assumed that aliasing does not affect the standard deviation of the output noise. Therefore, this standard deviation is directly applied to  $V_n(t_0)$  [45]. Fig. 14 serves as an insightful and unified roadmap, providing detailed trade-offs in noise, bandwidth, and gain budgets for receiver sensitivity using a second-order all-pole transfer function. The tables on the left side of the figure help illustrate how the vertical and horizontal dimensions are constrained by ISI and noise, based on the selection of  $\omega_0$  and Q for the transfer function. According to the calculations in the tables, the optimal random timing jitter is achieved when Q = 0.6 and

$$\omega_0 = \frac{2\pi \times 1.5}{T_b}$$
, while the best amplitude noise performance occurs when  $Q = \frac{\sqrt{2}}{2}$  and  $\omega_0 = \frac{2\pi}{T_b}$ .

### **III. MEASUREMENT RESULTS**

The SDT receiver, shown in Fig. 7, was fabricated using 0.35  $\mu$ m CMOS technology with a 3.3V supply voltage and a total current consumption of 28 mA (excluding the output buffer), utilizing four metal layers. The die photograph of the receiver, including the MD-PIN PD, is illustrated in Fig. 15. The input pad of the receiver is bonded to the cathode pad of the photodiode using a gold wire with a diameter of approximately  $26\mu m$ . Both pad sizes are  $95\mu m \times 95\mu m$ , introducing a parasitic capacitance of about 90 fF to the chip ground. The chip ground is connected to the PCB ground through a parallel combination of bond wires, resulting in an equivalent inductance of 100 pH.

As shown in the figure, the PD's chip is positioned on the PCB 1 mm away from the receiver chip. This distance adds approximately 1 nH of inductance, neglecting the bond curvature. However, as explained in Section II-B, this level of inductance does not compensate for the subpar frequency performance of the PD due to the receiver's low impedance nature. The layout implementation of the receiver, depicted in Fig. 15, includes bias pads, supply capacitor  $C_{dd}$ , two RC filters ( $(R_h, C_h)$  and  $(R_o, C_o)$ ), and all amplifying stages, occupying an area of approximately 0.7 mm × 1.4 mm  $\approx$ 0.98 mm<sup>2</sup>.



FIGURE 15. Photograph of the SDT receiver chip wire-bonded to a MD-PIN PD.

The receiver's optical frequency response was thoroughly examined using a Rohde&Schwarz ZNB8 vector network analyzer (VNA) as shown in Fig. 16. Port 1 ( $P_1$ ) of the VNA modulates a 675 nm laser source, delivering light input to a programmable attenuator. This is followed by an optical splitter. One path goes to a power meter to monitor the input power level, while the other path leads to an XYZ fiber positioner to align and optimize light emission to the active area of the MD-PIN PD. Once the XYZ fiber positioner is perfectly aligned, the average converted current is measured using an ultra-sensitive ampere meter, which is connected in series with an ultra-low noise power supply.



FIGURE 16. Optical gain measurement setup for the SDT receiver with an attached MD-PIN PD.

The MD-PIN PD converts light into an electrical current, while the SDT receiver generates differential signals at  $P_2$  and  $P_3$ . For enhanced signal protection, these signals are routed to the VNA through a variable electrical attenuator. Calibration has been completed using a standard 17 Gb/s optical receiver to remove the effects of the laser source, coaxial cables, and PCB. The entire receiver differential transimpedance, including MD-PIN PD,  $Z_{RX}$ , is illustrated in Fig. 17 for further comparison. Additionally, the results of

the post-layout simulation of the proposed receiver combined with the MD-PIN-PD are included. The transimpedance gain achieved is 84 dB $\Omega$ , with a 3-dB bandwidth of 2.45 GHz, when the input average optical power of -15 dBm reaches the active area of the MD-PIN PD at a wavelength of 675 nm, generating an  $I_{av} \sin(\omega_{VNA}t)$  signal, where  $I_{av} = 9.17 \mu A$  and  $\omega_{VNA}$  is the instantaneous frequency generated by the VNA.



FIGURE 17. The complete receiver differential transimpedance gain, incorporating the MD-PIN PD.



FIGURE 18. The test setup for time domain noise, eye diagram, and bit error measurement.

This input current produces a differential voltage of 0.145V at the output of the receiver, ensuring minimal interference. Thus, the receiver operates within a small-signal regime. By dividing the receiver bandwidth by that of the MD-PIN PD, we obtain x = 2.63, demonstrating the effectiveness of our method in enhancing the CMOS PD frequency response through circuit techniques. The gain imbalance (GI) and phase imbalance (PI) of the receiver are also measured with maximum GI and PI being less than 0.65 dB and 4.2 degrees, respectively.

In order to evaluate how well the receiver performs in the time domain, we employed the measurement setup illustrated in Fig. 18. A pseudo-random bit generator (PRBS) was utilized to supply a modulating signal for a 675nm laser. The bit period of the non-return-to-zero (NRZ) random data

was defined by an Arbitrary frequency synthesizer and a synchronized clock was produced within the bit generator, which was connected to the trigger port of the Tektronix DSA8200 digital serial analyzer. The pseudo-random NRZ pattern reaches through a single-mode fiber to the attenuator and then passes through an XYZ fiber positioner, similar to the optical gain measurement that feeds the MD-PIN PD. The differential signal output from the receiver is connected to channel 3 ( $C_{h3}$ ) and channel 4 ( $C_{h4}$ ) of the DSA8200 using high-frequency coaxial cables and an electrical attenuator. By configuring the mentioned ports as differential ports using the Math setup of the signal analyzer, we ensure accurate and reliable signal analysis.

Using this setup, we can measure time domain noise and eye diagram, and with the addition of a bit error analyzer, provide an accurate estimate of the receiver BER. In the noise analysis section, we discussed that noise suppression can occur in the differential output of the the SDT receiver. Consequently, measuring the differential output noise is crucial. This can be achieved either by assessing the total Power Spectral Density (PSD) using a spectrum analyzer or by conducting time-domain measurements.

To interface our receiver with a single-input port spectrum analyzer, a passive balun with sufficient bandwidth to observe noise suppression is necessary. However, this method introduces complexity. Therefore, we opted to measure the receiver noise in the time domain. Given that we are dealing with an ergodic random process, where ensemble averages are equal to the corresponding time averages, a sufficiently long time-domain observation of the noise at the receiver's output port is adequate to characterize the receiver's noise performance [46].

As a result, Fig. 19 has been included, which is divided into two sections: (a) the post-layout result of the input-referred current noise density, demonstrating an average current density of 13.35  $\frac{pA}{\sqrt{Hz}}$  across the entire receiver, and (b) a time-domain noise measurement described in detail below.

In the setup shown in Fig. 18, we measured the noise performance of the receiver in two steps. First, we unplugged the cables at  $C_{h3}$  and  $C_{h4}$  and measured the differential output voltage histogram of  $(V_{C_{h3}} - V_{C_{h4}})$  using the DSA8200 signal analyzer. The standard deviation of this histogram in this step is approximately 884  $\mu$ V-rms. In the second step, the receiver is connected to the signal analyzer, but only the MD-PIN PD power supply is active, supplying sufficient reverse bias to the PD. There is no input incident light, all other setup components are disconnected, and the PD is kept in a dark condition with no illumination from the laser source. The histogram obtained in this step, depicted in Fig. 19b, represents the noise profile of the receiver along with the DSA8200 signal analyzer noise, which was measured to be 11.486 mV. Excluding signal analyzer noise, the differential voltage RMS noise of the receiver was calculated to be 11.40 mV-rms. By dividing this value by the receiver's DC gain of 15.85  $k\Omega$ , the total integrated input-referred current noise was determined to be approximately 717 nA-rms.



**FIGURE 19.** Input-referred current density and histogram-based noise measurement of the SDT receiver.

Dividing this result by  $\sqrt{BW}$  yields a measured input-referred current density of 14.52  $\frac{pA}{\sqrt{Hz}}$ , which fairly aligns with the post-layout simulation results.

Three different bit rates (2, 3, and 4 Gb/s) were tested with the peak-to-peak input average Optical powers of -18, -17.2, and -16.2 dBm, respectively, and PRBS  $2^{23}$  – 1. The resulting eye diagrams are shown in Fig. 20. The average power calculations assume a very high extinction ratio for our laser. At 2 Gb/s, a sensitivity of -18.2 dBm, is achieved with a BER of  $10^{-9}$ , without any power penalty from vertical or horizontal ISI. However, at 3 Gb/s and 4 Gb/s, the input power needs to be increased by 1 dB and 2 dB, respectively, to maintain a BER of  $10^{-9}$ . At 4 Gb/s, as the input current nearly doubles, horizontal ISI becomes more pronounced, though the vertical eye opening remains unaffected.

In order to accommodate the single-ended input requirement of our bit error tester and take advantage of the improved noise suppression offered by the SDT receiver in its differential mode, we performed BER measurements using the configuration shown in Fig. 18, with some minor modifications. We evaluated the BER of the SDT receiver in conjunction with the MD-PIN PD at two different bit rates (2 Gb/s and 4 Gb/s), as depicted in Fig. 21 relative to the input average optical power (sensitivity).

The BER measurements were conducted using the single-ended output of the bit error tester and then repeated using the built-in Q function of the sampling oscilloscope set up for the differential port. By consolidating these outcomes, we were able to accurately determine the receiver's BER,

## TABLE 1. Performance comparison of fully integrated CMOS and BiCMOS optoelectronic integrated circuits (OEICs).

| Parameters                                                                                                                                                     | [23]       | [14]      | [17]       | [21]       | [18]      | [19]      | [20]       | [13]       | [12]       | This work |
|----------------------------------------------------------------------------------------------------------------------------------------------------------------|------------|-----------|------------|------------|-----------|-----------|------------|------------|------------|-----------|
| Technology                                                                                                                                                     | CMOS       | CMOS      | CMOS       | CMOS       | BiCMOS    | BiCMOS    | CMOS       | CMOS       | CMOS       | CMOS      |
| Node length (nm)                                                                                                                                               | 65         | 180       | 130        | 130        | 600       | 500       | 180        | 65         | 180        | 350       |
| PD Type                                                                                                                                                        | APD        | SMPD      | APD        | PN         | PIN       | PIN       | PN         | PN         | SML-PD     | MD-PIN    |
| Chip area $(mm^2)$                                                                                                                                             | 0.024      | 0.76      | 0.26       | 1.71       | 0.96      | 0.6       | N/A        | 0.24       | 0.72       | 0.98      |
| Wavelength (nm)                                                                                                                                                | 850        | 850       | 850        | 850        | 630       | 850       | 850        | 670        | 850        | 675       |
| Power dissipation (mW)                                                                                                                                         | 13.5       | 145       | 66.8       | 74.16      | 185       | 310       | 34         | 46         | 183        | 92        |
| Bit rate (Gb/s)                                                                                                                                                | 12.5       | 10        | 10         | 4.5        | 5         | 11        | 3          | 4          | 5          | 4         |
| BER                                                                                                                                                            | $10^{-12}$ | $10^{-9}$ | $10^{-12}$ | $10^{-12}$ | $10^{-9}$ | $10^{-9}$ | $10^{-11}$ | $10^{-12}$ | $10^{-12}$ | $10^{-9}$ |
| Sensitivity (dBm)                                                                                                                                              | -2         | -6        | -4         | -3.4       | -20.2     | -8.9      | -19        | -6.2       | -3         | -16.2     |
| PD+Reciver BW (GHz)                                                                                                                                            | 6          | 6.9       | 6          | 2.7        | 2.4       | 7.7       | N/A        | 2.6-4      | N/A        | 2.45      |
| BW extension ratio                                                                                                                                             | 1.27       | 1         | 1          | 5.4        | 1         | 3.5       | N/A        | N/A        | N/A        | 2.63      |
| Gain (dB $\Omega$ )                                                                                                                                            | 60         | 88        | 100        | 105        | 97.2      | 85        | N/A        | N/A        | N/A        | 84        |
| Integrated input current noise (nA-rms)                                                                                                                        | N/A        | 1270      | N/A        | 665.8      | N/A       | N/A       | N/A        | 5900-7800  | N/A        | 717       |
| $\text{FOM} \stackrel{\triangle}{=} \frac{ \text{log}(BER)  \times Z_{0-dB\Omega} \times \frac{BR}{Cb/s} \times x}{\frac{Sen - dBm}{mW} \times \frac{10}{10}}$ | 1341.9     | 217.5     | 451.23     | 903.24     | 2475.8    | 1622.5    | N/A        | N/A        | N/A        | 3603.7    |



**FIGURE 20.** Eye diagram measurements for three different bit rates (2, 3, and 4 Gb/s) using a PRBS  $2^{23} - 1$  pattern of the SDT receiver with the MD-PIN PD.

with the SDT achieving a BER of less than  $10^{-9}$  and average sensitivities of -18 dBm and -16.2 dBm at 2 Gb/s and 4 Gb/s, respectively.



**FIGURE 21.** Measured BER versus sensitivity at 2 and 4 Gb/s for PRBS  $2^{23} - 1$  input pattern.

Table 1 presents a comparison of the current work with various methods for enhancing the speed of CMOS/BiCMOS photodiodes. Our focus was on evaluating bandwidth efficiency and the trade-offs involved, such as increased noise, area requirements, the equalization method (bandwidth efficiency technique), and power consumption, all of which are detailed in the table. To facilitate a fair comparison, inspired by the figure-of-merit (FOM) in [21], this paper proposes a modified FOM, presented at the bottom of Table 1, to highlight the performance of the optical receiver equalizer. This modified FOM incorporates the key parameters from a system-level perspective. In the numerator, we use the log of the absolute value of BER, the normalized data rate, and the bandwidth extension ratio, while the denominator includes the normalized sensitivity and power dissipation. Based on the comparison table, it can be concluded that receivers with lower-speed PDs, particularly those using a PN junction,

exhibited poorer sensitivity performance. In our design, thanks to the relatively high bandwidth of the MD-PIN PD and our precise analysis, we successfully demonstrate a single-to-differential TIA equalizer with performance comparable to even more advanced technologies, such as those using superior transistors like bipolar.

## **IV. CONCLUSION**

In this paper, we presented the design and implementation of a high-speed 4 Gb/s optical receiver utilizing a multi-dot PIN CMOS photodiode with a 930 MHz bandwidth, extended to 2.45 GHz through a noise-suppressed single-to-differential transimpedance amplifier (SDT-TIA). By employing lowfrequency zero synthesis in the SDT-TIA feedback path, we achieved a bandwidth enhancement of 2.63x, enabling the receiver to operate at data rates of 4 Gb/s, 3 Gb/s, and 2.5 Gb/s with bit-error ratios below  $10^{-9}$  at minimum optical powers of -16.2 dBm, -17.2 dBm, and -18 dBm, respectively.

The front-end's noise performance was optimized, with an integrated input-referred current noise below 717 nA rms, eliminating the need for a dummy TIA. The proposed receiver design demonstrates the effectiveness of integrating equalization techniques to extend bandwidth while maintaining a low bit error rate (BER) and enhancing noise performance. By conducting a comprehensive comparison with existing CMOS/BiCMOS advanced receiver designs—where photodiodes with limited bandwidth are connected to their inputs—we show that our approach achieves competitive performance. This is particularly evident when evaluated using a modified figure of merit (FOM), which considers key system-level parameters such as normalized data rate, BER, bandwidth extension ratio, sensitivity, and power dissipation.

## APPENDIX A SDT DETAILED FREQUENCY RESPONSE ANALYSIS

Fig. 22 presents the small-signal equivalent circuit of SDT used for frequency response calculations. To simplify the analysis, channel length modulation is disregarded for all transistors except  $M_3$ . For transistor  $M_3$ , the channel length modulation effect is represented by  $r_{ds3} = \frac{1}{g_{o3}}$ . The body effect and all gate-drain capacitors are neglected in this analysis. The capacitance  $C_{in}$  arises from the parasitic capacitances of the transistors connected to the  $V_{in}$  node, as well as the capacitance of the photodiode. The capacitances  $C_1$  and  $C_2$  are associated with the output nodes  $V_{o1}$  and  $V_{o2}$ , respectively, and they contribute to the two output time constants,  $\tau_1 = R_1 \times C_1$  and  $\tau_2 = R_2 \times C_2$ . The voltage DC gain at each output with respect to  $V_{in}$  is given by  $A_1 = g_{m1} \times R_1$  and  $A_2 = g_{m2} \times R_2$ , respectively.

The effect of active feedback on the input is represented by a dependent current source, as illustrated in Fig. 22, with a factor of  $A_n(S) \times g_{m3}$ , where  $g_{m3}$  is the transconductance of the  $M_3$  transistor. The feedback capacitor,  $C_f$ , consists of a parasitic capacitor associated with the node  $V_f$  and an additional variable capacitor. By applying Kirchhoff's Current Law (KCL) at both output nodes, and considering the above descriptions, we can derive the voltage frequency transfer function of both outputs with respect to the input node ( $V_{in}$ ). This transfer function is given in (33), where S is the complex frequency, defined as  $S = j2\pi f$  with  $j = \sqrt{-1}$  and f representing the frequency in Hertz.



FIGURE 22. Small-signal equivalent circuit of SDT for frequency response.

$$V_{o1} - V_{o2} = \left(\frac{A_1}{1 + \tau_1 S} + \frac{A_2}{1 + \tau_2 S}\right) V_{in}$$
(33)

By applying KCL at the input node and performing algebraic simplifications, we can express  $V_{in}$  in terms of  $I_{in}$ , which also reveals the input impedance of the SDT, as shown in the equation below:

$$Z_{in} = \frac{(1 + \tau_2 S) (1 + \tau_f S)}{G_0 \left(1 + \frac{(\tau_{in} + \tau_f + \tau_2)S}{(1 + A_f)} + \frac{(\tau_{in} (\tau_f + \tau_2) + \tau_f \tau_2)S^2}{(1 + A_f)} + \frac{\tau_{in} \tau_f \tau_2 S^3}{(1 + A_f)}\right)}$$
(34)

In this equation,  $G_0 = (g_{m1} + g_{o3})(1 + A_f)$ , which represents the gm-boosting technique, where the transconductance of  $M_1$  is increased due to using an active feedback path [47]. By inserting (34) into (33), while considering the gain and phase imbalance conditions, and performing some manipulation, we can derive (7).

## **APPENDIX B NOISE ANALYSIS**

In the noise analysis of our optical receiver, we primarily consider a wide-sense stationary random process x(t) with zero mean. x(t) passing through a linear time-invariant (LTI) system characterized by an impulse response h(t). The autocorrelation of the output y(t) is given by [46]:

$$R_{y}(\tau) = h(-\tau) * h(\tau) * R_{x}(\tau)$$
(35)

Therefore, as result of the Wiener-Khinchin theorem, the above equation can be expressed in terms of the power spectral density (PSD,  $S_i(f)$ ) as follows:

$$S_{v}(f) = |H(f)|^{2} \times S_{x}(f)$$
 (36)

As previously mentioned, the random process has a zero mean, allowing the output variance  $(\delta_y)$  to be expressed as

shown in (37).

$$\sigma_{y}^{2} = R_{y}(0) = \int_{-\infty}^{\infty} |H(f)|^{2} \times S_{x}(f) df \qquad (37)$$

In our calculation, we primarily focus on white noise, which is frequency-independent. As a result, we only need to evaluate  $I = \int_{-\infty}^{\infty} |H(f)|^2 df$ . Consequently, we first need to calculate H(f) for each source of noise. By integrating and multiplying with the PSD of each associated noise source, we can obtain the variance of each noise source. Since these noise sources are statistically independent, their variances can be added together.

Now, the poles and zeros profile of H(f) determines the amount of noise that reaches the output. To simplify the noise calculation, we utilized the integral results summarized in (38), (39), and (40).

$$H_{1} = \frac{1}{1 + \frac{S}{\omega_{p}}}, I = \frac{\omega_{p}}{4}$$
(38)  
$$H_{2} = \frac{\overbrace{1 \pm \frac{S}{\omega_{p}}}^{N_{1}(S)}}{1 \pm \frac{S}{\omega_{z}}}, I = \frac{Q\omega_{0}}{4} \left(1 + \frac{\omega_{0}^{2}}{\omega_{z}^{2}}\right)$$
(39)

$$H_{3} = \frac{N_{1}(S)}{D_{1}(S) \times D_{2}(S)}, I = \frac{Q\omega_{0}}{4} \left( \frac{1 + \frac{\omega_{0}}{Q\omega_{p}} + \left(\frac{\omega_{0}}{\omega_{z}}\right)^{2}}{1 + \frac{\omega_{0}}{Q\omega_{p}} + \left(\frac{\omega_{0}}{\omega_{p}}\right)^{2}} \right)$$
(40)

The calculations in this paper for first- and second-order transfer functions are similar to closed-form integrals computed in [48] and [49]. In contrast, our calculation provides a general form for the third-order system, which was not reported in the two previous references.

As previously mentioned, channel thermal noise is the main contributor to noise in our receiver. It is typically represented as an equivalent current source between the drain and source terminals, as illustrated in Fig. 23. The one-sided PSD in the active region is approximately given by:

$$S_{xM}(f) = 4kT\gamma g_m, S_{xR}(f) = \frac{4kT}{R},$$
(41)

where  $(k = 1.38 \times 10^{-23} J_{K})$  denotes the Boltzmann constant, *T* represents the absolute temperature in Kelvin, and  $\gamma$  is a coefficient specifically set to  $\frac{2}{3}$  for long-channel devices. In this equation,  $S_{xM}(f)$  and  $S_{xR}(f)$  refer to on-sided PSDs for the MOS transistor and resistor, respectively.

The equivalent noise model of the SDT TIA circuit is depicted in Fig. 23, encompassing noise sources from all components. Thermal noise from resistors and MOSFETs is considered, while flicker noise has been neglected for



FIGURE 23. The noise equivalent circuit of the SDT.

simplicity.  $I_{n1}$  represents the noise from  $R_1$ ,  $I_{n2}$  corresponds to thermal noise from  $M_2$  and  $R_2$ , and  $I_{n3}$  is an equivalent noise current source representing the influence of  $M_3$ . The noise contribution of  $M_4$  and  $M_5$  is accounted for by  $I_{n5}$ , which first undergoes first-order filtering and then propagates to the differential output, similar to the noise from  $M_3$ . The current noise of  $M_3$  ( $I_{n3}$ ) follows the same path as the photodiode current, resulting in a similar transfer function. For  $M_4$  and  $M_5$ , the only difference is the multiplication by  $A_{cm} = \begin{pmatrix} g_{m3} \\ g_{m5} \\ g_{m5} \end{pmatrix}$ . The results are summarized in (42) and (43), respectively.

$$V_{on4} \simeq \frac{Z_0 \times (1 + 2A_f \tau_o S) \times I_{n4}}{\left(1 + \frac{(2A_f + 1)\tau_o}{(A_f + 1)}S + \frac{2A_f \tau_o^2}{(A_f + 1)}S^2\right)}$$
(42)

$$V_{on5} \simeq \frac{-A_{cm} \times Z_0 \times I_{n5}}{\left(1 + \frac{(2A_f + 1)\tau_o}{(A_f + 1)}S + \frac{2A_f\tau_o^2}{(A_f + 1)}S^2\right)}$$
(43)

In the calculation of the standard deviation (STD) noise, a few assumptions were made as follows: first,  $\tau_{in}$  is negligible compared to  $\tau_o$  and  $\tau_f$ . Second, to satisfy  $Q = \frac{\sqrt{2}}{2}$ , the two main time constants of the transimpedance amplifier (TIA) must be chosen such that  $\tau_f = 2A_f \tau_o$ . As a result, the transfer functions are expressed in terms of  $A_f$  and  $\tau_o$ . For simplicity, we did not include  $|H_x|^2$  in our initial calculations, but they are considered in the total noise integration. All calculations were adjusted to reveal the design trade-offs and to provide insight into the key system-level design parameters. For 100% accuracy, a co-simulation is necessary.

The noise current of  $R_1$  reaches the positive single-ended output  $V_{o1}$  through a first-order RC filter. In contrast, the noise current of  $M_1$  encounters two different paths with opposite signs, facilitating noise suppression across frequencies. The output voltages influenced by  $I_{n1}$  and  $I_{n2}$ generate  $V_{on1}$  and  $V_{on2}$ , as expressed in (44) and (45), respectively.

$$V_{on1} = \frac{R_1 \times I_{n1}}{(1 + \tau_o S)}$$
(44)  
$$V_{on2} \simeq \frac{R_1 \times ((1 - A_f) + (2A_f + 1) S + 2A_f \tau_o^2 S^2) \times I_{n2}}{\left( (A_f + 1) + (3A_f + 2) \tau_o S + (4A_f + 1) \tau_o^2 S^2 + 2A_f \tau_o^3 S^3 \right)}$$
(45)

It should be noted that in the noise calculation of  $M_1$ ,  $\frac{(A_1+A_2)}{(g_{m1}+g_{o3})} \simeq 2R_1$  is applied to obtain (45). Similarly, we also calculate the output noise affected by  $M_2$  and  $R_2$  as summarized in (46).

When dealing with fully differential pairs like LAs and differential pairs and EA, if the circuit is symmetrical, we can use a halved circuit. Also, if the source noises are uncorrelated, we can then multiply the total output variance by a factor of 2. Furthermore, all noise from the current tails shows up as a common-mode signal when the differential pair is in equilibrium and is removed in the differential output. Even if the differential pair is not in equilibrium, the output-generated noise from the current tail is minimal due to the symmetry [37].

The noise-equivalent circuit of the differential amplifier and LA is depicted in Fig. 24. In this circuit, the input is set to zero, and all sources of noise are highlighted as current sources affecting different nodes of the circuit. In Fig. 24a,  $I_{n1}$ ,  $I_{n2}$ , and  $I_{n3}$  represent the current noise of  $R_D$ ,  $M_6$ , and  $R_{eq}$ , respectively.



**FIGURE 24.** The noise equivalent circuit of the differential amplifier and LA.

Considering the design issues discussed in the context of the differential amplifier and Fig. 24a, the noise output voltage contributions from  $R_D$ ,  $M_6$ , and  $R_{eq}$  are expressed in Equations 47 to 49. Analysis of the differential pair's noise transfer functions suggests that  $R_{eq}$  is the main contributor, compared to  $R_D$  and  $M_6$ .

$$V_{on1} = \frac{R_D I_{n1}}{(1 + \tau_{od} S)}$$
(47)

$$V_{on2} = -\frac{R_D I_{n2}}{2\left(1 + \frac{\tau_{od}}{2}S\right)}$$
(48)

$$V_{on3} = \frac{A_{dp}R_{eq}I_{n3}}{\left(1 + \frac{\tau_{eq}}{2}S\right)\left(1 + \tau_{od}S\right)}$$
(49)

The last stage of noise analysis is the LA, which is depicted in Fig. 24b, where the noise current tails are neglected. In this circuit, the input is set to zero, and all noise sources are highlighted. The noise in this circuit is influenced by two different transfer functions:  $R_{L1}$ ,  $M_8$ , and  $M_{10}$  affect the  $V_{x1}$  node, while  $R_{L2}$  and  $M_{12}$  affect the  $V_{out}$  node. Thus, we can represent their noise paths using  $I_{x1}$  and  $I_{x2}$  as current sources, as depicted in Fig. 24.

As a result, we need to calculate two separate outputs: one when  $I_{x1}$  is applied and the other when  $I_{x2}$  is applied. Both transfer functions are computed in (50) and (51).

$$V_{outn1} = -\frac{A_{L2}R_{L1}I_{x1}}{(1 + \tau_{lx2}S)(1 + \tau_{lx1}S) + A_FA_{L2}}$$
(50)

$$V_{outn2} = -\frac{(1 + \tau_{lx1}S) R_{L2}T_{x2}}{(1 + \tau_{lx2}S) (1 + \tau_{lx1}S) + A_F A_{L2}}$$
(51)

## **APPENDIX C DC OFFSET CALCULATION**

To formulate the input offset of the EA, Kirchhoff's Voltage Law (KVL) should be applied at the input, as expressed in (52) (see Fig. 8d).

$$V_{if-EA} = V_{T17} - V_{T18} + \sqrt{\frac{2I_{D17}}{\beta_{17}}} - \sqrt{\frac{2I_{D18}}{\beta_{18}}}$$
(52)

In this context,  $I_{D17}$  and  $I_{D18}$  are the drain-source currents of  $M_{17}$  and  $M_{18}$ , respectively. The terms  $V_{T17}$  and  $V_{T18}$ represent the threshold voltages of  $M_{17}$  and  $M_{18}$ . The parameter  $\beta_{(n/p)x}$  is defined as  $\mu_{(n/p)}C_{ox}\left(\frac{W}{L}\right)_x$ . From Fig. 8d, we observe that  $I_{D17} = I_{D21} + I_{D23}$  and  $I_{D18} = I_{D24} + I_{D22}$ , while  $V_{SG21} = V_{SG24}$  and  $V_{SG22} = V_{SG23}$ .

The differences between two nominally matched circuit parameters are usually small in comparison to the absolute value of those parameters. This allows for a method where the separate contributions to offset voltage can be analyzed and then combined [36]. Similar to the approach in [36], for each parameter in the above equation, we define an average  $X = \frac{X_1 + X_2}{2}$  and a difference  $\Delta X = X_1 - X_2$ . Based on these definitions, we can rewrite the parameter as:

$$X_1 = X + \frac{\Delta X}{2}, X_2 = X - \frac{\Delta X}{2}$$
 (53)

When we apply  $V_{if-EA}$  at the input, we expect that  $V_{SG21} = V_{SG22}$  and also  $V_{SG23} = V_{SG24}$ , which leads to two important results summarized in (54) and (55), respectively.

$$\frac{\Delta I_{p1}}{I_{p1}} = -2\sqrt{\frac{\beta_{p1}}{2I_{p1}}}\Delta V_{Tp1} + \frac{\Delta\beta_{p1}}{\beta_{p1}}$$
(54)

$$\frac{\Delta I_{p2}}{I_{p2}} = -2\sqrt{\frac{\beta_{p2}}{2I_{p2}}}\Delta V_{Tp2} + \frac{\Delta\beta_{p2}}{\beta_{p2}}$$
(55)

Here,  $\Delta V_{Tp1}$  and  $\Delta V_{Tp2}$  represent the threshold voltage mismatches for the  $M_{21}$  and  $M_{22}$  pair, and the  $M_{23}$  and  $M_{24}$  pair, respectively, as defined by (53). The terms  $\frac{\Delta I_{p1}}{I_{p1}}$  and  $\frac{\Delta I_{p2}}{I_{p2}}$  characterize the mismatches in the drain current for the  $M_{21}$  and  $M_{22}$  pair, and the  $M_{23}$  and  $M_{24}$  pair, respectively, using (53).

When adjusting the offset in a differential amplifier, it is important to take into account the symmetrical pair elements along the y-axis of the amplifiers. The factors influencing the output current should be described using Equation 32. Finally, we can use a Taylor series approximation to simplify the equation and gain a better understanding. In many cases involving CMOS differential amplifiers, we can rely on the estimation derived in (56), where  $\frac{\Delta x}{x} \ll 1$  and  $\frac{\Delta y}{y} \ll 1$ .

$$\frac{\sqrt{1+\frac{\Delta x}{x}}}{\sqrt{1+\frac{\Delta y}{y}}} - \frac{\sqrt{1-\frac{\Delta x}{x}}}{\sqrt{1-\frac{\Delta y}{y}}} \simeq \left(\frac{\Delta x}{x} - \frac{\Delta y}{y}\right)$$
(56)

By inserting (54) and (55) into (52), and using the definitions provided in (53), and applying the Taylor approximation in (56), we can derive (15).

#### REFERENCES

- H. Park, Y.-U. Jeong, and S. Kim, "A 24-Gb/s/pin single-ended PAM-4 receiver with 1-Tap decision feedback equalizer using inverter-based summer for memory interfaces," *IEEE Access*, vol. 10, pp. 91888–91896, 2022.
- [2] X. Sun, C. H. Kang, M. Kong, O. Alkhazragi, Y. Guo, M. Ouhssain, Y. Weng, B. H. Jones, T. K. Ng, and B. S. Ooi, "A review on practical considerations and solutions in underwater wireless optical communication," *J. Lightw. Technol.*, vol. 38, no. 2, pp. 421–431, Jan. 15, 2020.
- [3] A. Giuglea, G. Belfiore, F. Protze, R. Henker, and F. Ellinger, "A high-speed 130-nm SiGe BiCMOS integrated duobinary driver for data rate capacity enhancement in VCSEL-based optical links," *IEEE Access*, vol. 12, pp. 28343–28352, 2024.
- [4] X. Li, H. Wang, J. Zhu, and C. P. Yue, "Dual-photodiode differential receivers achieving double photodetection area for gigabit-per-second optical wireless communication," *IEEE J. Solid-State Circuits*, vol. 58, no. 6, pp. 28343–28352, Jun. 2023.
- [5] H. A. Al-Mohammed, E. Yaacoub, K. Abualsaud, and S. A. Al-Maadeed, "Using quantum key distribution with free space optics to secure communications in high-speed trains," *IEEE Access*, vol. 12, pp. 1681–1692, 2024.
- [6] G. Dzialas, A. Fatemi, A. Peczek, L. Zimmermann, A. Malignaggi, and G. Kahmen, "A 56-Gb/s optical receiver with 2.08-μA noise monolithically integrated into a 250-nm SiGe BiCMOS technology," *IEEE Trans. Microw. Theory Techn.*, vol. 70, no. 1, pp. 392–401, Jan. 2022.
- [7] S.-J. Yang, J.-H. Lee, M.-J. Lee, and W.-Y. Choi, "A 20 Gb/s CMOS single-chip 850 nm optical receiver," *J. Lightw. Technol.*, vol. 42, no. 13, pp. 4525–4530, Jul. 1, 2024.
- [8] B. Fahs, J. Rollinson, and M. M. Hella, "A CMOS fully integrated 4×4 VLC-compliant receiver array with 2.5 Gb/s per channel," *IEEE Sensors J.*, vol. 23, no. 3, pp. 2375–2384, Feb. 2023.
- [9] Z. Lv, G. He, C. Qiu, Y. Fan, H. Wang, and Z. Liu, "CMOS monolithic photodetector with a built-in 2-dimensional light direction sensor for laser diode based underwater wireless optical communications," *Opt. Exp.*, vol. 29, no. 11, p. 16197, 2021.
- [10] T. Jukic, B. Steindl, R. Enne, and H. Zimmermann, "Monolithically integrated avalanche photodiode receiver in 0.35 μm bipolar complementary metal oxide semiconductor," *Opt. Eng.*, vol. 54, no. 11, Nov. 2015, Art. no. 110502.
- [11] T.-H. Hsu, J.-J. Jou, T.-T. Shih, and Y.-J. Hung, "Monolithic integration of photo-diode and transimpedance amplifier in CMOS 90 nm technology," in *Proc. 9th Int. Conf. Appl. Syst. Innov. (ICASI)*, Apr. 2023, pp. 247–249.
- [12] T. S. Kao, F. A. Musa, and A. C. Carusone, "A 5-Gbit/s CMOS optical receiver with integrated spatially modulated light detector and equalization," *IEEE Trans. Circuits Syst. I, Reg. Papers*, vol. 57, no. 11, pp. 2844–2857, Nov. 2010.
- [13] Y. Dong and K. W. Martin, "A 4-Gbps POF receiver using linear equalizer with multi-shunt-shunt feedbacks in 65-nm CMOS," *IEEE Trans. Circuits Syst. II, Exp. Briefs*, vol. 60, no. 10, pp. 617–621, Oct. 2013.
- [14] S.-H. Huang, W.-Z. Chen, Y.-W. Chang, and Y.-T. Huang, "A 10-Gb/s OEIC with meshed spatially-modulated photo detector in 0.18-µm CMOS technology," *IEEE J. Solid-State Circuits*, vol. 46, no. 5, pp. 1158–1169, May 2011.

- [15] M.-J. Lee, J.-S. Youn, K.-Y. Park, and W.-Y. Choi, "A fully-integrated 12.5-Gb/s 850-nm CMOS optical receiver based on a spatially-modulated avalanche photodetector," *Opt. Exp.*, vol. 22, no. 3, pp. 2511–2518, 2014.
- [16] D. Lee, J. Han, G. Han, and S. M. Park, "An 8.5-Gb/s fully integrated CMOS optoelectronic receiver using slope-detection adaptive equalizer," *IEEE J. Solid-State Circuits*, vol. 45, no. 12, pp. 2861–2873, Dec. 2010.
- [17] J.-S. Youn, M.-J. Lee, K.-Y. Park, and W.-Y. Choi, "10-Gb/s 850-nm CMOS OEIC receiver with a silicon avalanche photodetector," *IEEE J. Quantum Electron.*, vol. 48, no. 2, pp. 229–236, Feb. 2012.
- [18] R. Swoboda, J. Knorr, and H. Zimmermann, "A 5-Gb/s OEIC with voltageup-converter," *IEEE J. Solid-State Circuits*, vol. 40, no. 7, pp. 1521–1526, Jul. 2005.
- [19] R. Swoboda and H. Zimmermann, "11 Gb/s monolithically integrated silicon optical receiver for 850 nm wavelength," in *IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, 2006, pp. 904–911.
- [20] S. Radovanovic, A.-J. Annema, and B. Nauta, "A 3-Gb/s optical detector in standard CMOS for 850-nm optical communication," *IEEE J. Solid-State Circuits*, vol. 40, no. 8, pp. 1706–1717, Aug. 2005.
- [21] F. Tavernier and M. S. J. Steyaert, "High-speed optical receivers with integrated photodiode in 130 nm CMOS," *IEEE J. Solid-State Circuits*, vol. 44, no. 10, pp. 2856–2867, Oct. 2009.
- [22] S. S. K. Poushi, B. Goll, K. Schneider-Hornstein, M. Hofbauer, and H. Zimmermann, "Area and bandwidth enhancement of an n<sup>+</sup>/p-well dot avalanche photodiode in 0.35 μm CMOS technology," *Sensors*, vol. 23, no. 7, pp. 3403–3418, 2023.
- [23] H.-Y. Jung, J.-M. Lee, and W.-Y. Choi, "A high-speed CMOS integrated optical receiver with an under-damped TIA," *IEEE Photon. Technol. Lett.*, vol. 27, no. 13, pp. 1367–1370, Jul. 1, 2015.
- [24] P. Brandl, S. Schidl, and H. Zimmermann, "PIN photodiode optoelectronic integrated receiver used for 3-Gb/s free-space optical communication," *IEEE J. Sel. Topics Quantum Electron.*, vol. 20, no. 6, pp. 391–400, Nov. 2014.
- [25] P. Brandl and H. Zimmermann, "3 Gbit/s optical receiver IC with high sensitivity and large integrated pin photodiode," *Electron. Lett.*, vol. 49, no. 8, pp. 552–554, Apr. 2013.
- [26] S. S. K. Poushi, B. Goll, K. Schneider-Hornstein, and H. Zimmermann, "Large active area, low capacitance multi-dot PIN photodiode in 0.35 m CMOS technology," *IEEE Photon. J.*, vol. 16, no. 1, pp. 1–6, Feb. 2024.
- [27] B. Mesgari, H. Mahmoudi, and H. Zimmermann, "A single-to-differential transimpedance amplifier for low-noise and high-speed optical receivers," in *Proc. Austrochip Workshop Microelectron. (Austrochip)*, Oct. 2019, pp. 76–80.
- [28] P. Brandl, T. Jukic, R. Enne, K. Schneider-Hornstein, and H. Zimmermann, "Optical wireless APD receiver with high background-light immunity for increased communication distances," *IEEE J. Solid-State Circuits*, vol. 51, no. 7, pp. 1663–1673, Jul. 2016.
- [29] B. Nakhkoob and M. M. Hella, "A 4.7-Gb/s reconfigurable CMOS imaging optical receiver utilizing adaptive spectrum balancing equalizer," *IEEE Trans. Circuits Syst. I, Reg. Papers*, vol. 64, no. 1, pp. 182–194, Jan. 2017.
- [30] D. Milovančev, P. Brandl, T. Jukić, B. Steindl, N. Vokić, and H. Zimmermann, "Optical wireless APD receivers in 0.35 μm HV CMOS technology with large detection area," *Opt. Exp.*, vol. 27, no. 9, pp. 11930–11945, 2019.
- [31] Y. Dong and K. W. Martin, "A high-speed fully-integrated POF receiver with large-area photo detectors in 65 nm CMOS," *IEEE J. Solid-State Circuits*, vol. 47, no. 9, pp. 2080–2092, Sep. 2012.
- [32] E. Säckinger, Analysis and Design of Transimpedance Amplifiers for Optical Receivers. Hoboken, NJ, USA: Wiley, 2017.
- [33] B. Razavi, Design of Integrated Circuits for Optical Communications. Hoboken, NJ, USA: Wiley, 2012.
- [34] S. M. Park and H.-J. Yoo, "1.25-Gb/s regulated cascode CMOS transimpedance amplifier for gigabit Ethernet applications," *IEEE J. Solid-State Circuits*, vol. 39, no. 1, pp. 112–121, Jan. 2004.
- [35] S. Galal and B. Razavi, "10-Gb/s limiting amplifier and laser/modulator driver in 0.18-µm CMOS technology," *IEEE J. Solid-State Circuits*, vol. 38, no. 12, pp. 2138–2146, Dec. 2003.
- [36] P. R. Gray, P. J. Hurst, S. H. Lewis, and R. G. Meyer, Analysis and Design of Analog Integrated Circuits. Berkeley, CA, USA: Wiley, 2009.
- [37] B. Razavi, Design of Analog CMOS Integrated Circuits, 2nd ed., Los Angeles, CA, USA: McGraw-Hill, 2016.
- [38] R. Dehghani, Design of CMOS Operational Amplifiers. Isfahan, Iran: Artech House, 2013.

- [39] J. Proesel, C. Schow, and A. Rylyakov, "25Gb/s 3.6pJ/b and 15Gb/s 1.37pJ/b VCSEL-based optical links in 90nm CMOS," in *Proc. IEEE Int. Solid-State Circuits Conf.*, Feb. 2012, pp. 418–420.
- [40] E. Säckinger, Broadband Circuits for Optical Fiber Communication. Hoboken, NJ, USA: Wiley, 2005.
- [41] B. Mesgari, N. Vokic, B. Goll, B. Pichler, D. Milovancev, K. Schneider-Hornstein, H. Arthaber, and H. Zimmermann, "38.5 Gb/sRoF based optical receiver for 5G mobile remote radio head applications," in *Proc. Austrochip Workshop Microelectron. (Austrochip)*, Oct. 2020, pp. 66–70.
- [42] "HFAN-03.0.2: Optical receiver performance evaluation," Resour. Library. [Online]. Available: https://www.analog.com/
- [43] D. Abdelrahman and M. Atef, "Accurate characterization for continuoustime linear equalization in CMOS optical receivers," *IEEE Access*, vol. 10, pp. 129019–129028, 2022.
- [44] B. Radi, D. Abdelrahman, O. Liboiron-Ladouceur, G. Cowan, and T. C. Carusone, "Optimal optical receivers in nanoscale CMOS: A tutorial," *IEEE Trans. Circuits Syst. II, Exp. Briefs*, vol. 69, no. 6, pp. 2604–2609, Jun. 2022.
- [45] S. F. Piraghaj and S. Saeedi, "Analysis of timing accuracy and sensitivity in a RF correlation-based impulse radio receiver with phase interpolation for data synchronization," *IEEE Trans. Circuits Syst. I, Reg. Papers*, vol. 66, no. 7, pp. 2749–2762, Jul. 2019.
- [46] H. Darabi, Radio Frequency Integrated Circuits and Systems. Cambridge, CA, USA: Cambridge Univ. Press, 2020.
- [47] B. Abdollahi, B. Mesgari, S. Saeedi, E. Roshanshomal, A. Nabavi, and H. Zimmermann, "Transconductance boosting technique for bandwidth extension in low-voltage and low-noise optical TIAs," *IEEE Trans. Circuits Syst. II, Exp. Briefs*, vol. 69, no. 3, pp. 834–838, Mar. 2022.
- [48] A. Dastgheib and B. Murmann, "Calculation of total integrated noise in analog circuits," *IEEE Trans. Circuits Syst. I, Reg. Papers*, vol. 55, no. 10, pp. 2988–2993, Nov. 2008.
- [49] M. de Medeiros Silva and L. B. Oliveira, "Regulated common-gate transimpedance amplifier designed to operate with a silicon photomultiplier at the input," *IEEE Trans. Circuits Syst. I, Reg. Papers*, vol. 61, no. 3, pp. 725–735, Mar. 2014.



**BASET MESGARI** received the M.Sc. degree in electrical engineering from Tarbiat Modares University, Tehran, Iran, in 2015. He is currently pursuing the Ph.D. degree in electrical engineering with Vienna University of Technology (TU-Wien), where his research centers on the analysis and design of millimeter-wave receivers for beamforming applications. Recently, he was awarded the Austrian Marshall Plan Fellowship, enabling a research visit with the University of California,

Davis (UC Davis), focusing on "Energy-Efficient, Ultra-Wideband Beam Steering for Beyond 5G Transceivers." In addition to his Ph.D. studies, he has been actively involved as a Research Fellow with TU-Wien, contributing to various IC-design projects, including analog-mixed mode/RF and optoelectronic transceiver design. His research interests include the development of linear and nonlinear low-noise radio-frequency and millimeter-wave building blocks, high-speed, highly sensitive, and low-power optoelectronics.



**SEYED SAMAN KOHNEH POUSHI** (Member, IEEE) received the M.Sc. degree in electrical engineering from Tarbiat Modares University, Tehran, Iran, in 2013, and the Ph.D. degree in CMOS integrated optical sensors from TU Wien, in March 2024. In 2019, he joined the integrated circuit group with TU Wien as a Project Assistant. In February 2024, he was with Silicon Austria Laboratories as a Scientist in electronic sensors. His research primarily revolves around the design

and characterization of CMOS-based photodiodes and the development of ASIC readout circuits for signal processing.



**HORST ZIMMERMANN** received the Dr.-Ing. degree, in 1991. He was then an Alexander-von-Humboldt Research Fellow with Duke University, Durham, N.C. working on diffusion in Si, GaAs, and InP. In 1993, he joined Kiel University working on optoelectronic integration. Since 2000, he has been a Professor of circuit engineering with TU Wien working on (Bi) CMOS analog and optoelectronic full-custom integrated circuits. He is the author of two Springer books and one IOP

book, the co-author of five more Springer books, the co-author of another IOP book, and the co-author of more than 600 publications on integrated photodiodes and integrated circuits.