Rollins, J.G., Bendix, P. “Computer Software for Circuit Analysis and Design” The Electrical Engineering Handbook Ed. Richard C. Dorf Boca Raton: CRC Press LLC, 2000 1 13 Computer Software for Circuit Analysis and Design 13.1 Analog Circuit Simulation Introduction ? DC (Steady-State) Analysis ? AC Analysis ? Transient Analysis ? Process and Device Simulation ? Process Simulation ? Device Simulation ? Appendix 13.2 Parameter Extraction for Analog Circuit Simulation Introduction ? MOS DC Models ? BSIM Extraction Strategy in Detail 13.1 Analog Circuit Simulation J. Gregory Rollins Introduction Computer-aided simulation is a powerful aid during the design or analysis of electronic circuits and semicon- ductor devices. The first part of this chapter focuses on analog circuit simulation. The second part covers simulations of semiconductor processing and devices. While the main emphasis is on analog circuits, the same simulation techniques may, of course, be applied to digital circuits (which are, after all, composed of analog circuits). The main limitation will be the size of these circuits because the techniques presented here provide a very detailed analysis of the circuit in question and, therefore, would be too costly in terms of computer resources to analyze a large digital system. The most widely known and used circuit simulation program is SPICE (simulation program with integrated circuit emphasis). This program was first written at the University of California at Berkeley by Laurence Nagel in 1975. Research in the area of circuit simulation is ongoing at many universities and industrial sites. Com- mercial versions of SPICE or related programs are available on a wide variety of computing platforms, from small personal computers to large mainframes. A list of some commercial simulator vendors can be found in the Appendix. It is possible to simulate virtually any type of circuit using a program like SPICE. The programs have built- in elements for resistors, capacitors, inductors, dependent and independent voltage and current sources, diodes, MOSFETs, JFETs, BJTs, transmission lines, transformers, and even transformers with saturating cores in some versions. Found in commercial versions are libraries of standard components which have all necessary 1 The material in this chapter was previously published by CRC Press in The Circuits and Filters Handbook, Wai-Kai Chen, Ed., 1995. J. Gregory Rollins Technology Modeling Associates, Inc. Peter Bendix LSI Logic Corp. 1 ? 2000 by CRC Press LLC parameters prefitted to typical specifications. These libraries include items such as discrete transistors, op amps, phase-locked loops, voltage regulators, logic integrated circuits (ICs) and saturating transformer cores. Computer-aided circuit simulation is now considered an essential step in the design of integrated circuits, because without simulation the number of “trial runs” necessary to produce a working IC would greatly increase the cost of the IC. Simulation provides other advantages, however: ?The ability to measure “inaccessible” voltages and currents. Because a mathematical model is used all voltages and currents are available. No loading problems are associated with placing a voltmeter or oscilloscope in the middle of the circuit, with measuring difficult one-shot wave forms, or probing a microscopic die. ?Mathematically ideal elements are available. Creating an ideal voltage or current source is trivial with a simulator, but impossible in the laboratory. In addition, all component values are exact and no parasitic elements exist. ?It is easy to change the values of components or the configuration of the circuit. Unsoldering leads or redesigning IC masks are unnecessary. Unfortunately, computer-aided simulation has its own problems: ?Real circuits are distributed systems, not the “lumped element models” which are assumed by simulators. Real circuits, therefore, have resistive, capacitive, and inductive parasitic elements present besides the intended components. In high-speed circuits these parasitic elements are often the dominant perfor- mance-limiting elements in the circuit, and must be painstakingly modeled. ?Suitable predefined numerical models have not yet been developed for certain types of devices or electrical phenomena. The software user may be required, therefore, to create his or her own models out of other models which are available in the simulator. (An example is the solid-state thyristor which may be created from a NPN and PNP bipolar transistor.) ?The numerical methods used may place constraints on the form of the model equations used. The following sections consider the three primary simulation modes: DC, AC, and transient analysis. In each section an overview is given of the numerical techniques used. Some examples are then given, followed by a brief discussion of common pitfalls. DC (Steady-State) Analysis DC analysis calculates the state of a circuit with fixed (non-time varying) inputs after an infinite period of time. DC analysis is useful to determine the operating point (Q-point) of a circuit, power consumption, regulation and output voltage of power supplies, transfer functions, noise margin and fanout in logic gates, and many other types of analysis. In addition DC analysis is used to find the starting point for AC and transient analysis. To perform the analysis the simulator performs the following steps: 1.All capacitors are removed from the circuit (replaced with opens). 2.All inductors are replaced with shorts. 3.Modified nodal analysis is used to construct the nonlinear circuit equations. This results in one equation for each circuit node plus one equation for each voltage source. Modified nodal analysis is used rather than standard nodal analysis because an ideal voltage source or inductance cannot be represented using normal nodal analysis. To represent the voltage sources, loop equations (one for each voltage source or inductor), are included as well as the standard node equations. The node voltages and voltage source currents, then, represent the quantities which are solved for. These form a vector x. The circuit equations can also be represented as a vector F(x) = 0. 4.Because the equations are nonlinear, Newton’s method (or a variant thereof) is then used to solve the equations. Example 13.1. Simulation Voltage Regulator: We shall now consider simulation of the type 723 voltage regulator IC, shown in Fig. 13.1. We wish to simulate the IC and calculate the sensitivity of the output IV ? 2000 by CRC Press LLC characteristic and verify that the output current follows a “fold-back” type characteristic under overload conditions. The IC itself contains a voltage reference source and operational amplifier. Simple models for these elements are used here rather than representing them in their full form, using transistors, to illustrate model development. The use of simplified models can also greatly reduce the simulation effort. (For example, the simple op amp used here requires only eight nodes and ten components, yet realizes many advanced features.) Note in Fig. 13.1 that the numbers next to the wires represent the circuit nodes. These numbers are used to describe the circuit to the simulator. In most SPICE-type simulators the nodes are represented by numbers, with the ground node being node zero. Referring to Fig. 13.2, the 723 regulator and its internal op amp are represented by subcircuits. Each subcircuit has its own set of nodes and components. Subcircuits are useful for encapsulating sections of a circuit or when a certain section needs to be used repeatedly (see next section). The following properties are modeled in the op amp: 1.Common mode gain 2.Differential mode gain 3.Input impedance 4.Output impedance 5.Dominant pole 6.Output voltage clipping The input terminals of the op amp connect to a “T” resistance network, which sets the common and differential mode input resistance. Therefore, the common mode resistance is RCM + RDIF = 1.1E6 and the differential mode resistance is RDIF1 + RDIF2 = 2.0E5. Dependent current sources are used to create the main gain elements. Because these sources force current into a 1-W resistor, the voltage gain is Gm*R at low frequency. In the differential mode this gives (GDIF*R1 = 100). In the common mode this gives (GCM*R1*(RCM/(RDIF1 + RCM = 0.0909). The two diodes D1 and D2 implement clipping by preventing the voltage at node 6 from exceeding VCC or going below VEE. The diodes are made “ideal” by reducing the ideality factor n. Note that the diode current is I d = I s [exp(V d /(nV t )) – 1], where V t is the thermal voltage (0.026 V). Thus, reducing n makes the diode turn on at a lower voltage. A single pole is created by placing a capacitor (C1) in parallel with resistor R1. The pole frequency is therefore given by 1.0/(2*p*R1*C1). Finally, the output is driven by the voltage-controlled voltage source E1 (which has a voltage gain of unity), through the output resistor R4. The output resistance of the op amp is therefore equal to R4. To observe the output voltage as a function of resistance, the regulator is loaded with a voltage source (VOUT) and the voltage source is swept from 0.05 to 6.0 V. A plot of output voltage vs. resistance can then be obtained FIGURE 13.1 Regulator circuit to be used for DC analysis, created using PSPICE. ? 2000 by CRC Press LLC by plotting VOUT vs. VOUT/I(VOUT) (using PROBE in this case; see Fig. 13.3). Note that for this circuit, eventhough a current source would seem a more natural choice, a voltage source must be used as a load rather than a current source because the output characteristic curve is multivalued in current. If a current source were used it would not be possible to easily simulate the entire curve. Of course, many other interesting quantities can be plotted; for example, the power dissipated in the pass transistor can be approximated by plotting IC(Q3)*VC(Q3). For these simulations PSPICE was used running on an IBM PC. The simulation took < 1 min of CPU time. Pitfalls. Convergence problems are sometimes experienced if “difficult” bias conditions are created. An exam- ple of such a condition is if a diode is placed in the circuit backwards, resulting in a large forward bias voltage, SPICE will have trouble resolving the current. Another difficult case is if a current source were used instead of FIGURE 13.2 SPICE input listing of regulator circuit shown in Fig. 13.1. FIGURE 13.3 Output characteristics of regulator circuit using PSPICE. ? 2000 by CRC Press LLC a voltage to bias the output in the previous example. If the user then tried to increase the output current above 10 A, SPICE would not be able to converge because the regulator will not allow such a large current. AC Analysis Ac analysis uses phasor analysis to calculate the frequency response of a circuit. The analysis is useful for calculating the gain. 3 dB frequency input and output impedance, and noise of a circuit as a function of frequency, bias conditions, temperature, etc. Numerical Method 1.A DC solution is performed to calculate the Q-point for the circuit. 2.A linearized circuit is constructed at the Q point. To do this, all nonlinear elements are replaced by their linearized equivalents. For example, a nonlinear current source I = aV 1 2 + bV 2 3 would be replaced by a linear voltage controlled current source I = V 1 (2aV 1q ) + V 2 (3bV 2q 2 ). 3.All inductors and capacitors are replaced by complex impedances, and conductances evaluated at the frequency of interest. 4.Nodal analysis is now used to reduce the circuit to a linear algebraic complex matrix. The AC node voltages may now be found by applying an excitation vector (which represents the independent voltage and current sources) and using Gaussian elimination (with complex arithmetic) to calculate the node voltages. AC analysis does have limitations and the following types of nonlinear or large signal problems cannot be modeled: 1.Distortion due to nonlinearities such as clipping, etc. 2.Slew rate-limiting effects 3.Analog mixers 4.Oscillators Noise analysis is performed by including noise sources in the models. Typical noise sources include thermal noise in resistors I n 2 = 4kT Df /R, and shot I n 2 = 2qI d Df , and flicker noise in semiconductor devices. Here, T is temperature in Kelvins, k is Boltzmann’s constant, and Df is the bandwidth of the circuit. These noise sources are inserted as independent current sources, In j (f ) into the AC model. The resulting current due to the noise source is then calculated at a user-specified summation node(s) by multiplying by the gain function between the noise source and the summation node A js (f ). This procedure is repeated for each noise source and then the contributions at the reference node are root mean squared (RMS) summed to give the total noise at the reference node. The equivalent input noise is then easily calculated from the transfer function between the circuit input and the reference node A is (f ). The equation describing the input noise is therefore: Example 13.2. Cascode Amplifier with Macro Models: Here, we find the gain, bandwidth, input impedance, and output noise of a cascode amplifier. The circuit for the amplifier is shown in Fig. 13.5. The circuit is assumed to be fabricated in a monolithic IC process, so it will be necessary to consider some of the parasitics of the IC process. A cross-section of a typical IC bipolar transistor is shown in Fig. 13.4 along with some of the parasitic elements. These parasitic elements are easily included in the amplifier by creating a “macro model” for each transistor. The macro model is then implemented in SPICE form using subcircuits. The input to the circuit is a voltage source (VIN), applied differentially to the amplifier. The output will be taken differentially across the collectors of the two upper transistors at nodes 2 and 3. The input impedance of the amplifier can be calculated as VIN/I(VIN) or because VIN = 1.0 just as 1/I(VIN). These quantities are shown plotted using PROBE in Fig. 13.6. It can be seen that the gain of the amplifier falls off at high frequency I Af A fInf i is js j j = () () () [] ? 1 2 ? 2000 by CRC Press LLC FIGURE 13.4 BJT cross-section with macro model elements. FIGURE 13.5 Cascode amplifier for AC analysis, created using PSPICE. FIGURE 13.6 Gain and input impedance of cascode amplifier. ? 2000 by CRC Press LLC as expected. The input impedance also drops because parasitic capacitances shunt the input. This example took <1 min on an IBM PC. Pitfalls. Many novice users will forget that AC analysis is a linear analysis. They will, for example, apply a 1-V signal to an amplifier with 5-V power supplies and a gain of 1000 and be surprised when SPICE tells them that the output voltage is 1000 V. Of course, the voltage generated in a simple amplifier must be less than the power supply voltage, but to examine such clipping effects, transient analysis must be used. Likewise, selection of a proper Q point is important. If the amplifier is biased in a saturated portion of its response and AC analysis is performed, the gain reported will be much smaller than the actual large signal gain. Transient Analysis Transient analysis is the most powerful analysis capability of a simulator because the transient response is so hard to calculate analytically. Transient analysis can be used for many types of analysis, such as switching speed, distortion, basic operation of certain circuits like switching power supplies. Transient analysis is also the most CPU intensive and can require 100 or 1000 times the CPU time as a DC or AC analysis. Numerical Method In a transient analysis time is discretized into intervals called time steps. Typically the time steps are of unequal length, with the smallest steps being taken during portions of the analysis when the circuit voltages and currents are changing most rapidly. The capacitors and inductors in the circuit are then replaced by voltage and current sources based on the following procedure. The current in a capacitor is given by I c = CdV c /dt. The time derivative can be approximated by a difference equation: In this equation the superscript k represents the number of the time step. Here, k is the time step we are presently solving for and (k – 1) is the previous time step. This equation can be solved to give the capacitor current at the present time step. Here, Dt = t k – t k–1 , or the length of the time step. As time steps are advanced, V c k–1 ? V c k ; I c k–1 ? I c k . Note that the second two terms on the right hand side of the above equation are dependent only on the capacitor voltage and current from the previous time step, and are therefore fixed constants as far as the present step is concerned. The first term is effectively a conductance (g = 2C/Dt) multiplied by the capacitor voltage, and the second two terms could be represented by an independent current source. The entire transient model for the capacitor therefore consists of a conductance in parallel with two current sources (the numerical values of these are, of course, different at each time step). Once the capacitors and inductors have been replaced as indicated, the normal method of DC analysis is used. One complete DC analysis must be performed for each time point. This is the reason that transient analysis is so CPU intensive. The method outlined here is the trapezoidal time integration method and is used as the default in SPICE. Example 13.3. Phase-Locked Loop Circuit: Figure 13.7 shows the phase-locked loop circuit. The phase detector and voltage-controlled oscillator are modeled in separate subcircuits. Examine the VCO subcircuit and note the PULSE-type current source ISTART connected across the capacitor. The source gives a current pulse 03.E-6 s wide at the start of the simulation to start the VCO running. To start a transient simulation SPICE first computes a DC operating point (to find the initial voltages V c k–1 on the capacitors). As this DC point is a valid, although not necessarily stable, solution, an oscillator will remain at this point indefinitely unless some perturbation is applied to start the oscillations. Remember, this is an ideal mathematical model and no noise II C VV tt c k c k c k c k kk += - - - - - 1 1 1 2 IVCtVCtI c k c k c k c k = ( ) - ( ) - -- 22 11 DD. ? 2000 by CRC Press LLC sources or asymmetries exist that would start a real oscillator—it must be done manually. The capacitor C1 would have to be placed off-chip, and bond pad capacitance (CPAD1 and CPAD2) have been included at the capacitor nodes. Including the pad capacitances is very important if a small capacitor C1 is used for high- frequency operation. In this example, the PLL is to be used as a FM detector circuit and the FM signal is applied to the input using a single frequency FM voltage source. The carrier frequency is 600 kHz and the modulation frequency is 60 kHz. Figure 13.8 shows the input voltage and the output voltage of the PLL at the VCO output and at the phase detector output. It can be seen that after a brief starting transient, the PLL locks onto the input signal FIGURE 13.7 Phase-locked loop circuit for transient analysis, created with PSPICE. FIGURE 13.8 Transient analysis results of PLL circuit, created using PSPICE. ? 2000 by CRC Press LLC and that the phase detector output has a strong 60-kHz component. This example took 251 s on a Sun SPARC-2 workstation (3046 time steps, with an average of 5 Newton iterations per time step). Pitfalls. Occasionally SPICE will fail and give the message “Timestep too small in transient analysis”, which means that the process of Newton iterations at certain time steps could not be made to converge. One of the most common causes of this is the specification of a capacitor with a value that is much too large, for example, specifying a 1-F capacitor instead of a 1 pF capacitor (an easy mistake to make by not adding the “p” in the value specification). Unfortunately, we usually have no way to tell which capacitor is at fault from the type of failure generated other than to manually search the input deck. Other transient failures are caused by MOSFET models. Some models contain discontinuous capacitances (with respect to voltage) and others do not conserve charge. These models can vary from version to version so it is best to check the user’s guide. Process and Device Simulation Process and devices simulation are the steps that precede analog circuit simulation in the overall simulation flow (see Fig. 13.9). The simulators are also different in that they are not measurement driven as are analog circuit simulators. The input to a process simulator is the sequence of process steps performed (times, temper- atures, gas concentrations) as well as the mask dimensions. The output from the process simulator is a detailed description of the solid-state device (doping profiles, oxide thickness, junction depths, etc.). The input to the device simulator is the detailed description generated by the process simulator (or via measurement). The output of the device simulator is the electrical characteristics of the device (IV curves, capacitances, switching transient curves). Process and device simulation are becoming increasingly important and widely used during the integrated circuit design process. A number of reasons exist for this: ?As device dimensions shrink, second-order effects can become dominant. Modeling of these effects is difficult using analytical models. ?Computers have greatly improved, allowing time-consuming calculations to be performed in a reasonable amount of time. ?Simulation allows access to impossible to measure physical characteristics. ?Analytic models are not available for certain devices, for example, thyristors, heterojunction devices and IGBTS. ?Analytic models have not been developed for certain physical phenomena, for example, single event upset, hot electron aging effects, latchup, and snap-back. ?Simulation runs can be used to replace split lot runs. As the cost to fabricate test devices increases, this advantage becomes more important. ?Simulation can be used to help device, process, and circuit designers understand how their devices and processes work. Clearly, process and device simulation is a topic which can be and has been the topic of entire texts. The following sections attempt to provide an introduction to this type of simulation, give several examples showing what the simulations can accomplish, and provide references to additional sources of information. FIGURE 13.9 Data flow for complete process-device-circuit modeling. ? 2000 by CRC Press LLC Process Simulation Integrated circuit processing involves a number of steps which are designed to deposit (deposition, ion implan- tation), remove (etching), redistribute (diffusion), or transform (oxidation) the material of which the IC is made. Most process simulation work has been in the areas of diffusion, oxidation, and ion implantation; however, programs are available that can simulate the exposure and development of photo-resist, the associated optical systems, as well as gas and liquid phase deposition and etch. In the following section a very brief discussion of the governing equations used in SUPREM (from Stanford University, California) will be given along with the results of an example simulation showing the power of the simulator. Diffusion The main equation governing the movement of electrically charged impurities (acceptors in this case) in the crystal is the diffusion equation: Here, C is the concentration (#/cm 3 ) of impurities, C a is the number of electrically active impurities (#/cm 3 ), q is the electron charge, k is Boltzmann’s constant, T is temperature in degrees Kelvin, D is the diffusion constant, and E is the built-in electric field. The built-in electric field E in (V/cm) can be found from: In this equation n is the electron concentration (#/cm 3 ), which in turn can be calculated from the number of electrically active impurities (C a ). The diffusion constant (D) is dependent on many factors. In silicon the following expression is commonly used: The four D components represent the different possible charge states for the impurity: (x) neutral, (+) positive, (–) negative, (=) doubly negatively charged. n i is the intrinsic carrier concentration, which depends only on temperature. Each D component is in turn given by an expression of the type Here, A and B are experimentally determined constants, different for each type of impurity (x, +, –, =). B is the activation energy for the process. This expression derives from the Maxwellian distribution of particle energies and will be seen many times in process simulation. It is easily seen that the diffusion process is strongly influenced by temperature. The term F IV is an enhancement factor which is dependent on the concentration of interstitials and vacancies within the crystal lattice (an interstitial is an extra silicon atom which is not located on a regular lattice site; a vacancy is a missing silicon atom which results in an empty lattice site) F IV μ C I + C v . The concentration of vacancies, C v , and interstitials, C I , are in turn determined by their own diffusion equation: ? ? C t DC DqC kT a =?× ? - ? è ? ? ? ÷ E E =- ? kT qn n 1 DFD D n n D n n D n n IV x i ii =+++ ? è ? ? ? ÷ ? è ? ? ? ? ÷ ÷ +-= 2 DA B kT =- ? è ? ? ? ÷ exp ? ? C t DCRG v VV =+?× ×? - + ? 2000 by CRC Press LLC In this equation D V is another diffusion constant of the form A exp(–B/kT). R and G represent the recom- bination and generation of vacancies and interstitials. Note that an interstitial and a vacancy may recombine and in the process destroy each other, or an interstitial and a vacancy pair may be simultaneously generated by knocking a silicon atom off its lattice site. Recombination can occur anywhere in the device via a bulk recombination process R = A(C V C 1 )exp(–B/kT). Generation occurs where there is damage to the crystal structure, in particular at interfaces where oxide is being grown or in regions where ion implantation has occurred, as the high-energy ions can knock silicon atoms off their lattice sites. Oxidation Oxidation is a process whereby silicon reacts with oxygen (or with water) to form new silicon dioxide. Con- servation of the oxidant requires the following equation: Here, F is the flux of oxidant (#/cm 2 /s), N is the number of oxidant atoms required to make up a cubic centimeter of oxide, and dy/dt is the velocity with which the Si-SiO 2 interface moves into the silicon. In general the greater the concentration of oxidant (C 0 ), the faster the growth of the oxide and the greater the flux of oxidant needed at the Si-SiO 2 interface. Thus, F = k s C 0 . The flux of oxidant into the oxide from the gaseous environment is given by: F = h(HP ox – C 0 ) Here H is a constant, P is the partial pressure of oxygen in the gas, and C 0 is the concentration of oxidant in the oxide at the surface and h is of the form A exp(–B/kT). Finally, the movement of the oxidant within the already existing oxide is governed by diffusion: F = D 0 ?C . When all these equations are combined, it is found that (in the one-dimensional case) oxides grow linearly dy/dt μ t when the oxide is thin and the oxidant can move easily through the existing oxide. As the oxide grows thicker dy/dt μ because the movement of the oxidant through the existing oxide becomes the rate-limiting step. Modeling two-dimensional oxidation is a challenging task. The newly created oxide must “flow” away from the interface where it is begin generated. This flow of oxide is similar to the flow of a very thick or viscous liquid and can be modeled by a creeping flow equation: ? 2 V μ ?P ? · V = 0 V is the velocity at which the oxide is moving and P is the hydrostatic pressure. The second equation results form the incompressibility of the oxide. The varying pressure P within the oxide leads to mechanical stress, and the oxidant diffusion constant D 0 and the oxide growth rate constant k s are both dependent on this stress. The oxidant flow and the oxide flow are therefore coupled because the oxide flow depends on the rate at which oxide is generated at the interface and the rate at which the new oxide is generated depends on the availability of oxidant, which is controlled by the mechanical stress. Ion Implantation Ion implantation is normally modeled in one of two ways. The first involves tables of moments of the final distribution of the ions which are typically generated by experiment. These tables are dependent on the energy and the type of ion being implanted. The second method involves Monte-Carlo simulation of the implantation process. In Monte-Carlo simulation the trajectories of individual ions are followed as they interact with (bounce off) the silicon atoms in the lattice. The trajectories of the ions, and the recoiling Si atoms (which can strike more Si atoms) are followed until all come to rest within the lattice. Typically several thousand trajectories are dy dt N = F t ? 2000 by CRC Press LLC simulated (each will be different due to the random probabilities used in the Monte-Carlo method) to build up the final distribution of implanted ions. Process simulation is always done in the transient mode using time steps as was done with transient circuit simulation. Because partial differential equations are involved, rather than ordinary differential equations, spatial discretization is needed as well. To numerically solve the problem, the differential equations are dis- cretized on a grid. Either rectangular or triangular grids in one, two, or three dimensions are commonly used. This discretization process results in the conversion of the partial differential equations into a set of nonlinear algebraic equations. The nonlinear equations are then solved using a Newton method in a way very similar to the method used for the circuit equations in SPICE. Example 13.4. NMOS Transistor: In this example the process steps used to fabricate a typical NMOS tran- sistor will be simulated using SUPREM-4. These steps are 1.Grow initial oxide (30 min at 1000 K) 2.Deposit nitride layer (a nitride layer will prevent oxidation of the underlying silicon) 3.Etch holes in nitride layer 4.Implant P+ channel stop (boron dose = 5e12, energy = 50 keV) 5.Grow the field oxide (180 min at 1000 K wet O 2 ) 6.Remove all nitride 7.Perform P channel implant (boron dose = 1e11, energy = 40 keV) 8.Deposit and etch polysilicon for gate 9.Oxidize the polysilicon (30 min at 1000 K, dry O 2 ) 10.Implant the light doped drain (arsenic dose = 5e13 energy = 50 keV) 11.Deposit sidewall space oxide 12.Implant source and drain (arsenic, dose = 1e15, energy = 200 keV) 13.Deposit oxide layer and etch contact holes 14.Deposit and etch metal The top 4 mm of the completed structure, as generated by SUPREM-4, is shown in Fig. 13.10. The actual simulation structure used is 200 mm deep to allow correct modeling of the diffusion of the vacancies and interstitials. The gate is at the center of the device. Notice how the edges of the gate have lifted up due to the diffusion of oxidant under the edges of the polysilicon (the polysilicon, as deposited in step 8, is flat). The dashed contours show the concentration of dopants in both the oxide and silicon layers. The short dashes FIGURE 13.10Complete NMOS transistor cross section generated by process simulation, created with TMA SUPREM-4. ? 2000 by CRC Press LLC indicate N-type material, while the longer dashes indicate P-type material. This entire simulation requires about 30 min on a Sun SPARC-2 workstation. Device Simulation Device simulation uses a different approach from that of conventional lumped circuit models to determine the electrical device characteristics. Whereas with analytic or empirical models all characteristics are determined by fitting a set of adjustable parameters to measured data, device simulators determine the electrical behavior by numerically solving the underlying set of differential equations. The first is the Poisson equation, which describes the electrostatic potential within the device N d and N a are the concentration of donors and acceptors, i.e., the N- and P-type dopants. Q f is the concentration of fixed charge due, for example, to traps or interface charge. The electron and hole concentrations are given by n and p, respectively, and Y is the electrostatic potential. A set of continuity equations describes the conservation of electrons and holes: In these equations R and G describe the recombination and generation rates for the electrons and holes. The recombination process is influenced by factors such as the number of electrons and holes present as well as the doping and temperature. The generation rate is also dependent upon the carrier concentrations, but is most strongly influenced by the electric field, with increasing electric fields giving larger generation rates. Because this generation process is included, device simulators are capable of modeling the breakdown of devices at high voltage. J n and J p are the electron and hole current densities (in amperes per square centimeter). These current densities are given by another set of equations In this equation k is Boltzmann’s constant, m is the carrier mobility, which is actually a complex function of the doping, n, p, electric field, temperature, and other factors. In silicon the electron mobility will range between 50 and 1000 and the hole mobility will normally be a factor of 2 smaller. In other semiconductors such as gallium arsenide the electron mobility can be as high as 5000. T n and T p are the electron and hole mean temperatures, which describe the average carrier energy. In many models these default to the device temperature (300 K). In the first term the current is proportional to the electric field (?Y), and this term represents the drift of carriers with the electric field. In the second term the current is proportional to the gradient of the carrier concentration (?n), so this term represents the diffusion of carriers from regions of high concentration to those of low concentration. The model is therefore called the drift-diffusion model. ?× ×? = - - + - ( ) -+ eYqN N p n Q ad f ? ? ? ? n tq RG p tq RG n p =?×-+ ? è ? ? ? ÷ =- ?× - + ? è ? ? ? ÷ 1 1 J J J J n n p p qn kT q n qp kT q p =m-? + ? ? è ? ? ? ÷ =m-? - ? ? è ? ? ? ÷ Y Y ? 2000 by CRC Press LLC In devices in which self-heating effects are important, a lattice heat equation can also be solved to give the internal device temperature: where H is the heat generation term, which includes resistive (Joule) heating as well as recombination heating, H u . The terms s(T), l(T) represent the specific heat and the thermal conductivity of the material (both temperature dependent). Inclusion of the heat equation is essential in many power device problems. As with process simulation partial differential equations are involved, therefore, a spatial discretization is required. As with circuit simulation problems, various types of analysis are available: ?Steady state (DC), used to calculate characteristic curves of MOSFETs, BJTs diodes, etc. ?AC analysis, used to calculate capacitances, Y-parameters, small signal gains, and S-parameters. ?Transient analysis used for calculation of switching and large signal behavior, and special types of analysis such as radiation effects. Example 13.5. NMOS IV Curves: The structure generated in the previous SUPREM-IV simulation is now passed into the device simulator and bias voltages are applied to the gate and drain. Models were included with account for Auger and Shockley Reed Hall recombination, doping and electric field-dependent mobility, and impact ionization. The set of drain characteristics obtained is shown in Fig. 13.11. Observe how the curves bend upward at high V ds as the device breaks down. The V g , = 1 curve has a negative slope at I d = 1.5e-4A as the device enters snap-back. It is possible to model this type of behavior because impact ionization is included in the model. FIGURE 13.11I d vs. V ds curves generated by device simulation, created with TMA MEDICI. s ? ? lT T t HTT HH np R () =+?× () ×? =- + ( ) ×?+JJY ? 2000 by CRC Press LLC Figure 13.12 shows the internal behavior of the device with V gs = 3 V and I d = 3e-4A. The filled contours indicate impact ionization, with the highest rate being near the edge of the drain right beneath the gate. This is to be expected because this is the region in which the electric field is largest due to the drain depletion region. The dark lines indicate current flow from the source to the drain. Some current also flows from the drain to the substrate. This substrate current consists of holes generated by the impact ionization. The triangular grid used in the simulation can be seen in the source, drain, and gate electrodes. A similar grid was used in the oxide and silicon regions. Appendix Circuit Analysis Software SPICE2, SPICE3: University of California, Berkeley PSPICE: MicroSim Corporation, Irvine, CA (used in this chapter) HSPICE: Meta Software, Campbell, CA IsSPICE: Intusoft, San Pedro, CA SPECTRE: Cadence Design Systems, San Jose, CA SABRE: Analogy, Beaverton, OR Process and Device Simulators SUPREM-4, PISCES: Stanford University, Palo Alto, CA MINIMOS: Technical University, Vienna, Austria SUPREM-4, MEDICI, DAVINCI: Technology Modeling Associates, Palo Alto, CA (used in this chapter) SEMICAD: Dawn Technologies, Sunnyvale, CA Related Topics 3.2 Node and Mesh Analysis?23.1 Processes?27.1 Ideal and Practical Models FIGURE 13.12Internal behavior of MOSFET under bias, created with TMA MEDICI. ? 2000 by CRC Press LLC References P. Antognetti and G. Massobrio, Semiconductor Device Modeling with SPICE, New York: McGraw-Hill, 1988. P. W. Tuinenga, SPICE, A Guide to Circuit Simulation and Analysis Using PSPICE, Englewood Cliffs, N.J.: Prentice-Hall, 1988. J. A. Connelly and P. Choi, Macromodeling with SPICE, Englewood Cliffs, N.J.: Prentice-Hall, 1992. S. Selberherr, Analysis and Simulation of Semiconductor Devices, Berlin: Springer-Verlag, 1984. R. Dutton and Z. Yu, Technology CAD, Computer Simulation of IC Process and Devices, Boston: Kluwer Academic, 1993. 13.2 Parameter Extraction for Analog Circuit Simulation Peter Bendix Introduction Definition of Device Modeling We use various terms such as device characterization, parameter extraction, optimization, and model fitting to address an important engineering task. In all of these, we start with a mathematical model that describes the transistor behavior. The model has a number of parameters which are varied or adjusted to match the IV (current-voltage) characteristics of a particular transistor or set of transistors. The act of determining the appropriate set of model parameters is what we call device modeling. We then use the model with the particular set of parameters that represent our transistors in a circuit simulator such as SPICE 1 to simulate how circuits with our kinds of transistors will behave. Usually the models are supplied by the circuit simulator we chose. Occasionally we may want to modify these models or construct our own models. In this case we need access to the circuit simulator model subroutines as well as the program that performs the device characterization. Steps Involved in Device Characterization Device characterization begins with a test chip. Without the proper test chip structures, proper device modeling cannot be done from measured data. A good test chip for MOS technology would include transistors of varying geometries, gate oxide capacitance structures, junction diode capacitance structures, and overlap capacitance structures. This would be a minimal test chip. Additional structures might include ring oscillators and other circuits for checking the AC performance of the models obtained. It is very important that the transistors be well designed and their geometries be chosen appropriate for the technology as well as the desired device model. Although a complete test chip description is beyond the scope of this book, be aware that even perfect device models cannot correct for a poor test chip. Next we need data that represent the behavior of a transistor or set of transistors of different sizes. these data can come from direct measurement or they can be produced by a device simulator such as PISCES. 2 It is also possible to use a combination of a process simulator like SUPREM-IV 3 coupled to a device simulator, to provide the simulated results. The benefits of using simulation over measurement are that no expensive measurement equipment or fabricated wafers are necessary. This can be very helpful when trying to predict the device characteristics of a new fabrication process before any wafers have been produced. Once the measured (or simulated) data are available, parameter extraction software is used to find the best set of model parameter values to fit the data. 1 SPICE is a circuit simulation program from the Department of Electrical Engineering and Computer Science at the University of California at Berkeley. 2 PISCES is a process simulation program from the Department of Electrical Engineering at Stanford University, Stanford, CA. 3 SUPREM-IV is a process simulation program from the Department of Electrical Engineering at Stanford University, Stanford, CA. ? 2000 by CRC Press LLC Least-Squares Curve Fitting (Analytical) We begin this section by showing how to do least-squares curve fitting by analytical solutions, using a simple example to illustrate the method. We then mention least-squares curve fitting using numerical solutions in the next section. We can only find analytical solutions to simple problems. The more complex ones must rely on numerical techniques. Assume a collection of measured data, m 1 , . . ., m n . For simplicity, let these measured data values be functions of a single variable, v, which was varied from v 1 through v n , measuring each m i data point at each variable value v i , i running from 1 to n. For example, the m i data points might be drain current of an MOS transistor, and the v i might be the corresponding values of gate voltage. Assume that we have a model for calculating simulated values of the measured data points, and let these simulated values be denoted by s 1 , . . ., s n . We define the least- squares, root mean square (RMS) error as (13.1) where a weighting term is included for each data point. The goal is to have the simulated data match the measured data as closely as possible, which means we want to minimize the RMS error. Actually, what we have called the RMS error is really the relative RMS error, but the two terms are used synonymously. There is another way of expressing the error, called the absolute RMS error, defined as follows: (13.2) where we have used the term m min in the denominator to represent some minimum value of the measured data. The absolute RMS error is usually used when the measured values approach zero to avoid problems with small or zero denominators in (13.1). For everything that follows, we consider only the relative RMS error. The best result is obtained by combining the relative RMS formula with the absolute RMS formula by taking the maximum of the denominator from (13.1) or (13.2). We have a simple expression for calculating the simulated data points, s i , in terms of the input variable, v, and a number of model parameters, p 1 ,. . .,p m . That is, s i = f (v i , p 1 , . . . , p m ) (13.3) where f is some function. Minimizing the RMS error function is equivalent to minimizing its square. Also, we can ignore the term in the denominator of (13.1) as concerns minimizing, because it is a normalization term. In this spirit, we can define a new error term, (13.4) and claim that minimizing Error is equivalent to minimizing Error rms . To minimize Error, we set all partial derivatives of it with respect to each model parameter equal to zero; that is, write Error weight weight rms = - ( ){} {} é ? ê ê ê ê ê ù ? ú ú ú ú ú = = ? ? ii i i n ii i n sm m 2 1 2 1 12 Error weight weight rms min = - ( ){} {} é ? ê ê ê ê ê ù ? ú ú ú ú ú = = ? ? ii i i n i i n sm m 2 1 2 1 12 Error Error weight rms = ( ) {} é ? ê ê ù ? ú ú = ? 22 1 ii i n m ? 2000 by CRC Press LLC (13.5) Then solve the above equations for the value of p j . Least-Square Curve Fitting (Numerical) For almost all practical applications we are forced to do least-squares curve fitting numerically, because the analytic solutions as previously discussed are not obtainable in closed form. What we are calling least-squares curve fitting is more generally known as nonlinear optimization. Many fine references on this topic are available. We refer the reader to [Gill et al., 1981] for details. Extraction (as Opposed to Optimization) The terms “extraction” and “optimization” are, unfortunately, used interchangeably in the semiconductor industry; however, strictly speaking, they are not the same. By optimization, we mean using generalized least- squares curve fitting methods such as the Levenberg-Marquardt algorithm [Gill et al., 1981] to find a set of model parameters. By extraction, we mean any technique that does not use general least-squares fitting methods. This is a somewhat loose interpretation of the term extraction. The main point is that we write the equations we want and then solve them by whatever approximations we choose, as long as these approximations allow us to get the extracted results in closed form. This is parameter extraction. Extraction vs. Optimization Extraction has the advantage of being much faster than optimization, but it is not always as accurate. It is also much harder to supply extraction routines for models that are being developed. Each time you make a change in the model, you must make suitable changes in the corresponding extraction routine. For optimization, however, no changes are necessary other than the change in the model itself, because least-squares curve fitting routines are completely general. Also, if anything goes wrong in the extraction algorithm (and no access to the source code is available), almost nothing can be done to correct the problem. With optimization, one can always change the range of data, weighting, upper and lower bounds, etc. A least-squares curve fitting program can be steered toward a correct solution. Novices at device characterization find least-squares curve fitting somewhat frustrating because a certain amount of user intervention and intuition is necessary to obtain the correct results. These beginners prefer extraction methods because they do not have to do anything. However, after being burned by extraction routines that do not work, a more experienced user will usually prefer the flexibility, control, and accuracy that opti- mization provides. Commercial software is available that provides both extraction and optimization together. The idea here is to first use extraction techniques to make reasonable initial guesses and then use these results as a starting point for optimization, because optimization can give very poor results if poor initial guesses for the parameters are used. Nothing is wrong with using extraction techniques to provide initial guesses for optimization, but for an experienced user this is rarely necessary, assuming that the least-squares curve fitting routine is robust (converges well) and the experienced user has some knowledge of the process under characterization. Software that relies heavily on extraction may do so because of the nonrobustness of its optimizer. These comments apply when an experienced user is doing optimization locally, not globally. For global optimization (a technique we do not recommend), the above comparisons between extraction and optimization are not valid. The following section contains more detail about local vs. global optimization. Strategies: General Discussion The most naive way of using an optimization program would be to take all the measured data for all devices, put them into one big file, and fit to all these data with all model parameters simultaneously. Even for a very high quality, robust optimization program the chances of this method converging are slight. Even if the program does converge, it is almost certain that the values of the parameters will be very unphysical. This kind of approach is an extreme case of global optimization. We call any optimization technique that tries to fit with parameters to data outside their region of applicability a global approach. That is, if we try to fit to saturation ? ? Error for ( ) == p jm j 01, ,..., ? 2000 by CRC Press LLC region data with linear region parameters such as threshold voltage, mobility, etc., we are using a global approach. In general, we advise avoiding global approaches, although in the strategies described later, sometimes the rules are bent a little. Our recommended approach is to fit subsets of relevant parameters to corresponding subsets of relevant data in a way that makes physical sense. For example, in the MOS level 3 model, VT0 is defined as the threshold voltage of a long, wide transistor at zero back-bias. It does not make sense to use this parameter to fit to a short channel transistor, or to fit at nonzero back-bias values, or to fit to anywhere outside the linear region. In addition, subsets of parameters should be obtained in the proper order so that those obtained at a later step do not affect those obtained at earlier steps. That is, we would not obtain saturation region parameters before we have obtained linear region parameters because the values of the linear region parameters would influence the saturation region fits; we would have to go back and reoptimize on the saturation region parameters after obtaining the linear region parameters. Finally, never use optimization to obtain a parameter value when the parameter can be measured directly. For example, the MOS oxide thickness, TOX, is a model parameter, but we would never use optimization to find it. Always measure its value directly on a large oxide capacitor provided on the test chip. The recommended procedure for proper device characterization follows: 1. Have all the appropriate structures necessary on your test chip. Without this, the job cannot be performed properly. 2. Always measure whatever parameters are directly measurable. Never use otpimization for these. 3. Fit the subset of parameters to corresponding subsets of data, and do so in physically meaningful ways. 4. Fit parameters in the proper order so that those obtained later do not affect those obtained previously. If this is not possible, iteration may be necessary. Naturally, a good strategy cannot be mounted if one is not intimately familiar with the model used. There is no substitute for learning as much about the model as possible. Without this knowledge, one must rely on strategies provided by software vendors, and these vary widely in quality. Finally, no one can provide a completely general strategy applicable to all models and all process technologies. At some point the strategy must be tailored to suit the available technology and circuit performance require- ments. This not only requires familiarity with the available device models, but also information from the circuit designers and process architects. MOS DC Models Available MOS Models A number of MOS models have been provided over time with the original circuit simulation program, SPICE. In addition, some commercially available circuit simulation programs have introduced their own proprietary models, most notably HSPICE. 1 This section is concentrated on the standard MOS models provided by UC Berkeley’s SPICE, not only because they have become the standard models used by all circuit simulation programs, but also because the proprietary models provided by commercial vendors are not well documented and no source code is available for these models to investigate them thoroughly. MOS Levels 1, 2, and 3. Originally, SPICE came with three MOS models known as level 1, level 2, and level 3. The level 1 MOS model is a very crude first-order model that is rarely used. The level 2 and level 3 MOS models are extensions of the level 1 model and have been used extensively in the past and present [Vladimirescu and Liu, 1980]. These two models contain about 15 DC parameters each and are usually considered useful for digital circuit simulation down to 1 mm channel length technologies. They can fit the drain current for wide transistors of varying length with reasonable accuracy (about 5% RMS error), but have very little advanced fitting capability for analog application. They have only one parameter for fitting the subthreshold region, and no parameters for fitting the derivative of drain current with respect to drain voltage, G ds (usually considered critical for analog applications). They also have no ability to vary the mobility degradation with back-bias, so the fits to I ds in the saturation region at high back-bias are not very good. Finally, these models do not interpolate well over device 1 HSPICE is a commercially available, SPICE-like circuit simulation program from Meta Software, Campbell, CA. ? 2000 by CRC Press LLC geometry; e.g., if a fit it made to a wide-long device and a wide-short device, and then one observes how the models track for lengths between these two extremes, they usually do not perform well. For narrow devices they can be quite poor as well. Level 3 has very little practical advantage over level 2, although the level 2 model is proclaimed to be more physically based, whereas the level 3 model is called semiempirical. If only one can be used, perhaps level 3 is slightly better because it runs somewhat faster and does not have quite such an annoying kink in the transition region from linear to saturation as does level 2. Berkeley Short-Channel Igfet Model (BSIM). To overcome the many short-comings of level 2 and level 3, the BSIM and BSIM2 models were introduced. The most fundamental difference between these and the level 2 and 3 models is that BSIM and BSIM2 use a different approach to incorporating the geometry dependence [Ouster et al., 1988; Jeng et al., 1987]. In level 2 and 3 the geometry dependence is built directly into the model equations. In BSIM and BSIM2 each parameter (except for a very few) is written as a sum of three terms (13.6) where Par 0 is the zero-order term, Par L accounts for the length dependence of the parameter, Par W accounts for the width dependence, and L eff and W eff are the effective channel width and length, respectively. This approach has a large influence on the device characterization strategy, as discussed later. Because of this tripling of the number of parameters and for other reasons as well, the BSIM model has about 54 DC parameters and the BSIM2 model has over 100. The original goal of the BSIM model was to fit better than the level 2 and 3 models for submicron channel lengths, over a wider range of geometries, in the subthreshold region, and for nonzero back-bias. Without question, BSIM can fit individual devices better than level 2 and level 3. It also fits the subthreshold region better and it fits better for nonzero back-biases. However, its greatest shortcoming is its inability to fit over a large geometry variation. This occurs because (13.6) is a truncated Taylor series in 1/L eff and 1/W eff terms, and in order to fit better over varying geometries, higher power terms in 1/L eff and 1/W eff are needed. In addition, no provision was put into the BSIM model for fitting G ds , so its usefulness for analog applications is questionable. Many of the BSIM model parameters are unphysical, so it is very hard to understand the significance of these model parameters. This has profound implications for generating skew models (fast and slow models to represent the process corners) and for incorporating temperature dependence. Another flaw of the BSIM model is its wild behavior for certain values of the model parameters. If model parameters are not specified for level 2 or 3, they will default to values that will at least force the model to behave well. For BSIM, not specifying certain model parameters, setting them to zero, or various combinations of values can cause the model to become very ill-behaved. BSIM2. The BSIM2 model was developed to address the shortcomings of the BSIM model. This was basically an extension of the BSIM model, removing certain parameters that had very little effect, fixing fundamental problems such as currents varying the wrong way as a function of certain parameters, adding more unphysical fitting parameters, and adding parameters to allow fitting G ds . BSIM2 does fit better than BSIM, but with more than twice as many parameters as BSIM, it should. However, it does not address the crucial problem of fitting large geometry variations. Its major strengths over BSIM are fitting the subthreshold region better, and fitting G ds better. Most of the other shortcomings of BSIM are also present in BSIM2, and the large number of parameters in BSIM2 makes it a real chore to use in device characterization. BSIM3. Realizing the shortcomings of BSIM2, UC Berkeley recently introduced the BSIM3 model. This is an unfortunate choice of name because it implies BSIM3 is related to BSIM and BSIM2. In reality, BSIM3 is an entirely new model that in some sense is related more to level 2 and 3 than BSIM or BSIM2. The BSIM3 model abandons the length and width dependence approach of BSIM and BSIM2, preferring to go back to incorpo- rating the geometry dependence directly into the model equations, as do level 2 and 3. In addition, BSIM3 is a more physically based model, with about 30 fitting parameters (the model has many more parameters, but Parameter - Par Par L Par W 0 L eff W eff ++ , ? 2000 by CRC Press LLC the majority of these can be left untouched for fitting), making it more manageable, and it has abundant parameters for fitting G ds , making it a strong candidate for analog applications. It is an evolving model, so perhaps it is unfair to criticize it at this early stage. Its greatest shortcoming is, again, the inability to fit well over a wide range of geometries. It is hoped that future modifications will address this problem. In all fairness, however, it is a large order to ask a model to be physically based, have not too many parameters, be well behaved for all default values of the parameters, fit well over temperature, fit G ds , fit over a wide range of geometries, and still fit individual geometries as well as a model with over 100 parameters, such as BSIM2. Some of these features were compromised in developing BSIM3. Proprietary Models. A number of other models are available from commercial circuit simulator vendors, the literature, etc. Some circuit simulators also offer the ability to add a researcher’s own models. In general, we caution against using proprietary models, especially those which are supplied without source code and complete documentation. Without an intimate knowledge of the model equations, it is very difficult to develop a good device characterization strategy. Also, incorporating such models into device characterization software is almost impossible. To circumvent this problem, many characterization programs have the ability to call the entire circuit simulator as a subroutine in order to exercise the proprietary model subroutines. This can slow program execution by a factor of 20 or more, seriously impacting the time required to characterize a technology. Also, if proprietary models are used without source code, the circuit simulator results can never be checked against other circuit simulators. Therefore, we want to stress the importance of using standard models. If these do not meet the individual requirements, the next best approach is to incorporate a proprietary model whose source code one has access to. This requires being able to add the individual model not only to circuit simulators, but also to device characterization programs; it can become a very large task. MOS Level 3 Extraction Strategy in Detail The strategy discussed here is one that we consider to be a good one, in the spirit of our earlier comments. Note, however, that this is not the only possible strategy for the level 3 model. The idea here is to illustrate basic concepts so that this strategy can be refined to meet particular individual requirements. In order to do a DC characterization, the minimum requirement is one each of the wide-long, wide-short, and narrow-long devices. We list the steps of the procedure and then discuss them in more detail. STEP 1. Fit the wide-long device in the linear region at zero back-bias at V gs values above the subthreshold region, with parameters VT0 (threshold voltage), U0 (mobility), and THETA (mobility degradation with V gs ). STEP 2. Fit the wide-short device in the linear region at zero back-bias, at V gs values above the subthreshold region, with parameters VT0, LD (length encroachment), and THETA. When finished with this step, replace VT0 and THETA with the values from step 1, but keep the value of LD. STEP 3. Fit the narrow-long device in the linear region at zero back-bias, at V gs values above the subthreshold region, with parameters VT0, DW (width encroachment), and THETA. When finished with this step, replace VT0 and THETA with the values from step 1, but keep the value of DW. STEP 4. Fit the wide-short device in the linear region at zero back-bias, at V gs values above the subthreshold region, with parameters RS and RD (source and drain series resistance). STEP 5. Fit the wide-long device in the linear region at all back-biases, at V gs values above the subthreshold region, with parameter NSUB (channel doping affects long channel variation of threshold voltage with back-bias). STEP 6. Fit the wide-short device in the linear region at zero back-bias, at V gs values above the subthreshold region, with parameter XJ (erroneously called the junction depth; affects short-channel variation of threshold voltage with back-bias). STEP 7. Fit the narrow-long device in the linear region at zero back-bias, at V gs values above the subthreshold region, with parameter DELTA (narrow channel correction to threshold voltage). ? 2000 by CRC Press LLC STEP 8.Fit the wide-short device in the saturation region at zero back-bias (or all back-biases) with param- eters VMAX (velocity saturation), KAPPA (saturation region slope fitting parameter), and ETA (V ds dependence of threshold voltage). STEP 9.Fit the wide-short device in the subthreshold region at whatever back-bias and drain voltage is appropriate (usually zero back-bias and low V ds ) with parameter NES (subthreshold slope fitting parameter). One may need to fit with VT0 also and then VT0 is replaced after this step with the value of VT0 obtained from step 1. This completes the DC characterization steps for the MOS level 3 model. One would then go on to do the junction and overlap capacitance terms (discussed later). Note that this model has no parameters for fitting over temperature, although temperature dependence is built into the model that the user cannot control. In Step 1 VT0, U0, and THETA are defined in the model for a wide-long device at zero back-bias. They are zero-order fundamental parameters without any short or narrow channel corrections. We therefore fit them to a wide-long device. It is absolutely necessary that such a device be on the test chip. Without it, one cannot obtain these parameters properly. The subthreshold region must be avoided also because these parameters do not control the model behavior in subthreshold. In Step 2 we use LD to fit the slope of the linear region curve, holding U0 fixed from step 1. We also fit with VT0 and THETA because without them the fitting will not work. However, we want only the value of LD that fits the slope, so we throw away VT0 and THETA, replacing them with the values from step 1. Step 3 is the same as step 2, except that we are getting the width encroachment instead of the length. In Step 1 the value of THETA that fits the high V gs portion of the wide-long device linear region curve was found. Because the channel length of a long transistor is very large, the source and drain series resistances have almost no effect here, but for a short-channel device, the series resistance will also affect the high V gs portion of the linear region curve. Therefore, in step 4 we fix THETA from step 1 and use RS and RD to fit the wide- short device in the linear region, high V gs portion of the curve. In Step 5 we fit with NSUB to get the variation of threshold voltage with back-bias. We will get better results if we restrict ourselves to lower values of V gs (but still above subthreshold) because no mobility degradation adjustment exists with back-bias, and therefore the fit may not be very good at higher V gs values for the nonzero back-bias curves. Step 6 is just like step 5, except we are fitting the short-channel device. Some people think that the value of XJ should be the true junction depth. This is not true. The parameter XJ is loosely related to the junction depth, but XJ is really the short-channel correction to NSUB. Do not be surprised if XJ is not equal to the true junction depth. Step 7 uses DELTA to make the narrow channel correction to the threshold voltage. This step is quite straightforward. Step 8 is the only step that fits in the saturation region. The use of parameters VMAX and KAPPA is obvious, but one may question using ETA to fit in the saturation region. The parameter ETA adjusts the threshold voltage with respect to V ds , and as such one could argue that ETA should be used to fit measurements of I ds sweeping V gs and stepping V ds to high values. In doing so, one will corrupt the fit in the saturation region, and usually we want to fit the saturation region better at the expense of the linear region. Step 9 uses NFS to fit the slope of the log(I ds ) vs. V gs curve. Often the value of VT0 obtained from step 1 will prevent one from obtaining a good fit in the subthreshold region. If this happens, try fitting with VT0 and NFS, but replacing the final value of VT0 with that from step 1 at the end, keeping only NFS from this final step. The above steps illustrate the concepts of fitting relevant subsets of parameters to relevant subsets of data to obtain physical values of the parameters, as well as fitting parameters in the proper order so that those obtained in the later steps will affect those obtained in earlier steps minimally. Please refer to Figs. 13.13 and 13.14 for how the resulting fits typically appear (all graphs showing model fits are provided by the device modeling software package Aurora, from Technology Modeling Associates, Inc., Palo Alto, CA). An experienced person may notice that we have neglected some parameters. For example, we did not use parameters KP and GAMMA. This means KP will be calculated from U0, and GAMMA will be calculated from NSUB. In a sense U0 and NSUB are more fundamental parameters than KP and GAMMA. For example, KP ? 2000 by CRC Press LLC depends on U0 and TOX; GAMMA depends on NSUB and TOX. If one is trying to obtain skew models, it is much more advantageous to analyze statistical distributions of parameters that depend on a single effect than those that depend on multiple effects. KP will depend on mobility and oxide thickness; U0 is therefore a more fundamental parameter. We also did not obtain parameter PHI, so it will be calculated from NSUB. The level 3 model is very insensitive to PHI, so using it for curve fitting is pointless. This illustrates the importance of being very familiar with the model equations. The kind of judgments described here cannot be made without such knowledge. FIGURE 13.13 Typical MOS level 3 linear region measured and simulated plots at various V bs values for a wide-short device. FIGURE 13.14 Typical MOS level 3 saturation region measured and simulated plots at various V gs and V bs values for a wide-short device. ? 2000 by CRC Press LLC Test Chip Warnings. The following hints will greatly assist in properly performing device characterization. 1. Include a wide-long device; without this, the results will not be physically correct. 2. All MOS transistors with the same width should be drawn with their sources and drains identical. No difference should be seen in the number of source/drain contacts, contact spacing, source/drain contact overlap, poly gate to contact spacing, etc. 3. Draw devices in pairs. That is, if the wide-long device is W/L = 20/20, make the wide-short device the same width as the wide-long device; e.g., make the short device 20/1, not 19/1. If the narrow-long device is 2/20, make the narrow-short device of the same width; i.e., make is 2/1, not 3/1, and similarly for the lengths. (Make the wide-short and the narrow-short devices have the same length.) BSIM Extraction Strategy in Detail All MOS model strategies have basic features in common; namely, fit the linear region at zero back-bias to get the basic zero-order parameters, fit the linear region at nonzero back-bias, fit the saturation region at zero back-bias, fit the saturation region at nonzero back-bias, and then fit the subthreshold region. It is possible to extend the type of strategy we covered for level 3 to the BSIM model, but that is not the way BSIM was intended to be used. The triplet sets of parameters for incorporating geometry dependence into the BSIM model, (13.6), allow an alternate strategy. We obtain sets of parameters without geometry dependence by fitting to individual devices without using the Par L and Par W terms. We do this for each device size individually. This produces sets of parameters relevant to each individual device. So, for device number 1 of width W(1) and length L(1) we would have a value for the parameter VFB which we will call VFB(1); for device number n of width W(n) and length L(n) we will have VFB(n). To get the Par 0 , Par L , and VFB W we fit to the “data points” VFB(1), . . .,VFB(n) with parameters VFB 0 , VFB L , and VFB W using (13.6) where L eff and W eff are different for each index, 1 through n. Note that as L and W become very large, the parameters must approach Par 0 . This suggests that we use the parameter values for the wide-long device as the Par 0 terms and only fit the other geometry sizes to get the Par L and Par W terms. For example, if we have obtained VFB(1) for our first device which is our wide-long device, we would set VFB 0 = VFB(1), and then fit to VFB(2), . . .,VBF(n) with parameters VFB L and VFB W , and similarly for all the other triplets of parameters. In order to use a general least-squares optimization program in this way the software must be capable of specifying parameters as targets, as well as measured data points. We now list a basic strategy for the BSIM model: STEP 1. Fit the wide-long device in the linear region at zero back-bias, at V gs values above the subthreshold region, with parameters VFB (flatband voltage), MUZ (mobility), and U0 (mobility degradation), with DL (length encroachment) and DW (width encroachment) set to zero. STEP 2. Fit the wide-short device in the linear region at zero back-bias, at V gs values above the subthreshold region, with parameters VFB, U0, and DL. STEP 3. Fit the narrow-long device in the linear region at zero back-bias, at V gs values above the subthreshold region, with parameters VFB, U0, and DW. STEP 4. Refit the wide-long device in the linear region at zero back-bias, at V gs values above the subthreshold region, with parameters VFB, MUZ, and U0, now that DL and DW are known. STEP 5. Fit the wide-short device in the linear region at zero back-bias, at V gs values above the subthreshold region, with parameters VFB, RS, and RD. When finished, replace the value of VFB with the value found in step 4. STEP 6. Fit the wide-long device in the linear region at all back-biases, at V gs values above the subthreshold region, with parameters K1 (first-order body effect), K2 (second-order body effect), U0, and X2U0 (V bs dependence of U0). STEP 7. Fit the wide-long device in the saturation region at zero back-bias with parameters U0, ETA (V ds dependence of threshold voltage), MUS (mobility in saturation), U1 (V ds dependence of mobility), and X3MS (V ds dependence of MUS). ? 2000 by CRC Press LLC STEP 8.Fit the wide-long device in the saturation region at all back-biases with parameter X2MS (V bs dependence of MUS). STEP 9.Fit the wide-long device in the subthreshold region at zero back-bias and low V ds value with parameter N0; then fit the subthreshold region nonzero back-bias low V ds data with parameter NB; and finally fit the subthreshold region data at higher V ds values with parameter ND. Or, fit all the subthreshold data simultaneously with parameters N0, NB, and ND. Repeat Steps 6 through 10 for all the other geometries, with the result of sets of geometry-independent parameters for each different size device. Then follow the procedure described previously for obtaining the geometry-dependent terms Par 0 , Par L , and Par W . In the above strategy we have omitted various parameters either because they have minimal effect or because they have the wrong effect and were modified in the BSIM2 model. Because of the higher complexity of the BSIM model over the level 3 model, many more strategies are possible than the one just listed. One may be able to find variations of the above strategy that suit the individual technology better. Whatever modifications are made, the general spirit of the above strategy probably will remain. Some prefer to use a more global approach with BSIM, fitting to measured data with Par L and Par W terms directly. Although this is certainly possible, it is definitely not a recommended approach. It represents the worst form of blind curve fitting, with no regard for physical correctness or understanding. The BSIM model was originally developed with the idea of obtaining the model parameters via extraction as opposed to optimization. In fact, UC Berkeley provides software for obtaining BSIM parameters using extraction algorithms, with no optimization at all. As stated previously, this has the advantage of being relatively fast and easy. Unfortunately, it does not always work. One of the major drawbacks of the BSIM model is that certain values of the parameters can cause the model to produce negative values of G ds in saturation. This is highly undesirable, not only from a modeling standpoint, but also because of the convergence problems it can cause in circuit simulators. If an extraction strategy is used that does not guarantee non-negative G ds , very little can be done to fix the problem when G ds becomes negative. Of course, the extraction algorithms can be modified, but this is difficult and time consuming. With optimization strategies, one can weight the fitting for G ds more heavily and thus force the model to produce non-negative G ds . We, therefore, do not favor extraction strategies for BSIM, or anything else. As with most things in life, minimal effort provides minimal rewards. BSIM2 Extraction Strategy We do not cover the BSIM2 strategy in complete detail because it is very similar to the BSIM strategy, except more parameters are involved. The major difference in the two models is the inclusion of extra terms in BSIM2 for fitting G ds (refer to Fig. 13.15, which shows how badly BSIM typically fits 1/G ds vs. V ds ). Basically, the BSIM2 strategy follows the BSIM strategy for the extraction of parameters not related to G ds . Once these have been obtained, the last part of the strategy includes steps for fitting to G ds with parameters that account for channel length modulation and hot electron effects. The way this proceeds in BSIM2 is to fit I ds first, and then parameters MU2, MU3, and MU4 are used to fit to 1/G ds vs. V ds curves for families of V gs and V bs . This can be a very time consuming and frustrating experience, because fitting to 1/G ds is quite difficult. Also, the equations describing how G ds is modeled with MU2, MU3, and MU4 are very unphysical and the interplay between the parameters makes fitting awkward. The reader is referred to Fig. 13.16, which shows how BSIM2 typically fits 1/G ds vs. V ds . BSIM2 is certainly better than BSIM but it has its own problems fitting 1/G ds . BSIM3 Comments The BSIM3 model is very new and will undoubtedly change in the future [Huang et al., 1993]. We will not list a BSIM3 strategy here, but focus instead on the features of the model that make it appealing for analog modeling. BSIM3 has terms for fitting G ds that relate to channel length modulation, drain-induced barrier lowering, and hot electron effects. They are incorporated completely differently from the G ds fitting parameters of BSIM2. In BSIM3 these parameters enter through a generalized Early voltage relation, with the drain current in saturation written as ? 2000 by CRC Press LLC (13.7) where V A is a generalized Early voltage made up of three terms as (13.8) FIGURE 13.15 Typical BSIM 1/G d vs. V ds measured and simulated plots at various V gs values for a wide-short device. FIGURE 13.16 Typical BSIM2 1/G d vs. V ds measured and simulated plots at various V gs values for a wide-short device. II VV V ds d ds d A =+ - ( ) é ? ê ê ù ? ú ú sat sat 1 11 1 1 VV V V AA A A =++ CLM DIBL HCE ? 2000 by CRC Press LLC with the terms in (13.8) representing generalized Early voltages for channel length modulation (CLM), drain- induced barrier lowering (DIBL), and hot carrier effects (HCE). This formulation is more physically appealing than the one used in BSIM2, making it easier to fit 1/G ds vs. V ds curves with BSIM2. Figures 13.17 and 13.18 show how BSIM3 typically fits I ds vs. V ds and 1/G ds vs. V ds . Most of the model parameters for BSIM3 have physical significance so they are obtained in the spirit of the parameters for the level 2 and 3 models. The incorporation of temperature dependence is also easier in BSIM3 because the parameters are more physical. All this, coupled with the fact that about 30 parameters exist for BSIM3 as compared to over 100 for BSIM2, makes BSIM3 a logical choice for analog design. However, BSIM3 is evolving, and shortcomings to the model may still exist that may be corrected in later revisions. FIGURE 13.17Typical BSIM3 saturation region measured and simulated plots at various V gs values for a wide-short device. FIGURE 13.18Typical BSIM3 1/G d vs. V ds measured and simulated plots at various V gs values for a wide-short device. ? 2000 by CRC Press LLC Which MOS Model To Use? Many MOS models are available in circuit simulators, and the novice is bewildered as to which model is appropriate. No single answer exists, but some questions must be asked before making a choice: 1. What kind of technology am I characterizing? 2. How accurate a model do I need? 3. Do I want to understand the technology? 4. How important are the skew model files (fast and slow parameter files)? 5. How experienced am I? Do I have the expertise to handle a more complicated model? 6. How much time can I spend doing device characterization? 7. Do I need to use this model in more than one circuit simulator? 8. Is the subthreshold region important? 9. Is fitting G ds important? Let us approach each question with regard to the models available. If the technology is not submicron, perhaps a simpler model such as level 3 is capable of doing everything needed. If the technology is deep submicron, then use a more complicated model such as BSIM, BSIM2, or BSIM3. If high accuracy is required, then the best choice is BSIM3, mainly because it is more physical than all the other models and is capable of fitting better. For a good physical understanding of the process being characterized. BSIM and BSIM2 are not good choices. These are the least physically based of all the models. The level 2 and 3 models have good physical interpretation for most of the parameters, although they are relatively simple models. BSIM3 is also more physically based, with many more parameters than level 2 or 3, so it is probably the best choice. If meaningful skew models need to be generated, then BSIM and BSIM2 are very difficult to use, again, because of their unphysical parameter sets. Usually, the simplest physically based model is the best for skew model generation. A more complicated physically based model such as BSIM3 may also be difficult to use for skew model generation. If the user is inexperienced, none of the BSIM models should be used until the user’s expertise improves. Our advice is to practice using simpler models before tackling the harder ones. If time is critical, the simpler models will definitely be much faster for use in characterization. The more complicated models require more measurements over wider ranges of voltages as well as wider ranges of geometries. This, coupled with the larger number of parameters, means they will take some time with which to work. The BSIM2 model will take longer than all the rest, especially if the G ds fitting parameters are to be used. The characterization results may need to be used in more than one circuit simulator. For example, if a foundry must supply models to various customers, they may be using different circuit simulators. In this case proprietary models applicable to a single circuit simulator should not be used. Also, circuit designers may want to check the circuit simulation results on more than one circuit simulator. It is better to use standard Berkeley models (level 2, level 3, BSIM, BSIM2, and BSIM3) in such cases. If the subthreshold region is important, then level 2 or level 3 cannot be used, and probably not even BSIM; BSIM2 or BSIM3 must be used instead. These two models have enough parameters for fitting the subthreshold region. If fitting G ds is important, BSIM2 and BSIM3 are, again, the only choices. None of the other models have enough parameters for fitting G ds . Finally, if a very unusual technology is to be characterized, none of the standard models may be appropriate. In this case commercially available specialized models or the user’s own models must be used. This will be a large task, so the goals must justify the effort. Skew Parameter Files This chapter discussed obtaining model parameters for a single wafer, usually one that has been chosen to represent a typical wafer for the technology being characterized. The parameter values obtained from this wafer correspond to a typical case. Circuit designers also want to simulate circuits with parameter values representing the extremes of process variation, the so-called fast and slow corners, or skew parameter files. These represent the best and worst case of the process variation over time. ? 2000 by CRC Press LLC Skew parameter values are obtained usually by tracking a few key parameters, measuring many wafers over a long period of time. The standard deviation of these key parameters is found and added to or subtracted from the typical parameter values to obtain the skew models. This method is extremely crude and will not normally produce a realistic skew model. It will almost always overestimate the process spread, because the various model parameters are not independent—they are correlated. Obtaining realistic skew parameter values, taking into account all the subtle correlations between parameters, is more difficult. In fact, skew model generation is often more an art than a science. Many attempts have been made to utilize techniques from a branch of statistics called multivariate analysis [Dillon and Goldstein, 1984]. In this approach principal component or factor analysis is used to find parameters that are linear combinations of the original parameters. Only the first few of these new parameters will be kept; the others will be discarded because they have less significance. This new set will have fewer parameters than the original set and therefore will be more manageable in terms of finding their skews. The user sometimes must make many choices in the way the common factors are utilized, resulting in different users obtaining different results. Unfortunately, a great deal of physical intuition is often required to use this approach effectively. To date, we have only seen it applied to the simpler MOS models such as level 3. It is not known if this is a viable approach for a much more complicated model such as BSIM2 [Power et al., 1993]. Related Topic 24.3 The Metal-Oxide Semiconductor Field-Effect Transistor (MOSFET) References W. R. Dillon and M. Goldstein, Multivariate Analysis Methods and Applications, New York: John Wiley & Sons, 1984. P. E. Gill, W. Murray, and M. Wright, Practical Optimization, Orlando, Fla.: Academic Press, 1981. J. S. Duster, J.-C. Jeng, P. K. Ko, and C. Hu, “User’s Guide for BSIM2 Parameter Extraction Program and The SPICE3 with BSIM Implementation,” Electronic Research Laboratory, Berkeley: University of California, 1988. J.-H. Huang, Z. H. Liu, M.-C. Jeng, P. K. Ko, and C. Hu, “BSIM3 Manual,” Berkeley: University of California, 1993. M.-C. Jeng, P. M. Lee, M. M. Kuo, P. K. Ko, and C. Hu, “Theory, Algorithms, and User’s Guide for BSIM and SCALP,” Version 2.0, Electronic Research Laboratory, Berkeley: University of California, 1987. J. A. Power, A. Mathewson, and W. A. Lane, “An Approach for Relating Model Parameter Variabilities to Process Fluctuations,” Proc. IEEE Int. Conf. Microelectronic Test Struct., vol. 6, Mar. 1993. W. H. Press, B. P. Flannery, S. A. Teukolsky, and W. T. Vetterling, Numerical Recipes in C, Cambridge, U.K.: Cambridge University Press, 1988. B. J. Sheu, D. L. Scharfetter, P. K. Ko, and M.-C. Jeng, “BSIM: Berkeley Short-Channel IGFET Model for MOS Transistors,” IEEE J. Solid-State Circuits, vol. SC-22, no. 4, Aug. 1987. A. Vladimirescu and S. Liu, “The Simulation of MOS Integrated Circuits Using SPICE2,” memorandum no. UCB/ERL M80/7, Berkeley: University of California, 1980. Further Information Other recommended publications which are useful in device characterization are L. W. Nagel, “SPICE2: A Computer Program to Simulate Semiconductor Circuits,” memorandum no. ERL-M520, Berkeley: University of California, 1975. G. Massobrio and P. Antognetti, Semiconductor Device Modeling with SPICE, New York: McGraw-Hill, 1993. ? 2000 by CRC Press LLC