ASICs...THE COURSE (1 WEEK) 1 PROGRAMMABLE ASIC INTERCONNECT 7.1 Actel ACT Key concepts: programmable interconnect ? raw materials: aluminum-based metallization and a line capacitance of 0.2pFcm–1 The interconnect architecture used in an Actel ACT family FPGA. (Source: Actel.) Features and keywords: ? Wiring channels (or just channels) ? Horizontal channels ? Vertical channels ? Tracks ? Channel capacity ? Long vertical tracks (LVTs) ? Input stubs and output stubs ? Wire segments ? Segmented channel routing ? Long lines Each LM has 8 inputs:4 input stubs on topand 4 on bottom. routing channels: 7 or 13(A1010/20) full-size and 2 half-size (top and bottom)outputstub two-antifuse connection four-antifuse connection longverticaltrack(LVT) antifuse 1 10 20 30 40 44 1 8 input stub Actel ACT Each LM output drivesan output stub thatspans 2 channels up and 2 channels down.Logic Modules (LM):8 or 14 (A1010/20) rows of 44 modules 7 2 SECTION 7 PROGRAMMABLE ASIC INTERCONNECT ASICS... THE COURSE ACT 1 horizontal and vertical channel architecture. (Source: Actel.) Features: ? Input stubs ? Output stubs ? Long vertical tracks (LVT) ? Fully populated interconect array 15 1015 20 5 vertical tracks: 4tracks for output stubs,1 track for long vertical track (LVT) tracknumber 25 horizontaltracks perchannel, varying between 4columns and 44columns long: 22 signal tracks,global clock,VDD, and GND 8 verticaltracks forinputstubs dedicatedconnectionto module output—noantifuseneeded channelheight moduleheight column width programmedantifuse25 GNDVDD GCLK Logic Module(LM) Actel ACT expanded view ofpart of the channel ASICs... THE COURSE 7.1 Actel ACT 3 7.1.1 Routing Resources 7.1.2 Elmore’s Constant The time constant τDi is often called the Elmore delay and is different for each node. I call τDi the Elmore time constant as a reminder that, if we approximate Vi by an exponential waveform, the delay of the RC tree using 0.35/0.65 trip points is approximately τDi seconds. Actel FPGA routing resources Horizontal tracks per channel, H Vertical tracks per column, V Rows, R Columns, C Total antifuses on each chip H×V×R × C A1010 22 13 8 44 112,000 100,672 A1020 22 13 14 44 186,000 176,176 A1225A 36 15 13 46 250,000 322,920 A1240A 36 15 14 62 400,000 468,720 A1280A 36 15 18 82 750,000 797,040 Measuring the delay of a net (a) An RC tree (b) The waveforms as a result of closing the switch at t = 0 n Vi (t) = exp (–t/τDi) ; τDi = Σ RkiCk k = 1 time, t /s 1V t =00V C1 C2 i1 V1 V2 t =0 R1R24 R22 (a) (b) V0 R2 i2 C3 i3 i4C4 R3 R4V3 V4 V4V3 V2V0 V1 nodevoltage 4 SECTION 7 PROGRAMMABLE ASIC INTERCONNECT ASICS... THE COURSE 7.1.3 RC Delay in Antifuse Connections ? Two antifuses will generate a 3RC time constant ? Three antifuses a 6RC time constant ? Four antifuses gives a 10RC time constant ? Interconnect delay grows quadratically (∝ n2) as we increase the interconnect length and the number of antifuses, n 7.1.4 Antifuse Parasitic Capacitance 7.1.5 ACT 2 and ACT 3 Interconnect channel density ? fast fuse Actel routing model (a) A four-antifuse connection. L0 is an output stub, L1 and L3 are horizontal tracks, L2 is a long vertical track (LVT), and L4 is an input stub (b) An RC-tree model. Each antifuse is modeled by a resistance and each interconnect seg- ment is modeled by a capacitance. τD4 = R14C1 + R24C2 + R14C1 + R44C4 = (R1+ R2+ R3+ R4)C4 + (R1+ R2+ R3)C3 + (R1+ R2)C2 + R1C1 τD4 = 4RC4 + 3RC3 + 2RC2 + RC1 (b) C0V0 R1 LM2LM1LM2 LM1 (a) L4 L0 L1 L2L3 R2 R3 R4 C1 C2 C3 C4V1 V2 V3 V4 antifuse modelinterconnect model ASICs... THE COURSE 7.1 Actel ACT 5 Actel interconnect parameters Parameter A1010/A1020 A1010B/A1020B Technology 2.0μm, λ=1.0 μm 1.2μm, λ=0.6 μm Die height (A1010) 240mil 144mil Die width (A1010) 360mil 216mil Die area (A1010) 86,400mil 2=56Mλ2 31,104mil 2=56Mλ2 Logic Module (LM) height (Y1) 180μm=180λ 108μm=180λ LM width (X) 150μm=150λ 90μm=150λ LM area (X×Y1) 27,000μm2=27kλ2 9,720μm2=27kλ2 Channel height (Y2) 25 tracks=287μm 25 tracks=170μm Channel area per LM (X×Y2) 43,050μm2=43kλ2 15,300μm2=43kλ2 LM and routing area (X×Y1+X×Y2) 70,000μm 2=70kλ2 25,000μm2=70kλ2 Antifuse capacitance — 10 fF Metal capacitance 0.2pFmm–1 0.2pFmm–1 Output stub length (spans 3 LMs + 4 channels) 4 channels=1688μm 4 channels=1012μm Output stub metal capacitance 0.34pF 0.20pF Output stub antifuse connec- tions 100 100 Output stub antifuse capaci- tance — 1.0pF Horiz. track length 4–44 cols.= 600–6600μm 4–44 cols.= 360–3960μm Horiz. track metal capacitance 0.1–1.3pF 0.07–0.8pF Horiz. track antifuse connec- tions 52–572 antifuses 52–572 antifuses Horiz. track antifuse capaci- tance — 0.52–5.72 pF Long vertical track (LVT) 8–14 channels=3760–6580 μm 8–14 channels=2240–3920 μm LVT metal capacitance 0.08–0.13pF 0.45–0.8pF LVT track antifuse connections 200–350 antifuses 200–350 antifuses LVT track antifuse capacitance 2–3.5pF Antifuse resistance (ACT 1) 0.5k ? (typ.), 0.7k? (max.) 6 SECTION 7 PROGRAMMABLE ASIC INTERCONNECT ASICS... THE COURSE Actel interconnect: An input stub (1 channel) connects to 25 antifuses An output stub (4 channels) connects to 100 (25×4) antifuses An LVT (1010, 8 channels) connects to 200 (25×8) antifuses An LVT (1020, 14 channels) connects to 350 (25×14) antifuses A four-column horizontal track connects to 52 (13×4) antifuses A 44-column horizontal track connects to 572 (13×44) antifuses ASICs... THE COURSE 7.2 Xilinx LCA 7 7.2 Xilinx LCA Xilinx LCA interconnect (a) The LCA architecture (notice the matrix element size is larger than a CLB) (b) A simplified representation of the interconnect resources. Each of the lines is a bus. ? The vertical lines and horizontal lines run between CLBs. ? The general-purpose interconnect joins switch boxes (also known as magic boxes or switching matrices). ? The long lines run across the entire chip. It is possible to form internal buses using long lines and the three-state buffers that are next to each CLB. ? The direct connections (not used on the XC4000) bypass the switch matrices and directly connect adjacent CLBs. ? The Programmable Interconnection Points (PIPs) are programmable pass transis- tors that connect the CLB inputs and outputs to the routing network. ? The bidirectional (BIDI) interconnect buffers restore the logic level and logic strength on long interconnect paths longlinesdouble-length linesdouble-length lines single-length lines G4F4 C4 YQC1G1 KF1 F3C3G3Y X G2XQ F2 G2 CLB3 G4F4 C4 YQC1G1 KF1 F3C3G3Y X G2XQ F2 G2 CLB1 G4F4 C4 YQC1G1 KF1 F3C3G3Y X G2XQ F2 G2 CLB2 (a) (b) Xilinx LCA programmableinterconnectionpoints (PIPs)switchingmatrix CLB matrixheight, Y CLBmatrixwidth, X 8 SECTION 7 PROGRAMMABLE ASIC INTERCONNECT ASICS... THE COURSE XC3000 interconnect parameters Parameter XC3020 Technology 1.0μm, λ=0.5 μm Die height 220mil Die width 180mil Die area 39,600mil2=102Mλ2 CLB matrix height (Y) 480μm=960λ CLB matrix width (X) 370μm=740λ CLB matrix area (X× Y) 17,600μm2=710kλ2 Matrix transistor resistance, RP1 0.5–1k? Matrix transistor parasitic capacitance, CP1 0.01–0.02pF PIP transistor resistance, RP2 0.5–1k? PIP transistor parasitic capacitance, CP2 0.01–0.02pF Single-length line (X, Y) 370μm, 480μm Single-length line capacitance: CLX, CLY 0.075pF, 0.1pF Horizontal Longline (8X) 8 cols.=2960μm Horizontal Longline metal capacitance, CLL 0.6pF ASICs... THE COURSE 7.2 Xilinx LCA 9 Components of interconnect delay in a Xilinx LCA array (a) A portion of the interconnect around the CLBs (b) A switching matrix (c) A detailed view inside the switching matrix showing the pass-transistor arrangement (d) The equivalent circuit for the connection between nets 6 and 20 using the matrix (e) A view of the interconnect at a Programmable Interconnection Point (PIP) (f) and (g) The equivalent schematic of a PIP connection (h) The complete RC delay path 1 6 16 20 (b) G4F4 C4 YQCLB3YQCLB1 CLB2 (a) G4F4 C4 F4 M RP2 CP2 CP2F4 20 6 (h) C2 RP1 3CP13CP120 6 C3 CLB1YQ RP2 C1 CP2 CLB3F4 RP2 C4CP2 CP2 (f) (g)F4(e) 20 1 6 16 20 16 6 1 1 16 on (c) (d) MM MM M M CP1 switching matrix PIP RP1 CP2 PIP PIPswitching matrix PIP switching matrix 10 SECTION 7 PROGRAMMABLE ASIC INTERCONNECT ASICS... THE COURSE 7.3 Xilinx EPLD The Xilinx EPLD UIM (Universal Interconnection Module) (a) A simplified block diagram of the UIM. The UIM bus width, n, varies from 68 (XC7236) to 198 (XC73108) (b) The UIM is actually a large programmable AND array (c) The parasitic capacitance of the EPROM cell FB 9 I/Os per FB FB FB FB FB FB FB UIM sense amplifierVDD FB CDCGCBCWV H(a) (b) (c) Xilinx EPLD 9–1821 n UIM EPROM programmableAND array n inputs word line bit line 21 inputsper FB ASICs... THE COURSE 7.4 Altera MAX 5000 and 7000 11 7.4 Altera MAX 5000 and 7000 A simplified block diagram of the Altera MAX interconnect scheme (a) The PIA (Programmable Interconnect Array) is deterministic—delay is independent of the path length (b) Each LAB (Logic Array Block) contains a programmable AND array (c) Interconnect timing within a LAB is also fixed LAB1 LAB3 LAB5 LAB4 LAB6 VDD CH CVPIAtPIA tPIA LAB2 tLAD LAB2 macrocells (a) (b) (c) M4 M4 programmableAND arrayAltera MAX 5000/7000 12 SECTION 7 PROGRAMMABLE ASIC INTERCONNECT ASICS... THE COURSE 7.5 Altera MAX 9000 7.6 Altera FLEX The Altera MAX 9000 interconnect scheme (a) A 4×5 array of Logic Array Blocks (LABs), the same size as the EMP9400 chip (b) A simplified block dia- gram of the interconnect architecture showing the connection of the Fast- Track buses to a LAB The Altera FLEX interconnect scheme (a) The row and column FastTrack interconnect. The chip shown, with 4 rows × 21 col- umns, is the same size as the EPF8820 (b) A simplified diagram of the interconnect architecture showing the connections between the FastTrack buses and a LAB. Boxes A, B, and C represent the bus-to-bus connections rowFastTrack LAB (a) (b) 66 columnFastTrack 96 48114-wideLAB localarray 16macrocells A B C16 Altera MAX 9000 row FastTrack column FastTrack rowFastTrack Logic ArrayBlock (LAB) (a) (b) 24 columnFastTrack 168 32-wideLAB localinterconnect 8 LogicElements(LEs) A B C8 Altera FLEX row FastTrack column FastTrack 101 16FastTrack aspectratio ASICs... THE COURSE 7.7 Summary 13 7.7 Summary 7.8 Problems The RC product of the parasitic elements of an antifuse and a pass transistor are not too dif- ferent. However, an SRAM cell is much larger than an antifuse which leads to coarser inter- connect architectures for SRAM-based programmable ASICs. The EPROM device lends itself to large wired-logic structures. These differences in programming technology lead to different architectures: ? The antifuse FPGA architectures are dense and regular. ? The SRAM architectures contain nested structures of interconnect resources. ? The complex PLD architectures use long interconnect lines but achieve deterministic routing. Key points: ? The difference between deterministic and nondeterministic interconnect ? Estimating interconnect delay ? Elmore’s constant 14 SECTION 7 PROGRAMMABLE ASIC INTERCONNECT ASICS... THE COURSE