ASICs...THE COURSE (1 WEEK) 1 ASIC CONSTRUCTION Key terms and concepts: ? A microelectronic system (or system on a chip) is the town and ASICs (or system blocks) are the buildings ? System partitioning corresponds to town planning. ? Floorplanning is the architect’s job. ? Placement is done by the builder. ? Routing is done by the electrician. 15.1 Physical Design Key terms and concepts: Divide and conquer ? system partitioning ? floorplanning ? chip planning ? placement ? routing ? global routing ? detailed routing 15.2 CAD Tools Key terms and concepts: goals and objectives for each physical design step System partitioning: ? Goal. Partition a system into a number of ASICs. ? Objectives. Minimize the number of external connections between the ASICs. Keep each ASIC smaller than a maximum size. Floorplanning: ? Goal. Calculate the sizes of all the blocks and assign them locations. ? Objective. Keep the highly connected blocks physically close to each other. Placement: ? Goal. Assign the interconnect areas and the location of all the logic cells within the flexible blocks. ? Objectives. Minimize the ASIC area and the interconnect density. 15 2 SECTION 15 ASIC CONSTRUCTION ASICS... THE COURSE Global routing: ? Goal. Determine the location of all the interconnect. ? Objective. Minimize the total interconnect area used. Detailed routing: ? Goal. Completely route all the interconnect on the chip. ? Objective. Minimize the total interconnect length used. 15.2.1 Methods and Algorithms Key terms and concepts: methods or algorithms are exact or heuristic (algorithm is usually reserved for a method that always gives a solution)? The complexity O(f(n)) is important because n is very large ? algorithms may be constant, logarithmic, linear, or quadratic in time? many VLSI problems are NP-complete ? we need metrics: a measurement function or objective function, a cost function or gain function, and possibly constraints Part of an ASIC design flow showing the system partitioning, floorplanning, place- ment, and routing steps. These steps may be performed in a slight- ly different order, iterated or omitted de- pending on the type and size of the system and its ASICs. As the focus shifts from logic to intercon- nect, floorplanning assumes an increasing- ly important role. Each of the steps shown in the figure must be performed and each depends on the previous step. However, the trend is toward completing these steps in a parallel fashion and iterat- ing, rather than in a sequential manner. Design entry Systempartitioning Floorplanning Placement Routing Synthesis VHDL/Verilog chip block logic cells netlist ASICs... THE COURSE 15.3 System Partitioning 3 15.3 System Partitioning Key terms and concepts: partitioning ? we can’t do “What is the cheapest way to build my system?” ? we can do “How do I split this circuit into pieces that will fit on a chip?” System partitioning for the Sun Microsystems SPARCstation 1 SPARCstation 1 ASIC Gates /k-gate Pins Package Type 1 SPARC IU (integer unit) 20 179 PGA CBIC 2 SPARC FPU (floating-point unit) 50 144 PGA FC 3 Cache controller 9 160 PQFP GA 4 MMU (memory-management unit) 5 120 PQFP GA 5 Data buffer 3 120 PQFP GA 6 DMA (direct memory access) controller 9 120 PQFP GA 7 Video controller/data buffer 4 120 PQFP GA 8 RAM controller 1 100 PQFP GA 9 Clock generator 1 44 PLCC GA 4 SECTION 15 ASIC CONSTRUCTION ASICS... THE COURSE 15.4 Estimating ASIC Size System partitioning for the Sun Microsystems SPARCstation 10 SPARCstation 10 ASIC Gates Pins Package Type 1 SuperSPARC Superscalar SPARC 3M-transistors 293 PGA FC 2 SuperCache cache controller 2M-transistors 369 PGA FC 3 EMC memory control 40k-gate 299 PGA GA 4 MSI MBus–SBus interface 40k-gate 223 PGA GA 5 DMA2 Ethernet, SCSI, parallel port 30k-gate 160 PQFP GA 6 SEC SBus to 8-bit bus 20k-gate 160 PQFP GA 7 DBRI dual ISDN interface 72k-gate 132 PQFP GA 8 MMCodec stereo codec 32k-gate 44 PLCC FC ASICs... THE COURSE 15.4 Estimating ASIC Size 5 Some useful numbers for ASIC estimates, normalized to a 1μm technology Parameter Typical value Comment Scaling Lambda, λ 0.5 μm=0.5 (minimum feature size) In a 1μm technology, λ≈0.5 μm. NA Effective gate length 0.25 to 1.0μm Less than drawn gate length, usually by about 10 percent. λ I/O-pad width (pitch) 5 to 10mil =125 to 250μm For a 1μm technology, 2LM (λ=0.5 μm). Scales less than linearly with λ. λ I/O-pad height 15 to 20mil =375 to 500μm For a 1μm technology, 2LM (λ=0.5μm). Scales approximately lin- early with λ. λ Large die 1000 mil/side, 106mil2 Approximately constant 1 Small die 100 mil/side, 104mil2 Approximately constant 1 Standard-cell density 1.5×10–3gate/μm2 =1.0gate/mil2 For 1μm, 2LM, library = 4 ×10–4 gate/λ2 (independent of scaling). 1/λ2 Standard-cell density 8×10–3 gate/μm2 = 5.0gate/mil2 For 0.5 μm, 3LM, library = 5 ×10–4 gate/λ2 (independent of scaling). 1/λ2 Gate-array utilization 60 to 80% For 2LM, approximately constant 1 80 to 90% For 3LM, approximately constant 1 Gate-array density (0.8 to 0.9) × standard cell density For the same process as standard cells 1 Standard-cell rout- ing factor=(cell area+route area)/cell area 1.5 to 2.5 (2LM) 1.0 to 2.0 (3LM) Approximately constant 1 Package cost $0.01/pin, “penny per pin” Varies widely, figure is for low-cost plastic package, approximately con- stant 1 Wafer cost $1k to $5k average $2k Varies widely, figure is for a mature, 2LM CMOS process, approximately constant 1 6 SECTION 15 ASIC CONSTRUCTION ASICS... THE COURSE 15.5 Power Dissipation Key terms and concepts: dynamic (switching current and short-circuit current ) and static (leakage current and subthreshold current) power dissipation 15.5.1 Switching Current Key terms and concepts: I = C(dV/dt) ? power dissipation = 0.5 CVDD2 = IV = CV(dV/dt) for one- half the period of the input, t=1/(2 f) ? total power = P1 = fCV2DD ? estimate power by counting nodes that toggle 15.5.2 Short-Circuit Current Key terms and concepts: P2 = (1/12)β f trf(VDD – 2 Vtn) ? short-circuit current is typically less than 20 percent of the switching current (a) (b) Estmating circuit size (a) ASIC memory size. These figures are for static RAM constructed using compilers in a 2LM ASIC process, but with no special memory design rules. The actual area of a RAM will depend on the speed and number of read–write ports. (b) Multiplier size for a 2LM process. The actual area will depend on the multiplier architecture and speed. 108 4816 32 RAM area/λ2 word depth/bits word length/bits 107 106 109 64 256 1024 4096multiplier size = m×n /bits multiplier area/λ2 8 × 8 16 × 16 64 × 6432 × 32108 107 106 ASICs... THE COURSE 15.6 FPGA Partitioning 7 15.5.3 Subthreshold and Leakage Current Key terms and concepts: subthreshold current is normally less than 5pAμm–1 of gate width ? subthreshold current for 10 million transistors (each 10μm wide) is 0.1mA ? subthreshold current does not scale ? it takes about 120mV to reduce subthreshold current by a factor of 10 ? if Vt = 0.36V, at VGS=0 V we can only reduce IDS to 0.001 times its value at VGS=Vt ? leakage current ? field transistors ? quiescent leakage current, IDDQ ? IDDQ test 15.6 FPGA Partitioning 15.6.1 ATM Simulator 15.6.2 Automatic Partitioning with FPGAs Key terms and concepts: In Altera AHDL you can direct the partitioner to automatically partition logic into chips within the same family, using the AUTO keyword: DEVICE top_level IS AUTO; % let the partitioner assign logic Partitioning of the ATM board using Lattice Logic ispLSI 1048 FPGAs. Each FPGA con- tains 48 generic logic blocks (GLBs) Chip # Size Chip # Size 1 42 GLBs 7 36 GLBs 2 64k-bit ×8 SRAM 8 22 GLBs 3 38 GLBs 9 256k-bit × 16 SRAM 4 38 GLBs 10 43 GLBs 5 42 GLBs 11 40 GLBs 6 64k-bit ×16 SRAM 12 30 GLBs 8 SECTION 15 ASIC CONSTRUCTION ASICS... THE COURSE The asynchronous transfer mode (ATM) cell format. The ATM protocol uses 53-byte cells or packets of information with a data payload and header information for routing and error control. 8 GFC/VPI VPI VPIVCI VCI PTI CLPHEC payload payload 12 34 56 ...53 7 6 5 4 3 2 1 GFC = generic flow controlVPI = virtual path identifier VCI = virtual channel identifierPTI = payload type identifierCLP = cell loss priority HEC = header error control bit numberbytenumber ASICs... THE COURSE 15.6 FPGA Partitioning 9 10 SECTION 15 ASIC CONSTRUCTION ASICS... THE COURSE 15.7 Partitioning Methods Key terms and concepts: Examples of goals: A maximum size for each ASIC ? A maximum number of ASICs ? A maximum number of connections for each ASIC ? A maximum number of total connections between all ASICs 15.7.1 Measuring Connectivity Key terms and concepts: a network has circuit modules (logic cells) and terminals (connectors or pins) ? modelled by a graph with vertexes (logic cells) connected by edges (electrical connec- tions, nets or signals) ? cutset ? net cutset ? edge cutset (for the graph) ? external connections ? internal connections ? net cuts ? edge cuts 15.7.2 A Simple Partitioning Example Key terms and concepts: two types of network partitioning: constructive partitioning and iterative partitioning improvement 15.7.3 Constructive Partitioning Key terms and concepts: seed growth or cluster growth uses a seed cell and forms clusters or cliques ? a useful starting point 15.7.4 Iterative Partitioning Improvement Key terms and concepts: interchange (swap two) and group (swap many) migration ? greedy algorithms find a local minimum ? group migration algorithms such as the Kernighan–Lin algorithm (basis of min-cut methods) can do better 15.7.5 The Kernighan–Lin Algorithm Key terms and concepts: a cost matrix plus connectivity matrix models system ? measure is the cut cost, or cut weight ? careful to distinguish external edge cost and internal edge cost ? net-cut partitioning and edge-cut partitioning ? hypergraphs with stars, and hyperedges model connec- tions better than edges ? the Fiduccia–Mattheyses algorithm uses linked lists to reduce O( K–L algorithm) and is very widely used ? base logic cell ? balance ? critical net 15.7.6 The Ratio-Cut Algorithm Key terms and concepts: ratio-cut algorithm ? ratio ? set cardinality ? ratio cut ASICs... THE COURSE 15.7 Partitioning Methods 11 15.7.7 The Look-ahead Algorithm Key terms and concepts: gain vector ? look-ahead algorithm Networks, graphs, and partitioning. (a) A network containing circuit logic cells and nets. (b) The equivalent graph with vertexes and edges. For example: logic cell D maps to node D in the graph; net 1 maps to the edge (A, B) in the graph. Net 3 (with three connections) maps to three edges in the graph: (B, C), (B, F), and (C, F). (c) Partitioning a network and its graph. A network with a net cut that cuts two nets. (d) The network graph showing the corresponding edge cut. The net cutset in c contains two nets, but the corresponding edge cutset in d contains four edges. This means a graph is not an exact model of a network for partitioning purposes. (a) (b) CBA D FE vertex, node,or point edgemodule, cell,or blockterminal, or pinnet,signal,or wire network graph 12 34 (d) CBA D G F IH E A three-terminalnet requiresthree edges. A singlewire is modeled bymultipleedges inthe network graph. Only onewire isneeded toconnect severalmoduleson thesame net. (c) net cutset=two nets edge cutset=four edges net cut edge cutlogicmodule B E C F A D B E H I C F A D G 12 SECTION 15 ASIC CONSTRUCTION ASICS... THE COURSE (a) (b) Partitioning example. (a) We wish to partition this net- work into three ASICs with no more than four logic cells per ASIC. (b) A partitioning with five external connections (nets 2, 4, 5, 6, and 8)—the minimum number. (c) A constructed partition using log- ic cell C as a seed. It is difficult to get from this local minimum, with seven external connections (2, 3, 5, 7, 9,11,12), to the optimum solution of b. (c) 1 1 10 10 11 116 6 6 5 512 123 3 9 9 9 8 8 8 7 77 4 4 22 2 KJI E GF B C L H DA 4 2 6 5 8 4 26 5 ASIC 1 ASIC 2 ASIC 3C A BL HD FI JE GK1 1011 39 712 11 2 123 53 9 7 7 12 52 FA BD KH IJ 14 GC EL8 6 ASICs... THE COURSE 15.7 Partitioning Methods 13 A hypergraph. (a) The network contains a net y with three terminals. (b) In the network hypergraph we can model net y by a single hyperedge (B, C, D) and a star node. Now there is a direct correspondence between wires or nets in the network and hyperedges in the graph. starC (a) (b) CD D B BA AOne wire correspondsto one hyperedge in ahypergraph.wx y wx y hyperedgez z 14 SECTION 15 ASIC CONSTRUCTION ASICS... THE COURSE Partitioning a graph using the Kernighan–Lin algorithm. (a) Shows how swapping node 1 of partition A with node 6 of partition B results in a gain of g=1. (b) A graph of the gain resulting from swapping pairs of nodes. (c) The total gain is equal to the sum of the gains obtained at each step. A B A B(a) (b) Gain from swapping i th pair of nodes, gi i, number of pairs ofnodes pretend swapped 1 23 45 7 68 910 123 45 76 8 910 edges cut=4 edges cut=2 swap nodes 1 and 6 originalconfiguration after swapping nodes 1 and 6,gain, g1 =4–2=2+2+1 0–1 max (Gn ) 1 2 3 4 5–2 (c) n, number of pairs ofnodes actually swapped +2+1 0–1 Total gain from swapping the first n pairs of nodes, Gn 1 2 3 4 5 G1 = g0 + g1 ASICs... THE COURSE 15.7 Partitioning Methods 15 Terms used by the Kernighan–Lin partitioning algorithm. (a) An example network graph. (b) The connectivity matrix, C; the column and rows are labeled to help you see how the matrix entries correspond to the node numbers in the graph. For example, C17 (column 1, row 7) equals 1 because nodes 1 and 7 are connected. In this example all edges have an equal weight of 1, but in general the edges may have dif- ferent weights. (a) (b) internaledge external edgeA B 1 23 45 7 6 8 910 C17 C= 0000001000000001010000011000000010000100 0010000000010000000010000000100101000001 00000010010000000110 connectivitymatrix12 3456 78910 12345678910 16 SECTION 15 ASIC CONSTRUCTION ASICS... THE COURSE An example of network partitioning that shows the need to look ahead when selecting logic cells to be moved between partitions. Partitionings (a), (b), and (c) show one sequence of moves, partitionings (d), (e), and (f) show a second sequence. The partitioning in (a) can be improved by moving node 2 from A to B with a gain of 1. The result of this move is shown in (b). This partitioning can be improved by moving node 3 to B, again with a gain of 1. The partitioning shown in (d) is the same as (a). We can move node 5 to B with a gain of 1 as shown in (e), but now we can move node 4 to B with a gain of 2. (a) 1 23 4 5 6 78 9 10 (d) A1 23 45 B 6 78 910 A B (b) 1 23 4 5 6 78 9 10A B (c) 1 2 34 5 6 78 9 10A B (e) A1 23 4 5 B 6 78 910 (f) A1 23 4 5 B 6 78 910 gain=+1 gain=+1gain=+1 gain=+2 ASICs... THE COURSE 15.8 Summary 17 15.7.8 Simulated Annealing Key terms and concepts: simulated-annealing algorithm uses an energy function as a measure ? probability of accepting a move is exp(–?E/T) ? ?E is an increase in energy function ? T corre- sponds to temperature ? we hill climb to get out of a local minimum ? cooling schedule ? Ti+1 = αTi ? good results at the expense of long run times ? Xilinx used simulated annealing in one verion of their tools 15.7.9 Other Partitioning Objectives Key terms and concepts: timing, power, technology, cost and test constraints ? many of these are hard to measure and not well handled by current tools 15.8 Summary Key terms and concepts: The construction or physical design of a microelectronics system is a very large and complex problem. To solve the problem we divide it into several steps: system partitioning, floorplanning, placement, and routing. To solve each of these smaller problems we need goals and objectives, measurement metrics, as well as algorithms and methods ? The goals and objectives of partitioning ? Partitioning as an art not a science ? The simple nature of the algorithms necessary for VLSI-sized problems ? The random nature of the algorithms we use ? The controls for the algorithms used in ASIC design 18 SECTION 15 ASIC CONSTRUCTION ASICS... THE COURSE