Brewer, J.E., Zargham, M.R., Tragoudas, S., Tewksbury, S. “Integrated Circuits”
The Electrical Engineering Handbook
Ed. Richard C. Dorf
Boca Raton: CRC Press LLC, 2000
25
Integrated Circuits
25.1Integrated Circuit Technology
Technology Perspectives?Technology Generations?National
Technology Roadmap for Semiconductors
25.2Layout, Placement, and Routing
What Is Layout??Floorplanning Techniques?Placement
Techniques?Routing Techniques
25.3Application-Specific Integrated Circuits
Introduction?Primary Steps of VLSI ASIC Design?Increasing
Impact of Interconnection Delays on Design?General Transistor-
Level Design of CMOS Circuits?ASIC Technologies?
Interconnection Performance Modeling?Clock Distribution?
Power Distribution?Analog and Mixed-Signal ASICs
25.1 Integrated Circuit Technology
Joe E. Brewer
Integrated circuit (IC) technology, the cornerstone of the modern electronics industry, is subject to rapid change.
Electronic engineers, especially those engaged in research and development, can benefit from an understanding
of the structure and pattern of growth of the technology.
Technology Perspective
A solid state IC is a group of interconnected circuit elements formed on or within a continuous substrate.
While an integrated circuit may be based on many different material systems, silicon is by far the dominant
material. More than 98% of contemporary electronic devices are based on silicon technology. On the order of
85% of silicon ICs are complementary metal oxide semiconductor (CMOS) devices.
From an economic standpoint the most important metric for an IC is the “level of functional integration.”
Since the invention of the IC by Jack Kilby in 1958, the level of integration has steadily increased. The pleasant
result is that cost and physical size per function reduce continuously, and we enjoy a flow of new, affordable
information processing products that pervade all aspects of our day-to-day lives. The historical rate of increase
is a doubling of functional content per chip every 18 months.
For engineers who work with products that use semiconductor devices, the challenge is to anticipate and
make use of these enhanced capabilities in a timely manner. It is not an overstatement to say that survival in
the marketplace depends on rapid “design-in” and deployment.
For engineers who work in the semiconductor industry, or in its myriad of supporting industries, the
challenge is to maintain this relentless growth. The entire industry is marching to a drumbeat. The cost of
technology development and the investment in plant and equipment have risen to billions of dollars. Companies
that lag behind face a serious loss of market share and, possibly, dire economic consequences.
Joe E. Brewer
Northrop Grumman Corporation
Medhi R. Zargham and
Spyros Tragoudas
Southern Illinois University
Stuart Tewksbury
West Virginia University
? 2000 by CRC Press LLC
Technology Generations
The concept of a technology generation emerged from analysis of historical records, was clearly defined by
Gordon Moore in the 1960s, and codified as Moore’s law. The current version of the law is that succeeding
generations will support a four times increase in circuit complexity, and that new generations emerge at
approximately 3-year intervals. The associated observations are that linear dimensions of device features change
by a factor of 0.7, and the economically viable die size grows by a factor of 1.6.
Minimum feature size stated in microns (micrometers) is the term used most frequently to label a technology
generation. “Feature” refers to a geometric object in the mask set such as a linewidth or a gate length. The
“minimum feature” is the smallest dimension that can be reliably used to form the entity.
Figure 25.1 displays the technology evolution sequence. In the diagram succeeding generations are numbered
using the current generation as the “0” reference. Because this material was written in 1996, the “0” generation
is the 0.35 mm minimum feature size technology that began volume production in 1995.
An individual device generation has been observed to have a reasonably well-defined life cycle which covers
about 17 years. The first year of volume manufacture is the reference point for a generation, but its lifetime
actually extends further in both directions. As shown in Fig. 25.2, one can think of the stages of maturity as
ranging over a linear scale which measures years to production in both the plus and minus directions. The
17-year life cycle of a single generation, with new generations being introduced at 3-year intervals, means that
at any given time up to six generations are being worked on. This tends to blur the significance of research
news and company announcements unless the reader is sensitive to the technology overlap in time.
To visualize this situation, consider Fig. 25.3. The top row lists calendar years. The second row shows how
the life cycle of the 0.35 mm generation relates to the calendar. The third row shows the life cycle of the 0.25 mm
generation vs. the calendar. Looking down any column corresponding to a specific calendar year, one can see
which generations are active and identify their respective life cycle year.
FIGURE 25.1 Semiconductor technology generation time sequence.
FIGURE 25.2 Life cycle of a semiconductor technology generation.
FIGURE 25.3 Time overlap of semiconductor technology generations.
? 2000 by CRC Press LLC
One should not interpret the 17-year life cycle as meaning that no work is being performed that is relevant
to a generation before the 17-year period begins. For example, many organizations are conducting experiments
directed at transistors with gate lengths smaller than 0.1 mm. This author’s interpretation is that when basic
research efforts have explored technology boundary conditions, the conditions are ripe for a specific generation
to begin to coalesce as a unique entity. When a body of research begins to seek compatible materials and
processes to enable design and production at the target feature size, the generation life cycle begins. This is a
rather diffused activity at first, and it becomes more focused as the cycle proceeds.
National Technology Roadmap for Semiconductors
The National Technology Roadmap for Semiconductors (NTRS) is an almost 200-page volume distributed by the
Semiconductor Industry Association (SIA). Focused on mainstream leading edge technology, the roadmap
provides a common vision for the industry. It enables a degree of cooperative precompetitive research and
development among the fiercely competitive semiconductor device manufacturers. It is a dynamic document
which will be revised and reissued to reflect learning on an as-needed basis.
The NTRS is compiled by engineers and scientists from all sectors of the U.S. IC technology base. Industry,
academia, and government organizations participate in its formulation. Key leaders are the Semiconductor
Research Corporation (SRC) and SEMATECH industry consortia. The roadmap effort is directed by the
Roadmap Coordinating Group (RCG) of the SIA.
The starting assumption of the NTRS is that Moore’s law will continue to describe the growth of the
technology. The overall roadmap comprises many individual roadmaps which address defined critical areas of
semiconductor research, development, engineering, and manufacturing. In each area, needs and potential
solutions for each technology generation are reviewed. Of course, this process is more definitive for the early
generations because knowledge is more complete and the range of alternatives is restricted.
The NTRS document provides a convenient summary table which presents some of the salient characteristics
of the six technology generations ranging from 1995 to 2010. That summary is reproduced (with minor
variations in format) as Table 25.1.
TABLE 25.1 Overall Roadmap Technology Characteristics
Year of First DRAM Shipment/Minimum Feature (mm)
1995/0.35 1998/0.25 2001/0.18 2004/0.13 2007/0.10 2010/0.07
Memory
Bits/chip (DRAM/Flash) 64M 256M 1G 4G 16G 64G
Cost/bit @ volume (millicents) 0.017 0.007 0.003 0.001 0.0005 0.0002
Logic (high-volume microprocessor)
Logic transistors/cm
2
(packed) 4M 7M 13M 25M 50M 90M
Bits/cm
2
(cache SRAM) 2M 6M 20M 50M 100M 300M
Cost/transistor @ volume (millicents) 1 0.5 0.2 0.1 0.05 0.02
Logic (low-volume ASIC)
Transistors/cm
2
(auto layout) 2M 4M 7M 12M 25M 40M
Non-recurring engineering 0.3 0.1 0.05 0.03 0.02 0.01
Cost/transistor (millicents)
Number of chip I/Os
Chip to package (pads) high performance 900 1350 2000 2600 3600 4800
Number of package pins/balls
Microprocessor/controller 512 512 512 512 800 1024
ASIC (high performance) 750 1100 1700 2200 3000 4000
Package cost (cents/pin) 1.4 1.3 1.1 1.0 0.9 0.8
Chip frequency (MHz)
On-chip clock, cost performance 150 200 300 400 500 625
On-chip clock, high performance 300 450 600 800 1000 1100
Chip-to-board speed, high performance 150 200 250 300 375 475
Chip size (mm
2
)
DRAM 190 280 420 640 960 1400
Microprocessor 250 300 360 430 520 620
? 2000 by CRC Press LLC
Related Topics
1.1 Resistors ? 23.1 Processes
Further Information
The NTRS is available from the SIA, 181 Metro Drive, Suite 450, San Jose, CA 95110, telephone 408-436-6600, fax
408-436-6646. The document can also be accessed via the SEMATECH home page at <http://www.sematech.org>.
Information concerning the IC life cycle can be found in Larrabee, G. B. and Chatterjee, P. “DRAM Manu-
facturing in the 90’s — Part 1: The History Lesson” and “Part 2: The Roadmap,” Semiconductor International,
pp. 84–92, May 1991.
25.2 Layout, Placement, and Routing
Mehdi R. Zargham and Spyros Tragoudas
Very large scale integrated (VLSI) electronics presents a challenge, not only to those involved in the development
of fabrication technology, but also to computer scientists, computer engineers, and electrical engineers. The
ways in which digital systems are structured, the procedures used to design them, the trade-offs between
hardware and software, and the design of computational algorithms will all be greatly affected by the coming
changes in integrated electronics.
A VLSI chip can today contain millions of transistors and is expected to contain more than 100 million
transistors in the year 2000. One of the main factors contributing to this increase is the effort that has been
invested in the development of computer-aided design (CAD) systems for VLSI design. The VLSI CAD systems
are able to simplify the design process by hiding the low-level circuit theory and device physics details from the
designer, and allowing him or her to concentrate on the functionality of the design and on ways of optimizing it.
A VLSI CAD system supports descriptions of hardware at many levels of abstraction, such as system,
subsystem, register, gate, circuit, and layout levels. It allows designers to design a hardware device at an abstract
level and progressively work down to the layout level. A layout is a complete geometric representation (a set
of rectangles) from which the latest fabrication technologies directly produce reliable, working chips. A VLSI
ASIC 450 660 750 900 1100 1400
Max number wiring levels (logic)
On-chip 4–5 5 5–6 6 6–7 7–8
Electrical defect density (d/m
2
) 240 160 140 120 100 25
Minimum mask count 18 20 20 22 22 24
Cycle time days (theoretical) 9 10 10 11 11 12
Maximum substrate diameter (mm)
Bulk or epitaxial or SOI wafer 200 200 300 300 400 400
Power supply voltage (V)
Desktop 3.3 2.5 1.8 1.5 1.2 0.9
Battery 2.5 1.8–2.5 0.9–1.8 0.9 0.9 0.9
Maximum power
High performance with heatsink (W) 80 100 120 140 160 180
Logic without heatsink (W) 5 7 10 10 10 10
Battery (W) 2.5 2.5 3.0 3.5 4.0 4.5
Design and test
Volume tester cost/pin ($K) 3.3 1.7 1.3 0.7 0.5 0.4
Number of test vectors (mP/M) 16–32 16–32 16–32 8–16 4–8 4
% IC function with BIST/DFT 25 40 50 70 90 90+
TABLE 25.1 (continued) Overall Roadmap Technology Characteristics
Year of First DRAM Shipment/Minimum Feature (mm)
1995/0.35 1998/0.25 2001/0.18 2004/0.13 2007/0.10 2010/0.07
? 2000 by CRC Press LLC
CAD system also supports verification, synthesis, and testing of the design. Using a CAD system, the designer
can make sure that all of the parts work before actually implementing the design.
A variety of VLSI CAD systems are commercially available that perform all or some of the levels of abstraction
of design. Most of these systems support a layout editor for designing a circuit layout. A layout-editor is software
that provides commands for drawing lines and boxes, copying objects, moving objects, erasing unwanted
objects, and so on. The output of such an editor is a design file that describes the layout. Usually, the design
file is represented in a standard format, called Caltech Intermediate Form (CIF), which is accepted by the
fabrication industry.
What Is Layout?
For a specific circuit, a layout specifies the position and dimension of the different layers of materials as they
would be laid on the silicon wafer. However, the layout description is only a symbolic representation, which
simplifies the description of the actual fabrication process. For example, the layout representation does not
explicitly indicate the thickness of the layers, thickness of oxide coating, amount of ionization in the transistors
channels, etc., but these factors are implicitly understood in the fabrication process. Some of the main layers
used in any layout description are n-diffusion, p-diffusion, poly, metal-1, and metal-2. Each of these layers is
represented by a polygon of a particular color or pattern. As an example, Fig. 25.4 presents a specific pattern
for each layer that will be used through the rest of this section.
As is shown in Fig. 25.5, an n-diffusion layer crossing a poly layer implies an nMOS transistor, and a
p-diffusion crossing poly implies a pMOS transistor.
Note that the widths of diffusion and poly are represented with a scalable parameter called lambda. These
measurements, referred to as design rules, are introduced to prevent errors on the chip, such as preventing thin
lines from opening (disconnecting) and short circuiting.
FIGURE 25.4 Different layers.
FIGURE 25.5 Layout and fabrication of MOS transistors.
? 2000 by CRC Press LLC
Implementing the design rules based on lambda makes the design process independent of the fabrication
process. This allows the design to be rescaled as the fabrication process improves.
Metal layers are used as wires for connections between the components. This is because metal has the lowest
propagation delay compared to the other layers. However, sometimes a poly layer is also used for short wires
in order to reduce the complexity of the wire routing. Any wire can cross another wire without getting electrically
affected as long as they are in different layers. Two different layers can be electrically connected together using
contacts. The fabrication process of the contacts depends on types of the layers that are to be connected.
Therefore, a layout editor supports different types of contacts by using different patterns.
From the circuit layout, the actual chip is fabricated. Based on the layers in the layout, various layers of
materials, one on top of the others, are laid down on a silicon wafer. Typically, the processing of laying down
each of these materials involves several steps, such as masking, oxide coating, lithography and etching [Mead
and Conway, 1980]. For example, as shown in Fig. 25.6(a), for fabricating an nMOS transistor, first two masks,
one for poly and one for n-diffusion, are obtained from the circuit layout. Next, the n-diffusion mask is used
to create a layer of silicon oxide on the wafer [see Fig. 25.6(b)]. The wafer will be covered with a thin layer of
oxide in places where the transistors are supposed to be placed as opposed to a thick layer in other places. The
poly mask is used to place a layer of polysilicon on top of the oxide layer to define the gate terminals of the
transistor [see Fig. 25.6(c)]. Finally, the n-diffusion regions are made to form the source and drain terminals
of the transistor [see Fig. 25.6(d)].
To better illustrate the concept of layout design, the design of an inverter in the CMOS technology is shown
in Fig. 25.7. An inverter produces an output voltage that is the logical inverse of its input. Considering the circuit
diagram of Fig. 25.7(a), when the input is 1, the lower nMOS is on, but the upper pMOS is off. Thus, the output
becomes 0 by becoming connected to the ground through the nMOS. On the other hand, if the input is 0, the
pMOS is on and the nMOS is off, so the output must find a charge-up path through the pMOS to the supply
and therefore becomes 1. Figure 25.7(b) represents a layout for such an inverter. As can be seen from this figure,
the problem of a layout design is essentially reduced to drawing and painting a set of polygons. Layout editors
provide commands for drawing such polygons. The commands are usually entered at the keyboard or with a
mouse and, in some menu-driven packages, can be selected as options from a pull-down menu.
FIGURE 25.6 Fabrication steps for an nMOS transistor.
? 2000 by CRC Press LLC
In addition to the drawing commands, often a layout system provides tools for minimizing the overall area
of the layout (i.e., size of the chip). Today a VLSI chip consists of a lot of individual cells, with each one laid
out separately. A cell can be an inverter, a NAND gate, a multiplier, a memory unit, etc. The designer can make
the layout of a cell and then store it in a file called the cell library. Later, each time the designer wants to design
a circuit that requires the stored cell, he or she simply copies the layout from the cell library. A layout may
consist of many cells. Most of the layout systems provide routines, called floorplanning, placement and routing
routines, for placing the cells and then interconnecting them with wires in such a way that minimizes the layout
area. As an example, Fig. 25.8 presents the placement of three cells. The area between the cells is used for
routing. The entire routing surface is divided into a set of rectangular routing areas called channels. The sides
of each channel consist of a set of terminals. A wire that connects the terminals with the same ID is called a
net. The router finds a location for the wire segments of each net within the channel. The following sections
classify various types of placement and routing techniques and provide an overview of the main steps of some
of these techniques.
Floorplanning Techniques
The floorplanning problem in Computer Aided Design of Integrated Circuits is similar to that in Architecture
and the goal is to find a location for each cell based on proximity (layout adjacency) criteria to other cells. We
FIGURE 25.7 An inverter.
FIGURE 25.8 Placement and routing.,
? 2000 by CRC Press LLC
consider rectangular floorplans whose boundaries are rectangles. It is desirable to obtain a floorplan that
minimizes the overall area of the layout.
An important goal in floorplanning is the cell sizing problem where the goal is to determine the dimensions
of variable cells whose area is invariant. All cells are assumed to be rectangular, and in the cell sizing problem
the goal is to determine the width and height of each cell subject to predetermined upper and lower bounds
on their ratio, and to their product being equal to its area, so that the final floorplan has optimal area.
One of the early approaches in floorplanning is the hierarchical, where recursive bipartition or partition into
more than two parts is recursively employed and a floorplan tree is constructed. The tree simply reflects the
hierarchical construction of the floorplan. Figure 25.9 shows a hierarchical floorplan and its associated tree.
The partitioning problem and related algorithms are discussed extensively later in this section.
Many early hierarchical floorplanning tools insist that the floorplan be sliceable. A sliceable floorplan is
recursively defined as follows: (a) a cell or (b) a floorplan that can be bipartitioned into two sliceable floorplans
with either a horizontal or vertical line. Figure 25.10 shows a sliceable floorplan whose tree is binary.
Many tools that produce sliceable floorplans are still in use because of their
simplicity. In particular, many problems arising in sliceable floorplanning are solv-
able optimally in polynomial time [Sarrafzadeh and Wong, 1996]. Unfortunately,
sliceable floorplans are rarely optimal (in terms of their area), and they often result
in layouts with very difficult routing phases. (Routing is discussed later in this
section.) Figure 25.11 shows a compact floorplan that is not sliceable.
Hierarchical tools that produce nonsliceable floorplans have also been proposed
[Sarrafzadeh and Wong, 1996]. The major problem in the development of such
tools is that we are often facing problems that are intractable and thus we have to
rely on heuristics in order to obtain fast solutions. For example, the cell sizing
problem can be tackled optimally in sliceable floorplans [Otten, 1983 and Stock-
meyer, 1983] but the problem is intractable for general nonsliceable floorplans.
A second approach to floorplanning is the rectangular dual graph. The idea here is to use duality arguments
and express the cell adjacency constraints in terms of a graph, and then use an algorithm to translate the graph
into a rectangular floorplan. A rectangular dual graph of a rectangular floorplan is a planar graph G = (V,E),
where V is the set of cells and E is the set of edges, and an edge (C
1
,C
2
) is in E if and only if cells C
1
and C
2
are adjacent in the floorplan. See Fig. 25.12 for a rectangular floorplan and its rectangular dual graph G.
FIGURE 25.9 A hierarchical floorplan and its associated tree. The root node has degree 5. The internal node labeled with |
indicates a vertical slicing. The internal node labeled with —
indicates a horizontal slicing.
FIGURE 25.10 A sliceable floorplan and its associated binary tree.
FIGURE 25.11 A com-
pact layout that is not
sliceable.
? 2000 by CRC Press LLC
Let us assume that the floorplan does not contain cross junctions. Figure 25.13 shows a cross junction. This
restriction does not significantly increase the area of a floorplan because, as Fig. 25.13 shows, a cross junction
can be replaced by two T-junctions by simply adding a short edge e.
It has been shown that in the absence of cross junctions the dual graph is planar triangulated (PT), and
every T-junction corresponds to a triangulated face of the dual PT graph. Unfortunately, not all PT graphs
have a rectangular floorplan. For example, in the graph of Fig. 25.14 we cannot satisfy the adjacency require-
ments of edges (a,b), (b,c) and (c,a) at the same time. Note that the later edges form a cycle of length three
that is not a face. It has been shown that a PT graph has a rectangular floorplan if and only if it does not
contain such cycles of length three. Moreover, a linear time algorithm to obtain such a floorplan has been
presented [Sarrafzadeh and Wong, 1996]. The rectangular dual graph approach is a new method for floorplan-
ning, and many floorplanning problems, such as the sizing problem, have not been tackled yet.
Rectangular floorplans can be obtained using simulated annealing and genetic algorithms. Both techniques
are used to solve general optimization problems for which the solution space is not well understood. The
approaches are easy to implement, but the algorithms have many parameters which require empirical adjust-
ments, and the results are usually unpredictable.
A final approach to floorplanning, which unfortunately requires substantial computational resources and
results to an intractable problem, is to formulate the problem as a mixed-integer linear programming (LP).
Consider the following definitions:
W
i
,H
i
,R
i
: width, height and area of cell C
i
X
i
,Y
i
: coordinates of lower left corner of cell C
i
X,Y : the width and height of the final floorplan
A
i
,B
i
: lower and upper bound for the ratio W
i
/H
i
of cell C
i
P
ij
, Q
ij
: variables that take 0/1 values for each pair of cells C
i
and C
j
The goal is to find X
i
,Y
i
,W
i
, and H
i
for each cell so that all constraints are satisfied and XY is minimized.
The latter is a nonlinear constraint. However, we can fix the width W and minimize the height of the floorplan
as follows:
FIGURE 25.12 A rectangular floorplan and its associated dual planer graph.
FIGURE 25.13 A cross junction can be replaced by 2 T-junctions.
FIGURE 25.14 For a cycle of size 3 that is not a face we cannot satisfy all constraints.
? 2000 by CRC Press LLC
min Y
X
i
+ W
i
£ W
Y 3 Y
i
+ H
i
The complete mixed-integer LP formulation is [Sutanthavibul et al., 1991]:
min Y
X
i
,Y
i
,W
i
3 0
P
ij
,Q
ij
= 0 or 1
X
i
+ W
i
£ W
Y 3 Y
i
+ H
i
X
i
+ W
i
£ X
j
+ W(P
ij
+ Q
ij
)
X
j
+ W
j
£ X
i
+ W(1-P
ij
+ Q
ij
)
Y
i
+ H
i
£ Y
j
+ H(1 + P
ij
-Q
ij
)
Y
j
+ H
j
£ Y
i
+ H(2-P
ij
-Q
ij
)
When H
i
appears in the above equations, it must be replaced (using first-order approximation techniques)
by H
i
= D
i
W
i
+ E
i
where D
i
and E
i
are defined below:
W
min
=
W
max
=
H
min
=
H
max
=
D
i
= (H
max
– H
min
)/(W
min
– W
max
)
E
i
= H
max
– D
i
W
min
The unknown variables are X
i
, Y
i
, W
i
, P
ij
, and Q
ij
. All other variables are known. The equations can then be
fed into an LP solver to find a minimum cost solution for the unknowns.
Placement Techniques
Placement is a restricted version of floorplanning where all cells have fixed dimension. The objective of a
placement routine is to determine an optimal position on the chip for a set of cells in a way that the total
occupied area and total estimated length of connections are minimized. Given that the main cause of delay in
a chip is the length of the connections, providing shorter connections becomes an important objective in placing
a set of cells. The placement should be such that no cells overlap and enough space is left to complete all the
connections.
All exact methods known for determining an optimal solution require a computing effort that increases
exponentially with number of cells. To overcome this problem, many heuristics have been proposed [Preas and
Lorenzetti, 1988]. There are basically three strategies of heuristics for solving the placement problem, namely,
constructive, partitioning, and iterative methods. Constructive methods create placement in an incremental
manner where a complete placement is only available when the method terminates. They often start by placing
a seed (a seed can be a single cell or a group of cells) on the chip and then continuously placing other cells
based on some heuristics such as size of cells, connectivity between the cells, design condition for connection
lengths, or size of chip. This process continues until all the cells are placed on the chip. Partitioning methods
divide the cells into two or more partitions so that the number of connections that cross the partition boundaries
R
i
A
i
R
i
B
i
R
i
B
i
¤
R
i
A
i
¤
? 2000 by CRC Press LLC
is minimized. The process of dividing is continued until the number of cells per partition becomes less than a
certain small number. Iterative methods seek to improve an initial placement by repeatedly modifying it.
Improvement might be made by transforming one cell to a new position or switching positions of two or more
cells. After a change is made to the current placement configuration based on some cost function, a decision
is made to see whether to accept the new configuration. This process continues until an optimal (in most cases
a near optimal) solution is obtained. Often the constructive methods are used to create initial placement on
which an iterative method subsequently improves.
Constructive Method
In most of the constructive methods, at each step an unplaced cell is selected and then located in the proper
area. There are different strategies for selecting a cell from the collection of unplaced cells [Wimer and Koren,
1988]. One strategy is to select the cell that is most strongly connected to already placed cells. For each unplaced
cell, we find the total of its connections to all of the already placed cells. Then we select the unplaced cell that
has the maximum number of connections. As an example consider the cells in Fig. 25.15. Assume that cells c
1
and c
2
are already placed on the chip. In Fig. 25.16 we see that cell c
5
has been selected as the next cell to be
placed. This is because cell c
5
has the largest number of connections (i.e., three) to cells c
1
and c
2
.
FIGURE 25.15 Initial configuration.
FIGURE 25.16 Selection based on the number of connections.
? 2000 by CRC Press LLC
The foregoing strategy does not consider area as a factor and thus results in fragmentation of the available
free area; this may make it difficult to place some of the large unplaced cells later. This problem can be overcome,
however, by considering the product of the number of connections and the area of the cell as a criteria for the
selection. Figure 25.17 presents an example of such a strategy. Cell c
3
is selected as the next choice since the
product of its area and its connections to c
1
and c
2
combine to associate with the maximum value.
Partitioning Method
The approaches for the partitioning method can be classified as quadratic and sliced bisection. In both
approaches the layout is divided into two subareas, A and B, each having a size within a predefined range. Each
cell is assigned to one of these subareas. This assignment is such that the number of interconnections between
the two subareas is minimal. For example, Fig. 25.18 presents successive steps for the quadratic and sliced-
bisection methods. As shown in Fig. 25.18(a), in the first step of the quadratic method the layout area is divided
into two almost equal parts; in the second step the layout is further divided into four almost equal parts in the
opposite direction. This process continues until each subarea contains only one cell. Similar to the quadratic
method, the sliced-bisection also divides the layout area into several subareas.
The sliced-bisection method has two phases. In the first phase, the layout area is iteratively divided into a
certain number of almost equal subareas in the same direction. In this way, we end up with a set of slices [see
Fig. 25.18(b)]. Similarly, the second phase divides the area into a certain number of subareas; however, the
slicing is done in the opposite direction.
Several heuristics have been proposed for each of the preceding partitioning methods. Here, for example,
we emphasize the work of Fiduccia–Mattheyses [Fiduccia and Mattheyses, 1982], which uses the quadratic
method. For simplicity, their algorithm is only explained for one step of this method. Initially the set of cells
is randomly divided into two sets, A and B. Each set represents a subarea of the layout and has size equal to
the area it represents. A cell is selected from one of these sets to be moved to the other set. The selection of the
FIGURE 25.17 Selection based on the number of connections and area.
FIGURE 25.18 Partitioning.
? 2000 by CRC Press LLC
cell depends on three criteria. The cell should be free, i.e., the cell must have a minimum gain among the gains
of all other free cells. A cell c has gain g if the number of interconnections between the cells of the two sets
decreases by g units when c is moved from its current set to the other. Finally, the selected set’s move should
not violate a predefined balancing criterion that guarantees that the sizes of the two sets are almost equal. After
moving the selected cell from its current set to the complementary set, it is no longer free. A new partition,
which corresponds to the new instance of the two sets A and B, is created. The cost of a partition is defined as
the number of interconnections between cells in the two sets of the partition. The Fiduccia–Mattheyses algo-
rithm keeps track of the best partition encountered so far, i.e., the partition with the minimum cost. The
algorithm will move the selected cell to the complementary set even if its gain is negative. In this case, the new
partition is worse than the previous one, but the move can eventually lead to better partitions.
This process of generating new partitions and keeping track of the best encountered partition is repeated until
no free cells are left. At this point of the process, which is called a pass, the algorithm returns the best partition
and terminates. To obtain better partitions, the algorithm can be modified such that more passes occur. This
can easily be done by selecting the best partition of a pass as the initial partition of the next pass. In this partition,
however, all cells are free. The modified algorithm terminates whenever a new pass returns a partition that is no
better than the partition returned by the previous pass. This way, the number of passes generated will never be
more than the number of the interconnections in the circuit and the algorithm always terminates.
The balancing criterion in the Fiduccia–Mattheyses algorithm is easily maintained when the cells have
uniform areas. The only way a pass can be implemented to satisfy the criterion is to start with a random initial
partition in which the two sets differ by one cell, each time select the cell of maximum gain from the larger
sized set, move the cell and generate a new partition, and repeat until no free cells are left.
In the example of Fig. 25.19, the areas of the cells are nonuniform. However, the assigned cell areas ensure
that the previously described operation occurs so that the balancing criterion is satisfied. (The cell areas are
omitted in this figure, but they correspond to the ones given in Fig. 25.15.) Figure 23.19 illustrates a pass of
the Fiduccia–Mattheyses algorithm. During this pass, six different partitions are generated, i.e., the initial
partition and five additional ones. Note that according to the description of the pass, the number of additional
partitions equals the number of cells in the circuit.
FIGURE 25.19 Illustration of a pass.
? 2000 by CRC Press LLC
Each partition consists of the cells of the circuit (colored according to the set to which they belong and
labeled with an integer), the gain value associated with each free cell in the set from which the selected cell will
be moved (this value can be a negative number), and the nets (labeled with letters). In the figure a cell that is
no longer free is distinguished by an empty circle placed inside the rectangle that represents that cell.
The initial partition has cost 5 since nets a, b, h, g, f connect the cells in the two sets. Then the algorithm
selects cell 1, which has the maximum gain. The new partition has cost 3 (nets e, g, f), and cell 1 is no longer
free. The final partition has no free cells. The best partition in this pass has cost 3.
Iterative Method
Many iterative techniques have been proposed. Here, we emphasize one of these techniques called simulated
annealing. Simulated annealing, as proposed by Kirkpatrick et al. [1983], makes the connection between
statistical mechanics and combinatorial optimization problems. The main advantage with simulated annealing
is its hill-climbing ability, which allows it to back out of inferior local solutions and find better solutions.
Sechen has applied simulated annealing to the placement problem and has obtained good solutions [Sechen,
1990]. The method basically involves the following steps:
BEGIN
1. Find an initial configuration by placing the cells randomly. Set the initial temperature, T, and the
maximum number of iterations.
2. Calculate the cost of the initial configuration.
A general form of the cost function may be: Cost = c
1
*
Area of layout + c
2
*
Total interconnection length
where c
1
and c
2
are tuning factors.
3. While (stopping criterion is not satisfied)
{587
a. For ( maximum number of iteration )
{
1.Transform the old configuration into a new configuration.
This transformation can be in the form of exchange of positions of two randomly selected cells or
change of position of a randomly selected cell.
2.Calculate the cost of the new configuration.
3.If (new cost < old cost) accept the iteration, else check if the new iteration could be accepted with
the probability: e (?new cost – old cost? / T). There are also other options for the probability function.
}
b. Update the temperature.
}
END
The parameter T is called temperature; it is initially set to a very large value, so that the probability of accepting
“uphill” moves is very close to 1, that it is slowly decreasing toward zero, according to a rule called the cooling
schedule. Usually, the new reduced temperature is calculated as follows:
New temperature = (cooling rate) 3 (Old temperature)
Using a faster cooling rate can result in getting stuck at local minima; however, a cooling rate that is too
slow can pass over the possible global minima. In general, the cooling rate is taken from approximately 0.80 to
0.95.
Usually, the stopping criterion for the while-loop is implemented by recording the cost function’s value at
the end of each temperature stage of the annealing process. The stopping criterion is satisfied when the cost
function’s value has not changed for a number of consecutive stages.
Though simulated annealing is not the ultimate solution to placement problems, it gives very good results
compared to the other popular techniques. The long execution time of this algorithm is its major disadvantage.
Although a great deal of research has been done in improving this technique, substantial improvements have
not been achieved.
? 2000 by CRC Press LLC
Routing Techniques
Given a collection of cells placed on a chip, the routing problem is to connect the terminals (or ports) of these
cells for a specific design requirement. The routing problem is often divided into three subproblems: global,
detailed, and specialized routers. The global router considers the overall routing region in order to distribute the
nets over the channels based on their capacities while keeping the length of each net as short as possible. For
every channel that a net passes through, the net’s id is placed on the sides of the channel. Once the terminals of
each channel are determined, the detailed router connects all the terminals with the same id by a set of wire
segments. (A wire segment is a piece of material described by a layer, two end-points, and a width.) The specialized
router is designed to solve a specific problem such as routing of power and ground wires. These wires require
special attention for two reasons: 1) they are usually routed in one layer in order to reduce the parasitic capacitance
of contacts, and 2) they are usually wider than other wires (signal and data) since they carry more current.
Detailed routers further divide into general-purpose and restricted routers. The general-purpose routers
impose very few constraints on the routing problem and operate on a single connection at a time. Since these
routers work on the entire design in a serial fashion, the size of the problems they can attempt is limited. On
the other hand, the restricted routers require some constraints on the routing problem, such as empty rectan-
gular areas with all of the pins on the periphery. Because of their limited scope, these routers can do a better
job of modeling the contention of nets for the routing resources and therefore can be viewed as routing the
nets in parallel.
To reduce the complexity of restricted routers, this type of router often uses a rectangular grid on which
trunks (horizontal wire segments) and branches (vertical wire segments) are placed on different layers. In other
words, the layers supported by the technology are divided into two groups, horizontal and vertical. This is
known as the Manhattan model. On the other hand, in a non-Manhattan model, the assignment of a layer to
a vertical or horizontal direction is not enforced. Given the freedom of direction, assignment to layers reduces
the channel width and vias in many cases; the latter model usually produces a better result than the former.
In the literature, many different techniques have been proposed for
restricted routers. In general these techniques can be grouped into four
different approaches: 1) algorithms (such as left-edge, maze, greedy, hier-
archical); 2) expert systems; 3) neural networks; and 4) genetic algorithms
[Zobrist, 1994; Sarrafzadeh, 1996; and Lengauer, 1990]. As an example,
we consider here only one of the techniques, called a maze router which
is widely known. The maze router can be used as a global router and/or
detailed router. It finds the shortest rectilinear path by propagating a
wavefront from a source point toward a destination point [Lee, 1969].
Considering the routing surface as a rectangular array of cells, the algo-
rithm starts by marking the source cell as visited. In successive steps, it
visits all the unvisited neighbors of visited cells. This continues until the
destination cell is visited. For example, consider the cell configuration
given in Fig. 25.20.
We would like to find a minimal-crossing path from source cell A
(cell 2) to destination cell B (cell 24). (The minimal-crossing path is
defined as a path that crosses over the fewest number of existing paths.)
The algorithm begins by assigning the start cell 2 to a list, denoted as L,
i.e., L consists of the single entry {2}. For each entry in L, its immediate
neighbors (which are not blocked) will be added to an auxiliary list L¢.
(The auxiliary cell list L¢ is provided for momentary storage of names of
cells.) Therefore, list L¢ contains entries {1,3}. To these cells a chain coor-
dinate and a weight are assigned, as denoted in Fig. 25.21. For example,
in cell 3, we have the pair (0, ?), meaning that the chain coordinate is
toward right and the cell weight is 0. The weight for a cell represents the
number of wires that should be crossed in order to reach that cell from
the source cell. The cells with minimum weight in list L¢ are appended
FIGURE 25.20 Initial configuration.
FIGURE 25.21 First step.
? 2000 by CRC Press LLC
to list L. Thus, cells 1 and 3 are appended to list L. Moreover, cell 2 is erased from list L. Appending the
immediate neighbors of the cells in L to L¢, we find that list L¢ now contains entries 4 and 8. Note that cell 8
has a weight of 1; this is because a wire must be crossed in order to reach to this cell. Again the cells with
minimum weight in list L¢ are appended to list L, and cell 3 and cell 1 are erased from L. Now L contains
entry {4}. The above procedure is repeated until it reaches to the final cell B. Then a solution is found by tracing
the chain coordinated from cell B to cell A as shown in Fig. 25.22.
The importance of Lee’s algorithm is that it always finds the shortest path between two points. Since it routes
one net at a time, however, there is a possibility of having some nets unrouted at the end of the routing process.
The other weak points of this technique are the requirements of a large memory space and long execution time.
For this reason, the maze router is often used as a side router for the routing of critical nets and/or routing of
leftover unrouted nets.
Defining Terms
Floorplanning:A floorplan routine determines an approximate position for each cell so that the total area
is minimized.
Layout:Specifies the position and dimension of the different layers of materials as they would be layed on
the silicon wafer.
Placement:A placement routine determines an optimal position on the chip for a set of cells with fixed
dimensions in a way that the total occupied area and the total estimated length of connections are minimized.
Routing:Given a collection of cells placed on a chip, the routing routine connects the terminals of these cells
for a specific design requirement.
Related Topic
23.1 Processes
References
C. M. Fiduccia and R. M. Mattheyses, “A linear-time heuristic for improving network partitions,” Proceedings
of the 19th Annual Design Automation Conference, (July), pp. 175–181, 1982.
S. Kirkpatrick, C. D. Gelatt, and M. P. Vecchi, “Optimization by simulated annealing,” Science, vol. 220, no. 4598
(May), pp. 671–680, 1983.
C. Y. Lee, “An algorithm for path connections and its application,” IRE Transactions on Electronic Computers,
(Sept.), pp. 346–365, 1969.
T. Lengauer, Combinatorial Algorithms for Integrated Circuit Layout, New York: John Wiley & Sons, 1990.
C. A. Mead and L. A. Conway, Introduction to VLSI Systems, Reading, Mass.: Addison-Wesley, 1980.
R. H. J. M. Otten, “Efficient floorplan optimization,” International Journal on Computer Design, pp. 499–503,
IEEE/ACM, 1983.
B. Preas and M. Lorenzetti, Physical Design Automation of VLSI Systems, Menlo Park, Calif.: Benjamin/Cum-
mings, 1988.
FIGURE 25.22 Final step.
? 2000 by CRC Press LLC
M. Sarrafzadeh and C. K. Wong, An Introduction to VLSI Physical Design, New York: McGraw-Hill, 1996.
C. Sechen, “Chip-planning, placement and global routing of macro-cell integrated circuits using simulated
annealing,” International Journal of Computer Aided VLSI Design 2, pp. 127–158, 1990.
L. Stockmeyer, “Optimal orientation of cells in slicing floorplan designs,” Information and Control, 57 (2)
pp. 91–101, 1983.
S. Sutanthavibul, E. Shargowitz, and J. B. Rosen, “An analytical approach to floorplan design and optimization,”
IEEE Transactions on Computer Aided-Design, 10 (6) pp. 761–769, 1991.
S. Wimer and I. Koren, “Analysis of strategies for constructive general block placement,” IEEE Transactions on
Computer-Aided Design, vol. 7, no. 3 (March), pp. 371–377, 1988.
G. W. Zobrist, Editor, Routing, Placement, and Partitioning, Ablex Publishing, 1994.
Further Information
Other recommended layout design publications include Weste and Eshraghian, Principles of CMOS VLSI Design:
A Systems Perspective, Reading, Mass.: Addison-Wesley, 1988, and the book by B. Preas and M. Lorenzetti,
Physical Design Automation of VLSI Systems, Menlo Park, Calif.: Benjamin/Cummings, 1988. The first book
describes the design and analysis of a layout. The second book describes different techniques for development
of CAD systems.
Another source is IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, which is
published monthly by the Institute of Electrical and Electronics Engineers.
25.3 Application-Specific Integrated Circuits
S. K. Tewksbury
Introduction
Present-day silicon very large scale integration complementary metal-oxide semiconductor (VLSI CMOS)
technologies can place and interconnect several million transistors (representing over a million gates) on a
single integrated circuit (IC) approximately 1 cm square. Provided with such a vast number of gates, a digital
system designer can implement very sophisticated and complex system functions (including full systems) on
a single IC. However, efficient design (including optimized performance) of such functions using all these gates
is a complex puzzle of immense complexity. If this technology were to have been provided to the world overnight,
it is doubtful that designers could in fact make use of this vast amount of logic on an IC.
However, this technology instead has evolved over a long period of time (about three decades), starting with
only a few gates per IC, with the number of gates per IC doubling consistently about every 18 months (a
progression referred to as Moore’s Law), and evolving to the present high gate densities per IC. Projections
[The National Technology Roadmap for Semiconductors, 1994] of this evolution over the next 15 years, as
shown in Table 25.2, promise continued dramatic advances in the amount of logic and memory which will be
provided on individual ICs. Paralleling the technology evolution, computer-aided design (CAD) tools and
electronics design automation (EDA) tools [Rubin, 1987; Sherwani, 1993; Hill and Peterson, 1993; Banerjee,
1994] have evolved to assist designers of the increasingly complex ICs. With these CAD tools, today’s design
teams effectively have an army of very “experienced” designers embedded in the tools, capable of applying the
knowledge gained over the long and steady history of IC evolution. This prior experience captured in CAD/EDA
tools includes the ability to convert a high-level description [Camposano and Wolf, 1991; Gajski et al., 1992]
of a specific function (e.g., ALU, register, control unit, microcontroller, etc.) into an efficient physical imple-
mentation of that function.
Figure 25.23 is a photomicrograph of a contemporary, high-performance VLSI application-specific IC (ASIC)
circuit, the ADSP-1060 (SHARC) digital signal processor (DSP) from Analog Devices, Inc. The right two thirds
of the IC is 4 Mbit of SRAM, providing considerable on-chip memory. The DSP is on the left third of the IC.
A 0.5-mm CMOS technology with two levels of metal was used, with a total of about 20 million transistors on
the IC. The IC provides 120 MFLOP of performance.
? 2000 by CRC Press LLC
This section summarizes the design process, the gate-level physical design, and several issues which have
become particularly important with today’s VLSI technologies, with a focus on ASICs. An important and
growing issue is that of testing, including built-in testing, design for testability, and related topics (see e.g., Jha
and Kundu [1990] and Parker [1992]). This is a broad topic, beyond the scope of this section.
Primary Steps of VLSI ASIC Design
The VLSI IC design process consists of a sequence of well-defined steps [Preas and Lorenzetti, 1988; Hill et al.,
1989; DeMicheli, 1994a and b] related to the definition of the functions to be designed; organization of the
circuit blocks implementing these logic functions within the area of the IC; verification and simulation at several
stages of design (e.g., behavioral simulation, gate-level simulation, circuit simulation [White and Sangiovanni-
Vincentelli, 1987; Lee et al., 1993], etc.); routing of physical interconnections among the blocks, and final
detailed placement and transistor-level layout of the VLSI circuit. This process can also be used hierarchically
to design one of the blocks making up the overall IC, representing that circuit block in terms of simpler blocks.
This establishes the “top-down” hierarchical approach, extendable to successively lower-level elements of the
overall design.
These general steps are illustrated in Fig. 25.24(a), roughly showing the basic steps taken. A representative
example [Lipman, 1995] of a contemporary design approach is illustrated in Fig. 25.24(b). Design approaches
are continually changing and Fig. 25.24(b) is merely one of several current design sequences. Below, we
summarize the general steps highlighted in Fig. 25.24(a).
A.Behavioral Specification of Function: The behavioral specification is essentially a description of the func-
tion expected to be performed by the IC. The design can be represented by schematic capture, with the
designer representing the design using block diagrams. High-level description languages (HDLs) such
as VHDL [Armstrong, 1989; Lipsett et al., 1990; Mazor and Langstraat, 1992] and Verilog [Thomas and
Moorby, 1991] are increasingly used to provide a detailed specification of the function in a manner
which is largely independent of the physical design of the function. VHDL and Verilog are “hardware
description languages,” representing designs from a variety of viewpoints including behavioral descrip-
tions, structural descriptions, and logical descriptions. Figure 25.25(a) illustrates the specification of the
overall function in terms of subfunctions (A(s), B(s), …, E(s)) as well as expansion of one subfunction
(C(s)) into still simpler functions (c1, c2, c3, …, c6).
B.Verification of Function’s Behavior: It is important to verify that the behavior specification of today’s
complex ICs properly represents the behavior desired. In the case of HDL languages, there may be
software “debugging” of the “program” until the desired behavior is obtained, much as programming
languages need to be debugged until correct operation is obtained. Early verification is important since
any errors in the specification of the function will lead to an IC which is faulty as a result of the design,
rather than of physical defects.
TABLE 25.2Prediction of VLSI Evolution by Semiconductor Industries Association
Year 1995 1998 2001 2004 2007 2010
Feature size (mm) 0.35 0.25 0.18 0.13 0.10 0.07
DRAM bits/chip 64M 256M 1G 4G 16G 64G
ASIC gates/chip 5M 14M 26M 50M 210M 430M
Chip size (ASIC) (mm
2
) 450 660 750 900 1100 1400
Maximum number of wiring levels (logic) 4–5 5 5–6 6 6–7 7–8
On-chip speed (MHz) 300 450 600 800 1000 1100
Chip-to-board speed (MHz) 150 200 250 300 375 475
Desktop supply voltage (V) 3.3 2.5 1.8 1.5 1.2 0.9
Maximum power (W), heatsink 80 100 120 140 160 180
Maximum power (W), no heatsink 5 7 10 10 10 10
Power (W), battery systems 2.5 2.5 3.0 3.5 4.0 4.5
Number of I/Os 900 1350 2000 2600 3600 4800
Adapted from The National Technology Roadmap for Semiconductors, Semiconductor Industry
Association, San Jose, Calif., 1994.
? 2000 by CRC Press LLC
? 2000 by CRC Press LLC
FIGURE 25.23
Phot
omicr
og
r
a
ph of
the SHAR
C DSP of
A
nalog De
v
i
c
es,
I
nc.
(C
our
t
esy of
Doug
las Gar
de,
A
nalog De
v
i
c
es,
I
nc.,
N
o
r
w
ood,
M
ass.)
C. Mapping of Logical Function into Physical Blocks: Next, the logical functions, e.g., A(s), B(s), …, E(s) in
Fig. 25.25(a), are converted into physical circuit blocks, e.g., blocks A(b), B(b), …, E(b) in Fig. 25.25(b).
Each physical block represents a logic function as a set of interconnected gates. Although details of the
physical layout are not known at this point, the estimated area and aspect ratio (ratio of height to width)
of each circuit block is needed to organize these blocks within the area of the IC.
FIGURE 25.24 Representative VLSI design sequences. (a) Simplified but representative sequence. (b) Example design
approach from recent trade journal [Lipman, 1995].
FIGURE 25.25 Circuit stages of design. (a) Initial specification (e.g., HDL, schematic, etc.) of ASIC function in terms of
functions, with the next lower level description of function C illustrated. (b) Estimated size of physical blocks implementing
functions. (c) Floorplanning to organize blocks on an IC. (d) Placement and routing of interconnections among blocks.
? 2000 by CRC Press LLC
D.Floorplanning: Next, the individual circuit blocks must be compactly arranged to fit within the minimum
area. Floorplanning establishes this organization, as illustrated in Fig. 25.25(c). At this stage, routing of
interconnections among blocks has not been performed, perhaps requiring modifications to the floor
plan after routing. During floor planning, the design of a logic function in terms of a block function
can be modified to achieve shapes which better match the available IC area. For example, the block E(b)
in Fig. 25.25(b) has been redesigned to provide a different geometric shape for block E(f) in the floor
plan in Fig. 25.25(c).
E.Verification/Simulation of Function Performance: Given the floorplan, it is possible to estimate the average
length of interconnections among the blocks (with the actual length not known until after interconnec-
tion routing in step F below. Signal timing throughout the IC is estimated, allowing verification that the
various circuit blocks interact within timing margins.
F.Placement and Routing: When an acceptable floorplan has been established, the next step is to complete
the routing of interconnections among the various blocks of the design. As interconnections are routed,
the overall IC area expands to provide the area for the wiring, with the possibility that the original
floorplan (which ignored interconnections) is not optimal. Placement [Shahookar and Mazumder, 1991]
considers various rearrangements of the circuit blocks, without changing their internal design but
allowing rotations, etc. For example, the arrangement of some of the blocks in Fig. 25.25(c) have been
changed in the arrangement in Fig. 25.25(d).
G.Verification/Simulation of Performance: Following placement and routing, the detailed layout of the circuit
blocks and interconnections has been established and more accurate simulations of signal timing and
circuit behavior can be performed, verifying that the circuit behaves as desired with the interblock
interconnections in place.
H.Physical Design at Transistor/Cell Level: As the above steps are completed, the design is moving closer
toward a full definition of the ASIC circuit in terms of a set of physical masks precisely specifying
placement of all the transistors and interconnections. In this step, that process is completed.
I.Verification/Simulation of Performance: Before fabricating the masks and proceeding with manufacture
of the ASIC circuit, a final verification of the ASIC is normally performed. Figure 25.24(b) represents
this step as “golden simulation,” a process based on detailed and accurate simulation tools, tuned to the
process of the foundry and providing the final verification of desired performance.
Increasing Impact of Interconnection Delays on Design
In earlier generations of VLSI technology (with larger transistors and wider interconnection lines/spacings),
delays through the low-level logic gates greatly dominated delays along interconnection lines. This was largely
the result of the lower resistance of the larger cross-section interconnections. Under these conditions, a single
pass through a design sequence such as shown in Fig. 25.24 was often adequate. In particular, the placement
of blocks on the ICs, although impacting the lengths of interconnections among blocks, did not have a major
impact on performance since the gate delays within the blocks were substantially larger than interconnection
delays between blocks. Under such conditions, the steps of floorplanning and of placement and routing focused
on such objectives as minimum overall area and minimum interconnection area.
However, as feature sizes have decreased below about 0.5 mm, this condition has changed and current VLSI
technology has interconnection delays substantially larger than logic delays. The increasing importance of
interconnection delays is driven by several effects. The smaller feature size leads to interconnections with a
higher resistance R* per unit length and with a higher capacitance C* per unit area (the capacitance increase
also reflecting additional metal layers and coupling capacitances). For interconnection lines among high-level
blocks (spanning the IC), the result is a larger RC time constant (R*LC ′ *L), with L the line length. While
interconnect delays are increasing, gate delays are decreasing. Figure 25.26 illustrates the general behavior on
technology scaling to smaller features. A logic function F in a previous-generation technology requires a smaller
physical area and has a higher speed in a later, scaled technology (i.e., a technology with feature sizes decreased).
Although the intrablock line lengths decrease (relaxing the impact within the block of higher R*C*), the
interblock lines continue to have lengths proportional to the overall IC size (which is increasing), with the
larger R*C* leading to increased RC delays on such interconnections.
? 2000 by CRC Press LLC
As the interconnection delays have become increasingly dominant, the design process has evolved into an
iterative process through the design steps, as illustrated by the back-and-forward annotation arrows in
Fig. 25.24(a). Initial estimates of delays in step B need to be refined through back-annotation of interconnection
delay parameters obtained after floorplanning and/or after placement and routing to reflect the actual inter-
connection characteristics, perhaps requiring changes in the initial specification of the desired function in terms
of logical and physical blocks. This iterative process moving between the logical design and the physical design
of an ASIC has been problematic since often the logical design is performed by the company developing the
ASIC, whereas the physical design is performed by the company (the “foundry”) fabricating the ASIC. CAD
tools are an important vehicle for coordination of the interface between the designer and the foundry.
General Transistor Level Design of CMOS Circuits
The previous section has emphasized the CAD tools and general design steps involved in designing an ASIC.
A top-down approach was emphasized, with the designer addressing successively more-detailed portions of the
overall design through a hierarchical organization of the overall description of the function. However, the design
process also presumes a considerable understanding of the bottom-up principles through which the overall IC
function will eventually appear as a fully detailed specification of the transistor and interconnection structures
throughout the overall IC [Dillinger, 1988; Weste and Eshraghian, 1993; Wolf, 1994; Rabaey, 1996; Kang and
Leblebici, 1996].
Figure 25.27 illustrates the transistor-level description of a simple three-input, NAND gate. VLSI ASIC logic
circuits are dominated by this general structure, with the PMOS transistors (making up the pull-up section)
connected to the supply voltage V
dd
and the NMOS transistors (making up the pull-down section) connected
to the ground return GND. When the logic function generates a logic “1” output, the pull-up section is shorted
through its PMOS transistors to Vdd leading to a high output voltage while the pull-down section is open,
with no connection of GND to the output. When generating a logic “0” output, the pull-down section is shorted
to GND while the pull-up section is open (no path to V
dd
). Since the output is either a 1 or a 0, only one of
the sections (pull-up or pull-down) is shorted, with no dc current flowing directly from V
dd
to GND through
the logic circuit pull-up and pull-down sections.
The PMOS transistors used in the pull-up section are fabricated with P-type source and drain regions on
N-type substrates. The NMOS transistors used in the pull-down section, on the other hand, are fabricated with
N-type source and drain regions on P-type substrates. Since a given silicon wafer is either N-type or P-type, a
deep, opposite doping-type region must be placed in the silicon wafer for those transistors needing a substrate
of the opposite type. The shaded regions in Fig. 25.27(a) represent the “substrate types” within which the
FIGURE 25.26 Interconnect lengths under scaling of feature size. (a) Initial VLSI ASIC function F with line A extending
across IC. (b) Interconnection cross section with resistance R* per unit length. (c) Interconnection cross section, in scaled
technology, with increased resistance per unit length. (d) VLSI ASIC function G in scaled technology containing function F
but reduced in size (including interconnection A) and containing a long line B extending across the IC.
? 2000 by CRC Press LLC
transistors are fabricated. A deep doping of the desired substrate type is provided, with transistors fabricated
within such deep, substrate doping “wells.” To allow tight packing of transistors, large substrate doping wells
are used, with a large number of transistors placed in each well.
Each logic cell must be connected to power (V
dd
) and ground (Gnd), requiring that external power and
ground connections to the IC be routed (on continuous metal lines to avoid resistive voltage drops) to each
logic cell on the IC. Early IC technologies provided only a single level of metallization on which to route power
and ground and an interdigitated layout, illustrated in Fig. 25.28(a), was adopted. Given this power and ground
layout approach, channels of pull-up sections and channels of pull-down sections were placed between the
power and ground interconnections, as illustrated in Fig. 25.28(b).
Bottom-up IC design is also hierarchical, with the designer completing detailed layout of a specific logic
function (e.g., a binary adder), placing that detailed layout (a cell) in a library of physical designs and then
reusing that library cell when other instantiations of the cell are required elsewhere in the IC. Figure 25.29
illustrates this cell-based approach, with cells of common height but varying width placed in rows between the
power and ground lines. The straight power and ground lines shown in Fig. 25.29 allow tight packing of adjacent
rows.
To achieve tight packing of cells within a row, adjacent cells (Fig. 25.29) are abutted against each other. Since
metal interconnections are used within cells (and most basic cells are reused throughout the design), it is
generally not possible to route intercell metal interconnections over cells. This constraint can be relaxed, as
shown in Fig. 25.30(b), when additional metal layers are available, restricting a subset of the layers for intracell
use and allowing over-the-cell routing [Sherwani et al., 1995] with the other layers.
When metal intercell interconnections cannot be safely routed over cells, interconnection channels must be
provided between rows of logic cells, leading to the wiring channels above and/or below rows of logic cells as
shown in Fig. 25.29. In this approach, all connections to and from a logic cell are fed from the top and/or
bottom wiring channel. The width of the wiring channel is adjusted to provide space for the number of intercell
interconnections required in the channel. Special cells providing through-cell routing can be used to support
short interconnections between adjacent rows of cells. Given this layout style at the lowest level of cells, larger
functions can be readily constructed, as illustrated in Fig. 25.30(a). Figure 25.29(b) illustrates interconnections
(provided in the polysilicon layer under the metal layers) to the logic cells from the wiring channel. For classical
CMOS logic cells, the same set of input signals are applied to both the pull-up and pull-down sections, as in
the example in Fig. 25.27(b). By organizing the sequence of transistors along the pull-up and pull-down sections
properly, inputs can extend vertically between a PMOS transistor in the pull-up section and a corresponding
NMOS transistor in the pull-down section. Algorithms to determine the appropriate ordering of transistors
evolved early in CAD tools.
FIGURE 25.27 Transistor representation of three-input NAND gate. (a) Transistor representation without regard to layout.
(b) Transistor representation using parallel rows of PMOS and NMOS transistors, with interconnections connected from a
wiring channel.
? 2000 by CRC Press LLC
ASIC Technologies
Drawing on the discussion above, the primary ASIC technologies (gate arrays, sea-of-gate arrays, standard cell
ASICs, ASICs with “megacells,” and field-programmable gate arrays) can be easily summarized. For comparison,
full custom VLSI is briefly described first.
Full Custom Design
In full custom design, custom logic cells are designed, starting at the lowest level (i.e., transistor-based cell design)
and extending to higher levels (e.g., combinations of cells for higher-level functions) to create the overall IC
function. Figure 25.31(a) illustrates the general layout at the cell level. The designer can exploit new cell designs
FIGURE 25.28 Power and ground distribution (interdigitated lines) with rows of logic cells and rows of wiring channels.
(a) Overall power distribution and organization of logic cells and wiring channels. (b) Local region of power distribution
network.
FIGURE 25.29 Cell-based logic design, with cells organized between power and ground lines and with intercell wiring in
channels above (and/or below, also) the cell row.
FIGURE 25.30 (a) Construction of larger blocks from cells with wiring channels between rows of cells within the block.
(b) Over-the-cell routing on upper metal levels which are not used within the cells.
? 2000 by CRC Press LLC
which improve the performance of the specific function being designed, can provide interconnections through
wiring areas between logic cells to create compact functions, can use the full variety of CMOS circuit designs
(e.g., dynamic logic, pass-transistor logic, etc.), can use cells previously developed privately for earlier ICs, and
can use cells from a standard library provided by the foundry.
Standard Cell ASIC Technology
Consider a full custom circuit design completed using only predefined logic cells from a foundry-provided
specific library and physical design processes provided by standard EDA/CAD tools. This approach, standard
cell design [Heinbuch, 1987, SCMOS Standard Cell Library, 1989], is one of the primary ASIC technologies. A
critical issue impacting standard cell ASIC design is the quality of the standard cell library provided by the
foundry. By providing a rich set of library cells, the coordination between the logical design and the physical
design is substantially more effective. Figure 25.31(a) also illustrates the general standard cell approach, using
standard library cells of design-specified height (according to cells used), design-specified width logic rows,
and varying width of the wiring channel to accommodate the number of interconnection wires determined
during place and route.
Gate Array ASIC Technology
The gate array technology [Hollis, 1987] is based on partially prefabricated (up to but not including the final
metallization layer) wafers with simple gate cells. Such noncustomized wafers are stockpiled and the ASIC
designer specifies the final metallization layer added to customize the gate array. Gate array cells draw on the
general cell design shown earlier, Fig. 25.27(b). Figure 25.32 illustrates representative non-customized transis-
tor-level cells although different foundries use different physical layouts) with the dashed lines representing
metal layers (including power and ground) added during the final metallization process.
FIGURE 25.31(a) Full custom layout (with custom cells) or standard cell ASIC layout (using library cells). (b) Gate array
layout, with fixed width wiring channels (in prefabricated, up to metallization, wafers).
FIGURE 25.32Basic noncustomized gate array cells (two input, three and four input examples). The dashed lines represent
the power and ground lines, as well as interconnections in the wiring channel, which are placed on the IC during customization.
? 2000 by CRC Press LLC
The ASIC designer’s task is to translate the desired VLSI logic function into a physical design using the basic
gate cells provided on the noncustomized IC. To avoiding the routing of intercell interconnections around long
rows of logic cells, some of the “cells” along the row are feedthrough cells, allowing routing of interconnections
to other logic cell rows. Included in the designer’s toolset are library functions predefining the construction of
more-complex logic functions (e.g., adders, registers, multipliers, etc.) from the gate cells on the noncustomized
gate array IC.
The gate array technology shares the costs of masks among all ASIC customers and exploits high-volume
production of the noncustomized wafers. On the other hand, construction of higher-level functions must adhere
to the predefined positions and type of gate cells, leading to less-efficient and lower-performance designs than
in the standard cell approach. In addition, the width of the wiring channel is fixed, limiting the number of
parallel interconnections in the channel and imposing wasted area if all available wires are not used.
Sea-of-Gates ASIC Technology
The sea-of-gates technology also uses premanufactured, noncustomized gate arrays. However, additional met-
allization layers are provided, with lower-level metallization layer(s) used to program the internal function of
the cells and the upper-level layer(s) used for over-the-cell routing [Sherwani et al., 95] of signals among cells,
such as illustrated earlier in Fig. 25.30(b). This eliminates the need for wiring channels and feedthrough cells,
leading to denser arrays of transistors.
CMOS Circuits Using Megacell Elements
The examples above have focused on low-level logic functions. However, as the complexity of VLSI ICs has
increased, it has become increasingly important to include standard, high-level functions (e.g., microprocessors,
DSPs, PCI interfaces, MPEG coders, RAM arrays, etc.) within an ASIC. For example, an earlier generation of
microprocessor may offer the necessary performance and would occupy only a small portion of the area of a
present-generation ASIC. Including such a standard microprocessor has the advantages of allowing the ASIC
design to be completed more quickly as well as providing users with the microprocessor standard instruction
set and software development tools. Such large cells are called megacells. As VLSI technologies advance and as
standards increasingly impact design decisions, the use of standard megacells will become increasingly common.
Field-Programmable Gate Arrays: Evolving to an ASIC Technology
The field-programmable gate array (FPGA), like the gate array, places fixed cells on the wafer, and the FPGA
designer constructs more-complex functions from these cells. However, the cells provided on the FPGA can be
substantially more complex than the simple gates provided on the gate array. In addition, the term field
programmable highlights the customizing of the ASIC by the user, rather than by the foundry manufacturing
the FPGA. The mask-programmable gate array (MPGA) is similar to the FPGA (using more-complex cells than
the gate array), but the programming is performed by addition of the metal layer by the FPGA manufacturer.
Figure 25.33 shows an example of cells and programmable interconnections for a representative FPGA
technology [Actel, 1995]. The array of cells is constructed from two types of cell, which alternate along the
logic cell rows of the FPGA. The combinational logic cell — “C-module” in Fig. 25.33(a) — provides a ROM-
based lookup table (LUT) able to efficiently implement a complex logic function (with four data inputs and
two control signals). The sequential cell (S-module) adds a flip-flop to the combinational module, allowing
efficient realization of sequential circuits. The interconnection approach illustrated in Fig. 25.33(c) is based on
(1) short vertical interconnections directly connecting adjacent modules, (2) long vertical interconnections
extending through the overall array, (3) long horizontal interconnections extending across the overall array,
(4) points at which the long vertical interconnections can be connected to cells, and (5) points at which the
long vertical and horizontal lines can be used for general routing. The long horizontal and vertical lines are
broken into segments, with programmable links between successive segments. The programmer can then
connect a set of adjacent line segments to create the desired interconnection line. In addition, programmable
connection points allow the programmer to make transitions between the long vertical lines and the long
horizontal lines. By connecting various inputs the cell of an FPGA to either V
dd
or GND, the cell can be
“programmed” to perform one of its possible functions. The basic array is complemented by additional driver
and other circuitry around the perimeter of the FPGA for interfacing to the “external world.”
? 2000 by CRC Press LLC
Different FPGA manufacturers have developed different basic cells, seeking to provide the most useful
functionality in the cells for generation of overall FPGA functions. Cells range from fine-grained cells consisting
of basic gates through medium-grained cells providing more complex programmable functions to large-grained
cells. Different FPGA manufacturers also provide different approaches to the programming step. FPGA pro-
gramming technologies include one-time programming or multiple-time programming capabilities with the
programming either nonvolatile (i.e., programming is retained when power is turned off) or volatile (i.e.,
programming is lost when power is turned off). Physical programming includes antifuse (fuse) approaches in
which the programming nodes are normally off (on) and are “blown” into a permanently on (off) state,
providing one-time, nonvolatile programming. Electrical switches also can be used for programming, with an
electrical control signal setting the state of the switch. The state control signal can be provided by an EPROM
(one-time, nonvolatile), an EEPROM (multiple-time, nonvolatile), or an SRAM (multiple-time, volatile), with
different approaches having different advantages and disadvantages.
Interconnection Performance Modeling
Accurate estimation of signal timing is increasingly important in contemporary VLSI ASICs and will become
even more important as feature sizes decrease further. Higher clock rates impose tighter timing margins,
requiring more accurate modeling of the signal delays. In addition, more-sophisticated models are required for
smaller feature size VLSI.
Perhaps of greatest impact is the rapid increase in the importance of interconnect delay relative to gate delay.
In the earlier 1 mm VLSI technologies, typical gate delays were about six times the average interconnection
delays. For the 0.5 mm technologies, gate delays had decreased while interconnect delays had increased, becoming
approximately equal. At 0.3 mm, the decreasing gate delay and increasing interconnect delays have led to average
interconnect delays about six times greater than typical gate delays. Accurate estimation of signal delays early
in the design is therefore increasingly difficult, since the designer does not have a detailed knowledge of
interconnection lengths and nearby lines, which can cause coupling noise, until much of the design has been
completed. As the design proceeds and the interconnection lengths become better specified, parameters related
to the signal performance can be fed back (back-annotated) to the earlier design steps, allowing the design to
be adapted to reflect necessary changes to achieve desired performance.
In earlier VLSI technologies, a linear delay model was adequate, representing the overall delay t from the
input to one cell (cell A in Fig. 25.34) to the input to the connected cell (cell B in Fig. 25.34) by an analytic
FIGURE 25.33 Example of FPGA elements (Actel FPGA family [Actel, 1995]). Combinational cells (a) and sequential
cells (b). (c) Programmable wiring organization.
? 2000 by CRC Press LLC
form such as t = t(0) + k1 · C(out) + k2 · t(s), where t(0) is the intrinsic (internal) delay of the cell with no
loading, C(out) is the output capacitance seen by the output driver of the cell, t(s) is the rise/fall time of the
signal, and the parameters k1 and k2 are constants. In the case of deep submicron CMOS technologies, the
overall delay must be divided into the intrinsic delay of the gate and the extrinsic delay of the interconnect,
each having substantially more-complex models than the linear model.
Factors impacting the intrinsic delay of gates include the following, with the input and output signals referring
to logic cell A in Fig. 25.34.
1.A 0-to-1 change in an input to cell A may cause a different delay to the output of cell A than a 1-to-0
change in that input.
2.Starting from the time when the input starts to change, slower transition times lead to a longer delays
before the threshold voltage is reached, leading to longer delays to the output.
3.Once the input passes the threshold voltage, a slower changing input may lead to a longer delay to the
output transition.
4.The delay from a change in a given input to a change in the output may depend on the state of other
inputs to the circuit.
These are merely representative examples of the more-complex behavior seen in the gates as the feature sizes
decrease. Together, those complexities lead to a nonlinear delay model, which is typically implemented as a
LUT, rather than with an analytic expression.
The models used for interconnections [Tewksbury, 1994] have also changed, reflecting the changing inter-
connection parameters and increasing clock rates. The three primary models are as follows.
Lumped RC Model: If the rise/fall times of the signal are substantially greater than the round-trip propagation
delay of the signal, then the voltage and current are approximately constant across the length of the
interconnection. The interconnection is modeled as a single lumped resistance and a single lumped
capacitance. The signal does not incur a propagation delay, and at all points along the line the signal
has the same rise/fall times.
Distributed RC Model: If the line is sufficiently long, the signal sees a decreasing resistance and capacitance
toward the destination as the signal traverses the line. To represent this changing RC, the distributed RC
model divides the overall line into shorter segments, each of which can be represented by the lumped
RC model above. The transfer function of the overall line is then the product of the transfer functions
of the sections. The propagation delay is negligible, though the rise/fall times increase as the signal
propagates toward the far-end gates (significant if the line is tapped along its length to drive multiple
gates).
Distributed RLC Model: As the rise/fall times become shorter, the relative contributions of capacitance and
inductance change. In particular, the impedance of the capacitance is inversely proportional to frequency
while that of the inductance is proportional to frequency. At sufficiently high data rates, the inductance
effects become significant. In this case, the signal is delayed as it propagates toward the far-end gates,
with the rise/fall times increasing along the line. Different terminal points of a net will see the signal at
different times with different rise/fall times.
FIGURE 25.34 Logic cell delay (intrinsic delay) and intercell interconnection delay (extrinsic delay).
? 2000 by CRC Press LLC
Given the wide range of lengths of signal interconnections, all three models above are relevant; the lumped RC
model suitable for short interconnections, the distributed RC model for low-to-moderate speed signals on longer-
length interconnections, and the distributed RLC model for high-speed signals on longer interconnections.
Accurate modeling of signals propagating on an interconnection requires detailed knowledge of the capac-
itance C* and inductance L* per unit length along the length of the line. As additional metal layers have been
provided, capacitance to neighboring lines (on different or the same metal layer) has become increasingly
important, even exceeding the capacitance to ground in some cases. Extraction of accurate interconnect delay
parameters may require the use of three-dimensional field solvers, with two-dimensional analysis used for less-
accurate modeling of signal behavior.
In addition to the effects noted above, crosstalk (increasingly problematic for buses whose parallel lines run
long distances) and reflections (of increasing importance as the signal frequencies increase) degrade signals.
This broad range of effects impacting signal delay, distortion, and noise have made signal integrity an increasingly
important issue in VLSI design. Signal integrity effects also appear on the “dc” power and ground lines, due
to large transient currents caused by switching gates and switching drivers of output lines.
Clock Distribution
Signals are increasingly distorted not only by long line lengths, but also by the higher clock frequency in present-
day VLSI circuits, with the combination of long lines and high clock rates of particular concern. Present-day
VLSI circuits include a vast number of flip-flops (often as registers) distributed across the area of the VLSI
circuit. Synchronous ASICs use a common clock, distributed to each of these flip-flops. With the clock signal
being the longest interconnection on the VLSI circuit and the highest-frequency signal, design of the clock
distribution network is critical for highest performance.
Complex synchronous ASICs are designed assuming that all flip-flops are clocked simultaneously. Clock skew
is the maximum difference between the times of clock transitions at any two flip-flops of the overall VLSI
circuit. The clock network must deliver clock signals to each of the flip-flops within the margins set by the
allowed clock skew, margins which are substantially less than the clock period. For example, part of the 2-ns
clock period of a high-speed VLSI circuit operating with a 500-MHz clock is consumed by the rise/fall times
of the signals appearing at the input to the flip-flop and by the specified setup and hold times of the flip-flop.
The result is that the clock must be applied to the flip-flop within a time interval small as compared with the
clock period.
The distance over which the clock signal can travel before incurring a delay greater than the clock skew
defines isochronous regions (illustrated in Fig. 25.35(a) as shaded regions) within the IC. If the external clock
can be provided to such regions with zero clock skew, then clock routing within the isochronous region is not
critical. Figure 25.35(a) illustrates the H-tree approach, whose clock paths have equal lengths to terminal points,
ideally delivering clock pulses to each of the terminal points (leaf nodes) of the tree simultaneously (zero skew).
In a real circuit, precisely zero clock skew is not achieved since different network segments encounter different
environments of data lines coupled electrically to the clock line segment.
In Fig. 25.35(a), a single buffer drives the entire H-tree network, requiring a large area buffer and wide clock
lines toward the connection of the clock line to the external clock signal. Such a large buffer can account for
up to 30% or more of the total VLSI circuit power dissipation. Figure 25.35(b) illustrates a distributed buffer
approach, with a given buffer only having to drive those clock line segments to the next level of buffers. In this
case, the buffers can be smaller and the clock lines can be narrower. The 300-MHz DEC Alpha microprocessor,
for example, uses an H-tree clock distribution network with multiple stages of buffering extending to the final
legs of the H-tree network. Another approach to relax the clock distribution problem uses multiple input/output
(I/O) pins for the clock. In this case, a number of smaller H-trees can be driven separately, one starting at each
clock I/O pin.
The constraint on clock timing is a bound on clock skew, not a requirement for zero clock skew. In
Fig. 25.35(c), the clock network uses multiple buffers but allows different path lengths consistent with clock
skew margins. For tight margins, an H-tree can be used to deliver clock pulses to local regions in which
distribution proceeds using a different buffered network approach such as that in Fig. 25.35(c).
? 2000 by CRC Press LLC
Other approaches for clock distribution are gaining in importance. For example, a lower frequency clock
can be distributed across the VLSI circuit, with phase-locked loops (PLLs) used to multiply the clock rate at
various sites on the circuit. In addition to multiplying the low clock rate, the PLL can also adjust the phase of
the high rate clock, correcting clock skew which may have been introduced in the routing of the low rate clock
to that PLL. Another approach is to generate a local clock at a local register when the register input data changes.
This self-timed circuit approach leads to asynchronous circuits but can be quite effective for register-transfer
logic (generating a local clock for a large register).
Power Distribution
Present-day VLSI ASICs consume considerable power, unless specifically designed for battery-operated, low-
power portable electronics [Chandrakasan and Broderson, 1995]. A 40-W IC operating at 3.3 V requires a
current of 12 A, with currents increasing in future generations of high-power VLSI (due not only to the higher
power dissipation but also to lower V
dd
).
Voltage and ground lines must be properly sized to prevent the peak current density exceeding the level at
which the power lines will be physically “blown out,” leading to rapid and catastrophic failure of the circuit.
An equally serious problem is gradual deterioration of a voltage or ground line, eventually leading to failure
as a result of electromigration. Electromigration failure affects both signal and power lines, but is particularly
important in power lines because of the constant direction of the current. As current flows through an aluminum
interconnection, the average force exerted on the metal atoms by the electrons leads to a slow migration of
those atoms in the direction of electron flow, causing the line to migrate in the direction of the electrons
(opposite to the current flow). In regions of the metal line where discontinuities occur (e.g., at the naturally
occurring grain boundaries), a void can develop, creating an open in the line. Fortunately, there is a current
density threshold level (about 1 mA/mm) below which electromigration is insignificant. Notably, copper, in
addition to having a lower resistivity than aluminum, has greater resistance to electromigration. Accurate
estimates of power dissipation due to logic switching within logic blocks of the ASIC are also necessary to assess
thermal heating within the IC.
Another issue in power distribution concerns ground bounce (or simultaneous switching noise), which is
increasingly problematic as the number of ASIC I/O data pins increases. Consider M output lines switching
simultaneously to the 1 state, each of those lines outputting a current transient I(out) within a time t(out). (If
M output lines switch simultaneously to the 0 state, then a corresponding input current transient is produced.)
The net output current M · I(out) is fed through the V
dd
pin (returned to ground in the case of outputs
switching to 0). With an inductance L associated with the V
dd
pin, a transient voltage DV ? L · M · I(out)/t(out)
is imposed on V
dd
. A similar effect occurs on the ground connection for outputs switching to 0. With a total
output current of 200 mA/ns and a ground pin inductance of 5 nH, the voltage transient is about 1 V. The
voltage transient propagates through the IC, potentially causing logic blocks to fail to produce the correct
outputs. The transient voltage can be reduced by reducing the power line inductance L, for example by replacing
FIGURE 25.35 H-tree clock distribution. (a) Example of single driver and isochronous (shaded) regions. (b) Example of
distributed drivers. (c) Example of clock distribution with unequal line lengths but within skew tolerances.
? 2000 by CRC Press LLC
the single V
dd
and Gnd pins by multiple V
dd
and Gnd pins, with K voltage pins reducing the inductance by a
factor of K.
With power distribution and power line noise problems growing in importance, EDA/CAD tools are rapidly
evolving to provide the designer with early estimates and final accurate assessments of various measures of
current and of power dissipation.
Analog and Mixed-Signal ASICs
One of the exciting ASIC areas undergoing rapid development is the addition of analog integrated circuits
[Geiger et al., 1990; Ismail and Fiez, 1994; Laker and Sansen, 1994] to the standard digital VLSI ASIC,
corresponding to mixed-signal VLSI. Mixed-signal ICs allow the IC to interact directly with the real physical
(and analog) world. Library cells representing various analog circuit functions supplement the usual digital
circuit cells of the library, allowing the ASIC designer to add needed analog circuits within the same general
framework as the addition of digital circuit cells. Automotive electronics is a representative example, with many
sensors providing analog information which is converted into a digital format and analyzed using microcom-
puters or other digital circuits. Mixed-signal library cells include A/D and D/A converters, comparators, analog
switches, sample-and-hold circuits, etc., while analog library cells include op amps, precision voltage sources,
and phase-locked loops.
As such mixed-signal VLSI ASICs evolve, EDA/CAD tools will also evolve to address the performance and
design issues related to analog circuits and their behavior in a digital circuit environment. In addition, analog
high-level description languages (AHDLs) are being developed to support high-level specifications of both
analog and mixed-signal circuits.
Summary
For about three decades, microelectronics technologies have been evolving, starting with primitive digital logic
functions and evolving to the extraordinary capabilities available in present-day VLSI ASICs. This evolution
promises to continue for at least another decade, leading to VLSI ICs containing complex systems and vast
memory on a single IC. ASIC technologies (including the EDA/CAD tools which guide design to a final IC)
deliver this complex technology to the systems designers, including those not associated with a company having
a microfabrication facility. This delivery of a highly complex technology to the average electronic systems
designer is the result of a steady migration of specialized skill to very powerful EDA/CAD tools which control
the complexity of the design process and the result of a need to provide a wide variety of electronics designers
with access to technologies which earlier had been available only within large, vertically integrated companies.
Defining Terms
ASIC: Application-specific integrated circuit — an integrated circuit designed for a specific application.
CAD: Computer-aided design — software programs which assist the design of electronic, mechanical, and
other components and systems.
CMOS: Complementary metal-oxide semiconductor transistor circuit composed of PMOS and NMOS tran-
sistors.
EDA: Electronics design automation — software programs which automate various steps in the design of
electronics components and systems.
Extrinsic Delay: Also called point-to-point delay, the delay from the transition of output of a logic cell to the
transition at the input of another logic cell.
HDL: High-level description language — a software “language” used to describe the function performed by a
circuit (or collection of circuits).
IC: Integrated circuit — a normally silicon substrate in which electronic devices and interconnections have
been fabricated.
Intrinsic delay: Also called pin-to-pin delay, the delay between the transition of an input to a logic cell to the
transition at the output of that logic cell.
? 2000 by CRC Press LLC
Mixed-signal ICs: Integrated circuits including circuitry performing digital logic functions as well as circuitry
performing analog circuit functions.
NMOS transistor: A metal-oxide semiconductor transistor which is in the on state when the voltage input is
high and in the off state when the voltage input is low.
PMOS Transistor: A metal-oxide semiconductor transistor which is in the off state when the voltage input is
high and in the on state when the voltage input is low.
Rise/(fall) time: The time required for a signal (normally voltage) to change from a low (high) value to a high
(low) value.
V
dd
: The supply voltage used to drive logic within an IC.
VLSI: Very large scale integration — microelectronic integrated circuits containing large (presently millions)
of transistors and interconnections to realize a complex electronic function.
Wiring channel: A region extending between the power and ground lines on an IC and dedicated for placement
of interconnections among logic cells.
Related Topic
79.1 IC Logic Family Operation and Characteristics
References
Actel, 1995. Actel FPGA Data Book and Design Guide, Sunnyvale, Calif.: Actel Corp., 1995.
J. R. Armstrong, Chip-Level Modeling with VHDL, Englewood Cliffs, N.J.: Prentice-Hall, 1989.
P. Banerjee, 1994. Parallel Algorithms for VLSI Computer-Aided Design, Englewood Cliffs, N.J.: Prentice-Hall.
R. Camposano, and W. Wolf (Eds.), High Level VLSI Synthesis, Norwell, Mass.: Kluwer Academic Publishers,
1991.
A. Chandrakasan and R. Broderson, Low Power Digital CMOS Design, Norwell, Mass.: Kluwer Academic
Publishers, 1995.
G. De Micheli, Synthesis of Digital Circuits, New York: McGraw-Hill, 1994a.
G. De Micheli, Synthesis and Optimization of Digital Circuits, New York: McGraw-Hill, 1994b.
T. E. Dillinger, VLSI Engineering, Englewood Cliffs, N.J.: Prentice-Hall, 1988.
D. Gajski, N. Dutt, A. Wu, and S. Lin, High-Level Synthesis: Introduction to Chip and Systems Design, Norwell,
Mass.: Kluwer Academic Publishers, 1992.
R. L. Geiger, P. E. Allen, and N. R. Strader, VLSI Design Techniques for Analog and Digital Circuits, New York:
McGraw-Hill, 1990.
D. V. Heinbuch, CMOS Cell Library, New York: Addison-Wesley, 1987.
F. J. Hill, and G. R. Peterson, Computer Aided Logical Design with Emphasis on VLSI, New York: John Wiley &
Sons, 1993.
D. Hill, D. Shugard, J. Fishburn, and K. Keutzer, Algorithms and Techniques for VLSI Layout Synthesis, Norwell,
Mass.: Kluwer Academic Publishers, 1989.
E. E. Hollis, Design of VLSI Gate Array ICs, Englewood Cliffs, N.J.: Prentice-Hall., 1987.
M. Ismail, and T. Fiez, Analog VLSI: Signal and Information Processing, New York: McGraw-Hill, 1994.
N. Jha, and S. Kundu, Testing and Reliable Design of CMOS Circuits, Norwell, Mass.: Kluwer Academic Publishers,
1990.
S.-M. Kang, and Y. Leblebici, CMOS Digital Integrated Circuits: Analysis and Design, New York: McGraw-Hill,
1996.
K. R. Laker, and W. M. C. Sansen, Design of Analog Integrated Circuits and Systems, New York: McGraw-Hill, 1994.
K. Lee, M. Shur, T. A. Fjeldly, and Y. Ytterdal, Semiconductor Device Modeling for VLSI, Englewood Cliffs, N.J.:
Prentice-Hall, 1993.
J. Lipman, “EDA tools put it together,” Electronics Design News (EDN), Oct 26, 1995, pp. 81–92.
R. Lipsett, C. Schaefer, and C. Ussery, VHDL: Hardware Description and Design, Norwell, Mass.: Kluwer
Academic Publishers, 1990.
S. Mazor, and P. Langstraat, A Guide to VHDL, Norwell, Mass.: Kluwer Academic Publishers, 1992.
? 2000 by CRC Press LLC
The National Technology Roadmap for Semiconductors, San Jose, Calif.: Semiconductor Industry Association,
1994.
K. P. Parker, The Boundary-Scan Handbook, Norwell, Mass.: Kluwer Academic Publishers, 1992.
B. Preas, and M. Lorenzetti, Physical Design Automation of VLSI Systems, Menlo Park, Calif.: Benjamin-Cum-
mings, 1988.
J. M. Rabaey, Digital Integrated Circuits: A Design Perspective, Englewood Cliffs, N.J.: Prentice-Hall, 1996.
S. Rubin, Computer Aids for VLSI Design, New York: Addison-Wesley, 1987.
SCMOS Standard Cell Library, Center for Integrated Systems, Mississippi State University, 1989.
K. Shahookar, and P. Mazumder, “VLSI placement techniques,” ACM Computing Surveys, vol. 23(2),
pp. 143–220, 1991.
N. A. Sherwani, Algorithms for VLSI Design Automation, Norwell, Mass.: Kluwer Academic Publishers, 1993.
N. A. Sherwani, S. Bhingarde, and A. Panyam, Routing in the Third Dimension: From VLSI Chips to MCMs,
Piscataway, N.J.: IEEE Press., 1995.
S. Tewksbury (Ed.), Microelectronic Systems Interconnections: Performance and Modeling, Piscataway, N.J.: IEEE
Press, 1994.
D. E. Thomas, and P. Moorby, The Verilog Hardware Description Language, Norwell, Mass.: Kluwer Academic
Publishers, 1991.
N. H. E. Weste, and K. Eshraghian, Principles of CMOS VLSI Design, New York: Addison-Wesley, 1993.
J. White, and A. Sangiovanni-Vincentelli, Relaxation Methods for Simulation of VLSI Circuits, Norwell, Mass.:
Kluwer Academic Publishers, 1987.
W. Wolf, Modern VLSI Design: A Systems Approach, Englewood Cliffs, N.J.: Prentice-Hall, 1994.
Further Information
The Institute for Electrical and Electronics Engineers, Inc. (IEEE) publishes several professional journals which
describe the broad range of issues related to contemporary VLSI circuits, including the IEEE Journal of Solid-
State Circuits and the IEEE Transactions on Very Large Scale Integration Systems. Other applications-related
journals from the IEEE cover VLSI-related topics. Representative examples include the IEEE Transactions on
Signal Processing, the IEEE Transactions on Computers, the IEEE Transactions on Image Processing, and the IEEE
Transactions on Communications. Several conferences also highlight VLSI circuits, including the IEEE Solid-
State Circuits Conference, the IEEE Custom Integrated Circuits Conference, and several others.
Commercial software tools and experiences related to ASIC designs are changing rapidly, but are well covered
in several trade journals including Integrated System Design (the Verecom Group, Los Altos, Calif.), Computer
Design (PennWell Publishing Co., Nashua, N.H.), Electronics Design News (Cahners Publishing Co., Highlands
Ranch, Colo.), and Electronic Design (Penton Publishing Inc., Cleveland, Ohio).
There are also many superb books covering the many topics related to IC design and VLSI design. The
references for this chapter consist mainly of books, all of which are well-established treatments of various aspects
of VLSI circuits and VLSI design automation.
? 2000 by CRC Press LLC