Pricer, W.D., Katz, R.H., Lee, P.A., Mansuripur, M. “Memory Devices”
The Electrical Engineering Handbook
Ed. Richard C. Dorf
Boca Raton: CRC Press LLC, 2000
80
Memory Devices
80.1 Integrated Circuits (RAM, ROM)
Dynamic RAMs (DRAMs) ? Static RAMs (SRAMs) ? Nonvolatile
Programmable Memories ? Read-Only Memories (ROMs)
80.2 Basic Disk System Architectures
Basic Magnetic Disk System Architecture ? Characterization of I/O
Workloads ? Extensions to Conventional Disk Architectures
80.3 Magnetic Tape
A Brief Historical Review ? Introduction ? Magnetic Tape ? Tape
Format ? Recording Modes
80.4 Magneto-Optical Disk Data Storage
Preliminaries and Basic Definitions ? The Optical Path ? Automatic
Focusing ? Automatic Tracking ? Thermomagnetic Recording
Process ? Magneto-Optical Readout ? Materials of Magneto-Optical
Data Storage
80.1 Integrated Circuits (RAM, ROM)
W. David Pricer
The major forms of semiconductor memory in descending order of present economic importance are
1. Dynamic Random-Access Memories (DRAMs)
2. Static Random-Access Memories (SRAMs)
3. Nonvolatile Programmable Memories (PROMs, EEPROMs, EAROMs, EPROMs)
4. Read-Only Memories (ROMs)
DRAMs and SRAMs differ little in their applications. DRAMs are distinguished from SRAMs in that no
bistable electronic circuit internal to the storage cell maintains the information. Instead DRAM information is
stored “dynamically” as charge on a capacitor. All modern designs feature one field-effect transistor (FET) to
access the information for both reading and writing and a thin film capacitor for information storage. SRAMs
maintain their bistability, so long as power is applied, by a cross-coupled pair of inverters within each storage
cell. Almost always two additional transistors serve to access the internal nodes for reading and writing. Most
modern cell designs are CMOS, with two P-channel and four N-channel FETs.
Programmable memories operate much like read-only memories with the important attribute that they can
be programmed at least once, and some can be reprogrammed a million times or more. Storage is almost always
by means of a floating-gate FET. Information in such storage cells is not indefinitely nonvolatile. The discharge
time constant is on the order of ten years. ROMs are generally programmed by a custom information mask
within the fabrication sequence. As the name implies, information thence can only be read. The information
thus stored is truly nonvolatile, even when power is removed. This is the most dense form of semiconductor
storage (and the least flexible). Other forms of semiconductor memories, such as associative memories and
charge-coupled devices, are used rarely.
W. David Pricer
IBM
Randy H. Katz
University of California, Berkeley
Peter A. Lee
Department of Trade and Industry,
London
M. Mansuripur
University of Arizona, Tucson
? 2000 by CRC Press LLC
Dynamic RAMs (DRAMs)
The universally used storage cell circuit of one transistor and
one capacitor has remained unchanged for over 20 years. The
physical implementation, however, has undergone much
diversity and many refinements. The innovation in physical
implementation is driven primarily by the need to maintain
a nearly constant value of capacitance while the surface area
of the cell has decreased. A nearly fixed value of capacitance
is needed to meet two important design goals. The cell has no
internal amplification. Once the information is accessed, the
stored voltage is vastly attenuated by the much larger bit line
capacitance (see Fig. 80.1). The resulting signal must be kept
larger than the resolution limits of the sensing amplifier.
DRAMs in particular are also sensitive to a problem called soft
errors. These are typically initiated by atomic events such as
the incidence of a single alpha particle. An alpha particle can
cause a spurious signal of 50,000 electrons or more. All mod-
ern DRAM designs resolve this problem by constructing the
capacitor in space out of the plane of the transistors (see
Fig. 80.2 for examples). Placing the capacitor in space unusable for transistor fabrication has allowed great
strides in DRAM density, generally at the expense of fabrication complexity. DRAM chip capacity has increased
by about a factor of four every three years.
DRAMs are somewhat slower than SRAMs. This relationship derives directly from the smaller signal available
from DRAMs and from certain constraints put on the support circuitry by the DRAM array. DRAMs also
FIGURE 80.2 (a) Cross section of “trench capacitors” etched vertically into the semiconductor surface of a DRAM inte-
grated circuit. (Courtesy of IBM.) (b) Cross section of “stacked” capacitors fabricated above the semiconductor surface of
a DRAM integrated circuit. (Source: M. Taguchi et al., “A 40-ns 64-b parallel data bus architecture,” IEEE J. Solid State
Circuits, vol. 26, no. 11, p. 1495. ? 1991 IEEE. With permission.)
FIGURE 80.1 Cell and bit line capacitance.
? 2000 by CRC Press LLC
THE REVOLUTION OF
ELECTRONICS TECHNOLOGY
he last three decades have witnessed a revo-
lution in electrical and, especially, electron-
ics technology. This revolution was paced by
changes in solid-state electronics that greatly
expanded capabilities while at the same time radi-
cally reduced costs. The entire field of electrical engi-
neering has grown far beyond the boundaries that
characterized it just a generation ago. Electrical engi-
neers have become the creators and masters of the
most pervasive technology of our time, with pro-
found effects on society and on their profession.
The effects of the electronics revolution are com-
T
? 2000 by CRC Press LLC
require periodic intervals to “refresh” lost charge from the capacitor. This charge is lost primarily across the
semiconductor junctions and must be replenished every few milliseconds. The manufacturer usually supplies
these “housekeeping” functions with on-chip circuitry.
Signal detection and amplification remain a critical focus of good DRAM design. Figure 80.3 illustrates an
arrangement called a “folded bit line.” This design cancels many of the noise sources originating in the array
and decreases circuit sensitivity to manufacturing process variations. It also achieves a high ratio of storage
cells per sense amplifier. Note the presence of the dummy cells, which create a reference signal midway between
a “one” and a “zero” for the convenience of the sense amplifier. The stored reference voltage in this case is
created by shorting two driven bit lines after one of the storage cells has been written.
Large DRAM integrated circuit chips frequently provide other features that users may find useful. Faster
access is provided between certain adjacent addresses, usually along a common word line. Some designs feature
on-chip buffer memories, low standby power modes, or error correction circuitry. A few DRAM chips are
designed to mesh with the constraints of particular applications such as image support for CRT displays. Some
on-chip features are effectively hidden from the user. These may include redundant memory addresses which
the maker activates by laser to improve manufacturing yield.
The largest single market for DRAMs is with microprocessors in personal computers. Rapid microprocessor
performance improvements have led DRAM manufacturers to offer improvements especially designed for the
“PC” environment. Extended Data Out mode (EDO) keeps the data accessed from a DRAM valid over a longer
period of the DRAM cycle. EDO mode is intended to ease the synchronization problem between a DRAM and
the increasingly higher speed microprocessor. Synchronous DRAM (SDRAM) allows the rapid sequential
plex. For the profession, the most obvious impact
has been explosive growth. The increase in the num-
ber of students studying in the field continues to be
dramatic and shows no signs of slowing. The elec-
trical engineering community represents the largest
single technical group in the world, and the mem-
bers of the IEEE make up the world’s largest engi-
neering society. (Courtesy of the IEEE Center for the
History of Electrical Engineering.)
This 64-kB random access memory chip, developed
by IBM in 1978, was one of the densest of its time. It
could store as many as 64,000 bits of informa-
tion—roughly equivalent to 1,000 eight-letter words.
(Photo courtesy of the IEEE Center for the History of
Electrical Engineering.)
MULTICOORDINATE DIGITAL
INFORMATION STORAGE DEVICE
Jay W. Forrester
Patented February 28, 1956
#2,736,880
p to this time, digital data storage was
generally done by encoding binary data
? 2000 by CRC Press LLC
transfer of large blocks of data between the microprocessor and the DRAM without extensive signal “hand-
shaking”. While SDRAMs do nothing to improve the access time to first data, they greatly improve the “band-
width” between microprocessor and DRAM.
Static RAMs (SRAMs)
The primary advantages of SRAMs as compared to DRAMs are high speed and ease of use. In addition, SRAMs
fabricated in CMOS technology exhibit extremely low standby power. This later feature is effectively used in
much portable equipment like pocket calculators. Bipolar SRAMs are generally faster but less dense than FET
versions. Figure 80.4 illustrates two cells. SRAM performance is dominated by the speed of the support circuits,
leading some manufacturers to design bipolar support circuits to FET arrays.
Bipolar designs frequently incorporate circuit consolidation unavailable in FET technology, such as the multi-
emitter cell shown in Fig. 80.4(a). Here one of the two lower emitters is normally forward biased, turning one
inverter on and the other off for bistability. The upper emitters can be used either to extract a differential signal
on rotating magnetic drums or other
means where data had to be stored and retrieved
sequentially. This patent describes a system
whereby data could be stored and retrieved ran-
domly by a simple addressing scheme. It used
tiny doughnut-shaped ferromagnetic cores with
windings to magnetically polarize the material
in one direction or the other. This was about
one hundred times faster than rotating drums
and took up perhaps 2% of the volume. A
4-Kbyte core memory module would take up
about 60 cubic inches and could access data in
less than one millisecond. Random access mem-
ory (RAM) was born. Core memory (as it has
become known) was non-volatile; that is, the
information would not be lost when power was
cut. Modern non-volatile “flash” memory is yet
again thousands of times faster and achieves
data density of over 100,000 times greater than
the breakthrough magnetic core memory
described by Forrester. (Copyright ? 1995,
DewRay Products, Inc. Used with permission.)
U
or to discharge one collector towards ground in order to write the cell. The word line is pulsed positive to both
read and write the cell.
A few RAMs use polysilicon load resistors of very high resistance value in place of the two P-channel
transistors shown in Fig. 80.4(b). Most are full CMOS designs like the one shown. Sometimes the P-channel
transistors are constructed by thin film techniques and are physically placed over the N-channel transistors to
improve density. When both P- and N-channel transistors are fabricated in the same plane of the single-crystal
semiconductor, the standby current can be extremely low. Typically this can be microamps for megabit chips.
The low standby current is possible because each cell sources and sinks only that current needed to overcome
the actual node leakage within the cell.
Selecting the proper transconductance for each transistor is an important focus of the designer. The accessing
transistors should be large enough to extract a large read signal but insufficiently large to disturb the stored
information. During the write operation, these same transistors must be capable of overriding the current drive
of at least one of the internal CMOS inverters.
The superior performance of SRAMs derives from their larger signal and the absence of a need to refresh
the stored information as in a DRAM. As a result, SRAMs need fewer sense amplifiers. Likewise these amplifiers
are not constrained to match the cell pitch of the array. SRAM design engineers have exploited this freedom
to realize higher-performance sense amplifiers.
FIGURE 80.3 Folded bit line array.
FIGURE 80.4 (a) Bipolar SRAM cell. (b) CMOS SRAM cell.
? 2000 by CRC Press LLC
Practical SRAM designs routinely achieve access times of a few nanoseconds to a few tens of nanoseconds.
Cycle time typically equals access time, and in at least one pipelined design, cycle time is actually less than
access time.
SRAM integrated circuit chips have fewer special on-chip features than DRAM chips, primarily because no
special performance enhancements are needed. By contrast, many other integrated circuit chips feature on-
chip SRAMs. For example, many ASICs (application-specific integrated circuits) feature on-chip RAMs
because of their low power and ease of use.
All modern microprocessors include one or more on-chip “cache” SRAM memories which provide a high
speed link between processor and memory.
Nonvolatile Programmable Memories
A few nonvolatile memories are programmable just once. These have
arrays of diodes or transistors with fuses or antifuses in series with
each semiconductor cross point. Aluminum, titanium, tungsten,
platinum silicide, and polysilicon have all been successfully used as
fuse technology (see Fig. 80.5).
Most nonvolatile cells rely on trapped charge stored on a floating
gate in an FET. These can be rewritten many times. The trapped
charge is subject to very long term leakage, on the order of ten years.
The number of times the cell may be rewritten is limited by pro-
gramming stress-induced degradation of the dielectric. Charge
reaches the floating gate either by tunneling or by avalanche injec-
tion from a region near the drain. Both phenomena are induced by
over-voltage conditions and hence the degradation after repeated
erase/write cycles. Commercially available chips typically promise
100 to 100,000 write cycles. Erasure of charge from the floating gate
may be by tunneling or by exposure to ultraviolet light. Asperities
on the polysilicon gate and silicon-rich oxide have both been shown
to enhance charging and discharging of the gate. The nomenclature
used is not entirely consistent throughout the industry. However,
EPROM is generally used to describe cells which are electronically
written but UV erased. EEPROM is used to describe cells which are
electronically both written and erased.
Cells are of either a two- or a one-transistor design. Where two
transistors are used, the second transistor is a conventional enhance-
ment mode transistor (see Fig. 80.6). The second transistor works
to minimize the disturb of unselected cells. It also removes some
constraints on the writing limits of the programmable transistor, which in one state may be depletion mode.
The two transistors in series then assume the threshold of the second (enhancement) transistor, or a very high
threshold as determined by the programmable transistor. Some designs are so cleverly integrated that the
features of the two transistors are merged.
Flash EEPROMs describe a family of single-transistor cell EPPROMs. Cell sizes are about half that of two-
transistor EEPROMs, an important economic consideration. Care must be taken that these cells are not
programmed into the depletion mode. An array of depletion mode cells would confound the read operation
by providing multiple signal paths. Programming to enhancement only thresholds can be accomplished by a
sequence of partial program and then monitor subcycles, until the threshold is brought to compliance with
specification limits. Flash EEPROMs require bulk erasure of large portions of the array.
NVRAM is a term used to describe a SRAM or DRAM with nonvolatile circuit elements. The cell is built to
operate as a RAM with normal power applied. On command or with power failure imminent, the EEPROM
elements can be activated to capture the last state of the RAM cell. The nonvolatile information is restored to
a SRAM cell by normal internal cell regeneration when power is restored.
FIGURE 80.5 PROM cells.
? 2000 by CRC Press LLC
Read-Only Memories (ROMs)
ROMs are the only form of semiconductor storage which is
permanently nonvolatile. Information is retained without
power applied, and there is not even very gradual informa-
tion loss as in EEPROMs. It is also the most dense form of
semiconductor storage. ROMs are, however, less used than
RAMs or EEPROMs. ROMs must be personalized by a mask
in the fabrication process. This method is cumbersome and
expensive unless many identical parts are to be made. Fur-
thermore it seems much “permanent” information is not
really permanent and must be occasionally updated.
ROM cells can be formed as diodes or transistors at every
intersection of the word and bit lines of a ROM array (see
Fig. 80.7). One of the masks in the chip fabrication process
programs which of these devices will be active. Clever layout and circuit techniques may be used to obtain
further density. Two such techniques are illustrated in Figs. 80.8 and 80.9. The X array shares bit and virtual
ground lines. The AND array places many ROM cells in series. Each of these series AND ROM cells is either
FIGURE 80.6 Cross section of two-transistor EEPROM cells.
FIGURE 80.8 Layout of ROS X array.
FIGURE 80.7 ROM cell.
? 2000 by CRC Press LLC
an enhancement or a depletion channel of an FET. Sensing is accomplished by pulsing the gates of all series
cells positive except the gate which is to be interrogated. Current will flow through all series channels only if
the interrogated channel is depletion mode.
ROM applications include look-up tables, machine-level instruction code for computers, and small arrays
used to perform logic (see PLA in Section 81.4 of this handbook).
Defining Terms
Antifuse: A fuse-like device which when activated becomes low impedance.
Application-specific integrated circuits (ASICs): Integrated circuits specifically designed for one particular
application.
Avalanche injection: The physics whereby electrons highly energized in avalanche current at a semiconductor
junction can penetrate into a dielectric.
Depletion mode: An FET which is on when zero volts bias is applied from gate to source.
Enhancement mode: An FET which is off when zero volts bias is applied from gate to source.
Polysilicon: Silicon in polycrystalline form.
Tunneling: A physical phenomenon whereby an electron can move instantly through a thin dielectric.
Related Topic
25.3 Application-Specific Integrated Circuits
References
H. Kalter et al., “A 50 nsec 16 Mb DRAM with 10 nsec data rate and on-chip ECC,” IEEE Journal of Solid-State
Circuits, vol. SC 25, no. 5, 1990.
H. Kato, “A 9 nsec 4 Mb BiCMOS SRAM with 3.3 V operation,” Digest of Technical Papers ISSCC, vol 35, 1992.
H. Kawague, and N. Tsuji, “Minimum size ROM structure compatible with silicon-gate E/D MOS LSI,” IEEE
Journal of Solid State Circuits, vol. SC 11, no. 2, 1976.
Further Information
W. Donoghue et al., “A 256K H CMOS ROM using a four state cell approach,” IEEE Journal of Solid-State
Circuits, vol. SC20, no. 2, 1985.
FIGURE 80.9 Layout of ROS AND array.
? 2000 by CRC Press LLC
D. Frohmann-Bentchkowsky, “A fully decoded 2048 bit electronically programmable MOS-ROM,” Digest of
Technical Papers ISSCC, vol. 14, 1971.
L. A. Glasser and D. W. Dobberpuhl, The Design and Analysis of VLSI Circuits, Reading, Mass.: Addison-Wesley,
1985.
F. Masuoka, “Are you ready of the next generation dynamic RAM chips,” IEEE Spectrum Magazine, vol. 27, no.
11, 1990.
R. D. Pashley and S. K. Lai, “Flash memories: The best of two worlds,” IEEE Spectrum Magazine, vol. 26, no.
12, 1989.
80.2 Basic Disk System Architectures
Randy H. Katz
Architects of high-performance computers have long been forced to acknowledge the existence of a large gap
between the speed of the CPU and the speed of its attached I/O devices. A number of techniques have been
developed in an attempt to narrow this gap, and we shall review them in this chapter.
A key measure of magnetic disk technology is the growth in the maximum number of bits that can be stored
per square inch, i.e., the bits per inch in a disk track times the number of tracks per inch of media. Called
MAD, for maximal areal density, the “First Law in Disk Density” predicts [Frank, 1987]:
MAD = 10
(Year-1971)/10
(80.1)
This is plotted against several real disk products in Fig. 80.10. Magnetic disk technology has doubled capacity
and halved price every three years, in line with the growth rate of semiconductor memory. Between 1967 and
1979 the growth in disk capacity of the average IBM data processing system more than kept up with its growth
in main memory, maintaining a ratio of 1000:1 between disk capacity and physical memory size [Stevens, 1981].
In contrast to primary memory technologies, the performance of conventional magnetic disks has improved
only modestly. These mechanical devices, the elements of which are described in more detail in the next section,
are dominated by seek and rotation delays: from 1971 to 1981, the raw seek time for a high-end IBM disk
improved by only a factor of two while the rotation time did not change [Harker et al., 1981]. Greater recording
density translates into a higher transfer rate once the information is located, and extra positioning actuators
for the read/write heads can reduce the average seek time, but the raw seek time only improved at a rate of
7% per year. This is to be compared to a doubling in processor power every year, a doubling in memory density
every two years, and a doubling in disk density every three years. The gap between processor performance and
disk speeds continues to widen, and there is no reason to expect a radical improvement in raw disk performance
in the near future.
To maintain balance, computer systems have been using even larger main memories or solid-state disks to
buffer some of the I/O activity. This may be an acceptable solution for applications whose I/O activity has
locality of reference and for which volatility is not an issue, but applications dominated by a high rate of random
requests for small pieces of data (e.g., transaction processing) or by a small number of sequential requests for
massive amounts of data (e.g., supercomputer applications) face a serious performance limitation.
The rest of the chapter is organized as follows. In the next section, we will briefly review the fundamentals
of disk system architecture. The third section describes the characteristics of the applications that demand high
I/O system performance. Conventional ways to improve disk performance are discussed in the last section.
Basic Magnetic Disk System Architecture
We will review here the basic terminology of magnetic disk devices and controllers and then examine the disk
subsystems of three manufacturers (IBM, Cray, and DEC). Throughout this section we are concerned with
technologies that support random access, rather than sequential access (e.g., magnetic tape). A more detailed
discussion, focusing on the structure of small dimension disk drives, can be found in Vasudeva [1988]. The
basic concepts are illustrated in Fig. 80.11. A spindle consists of a collection of platters. Platters are metal disks
? 2000 by CRC Press LLC
covered with a magnetic material for recording information. Each platter contains a number of circular
recording tracks. A sector is a unit of a track that is physically read or written at the same time. In traditional
magnetic disks, the constant angular rotation of the platters dictates that sectors on inner tracks are recorded
more densely than sectors on the outer tracks. Thus, the platter can spin at a constant rate and the same amount
of data can be recorded on the inner and outer tracks.
1
Some modern disks use zone recording techniques to
more densely record data on the outer tracks, but this requires more sophisticated read/write electronics.
The read/write head is an electromagnet that produces switchable magnetic fields to read and record bit
streams on a platter’s track. It is associated with a disk arm, attached to an actuator. The head “flies” close to,
but never touches, the rotating platter (except perhaps when powered down). This is the classical definition of
a Winchester disk. The actuator is a mechanical assembly that positions the head electronics over the appro-
priate track. It is possible to have multiple read/write mechanisms per surface, e.g., multiple heads per arm—at
one extreme, one could have a head-per-track position, that is, the disk equivalent of a magnetic drum—or
1
Some optical disks use a technique called constant linear velocity (CLV), where the platter rotates at different speeds
depending on the relative position of the track. This allows more data to be stored on the outer tracks than the inner tracks,
but because it takes more delay to vary the speed of rotation, the technique is better suited to sequential rather than random
access.
FIGURE 80.10 Maximal areal density law. Squares represent predicted density; triangles are the MAD reported for the
indicated products.
FIGURE 80.11 Disk terminology. Heads reside on arms which are positioned by actuators. Tracks are concentric rings on
platters. A sector is the basic unit of read/write. A cylinder is a stack of tracks at one actuator position. An HDA is everything
in the figure plus the air-tight casing. In some devices it is possible to transfer from multiple surfaces simultaneously. The
collection of heads that participate in a single logical transfer that is spread over multiple surfaces is called a head group.
? 2000 by CRC Press LLC
multiple arms per surface through multiple actuators. Due to costs and technical limitations, it is usually
uneconomical to build a device with a large number of actuators and heads.
A cylinder is a stack of tracks at one actuator position. A head disk assembly (HDA) is the collection of
platters, heads, arms, and actuators, plus the air-tight casing. A disk drive is an HDA plus all associated
electronics. A disk might be a platter, an actuator, or a drive depending the context.
We can illustrate these concepts by describing two first-generation supercomputer disks, the Cray DD-19
and the CDC 819 [Bucher and Hayes, 1980]. These were state-of-the-art disks around 1980. Each disk has
40 recording surfaces (20 platters), 411 cylinders, and 18 (DD-19) or 20 (CDC 819) 512-byte sectors per track.
Both disks possess a limited “parallel read-out” capability. A given data word is actually byte interleaved over
four surfaces. Rather than a single set of read/write electronics for the actuator, these disks have four sets, so
it is possible to read or write with four heads at a time. Four heads on adjacent arms are called a head group.
A disk track is thus composed of the stacked recording tracks of four adjacent surfaces, and there are 10 tracks
per cylinder, spread over 40 surfaces. The advances over the last decade can be illustrated by the Cray DD-49,
which is a typical high-end supercomputer disk of today. It consists of 16 recording surfaces (9 platters),
886 cylinders, 42 4096-byte sectors per track, with 32 read/write heads organized into eight head groups, four
groups on each of two independent actuators. Each actuator can sweep the entire range of tracks, and by
“scheduling” the arms to position the actuator closest to the target track of the pending request, the average
seek time can be reduced. The DD-49 has a capacity of 1.2 Gbytes of storage and can transfer at a sustained
rate of 9.6 Mbytes/s.
A variety of standard and proprietary interfaces are defined for transferring the data recorded on the disk
to or from the host. We concentrate on industry standards here. On the disk surface, information is represented
as alternating polarities of magnetic fields. These signals need to be sensed, amplified, and decoded into
synchronized pulses by the read electronics. For example, the pulse-level protocol ST506/412 standard describes
the way pulses can be extracted from the alternating flux fields. The bit-level ESDI, SMD, and IPI-2 standards
describe the bit encoding of signals. At the packet level, these bits must be aligned into bytes, error correcting
codes need to be applied, and the extracted data must be delivered to the host. These “intelligent” standards
include SCSI (small computer standard interface) and IPI-3.
The ST506 is a low-cost but primitive interface, most appropriate for interfacing floppy disks to personal
computers and low-end workstations. For example, the controller must perform data separation on its own;
this is not done for it by the disk device. As a result, its transfer rate is limited to 0.625 Mbytes/s. The SMD
interface is higher performance and is used extensively in connecting disks to mainframe disk controllers. ESDI
is similar, but geared more towards smaller disk systems. One of its innovations over the ST506 is its ability to
specify a seek to a particular track number rather than requiring track positioning via step-by-step pulses. Its
performance is in the range of 1.25–1.875 Mbytes/s. SCSI has so far been used primarily with workstations
and minicomputers, but offers the highest degree of integration and intelligence. Implementations with per-
formance at the level of 1.5–4 Mbytes/s are common. The newer IPI-3 standard has the advantages of SCSI,
but provides even higher performance at a higher cost. It is beginning to make inroads into mainframe systems.
However, because of the very widespread use of SCSI, many believe that SCSI-2, an extension of SCSI to wider
signal paths, will become the de facto standard for high-performance small disks.
The connection pathway between the host and the disk device varies widely depending on the desired level
of performance. A low-end workstation or personal computer would use a SCSI interface to directly connect the
device to the host. A higher end file server or minicomputer would typically use a separate disk controller to
manage several devices at the same time. These devices attach to the controller through SMD interfaces. It is the
controller’s responsibility to implement error checking and corrections and direct memory transfer to the host.
Mainframes tend to have more devices and more complex interconnection schemes to access them. In IBM
terminology [Buzen and Shum, 1986], the channel path, i.e., the set of cables and associated electronics that
transfer data and control information between an I/O device and main memory, consists of a channel, a storage
director, and a head of string (see Fig. 80.12). The collection of disks that share the same pathway to the head
of string is called a string.
In earlier IBM systems, a channel path and channel are essentially the same thing. The channel processor is
the hardware that executes channel programs, which are fetched from the host’s memory. A subchannel is the
? 2000 by CRC Press LLC
execution environment of a channel program, similar to a process on a conventional CPU. Formerly, a
subchannel was statically assigned for execution to a particular channel, but a major innovation in high-end
IBM systems (308X and 3090) allows subchannels to be dynamically switched among channel paths. This is
like allocating a process to a new processor within a multiprocessor system every time it is rescheduled for
execution.
I/O program control statements, e.g., transfer in channel, are interpreted by the channel, while the storage
director (also known as the device controller or control unit) handles seek and data-transfer requests. Besides
these control functions, it may also perform certain datapath functions, such as error detection/correction and
mapping between serial and parallel data. In response to requests from the storage director, the device will
position the access mechanism, select the appropriate head, and perform the read or write. If the storage director
is simply a control unit, then the datapath functions will be handled by the head of string (also known as a
string controller).
To minimize the latency caused by copying into and out of buffers, the IBM I/O system uses little buffering
between the device and memory.
1
In a high-performance environment, devices spend a good deal of time
waiting for the pathway’s resources to become free. These resources are used for time periods related to disk
transfer speeds, measured in milliseconds. One possible method for improving utilization is to support dis-
connect/reconnect. A subchannel can connect to a device, issue a seek, disconnect to free the channel path for
other requests, and reconnect later to perform the transfer when the seek is completed. Unfortunately, not all
reconnects can be serviced immediately, because the control units are busy servicing other devices. These RPS
misses (to be described in more detail in the next section) are a major source of delay in heavily utilized IBM
storage subsystems [Buzen and Shum, 1987]. Performance can be further improved by providing multiple paths
between memory and devices. To this purpose, IBM’s high-end systems support dynamic path reconnect, a
1
Only the most recent generation of storage directors (e.g., IBM 3880, 3990) incorporate disk caches, but care must be
taken to avoid cache management-related delays [Buzen, 1982].
FIGURE 80.12 Host-to-device pathways. For large IBM mainframes, the connection between host and device must pass
through a channel, storage director, and string controller. Note that multiple storage directors can be attached to a channel,
multiple string controllers per storage director, and multiple devices per string controller. This multipathing approach makes
it possible to share devices among hosts and to provide alternative pathways to better utilize the drives and controllers.
While logically correct, the figure does not reflect the true physical components of high-end IBM systems (308X, 3090). The
concept of channel has disappeared from these systems and has been replaced by a channel path.
? 2000 by CRC Press LLC
mechanism that allows a subchannel to change its channel path each time it cycles through a disconnect/recon-
nect with a given device. Rather than wait for its currently allocated path to become free, it can be assigned to
another available path.
Turning to supercomputer I/O systems, we will now examine the I/O architecture of the Cray machines.
Because the Cray I/O system (IOS) varies from model to model, the following discussion concentrates on the
IOS found on the Cray X-MP and Y-MP [Cray, 1988]. In general, the IOS consists of two to four I/O processors
(IOPs), each with its own local memory and sharing a common buffer memory with the other IOPs. The IOP
is designed to be a simple, fast machine for controlling data transfers between devices and the central memory
of the Cray main processors. Since it executes the control statements of an I/O program, it is not unlike the
IBM channel processor in terms of its functionality, except that IO programs reside in its local memory rather
than in the host’s. An IOP’s local memory is connected through a high-speed communications interface, called
a channel in Cray terminology, to a disk control unit (DCU). A given port into the local memory can be time
multiplexed among multiple channels. Data is transferred back and forth between devices and the main
processors through the IOP’s local memory, which is interfaced to central memory through a 100-Mbyte/s
channel pair (one pathway for each direction of transfer).
The DCU provides the interface between the IOP and the disk drives and is similar in functionality to IBM’s
storage director. It oversees the data transfers between devices and the IOP’s local memory, provides speed
matching buffer storage, and transmits control signals and status information between the IOP and the devices.
Disk storage units (DSUs) are attached to the DCU through point-to-point connections. The DSU contains
the disk device and is responsible for dealing with its own defect management, by using a technique called
sector slipping. Figure 80.13 summarizes the elements of the Cray I/O system.
Digital Equipment Corporation’s high-end I/O strategy is described in terms of the digital storage architecture
(DSA) and is embodied in system configurations such as the VAXCluster shared disk system (see Fig. 80.14).
The architecture provides a rigorous definition of how storage subsystems and host computers interact. It
achieves this by defining a client/server message-based model for I/O interaction based on device-independent
interfaces [Massiglia, 1986; Kronenberg et al., 1986]. A mass storage subsystem is viewed at the architectural
level as consisting of logical block machines capable of storing and retrieving fixed blocks of data, i.e., the I/O
system supports the transfer of logical blocks between CPUs and devices given a logical block number. From
the viewpoint of physical components, a subsystem consists of controllers which connect computers to drives.
The software architecture is divided into four levels: the Operating System Client (also called the Class Driver),
the Class Server (Controller), the Device Client (Data Controller), and the Device Server (Device). The Disk
Class Driver, resident on a host CPU, accepts requests for disk I/O service from applications, packages these
FIGURE 80.13 Elements of the Cray I/O system for the Y-MP. An IOS contains up to four IOPs. The MIOP connects to
the operator workstation and performs mainly maintenance functions. The XIOP supports block multiplexing and is most
appropriate for controlling relatively slow speed devices, such as tapes. The BIOP and DIOP are designed for controlling
high-speed devices like disks. Up to four disk storage units (DSUs) can be attached through the disk control unit (DCU)
to the IOP. Three DCUs can be connected to each of the BIOP and DIOP, leading to a total of 24 disks per IOS. The Y-MP
can be configured with two IOSs, for a system total of 48 devices.
? 2000 by CRC Press LLC
requests into messages, and transmits them via a communications interface (such as the Computer Interconnect
port driver) to the Disk Class Server resident within a controller in the I/O subsystem. The command set
supported by the Class Server includes such relatively device-independent operations as read logical block,
write logical block, bring on-line, and request status. The Disk Class Server
1
interprets the transmitted com-
mands, handles the scheduling of command execution, tracks their progress, and reports status back to the
Class Driver. Note the absence of seek or select head commands. This interface can be used equally well for
solid-state disks as for conventional magnetic disks. Device-specific commands are issued at a lower level of
the architecture, i.e., between the Device Client (disk controller) and Device Server (disk device). The former
provides the path for moving commands and data between hosts and drives, and it is usually realized physically
by a piece of hardware that corresponds to the device controller. The latter coincides with the physical drives
used for storing and retrieving data.
It is interesting to contrast these proprietary approaches with an industry standard approach like SCSI,
admittedly targeted for the low to mid range of performance. SCSI defines the logical and physical interface
between a host bus adapter (HBA) and a disk controller, usually embedded within the assembly of the disk
device. The HBA accepts I/O requests from the host, initiates I/O actions by communicating with the controllers,
and performs direct memory access transfers between its own buffers and the memory of the host. Requesters
of service are called initiators, while providers of service are called targets. Up to eight nodes can reside on a
single SCSI string, sharing a common pathway to the HBA. The embedded controller performs device handling
and error recovery. Physically, the interface is implemented with a single daisy-chained cable, and the 8-bit
datapath is used to communicate control and status information, as well as data. SCSI defines a layered
communications protocol, including a message layer for protocol and status, and a command/status layer for
target operation execution. The HBA roughly corresponds to the function of the IBM channel processor or
Cray IOP, while the embedded controller is similar to the IBM storage director/string controller or the Cray
DCU. Despite the differences in terminology, the systems we have surveyed exhibut significant commonality
of function and similar approaches for partitioning these functions among hardware components.
Characterization of I/O Workloads
Before characterizing the I/O behavior of different workloads, it is necessary to first understand the elements
of disk performance. Disk performance is a function of the service time, which consists of three main compo-
nents: seek time, rotational latency, and data transfer time.
2
Seek time is the time needed to position the heads
1
Other kinds of class servers are also supported, such as for tape drives.
2
In a heavily utilized system, delays waiting for a device can match actual disk service times, which in reality is composed
of device queuing, controller overhead, seek, rotational latency, reconnect misses, error retries, and data transfer.
FIGURE 80.14 VAXCluster architecture. CPUs are connected to HSCs (hierarchical storage controllers) through a dual CI
(computer interconnect) bus. Thirty-one hosts and 31 HSCs can be connected to a CI. Up to 32 disks can be connected to
an HSC-70.
? 2000 by CRC Press LLC
to the appropriate track position containing the desired data. It is a function of a substantial initial start-up
cost to accelerate the disk head (on the order of 6 ms) as well as the number of tracks that must be traversed. Typical
average seek times, i.e., the time to traverse between two randomly selected tracks (approximately 28% of the data
band), are in the range of 10 to 20 ms. The track-to-track seek time is usually below 10 ms and as low as 2 ms.
The second component of service time is rotational latency. It takes some time for the desired sector to
rotate under the head position before it can be read or written. Today’s devices spin at a rate of approximately
3600 rpm, or 60 revolutions per second (we expect to see rotation speeds increase to 5400 rpm in the near
future). For today’s disks, a full revolution is 16 ms, and the average latency is 8 ms. Note that the worst-case
latencies are comparable to average seeks.
The last component is the transfer time, i.e., the time to physically transfer the bytes from disk to the host.
While the transfer time is a strong function of the number of bytes to be transferred, seek and rotational
latencies times are independent of the transfer blocksize. If data is to be read or written in large chunks, it
makes sense to choose a large blocksize, since the “fixed cost” of seek and latency is better amortized across a
large data transfer.
A low-performance I/O system might dedicate the pathway between the host and the disk for the entire
duration of the seek, rotate, and transfer times. Assuming small blocksizes, transfer time is a small component
of the overall service time, and these pathways can be better utilized if they are shared among multiple devices.
Thus, higher performance systems support independent seeks, in which a device can be directed to detach itself
from the pathway while seeking to the desired track (recall the discussion of dynamic path reconnect in the
previous section). The advantage is that multiple seeks can be overlapped, reducing overall I/O latency and
better utilizing the available I/O bandwidth.
However, to make it possible for devices to reattach to the pathway, the I/O system must support a mechanism
called rotational position sensing, i.e., the device interrupts the I/O controller when the desired sector is under
the heads. If the pathway is currently in use, the device must pay a full rotational delay before it can again
attempt to transfer. These rotational positional reconnect miss delays (RPS delays) represent a major source of
degradation in many existing I/O systems [Buzen and Shum, 1987]. This arises from the lack of device buffering
and the real-time service requirements of magnetic disks. At the time that these architectures were established,
buffer memories were expensive and the demands for high I/O performance were less pressing with slower
speed CPUs. An alternative, made more attractive by today’s relative costs of electronic and mechanical com-
ponents, is to associate a track buffer with the device that can be filled immediately. This can then be used as
the source of the transfer when the pathway becomes available [Houtekamer, 1985].
I/O intensive applications vary widely in the demand they place on the I/O system. They run the gamut
from processing small numbers of bulk I/Os that must be handled with minimum delay (supercomputer I/O)
to large numbers of simple tasks that touch small amounts of data (transaction processing). An important
design challenge is to develop an I/O system that can handle the performance needs of these diverse workloads.
A given workload’s demand for I/O service can be specified in terms of three metrics: throughput, latency,
and bandwidth. Throughput refers to the number of requests for service made per unit time. Latency measures
how long it takes to service an individual request. Bandwidth gauges the amount of data flowing between
service requesters (i.e., applications) and service providers (i.e., devices).
As observed by Bucher and Hayes [1980], supercomputer I/O can be characterized almost entirely by
sequential I/O. Typically, computation parameters are moved in bulk from disk to in-memory data structures,
and results are periodically written back to disk. These workloads demand large bandwidth and minimum
latency, but are characterized by low throughput. Contrast this with transaction processing, which is charac-
terized by enormous numbers of random accesses, relatively small units of work, and a demand for moderate
latency with very high throughput.
Figure 80.15 shows another way of thinking about the varying demands of I/O intensive applications. It
shows the percentage of time different applications spend in the three components of I/O service time. Trans-
action processing systems spend the majority of their service time in seek and rotational latency; thus techno-
logical advances which reduce the transfer time will not affect their performance very much. On the other
hand, scientific applications spend a more equal amount of time in seek and data transfer, and their performance
is sensitive to any improvement in disk technology.
? 2000 by CRC Press LLC
Extensions to Conventional Disk Architectures
In this subsection, we will focus on techniques for improving the performance of conventional disk systems,
i.e., methods which allow us to reduce the seek time, rotational latency, or transfer time of conventional disks.
By reducing disk service times, we also decrease device queuing delays. These techniques include fixed-head
disks, parallel transfer disks, increased disk density, solid-state disks, disk caches, and disk scheduling.
Fixed-Head Disk
The concept of a fixed-head disk is to place a read/write head at every track position. The need for positioning
the heads is eliminated, thus eliminating the seek time altogether. The approach does not assist in reducing
rotational latencies, nor does it lessen the transfer time. Fixed-head disks were often used in the early days of
computing systems as a back-end store for virtual memory. However, since modern disks have hundreds of
tracks per surface, placing a head at every position is no longer viewed as an economical solution.
Parallel Transfer Disks
Some high-performance disk drives make it possible to read or write from multiple disk surfaces at the same
time. For example, the Cray DD-19 and DD-49 disks described in the second section have a parallel transfer
capability. The advantage is that much higher transfer rates can be achieved, but no assistance is provided for
seek or rotational latency. Thus transfer units are correspondingly larger in these systems.
A number of economic and technological issues limit the usefulness of parallel transfer disks. From the
economic perspective, providing more than one set of read/write electronics per actuator is expensive. Further,
current disks use sophisticated control systems to lock onto an individual track, and it is difficult to do this
simultaneously across tracks within the same cylinder. Hence, the Cray strategy is limiting head groups to only
four surfaces. There appears to be a fundamental trade-off between track density and the number of platters:
as the track density increases, it becomes ever more difficult to lock onto tracks across many platters, and the
number of surfaces that can participate in a parallel transfer is reduced. For example, current Cray track densities
are around 980 tracks/inch, and require a rather sophisticated closed-loop track-following servo system to
position the heads accurately with finely controlled voice coil actuators. A lower cost ($/megabyte) high-
performance disk system can be constructed from several standard drives than from a single parallel transfer
device, in part because of the relatively small sales volume of parallel transfer devices compared to standard drives.
Increasing Disk Density
As described in the first section, the improvements in disk recording density are likely to continue. Higher bit
densities are achieved through a combination of the use of thinner films on the disk platters (e.g., densities
improve from 16,000 bpi to 21,000 bpi when thick iron oxide is replaced with thin film materials), smaller gaps
between the poles of the read/write head’s electromagnet, and heads which fly closer to the disk surface.
FIGURE 80.15 I/O system parameters as a function of application. Transaction processing applications are seek and
rotational latency limited, since only small blocks are usually transferred from disk. Image-processing applications, on the
other hand, transfer huge blocks and thus spend most of their I/O time in data transfer. Scientific computing applications
tend to fall in between. (Source: I. Y. Bucher and A. H. Hayes, “I/O performance measurement on Cray-1 and CDC 7600
computers,” Proc. Cray Users Group Conference, October 1980. With permission.)
? 2000 by CRC Press LLC
While vertical recording techniques have long been touted as the technology of the future, advances in head
technology make it possible to continue using conventional horizontal methods, but still keep disks on the
MAD curve. These magneto-resistive heads employ noninductive methods for reading, which work well with
dense horizontal recording fields. However, a more conventional head is needed for writing, but this dual-head
organization permits separate optimizations for read and write.
Also, the choice of coding technique can have a significant effect on density. Standard modified frequency
modulation techniques require approximately one flux change per bit, while more advanced run-length limited
codes can increase density by an additional factor of 50%. Densities as high as 31,429 bpi can be attained with
these techniques. As the recording densities increase, the transfer times decrease, as more bits transit beneath
the heads per unit time. Of course, this approach provides no improvement in seek and latency times. Most
of the increase in density comes from increases in the number of tracks per inch, which does not improve (and
may actually reduce) performance.
Although increased densities are inevitable, the problem is primarily economic. Increasing the tracks per
inch may make seeks slower as it becomes more time consuming for the heads to correctly “lock” onto the
appropriate track. The sensing electronics get more complex and thus more expensive. Once again, it can be
argued that higher capacity can be achieved at lower cost by using several smaller disks rather than one expensive
high-density disk.
Solid-State Disks
Solid-state disks (SSD), constructed from relatively slow memory chips, can be viewed either as a kind of large
and slow main memory or as a small and high-speed disk. When viewed as large main memory, the SSD is
often called expanded storage (ES). The expanded storage found in the IBM 3090 class machines [Buzen and
Shum, 1986] supports operations for paging data blocks from and to main memory. Usually, the expanded
storage looks to the system more like memory than an I/O device: it is directly attached to main memory
through a high-speed bus rather than an I/O controller. The maximum transfer bandwidth on the IBM 3090
between expanded store and memory is two orders of magnitude faster than conventional devices: approxi-
mately 216 Mbytes/s—one word each 18.5 ns!
Further, unlike conventional devices, a transfer between memory and expanded storage is performed syn-
chronously with the CPU. This is viewed as acceptable, because the transfer requires so little time and does not
involve the usual operating system overheads of I/O set-up and interrupts. Note that to transfer data from ES
to disk requires the data to be first staged into main memory.
The Cray X-MP and Y-MP also support SSDs, which can come in configurations of up to 4096 Mbytes,
approximately four times the capacity of the DD-49. The SSD has the potential for enormous bandwidth. It
can be attached to the Cray IO system or directly to the CPU through up to two 1000-Mbyte/s channels. Access
can be arranged in one of three ways [Reinhardt, 1988]. The first alternative is to treat the SSD as a logical
disk, with users responsible for staging heavily accessed files to it. Unfortunately, this leads to the inevitable
contention for SSD space. Further, the operating system’s disk device drivers are not tuned for the special
capabilities of SSDs, and some performance is lost. The second alternative is to use the SSD as an extended
memory, in much the same manner as IBM’s extended storage. Special system calls for accessing the SSD bypass
the usual disk-handling code, and a 4096-byte sector can be accessed in 25 ms. The last alternative is to use the
SSD as a logical device cache, i.e., as a second-level cache for multitrack chunks of files that resides between
the system’s in-main memory file cache and the physical disk devices. Cray engineers have observed workload
speedups for their UNIX-like operating system of a factor of four over conventional disk when the cache is
enabled. These results indicate that SSDs are most appropriate for containing “hot spot” data [Gawlick, 1987].
Conventional wisdom has it that 20% of the data receives 80% of the accesses, and this has been widely observed
in transaction processing systems [Gawlick, 1987].
If SSDs are to be used to replace magnetic disks, then they must be made nonvolatile, and herein lies their
greatest weakness. This can be achieved through battery back-up, but the technique is controversial. First, it is
difficult to verify that the batteries will be fully charged when needed, i.e., when conventional power fails.
Second, it is difficult to determine how long is long enough when powering the SSD with batteries. This should
probably be long enough to off-load the disk’s contents to magnetic media. Fortunately, low-power DRAM
and wafer scale integration technology are making feasible longer battery hold times.
? 2000 by CRC Press LLC
Another weakness is their cost. At the present time, there is more than a 10 to 20 times difference in price
between the cost of a megabyte of magnetic disk memory and a megabyte of DRAM. While wafer scale integration
may bring this price down in the future, for the near term SSDs will be limited to a staging or caching function.
Disk Caches
Disk caches place buffer memories between the host and the device. If disk data is likely to be re-referenced,
caches can be effective in eliminating the seek and rotational latencies. Unfortunately, this effectiveness depends
critically on the access behavior of the applications. Truly random access with little re-referencing cannot make
effective use of disk caches. However, applications that exhibit a large degree of sequential access can use a
cache to good purpose, because data can be staged into the cache before it is actually requested.
Disk caches can become even more useful if they are made nonvolatile using the battery back-up techniques
described in the previous subsection (and with the same potential problems). A nonvolatile cache will allow
“fast writes”: the application need not wait for the write I/O to actually complete before it is notified that it
has completed. For some applications environments, disk caches have the beneficial effect of reducing the
number of reads and thus the number of I/O requests seen by the disks. This has the interesting side effect of
increasing the percentage of writes found in the I/O mix, and some observers believe that writes may dominate
I/O performance in future systems.
As already mentioned, a disk cache can also lead to better utilization of the host-to-device pathways. A device
can transfer data into a cache even if the pathway is in use by another device on the same string. Thus caches
are effective in avoiding rotational position sensing misses.
Disk Scheduling
The mechanical delays as seen by a set of simultaneous I/O requests can be reduced through effective disk
scheduling. For example, seek times can be reduced if a shortest-seek-time-first scheduling algorithm is used
[Smith, 1981]. That is, among the queue of pending I/O requests, the one next selected for service is the one
that requires the shortest seek time from the current location of the read/write heads. The literature on disk
scheduling algorithms is vast, and the effectiveness of a particular scheduling approach depends critically on
the workload. It has been observed that scheduling algorithms work best when there are long queues of pending
requests; unfortunately, this situation seems to occur rarely in existing systems [Smith, 1981].
Disk Arrays
An alternative to the approaches just described is to exploit parallelism by grouping together a number of
physical disks and making these appear to applications as a single logical disk. This has the advantage that the
bandwidth of several disks can be harnessed to service a single logical I/O request or can support multiple
independent I/Os in parallel. Further, arrays can be constructed using existing, widely available disk technology,
rather than the more specialized and more expensive approaches described in the previous subsection. For
example, Cray offers a device called the DS-40, which appears as a single logical disk device but which is actually
implemented internally as four drives. A logical track is constructed from sectors across the four disks. The
DS-40 can transfer at a peak rate of 20 Mbyte/s, with a sustained transfer rate of 9.6 Mbyte/s, and thus is strictly
faster than the DD-49.
Defining Terms
Arm: A mechanical assembly that positions the head to the correct track for reading or writing.
Bandwidth: The amount of data per unit time flowing between host computers and storage devices.
Cylinder: A stack of tracks at one acuator position.
Disk drive: An HDA plus all associated electronics.
Head: An electromagnet that produces switchable magnetic fields to read and record bit streams on a platter’s
track.
Head disk assembly (HDA): The collection of platters, heads, arms, and actuators, plus the air-tight casing,
that makes up the storage device. Basically, this is everything but the electronics for controlling the drive
and interfacing it to a computer system.
? 2000 by CRC Press LLC
Latency: How long it takes to service an individual request.
Maximal areal density (MAD): The maximum number of bits that can be stored per square inch. Computed
by multiplying the bits per inch in a disk track times the number of tracks per inch of media.
Platters: Metal disks covered with a magnetic material for recording information.
Rotational latency: The time it takes for the desired sector to rotate under the head position before it can be
read or written.
Rotational position sensing: A storage device interrupts the I/O controller when the desired sector is under
the heads.
Sector: A unit of a storage that is physically read or written at the same time.
Seek time: The time needed to position the heads to the appropriate track position containing the desired data.
Spindle: The collection of disk platters.
Track buffer: A memory buffer embedded in the disk drive. It can hold the contents of the current disk track.
Tracks: The circular recording regions on a platter.
Transfer time: The time taken to physically transfer the bytes from disk to the host.
Throughput: The number of requests for disk service per unit time.
Winchester disk: A magnetic disk in which the read/write heads fly above the recording surface on an air
bearing. This is in contrast to contact recording, such as a floppy disk, in which the head and the magnetic
media are actually touching.
Related Topic
36.2 Magnetic Recording
References
I. Y. Bucher and A. H. Hayes, “I/O performance measurement on Cray-1 and CDC 7600 computers,” Proceedings
of the Cray Users Group Conference, October 1980.
J. Buzen, “BEST/1 analysis of the IBM 3880-13 cached storage controller,” Proc. CMG XIII Conference, 1982.
J. P. Buzen and A. Shum, “I/O architecture in MVS/370 and MVS/XA,” CMG Transactions, vol. 54, pp. 19–26,
Fall 1986.
J. P. Buzen and A. Shum, “A unified operational treatment of RPS reconnect delays,” Proc. 1987 Sigmetrics
Conference, Performance Evaluation Review, vol. 15, no. 1, May 1987.
Cray Research, Inc., “CRAY Y-MP Computer Systems Functional Description Manual,” HR-4001, January 1988.
P. D. Frank, “Advances in head technology,” presentation at Challenges in Disk Technology Short Course,
Institute for Information Storage Technology, University of Santa Clara, Santa Clara, Calif., December
12–15, 1987.
D. Gawlick, Private Communication, November 1987.
J. M. Harker et al., “A quarter century of disk file innovation,” IBM Journal of Research and Development, vol.
25, no. 5, pp. 677–689, September 1981.
G. Houtekamer, “The local disk controller,” Proc. 1985 Sigmetrics Conference, August 1985.
N. P. Kronenberg, H. Levy, and W. D. Strecker, “VAXClusters: A closely-coupled distributed system,” ACM Trans.
on Comp. Systems, vol. 4, no. 2, pp. 130–146, May 1986.
P. Massiglia, Digital Large System Mass Storage Handbook, Colorado Springs, Col.: Digital Equipment Corpo-
ration, 1986.
S. Reinhardt, “A blueprint for the UNICOS operating system,” Cray Channels, vol. 10, no. 3, pp. 20–24, Fall
1988.
A. J. Smith, “Input/output optimization and disk architectures: A survey,” in Performance and Evaluation 1,
North-Holland Publishing Company, 1981, pp. 104–117.
L. D. Stevens, “The evolution of magnetic storage,” IBM Journal of Research and Development, vol. 25, no. 5,
pp. 663–675, September 1981.
A. Vasudeva, “A case for disk array storage system,” Proc. Reliability Conference, Santa Clara, Calif., 1988.
J. Voelcker, “Winchester disks reach for a gigabyte,” IEEE Spectrum, pp. 64–67, February 1987.
? 2000 by CRC Press LLC
Further Information
International Business Machines (IBM) Corporation developed the first rotating magnetic storage device in
the mid-1950s and has always been an industry leader in the storage industry. In honor of the 25-year anniversary
of the invention of the magnetic disk, IBM’s Journal of Research and Development in September 1981 reviewed
the development of the technology up to that time. Two particularly notable papers are
L. D. Stevens, “The evolution of magnetic storage,” IBM Journal of Research and Development, vol. 25, no. 5,
pp. 663–675, September 1981.
J. M. Harker et al., “A quarter century of disk file innovation,” IBM Journal of Research and Development, vol. 25,
no. 5, pp. 677–689, September 1981.
For a more up-to-date review of progress in the disk drive industry, see:
J. Voelker, “Winchester disks reach for a gigabyte,” IEEE Spectrum, pp. 64–67, February 1987.
80.3 Magnetic Tape
1
Peter A. Lee
Computers depend on memory to execute programs and to store program code and data. They also need access
to stored program code and data in a nonvolatile memory (i.e., a form in which the information is not lost
when the power is removed from the computer system). Different types of memory have been developed for
different tasks. This memory can be categorized according to its price per bit, access time, and other parameters.
Table 80.1 shows a typical hierarchy for memory which places the smallest and fastest memory at the top in
level 0 and in general the largest, slowest, and cheapest at the bottom in level 4 [Ciminiera and Valenzano,
1987]. Auxiliary (secondary or mass) memory of level 4 forms the large storage capacity for program and code
that are not currently required by the CPU. This is usually nonvolatile and is at a low cost per bit. Computer
magnetic tape falls within this category and is the subject of this section.
A Brief Historical Review
Probably the first recorded storage device, developed by Schickard in 1623, used mechanical positions of cogs
and gears to work a semi-automatic calculator. Then came Pascal’s calculating machine based on 10 digits per
wheel. In 1812 punched cards were used in weaving looms to store patterns for woven material. Since that time
there have been many mechanical and, latterly, electromechanical devices developed for memory and storage.
In 1948 at Manchester University in England the cathode ray tube (Williams) and the magnetic drum were
developed. These consisted of 1024 bits and 1280 bits and a magnetic drum capacity of 120K bits. Cambridge
University developed the mercury delay line in 1949, which represented the first fully operational delay line
memory, consisting of 576 bits per tube with a total capacity of 18K bits and a circulation time of 1.1 ms.
The first commercial computer with a magnetic tape system was introduced in 1951. The UNIVAC I had a
magnetic tape system of 1.44M bits on 150 feet of tape and was capable of storing 128 characters per inch. The
tape could be read at a rate of 100 ips. Optical memories are now available as very fast storage devices and will
replace magnetic storage in the next few years. At present these devices are expensive although it is envisaged
that optical disks with large silicon caches will be the storage arrangement of the future where computer systems
utilizing CAD software and image processing can take advantage of the large storage capacities with fast access
times. In the future, semiconductor memories are likely to continue their advancing trend.
Introduction
Today’s microprocessors are capable of addressing up to 16 Mbytes of main memory. To take advantage of this
large capacity, it is usual to have several programs residing in memory at the same time. With intelligent memory
1
Based on P. A. Lee, “Memory subsystems,” in Digital Systems Reference Book, B. Holdworth and G. R. Martins, Eds.,
Oxford: Butterworth-Heinemann, 1991, chap. 2.6. With permission.
? 2000 by CRC Press LLC
management units (MMUs), the programs can be swopped in and out of the main memory to the auxiliary
memory when required. For the system to keep pace with this program swopping, it must have a fast auxiliary
memory to write to. In the past, most auxiliary systems like magnetic tape and disks have had slow access times,
and this has meant that expensive systems have evolved to cater for this requirement. Now that auxiliary memory
has improved, and access times are fast and the memory cheap, computer systems have been developed that
provide memory swopping with large nonvolatile storage systems. Although the basic technology has not
changed over the last 20 years, new materials and different approaches have meant that a new form of auxiliary
memory has been brought to the market at a very cheap cost.
Magnetic Tape
Magnetic tape currently provides the cheapest form of storage for large quantities of computer data in a
nonvolatile form. The tape is arranged on a reel and has several different packaging styles. It is made from a
polyester transportation layer with a deposited layer of oxide having a property similar to a ferrite material
with a large hysteresis. Magnetic tape is packaged either in a cartridge, on a reel, or in a cassette. The magnetic
cartridge is manufactured in several tape lengths and cartridge sizes capable of storing up to 2 G (giga) bytes
of data. These can be purchased in many popular preformatted styles.
The magnetic tape reel is usually 1?2 inch or 1 inch wide and has lengths of 600, 1200, and 2400 feet. Most
reels can store data at rates from 800 bits per inch (bpi) up to 6250 bpi. The reel-to-reel magnetic tape reader
is generally bulkier and more expensive than the cartridge readers due to the complicated pneumatic drive
mechanisms, but it provides a large data storage capacity with high access speeds [Wiehler, 1974]. An example
of a typical magnetic tape drive with the reel-to-reel arrangement is shown in Fig. 80.16.
A cheap storage medium is the magnetic cassette. Based on the audio cassette, this uses the normal audio
cassette recorder for reading and writing data via the standard Kansas City interface through a serial computer
I/O line. A logic data “1” is recorded by a high frequency and a logic data “0” by a lower frequency. High-
density cassettes can store up to 60 Mbytes of data on each tape and are popular with the computer games
market as a cheap storage medium for program distribution.
Both reel-to-reel and cartridge tapes are generally organised by using nine separate tracks across the tape as
shown in Fig. 80.17(a).
Each track has its own read and write head operated independently from other tracks [see Fig. 80.17(b)].
Tracks 1 to 8 are used for data and track nine for the parity bit. Data is written on the tape in rows of magnetized
islands, using for example EBCDIC (Extended Binary Coded Decimal Interchange Code).
Each read/write head is shaped from a ferromagnetic material with an air gap 1 mm wide as seen in Fig. 80.18.
The writing head is concerned with converting an electrical pulse into a magnetic state and can be magnetized
in one of two directions. This is done by passing a current through the magnetic coil which sets up a leakage
field across the 1-mm gap. When the current is reversed the field across the gap is changed, reversing the polarity
of the magnetic field on the tape. The head magnetizes the passing magnetic tape recording the state of the
magnetic field in the air gap. A logic 1 is recorded as a change in polarity on the tape, and a logic 0 is recorded
as no change in polarity, as seen in Fig. 80.19. Reading the magnetic tape states from the tape and converting
them to electrical signals is done by the read head. The bit sequences in Fig. 80.19 show the change in magnetic
TABLE 80.1Memory Hierarchy
Data Code MMU
Level 0 CPU register Instruction registers MMU registers
Level 1 Data cache Instruction cache MMU memory
Level 2 On-board cache
Level 3 Main memory
Level 4 Auxiliary memory
Source: P. A. Lee, “Memory subsystems,” in Digital Systems Refer-
ence Book, B. Holdsworth and G. R. Martin, Eds., Oxford: Butter-
worth-Heinemann, 1991, p. 2.6/3. With permission.
? 2000 by CRC Press LLC
states on the tape. When the tape is passed over the read head, it induces a voltage into the magnetic coil which
is converted to digital levels to retrieve the original data.
Tape Format
Information is stored on magnetic tape in the form of a coherent sequence of rows forming a block. This usually
corresponds to a page of computer memory and is the minimum amount of data written to or read from
FIGURE 80.16 (a) Magnetic tape drive. (b) Magnetic tape reel arrangement. (Source: K. London, Introduction to Computers,
London: Faber and Faber, 1986, p. 141. With permission.)
FIGURE 80.17 Magnetic tape format. (Source: P. A. Lee, “Memory subsystems,” in Digital Systems Reference Book, B.
Holdsworth and G. R. Martin, Eds., Oxford: Butterworth-Heinemann, 1991, p. 2.6/11. With permission.)
? 2000 by CRC Press LLC
magnetic tape with each program statement. Each block of data is separated by a block gap which is approxi-
mately 15 mm long and has no data stored in it. This is shown in Fig. 80.20.
Block gaps are used to allow the tape to accelerate to its operational speed and for the tape to decelerate
when stopping at the end of a block. Block gaps use up to 50% of the tape space available for recording, although
this may be reduced by making the block sizes larger but has the disadvantage of requiring larger memory
buffers to accommodate the data.
FIGURE 80.18 Read/write head layout. (Source: P. A. Lee, “Memory subsystems,” in Digital Systems Reference Book, B.
Holdsworth and G. R. Martin, Eds., Oxford: Butterworth-Heinemann, 1991, p. 2.6/12. With permission.)
FIGURE 80.19 Write and read pulses on magnetic tape. (Source: P. A. Lee, “Memory subsystems,” in Digital Systems
Reference Book, B. Holdsworth and G. R. Martin, Eds., Oxford: Butterworth-Heinemann, 1991, p. 2.6/12. With permission.)
FIGURE 80.20 Magnetic tape format. (Source: P. A. Lee, “Memory subsystems,” in Digital Systems Reference Book, B.
Holdsworth and G. R. Martin, Eds., Oxford: Butterworth-Heinemann, 1991, p. 2.6/12. With permission.)
? 2000 by CRC Press LLC
A number of blocks make up a file identified by a tape file marker which is written to the tape by the tape
controller. The entire length of tape is enclosed between the beginning and end of tape markers. These normally
consist of a photosensitive material that triggers sensors on the read/write heads. When a new tape is loaded,
it normally advances to the beginning of a tape marker and then it is ready for access by the CPU. The end of
tape marker is used to prevent the tape from running off the end of the tape spool and indicates the limit of
the storage length.
Recording Modes
Several recording modes are used with the express objective of storing data at the highest density and with the
greatest reliability of noncorruption of retrieved data. Two popular but contrasting modes are the non-return-
to-zero (NRZ) and phase encoding (PE) modes. These are incompatible although some magnetic tape drives
have detectors to sense the mode and operate in a bimodal way. The NRZ technique is shown in Fig. 80.19,
where only the 1 bit is displayed by a reversal of magnetization on the tape. The magnetic polarity remains
unchanged for logic 0. An external clock track is also required for this mode because a pulse is not always
generated for each row of data on the tape.
The PE technique allows both the 0 and 1 states to be displayed by changes of magnetization. A 1 bit is given
by a north-to-north pole on the tape, and a 0 bit is given by a south-to-south pole on the tape. PE provides
approximately double the recording density and processor speed of NRZ. PE tapes carry an identification mark
called a burst, which consists of successive magnetization changes at the beginning of track 4. This allows the
tape drive to recognize the tape mode and configure itself accordingly.
Defining Terms
Access time: The cycle time for the computer store to present information to the CPU. Access times vary
from less than 40 ns for level 0 register storage up to tens of seconds for magnetic tape storage.
Auxiliary (secondary, mass, or backing) storage: Computer stores which have a capacity to store enormous
amounts of information in a nonvolatile form. This type of memory has an access time usually greater
than main memory and consists of magnetic tape drives, magnetic disk stores, and optical disk stores.
Ferromagnetic material: Materials that exhibit high magnetic properties. These include metals such as cobalt,
iron, and some alloys.
Magnetic tape: A polyester film sheet coated with a ferromagnetic powder, which is used extensively in
auxiliary memory. It is produced on a reel, in a cassette, or in a cartridge transportation medium.
Nonvolatile memory: The class of computer memory that retains its stored information when the power
supply is cut off. It includes magnetic tape, magnetic disks, flash memory, and most types of ROM.
Related Topic
36.2 Magnetic Recording
References
L. Ciminiera and A. Valenzano, Advanced Microprocessor Architectures, Reading, Mass.: Addison-Wesley, 1987.
B. Holdsworth and G. Martin, Eds, Digital Systems Reference Book, Oxford: Butterworth-Heinemann, 1991,
pp 2.6/1–2.6/11.
R. Hyde, “Overview of memory management,” Byte, pp. 219–225, April 1988.
J. Isailovíc, Video Disc and Optical Memory Systems, Englewood Cliffs, N.J.: Prentice-Hall, 1985.
K. London, Introduction to Computers, London: Faber and Faber Press, 1986, p. 141.
M. Mano, Computer Systems Architecture, Englewood Cliffs, N.J.: Prentice-Hall, 1982.
R. Matick, Computer Storage Systems & Technology, New York: John Wiley, 1977.
A. Tanenbaum, Structured Computer Organisation, Englewood Cliffs, N.J.: Prentice-Hall, 1990.
G. Wiehler, Magnetic Peripheral Data Storage, Heydon & Son, 1974.
? 2000 by CRC Press LLC
Further Information
The IEEE Transactions on Magnetics is available from the IEEE Service Center, Customer Service Department,
445 Hoes Lane, Piscataway, NJ 08855-1331; 800-678-IEEE (outside the USA: 908-981-0060). An IEEE-sponsored
Conference on Magnetism and Magnetic Materials was held in December 1992. The British Tape Industry
Association (BTIA) has a computer media committee, and further information on standards, etc. can be
obtained from British Tape Industry Association, Carolyn House, 22-26 Dingwall Road, Croydon CR0 9XF,
England. The equivalent American Association also provides information on computer tape and can be con-
tacted at International Tape Manufacturers’ Association, 505 Eighth Avenue, New York, NY 10018.
80.4 Magneto-Optical Disk Data Storage
M. Mansuripur
Since the early 1940s, magnetic recording has been the mainstay of electronic information storage worldwide.
Audio tapes provided the first major application for the storage of information on magnetic media. Magnetic
tape has been used extensively in consumer products such as audio tapes and video cassette recorders (VCRs);
it has also found application in backup/archival storage of computer files, satellite images, medical records, etc.
Large volumetric capacity and low cost are the hallmarks of tape data storage, although sequential access to
the recorded information is perhaps the main drawback of this technology. Magnetic hard disk drives have
been used as mass storage devices in the computer industry ever since their inception in 1957. With an areal
density that has doubled roughly every other year, hard disks have been and remain the medium of choice for
secondary storage in computers.
1
Another magnetic data storage device, the floppy disk, has been successful in
areas where compactness, removability, and fairly rapid access to the recorded information have been of prime
concern. In addition to providing backup and safe storage, inexpensive floppies with their moderate capacities
(2 Mbyte on a 3.5-in. diameter platter is typical nowadays) and reasonable transfer rates have provided the
crucial function of file/data transfer between isolated machines. All in all, it has been a great half-century of
progress and market dominance for magnetic recording devices, which are only now beginning to face a
potentially serious challenge from the technology of optical recording.
Like magnetic recording, a major application area for optical data storage systems is the secondary storage
of information for computers and computerized systems. Like the high-end magnetic media, optical disks can
provide recording densities in the range of 10
7
bits/cm
2
and beyond. The added advantage of optical recording
is that, like floppies, these disks can be removed from the drive and stored on the shelf. Thus the functions of
the hard disk (i.e., high capacity, high data transfer rate, rapid access) may be combined with those of the
floppy (i.e., backup storage, removable media) in a single optical disk drive. Applications of optical recording
are not confined to computer data storage. The enormously successful audio compact disk (CD), which was
introduced in 1983 and has since become the de facto standard of the music industry, is but one example of
the tremendous potentials of the optical technology.
A strength of optical recording is that, unlike its magnetic counterpart, it can support read-only, write-once,
and erasable/rewritable modes of data storage. Consider, for example, the technology of optical audio/video
disks. Here the information is recorded on a master disk which is then used as a stamper to transfer the embossed
patterns to a plastic substrate for rapid, accurate, and inexpensive reproduction. The same process is employed
in the mass production of read-only files (CD-ROM, O-ROM) which are now being used to distribute software,
catalogues, and other large databases. Or consider the write-once read-many (WORM) technology, where one
can permanently store massive amounts of information on a given medium and have rapid, random access to
them afterwards. The optical drive can be designed to handle read-only, WORM, and erasable media all in one
unit, thus combining their useful features without sacrificing performance and ease of use or occupying too
1
At the time of this writing, achievable densities on hard disks are in the range of 10
7
bits/cm
2
. Random access to arbitrary
blocks of data in these devices can take on the order of 10 ms, and individual read/write heads can transfer data at the rate
of several megabits per second.
? 2000 by CRC Press LLC
much space. What is more, the media can contain regions with prerecorded information as well as regions for
read/write/erase operations, both on the same platter. These possibilities open new vistas and offer opportunities
for applications that have heretofore been unthinkable; the interactive video disk is perhaps a good example
of such applications.
In this article we will lay out the conceptual basis for optical data storage systems; the emphasis will be on
disk technology in general and magneto-optical disk in particular. The first section is devoted to a discussion
of some elementary aspects of disk data storage including the concept of track and definition of the access time.
The second section describes the basic elements of the optical path and its functions; included are the properties
of the semiconductor laser diode, characteristics of the beamshaping optics, and certain features of the focusing
objective lens. Because of the limited depth of focus of the objective and the eccentricity of tracks, optical disk
systems must have a closed-loop feedback mechanism for maintaining the focused spot on the right track.
These mechanisms are described in the third and fourth sections for automatic focusing and automatic track
following, respectively. The physical process of thermomagnetic recording in magneto-optic (MO) media is
described next, followed by a discussion of the MO readout process in the sixth section. The final section
describes the properties of the MO media.
Preliminaries and Basic Definitions
A disk, whether magnetic or optical, consists of a number of tracks along which the information is recorded.
These tracks may be concentric rings of a certain width, W
t
, as shown in Fig. 80.21. Neighboring tracks may
be separated from each other by a guard band whose width we shall denote by W
g
. In the least sophisticated
recording scheme imaginable, marks of length D
0
are recorded along these tracks. Now, if each mark can be in
either one of two states, present or absent, it may be associated with a binary digit, 0 or 1. When the entire
disk surface of radius R is covered with such marks, its capacity C
0
will be
(80.2)
Consider the parameter values typical of current optical disk technology: R = 67 mm corresponding to 5.25-in.
diameter platters, D
0
= 0.5 mm which is roughly determined by the wavelength of the read/write laser diodes,
and W
t
+ W
g
= 1 mm for the track pitch. The disk capacity will then be around 28 ′ 10
9
bits, or 3.5 gigabytes.
This is a reasonable estimate and one that is fairly close to reality, despite the many simplifying assumptions
made in its derivation. In the following paragraphs we examine some of these assumptions in more detail.
FIGURE 80.21 Physical appearance and general features of an optical disk. The read/write head gains access to the disk
through a window in the jacket; the jacket itself is for protection purposes only. The hub is the mechanical interface with
the drive for mounting and centering the disk on the spindle. The track shown at radius r
0
is of the concentric-ring type.
C
R
WW
tg
0
2
0
=
+
p
()D
bits per surface
? 2000 by CRC Press LLC
The disk was assumed to be fully covered with information-carrying marks. This is generally not the case
in practice. Consider a disk rotating at 1 revolutions per second (rps). For reasons to be clarified later, this
rotational speed should remain constant during the disk operation. Let the electronic circuitry have a fixed
clock duration T
c
. Then only pulses of length T
c
(or an integer multiple thereof) may be used for writing. Now,
a mark written along a track of radius r, with a pulse-width equal to T
c ,
will have length ,, where
, = 2p 1 rT
c
(80.3)
Thus for a given rotational speed 1 and a fixed clock cycle T
c
, the minimum mark length , is a linear function
of track radius r, and , decreases toward zero as r approaches zero. One must, therefore, pick a minimum
usable track radius, r
min
, where the spatial extent of the recorded marks is always greater than the minimum
allowed mark length, D
0
. Equation (80.3) yields
(80.4)
One may also define a maximum usable track radius r
max
, although for present purposes r
max
= R is a perfectly
good choice. The region of the disk used for data storage is thus confined to the area between r
min
and r
max
.
The total number N of tracks in this region is given by
(80.5)
The number of marks on any given track in this scheme is independent of the track radius; in fact, the number
is the same for all tracks, since the period of revolution of the disk and the clock cycle uniquely determine the
total number of marks on any individual track. Multiplying the number of usable tracks N with the capacity
per track, we obtain for the usable disk capacity
(80.6)
Replacing for N from Eq. (80.5) and for 1 T
c
from Eq. (80.4), we find,
(80.7)
If the capacity C in Eq. (80.7) is considered a function of r
min
with the remaining parameters held constant, it
is not difficult to show that maximum capacity is achieved when
r
min
= ? r
max
(80.8)
With this optimum r
min
, the value of C in Eq. (80.7) is only half that of C
0
in Eq. (80.2). In other words, the
estimate of 3.5 gigabyte per side for 5.25-in. disks seems to have been optimistic by a factor of two.
One scheme often proposed to enhance the capacity entails the use of multiple zones, where either the rotation
speed 1 or the clock period T
c
is allowed to vary from one zone to the next. In general, zoning schemes can reduce
the minimum usable track radius below that given by Eq. (80.8). More importantly, however, they allow tracks
with larger radii to store more data than tracks with smaller radii. The capacity of the zoned disk is somewhere
between C of Eq. (80.7) and C
0
of Eq. (80.2), the exact value depending on the number of zones implemented.
r
T
c
min
=
D
0
2p1
N
rr
WW
tg
=
-
+
max min
C
N
T
c
=
1
C
rr r
WW
tg
=
-
+
2
0
p
min max min
()
()D
? 2000 by CRC Press LLC
A fraction of the disk surface area is usually reserved for preformat information and cannot be used for data
storage. Also, prior to recording, additional bits are generally added to the data for error correction coding
and other housekeeping chores. These constitute a certain amount of overhead on the user data and must be
allowed for in determining the capacity. A good rule of thumb is that overhead consumes approximately 20%
of the raw capacity of an optical disk, although the exact number may vary among the systems in use. Substrate
defects and film contaminants during the deposition process can create bad sectors on the disk. These are
typically identified during the certification process and are marked for elimination from the sector directory.
Needless to say, bad sectors must be discounted when evaluating the capacity.
Modulation codes may be used to enhance the capacity beyond what has been described so far. Modulation
coding does not modify the minimum mark length of D
0
, but frees the longer marks from the constraint of
being integer multiples of D
0
. The use of this type of code results in more efficient data storage and an effective
number of bits per D
0
that is greater than unity. For example, the popular (2, 7) modulation code has an
effective bit density of 1.5 bits per D
0
. This or any other modulation code can increase the disk capacity beyond
the estimate of Eq. (80.7).
The Concept of Track
The information on magnetic and optical disks is recorded along tracks. Typically, a track is a narrow annulus
at some distance r from the disk center. The width of the annulus is denoted by W
t
, while the width of the
guard band, if any, between adjacent tracks is denoted by W
g
. The track pitch is the center-to-center distance
between neighboring tracks and is therefore equal to W
t
+ W
g
. A major difference between the magnetic floppy
disk, the magnetic hard disk, and the optical disk is that their respective track pitches are presently of the order
of 100, 10, and 1 mm. Tracks may be fictitious entities, in the sense that no independent existence outside the
pattern of recorded marks may be ascribed to them. This is the case, for example, with the audio compact disk
format where prerecorded marks simply define their own tracks and help guide the laser beam during readout.
In the other extreme are tracks that are physically engraved on the disk surface before any data is ever recorded.
Examples of this type of track are provided by pregrooved WORM and magneto-optical disks. Figure 80.22
shows micrographs from several recorded optical disk surfaces. The tracks along which the data are written are
clearly visible in these pictures.
It is generally desired to keep the read/write head stationary while the disk spins and a given track is being
read from or written onto. Thus, in an ideal situation, not only should the track be perfectly circular, but also
the disk must be precisely centered on the spindle axis. In practical systems, however, tracks are neither precisely
circular, nor are they concentric with the spindle axis. These eccentricity problems are solved in low-perfor-
mance floppy drives by making tracks wide enough to provide tolerance for misregistrations and misalignments.
Thus the head moves blindly to a radius where the track center is nominally expected to be and stays put until
the reading or writing is over. By making the head narrower than the track pitch, the track center is allowed
to wobble around its nominal position without significantly degrading the performance during the read/write
operation. This kind of wobble, however, is unacceptable in optical disk systems, which have a very narrow
track, about the same size as the focused beam spot. In a typical situation arising in practice, the eccentricity
of a given track may be as much as ±50 mm while the track pitch is only about 1 mm, thus requiring active
track-following procedures.
One method of defining tracks on an optical disk is by means of pregrooves that are either etched, stamped,
or molded onto the substrate. In grooved media of optical storage, the space between neighboring grooves is
the so-called land [see Fig. 80.23(a)]. Data may be written in the grooves with the land acting as a guard band.
Alternatively, the land regions may be used for recording while the grooves separate adjacent tracks. The groove
depth is optimized for generating an optical signal sensitive to the radial position of the read/write laser beam.
For the push-pull method of track-error detection the groove depth is in the neighborhood of l/8, where l is
the wavelength of the laser beam.
In digital data storage applications, each track is divided into small segments or sectors, intended for the
storage of a single block of data (typically either 512 or 1024 bytes). The physical length of a sector is thus a
few millimeters. Each sector is preceded by header information such as the identity of the sector, identity of
the corresponding track, synchronization marks, etc. The header information may be preformatted onto the
? 2000 by CRC Press LLC
substrate, or it may be written on the storage layer prior to shipping the disk. Pregrooved tracks may be “carved”
on the optical disk either as concentric rings or as a single continuous spiral. There are certain advantages to
each format. A spiral track can contain a succession of sectors without interruption, whereas concentric rings
may each end up with some empty space that is smaller than the required length for a sector. Also, large files
may be written onto (and read from) spiral tracks without jumping to the next track, which occurs when
concentric tracks are used. On the other hand, multiple-path operations such as write-and-verify or erase-and-
write, which require two paths each for a given sector, or still-frame video are more conveniently handled on
concentric-ring tracks.
Another track format used in practice is based on the sampled-servo concept. Here the tracks are identified
by occasional marks placed permanently on the substrate at regular intervals, as shown in Fig. 80.23. Details
of track following by the sampled-servo scheme will follow shortly; suffice it to say at this point that servo
marks help the system identify the position of the focused spot relative to the track center. Once the position
is determined it is fairly simple to steer the beam and adjust its position.
FIGURE 80.22 Micrographs of several types of optical storage media. The tracks are straight and narrow (track pitch =
1.6 mm), with an orientation angle of . –45°. (A) Ablative, write-once tellurium alloy. (B) Ablative, write-once organic dye.
(C) Amorphous-to-crystalline, write-once phase-change alloy GaSb. (D) Erasable, amorphous magneto-optic alloy GdTbFe.
(E) Erasable, crystalline-to-amorphous phase-change tellurium alloy. (F) Read-only CD-audio, injection-molded from
polycarbonate with a nickel stamper. (Source: Ullmann’s Encyclopedia of Industrial Chemistry, 5th ed., vol. A14, Weinheim:
VCH, 1989, p. 196. With permission.)
? 2000 by CRC Press LLC
Disk Rotation Speed
When a disk rotates at a constant angular velocity w, a track
of radius r moves with the constant linear velocity V = rw .
Ideally, one would like to have the same linear velocity for
all the tracks, but this is impractical except in a limited
number of situations. For instance, when the desired mode
of access to the various tracks is sequential, such as in audio
and video disk applications, it is possible to place the head
in the beginning at the inner radius and move outward from
the center thereafter while continuously decreasing the
angular velocity. By keeping the product of r and w constant,
one can thus achieve constant linear velocity for all the
tracks.
1
Sequential access mode, however, is the exception
rather than the norm in data storage systems. In most appli-
cations, the tracks are accessed randomly with such rapidity
that it becomes impossible to adjust the rotation speed for
constant linear velocity. Under these circumstances, the
angular velocity is best kept constant during the normal
operation of the disk. Typical rotation speeds are 1200 and
1800 rpm for slower drives and 3600 rpm for the high data
rate systems. Higher rotation rates (5000 rpm and beyond)
are certainly feasible and will likely appear in future storage
devices.
Access Time
The direct-access storage device or DASD, used in computer
systems for the mass storage of digital information, is a disk
drive capable of storing large quantities of data and accessing
blocks of this data rapidly and in arbitrary order. In
read/write operations it is often necessary to move the head
to new locations in search of sectors containing specific data
items. Such relocations are usually time-consuming and can
become the factor that limits performance in certain appli-
cations. The access time t
a
is defined as the average time
spent in going from one randomly selected spot on the disk to another. t
a
can be considered the sum of a seek
time, t
s
, which is the average time needed to acquire the target track, and a latency, t
l
, which is the average
time spent on the target track waiting for the desired sector. Thus,
t
a =
t
s +
t
l
(80.9)
The latency is half the revolution period of the disk, since a randomly selected sector is, on the average, halfway
along the track from the point where the head initially lands. Thus for a disk rotating at 1200 rpm t
l
= 25 ms,
while at 3600 rpm t
l
. 8.3 ms. The seek time, on the other hand, is independent of the rotation speed, but is
determined by the traveling distance of the head during an average seek, as well as by the mechanism of head
actuation. It can be shown that the average length of travel in a random seek is one third of the full stroke. (In
our notation the full stroke is r
max
– r
min
.) In magnetic disk drives where the head/actuator assembly is relatively
light-weight (a typical Winchester head weighs about 5 grams) the acceleration and deceleration periods are
short, and seek times are typically around 10 ms in small drives (i.e., 5.25 and 3.5 in.). In optical disk systems,
1
In compact disk players the linear velocity is kept constant at 1.2 m/s. The starting position of the head is at the inner
radius r
min
= 25 mm, where the disk spins at 460 rpm. The spiral track ends at the outer radius r
max
= 58 mm, where the
disk’s angular velocity is 200 rpm.
FIGURE 80.23 (a) Lands and grooves in an optical
disk. The substrate is transparent, and the laser beam
must pass through it before reaching the storage
medium. (b) Sampled-servo marks in an optical disk.
These marks which are offset from the track-center
provide information regarding the position of
focused spot.
? 2000 by CRC Press LLC
on the other hand, the head, being an assembly of discrete elements, is fairly large and heavy (typical weight
.100 grams), resulting in values of t
s
that are several times greater than those obtained in magnetic recording
systems. The seek times reported for commercially available optical drives presently range from 20 ms in high-
performance 3.5-in. drives to about 80 ms in larger drives. We emphasize, however, that the optical disk tech-
nology is still in its infancy; with the passage of time, the integration and miniaturization of the elements within
the optical head will surely produce lightweight devices capable of achieving seek times of the order of a few
milliseconds.
The Optical Path
The optical path begins at the light source which, in practically
all laser disk systems in use today, is a semiconductor GaAs
diode laser. Several unique features have made the laser diode
indispensable in optical recording technology, not only for the
readout of stored information but also for writing and erasure.
The small size of this laser has made possible the construction
of compact head assemblies, its coherence properties have
enabled diffraction-limited focusing to extremely small spots,
and its direct modulation capability has eliminated the need
for external modulators. The laser beam is modulated by con-
trolling the injection current; one applies pulses of variable
duration to turn the laser on and off during the recording
process. The pulse duration can be as short as a few nanosec-
onds, with rise and fall times typically less than 1 ns. Although
readout can be accomplished at constant power level, i.e., in
CW mode, it is customary for noise reduction purposes to
modulate the laser at a high frequency (e.g., several hundred
megahertz during readout).
Collimation and Beam Shaping
Since the cross-sectional area of the active region in a laser
diode is only about one micrometer, diffraction effects cause
the emerging beam to diverge rapidly. This phenomenon is
depicted schematically in Fig. 80.24(a). In practical applica-
tions of the laser diode, the expansion of the emerging beam
is arrested by a collimating lens, such as that shown in
Fig. 80.24(b). If the beam happens to have aberrations (astig-
matism is particularly severe in diode lasers), then the colli-
mating lens must be designed to correct this defect as well.
In optical recording it is most desirable to have a beam with
circular cross section. The need for shaping the beam arises
from the special geometry of the laser cavity with its rectan-
gular cross section. Since the emerging beam has different
dimensions in the directions parallel and perpendicular to the
junction, its cross section at the collimator becomes elliptical,
with the initially narrow dimension expanding more rapidly
to become the major axis of the ellipse. The collimating lens
thus produces a beam with elliptical cross section. Circular-
ization may be achieved by bending various rays of the beam
at a prism, as shown in Fig. 80.24(c). The bending changes the
beam’s diameter in the plane of incidence but leaves the diam-
eter in the perpendicular direction intact.
FIGURE 80.24 (a) Away from the facet, the out-
put beam of a diode laser diverges rapidly. In gen-
eral, the beam diameter along X is different from
that along Y, which makes the cross section of the
beam elliptical. Also, the radii of curvature R
x
and
R
y
are not the same, thus creating a certain amount
of astigmatism in the beam. (b) Multi-element col-
limator lens for laser diode applications. Aside from
collimating, this lens also corrects astigmatic aber-
rations of the beam. (c) Beam shaping by deflection
at a prism surface. q
1
and q
2
are related by the Snell’s
law, and the ratio d
2
/d
1
is the same as cos q
2
/cos q
1
.
Passage through the prism circularizes the elliptical
cross section of the beam.
? 2000 by CRC Press LLC
Focusing by the Objective Lens
The collimated and circularized beam of the diode laser
is focused on the surface of the disk using an objective
lens. The objective is designed to be aberration-free, so
that its focused spot size is limited only by the effects of
diffraction. Figure 80.25(a) shows the design of a typical
objective made from spherical optics. According to the
classical theory of diffraction, the diameter of the beam,
d, at the objective’s focal plane is given by
(80.10)
where l is the wavelength of light, and NA is the numerical
aperture of the objective.
1
In optical recording it is desired to achieve the smallest
possible spot, since the size of the spot is directly related
to the size of marks recorded on the medium. Also, in
readout, the spot size determines the resolution of the
system. According to Eq. (80.10) there are two ways to
achieve a small spot: first by reducing the wavelength and,
second, by increasing the numerical aperture of the objec-
tive. The wavelengths currently available from GaAs lasers
are in the range of 670–840 nm. It is possible to use a
nonlinear optical device to double the frequency of these
diode lasers, thus achieving blue light. Good efficiencies
have been demonstrated by frequency doubling. Also
recent developments in II–VI materials have improved the
prospects for obtaining green and blue light directly from
semiconductor lasers. Consequently, there is hope that in
the near future optical storage systems will operate in the
wavelength range of 400–500 nm. As for the numerical
aperture, current practice is to use a lens with NA
.0.5–0.6. Although this value might increase slightly in
the coming years, much higher numerical apertures are
unlikely, since they put strict constraints on the other
characteristics of the system and limit the tolerances. For
instance, the working distance at high numerical aperture
is relatively short, making access to the recording layer
through the substrate more difficult. The smaller depth of
focus of a high numerical aperture lens will make attain-
ing/maintaining proper focus more of a problem, while
the limited field of view might restrict automatic track-following procedures. A small field of view also places
constraints on the possibility of read/write/erase operations involving multiple beams.
The depth of focus of a lens, d, is the distance away from the focal plane over which tight focus can be
maintained [see Fig. 80.25(b)]. According to the classical diffraction theory
1
Numerical aperture is defined as NA = n sin q, where n is the refractive index of the image space, and q is the half-angle
subtended by the exit pupil at the focal point. In optical recording systems the image space is air whose index is very nearly
unity; thus for all practical purposes NA = sin q.
FIGURE 80.25 (a) Multi-element lens design for a
high numerical aperture video disk objective. (Source:
D. Kuntz, “Specifying laser diode optics,” Laser Focus,
March 1984. With permission.) (b) Various parameters
of the objective lens. The numerical aperture is NA =
sin q. The spot diameter d and the depth of focus d are
given by Eqs. (80.10) and (80.11), respectively.
(c) Focusing through the substrate can cause spherical
aberration at the active layer. The problem can be cor-
rected if the substrate is taken into account while design-
ing the objective.
d
NA
.
l
? 2000 by CRC Press LLC
(80.11)
Thus for a wavelength of l = 700 nm and NA = 0.6, the depth of focus is about ±1 mm. As the disk spins
under the optical head at the rate of several thousand rpm, the objective lens must stay within a distance of
f ± d from the active layer if proper focus is to be maintained. Given the conditions under which drives usually
operate, it is impossible to make rigid enough mechanical systems to yield the required positioning tolerances.
On the other hand, it is fairly simple to mount the objective lens in an actuator capable of adjusting its position
with the aid of closed-loop feedback control. We shall discuss the technique of automatic focusing in the next
section. For now, let us emphasize that by going to shorter wavelengths and/or larger numerical apertures (as
is required for attaining higher data densities) one will have to face a much stricter regime as far as automatic
focusing is concerned. Increasing the numerical aperture is particularly worrisome, since d drops with the
square of NA.
A source of spherical aberrations in optical disk systems is the substrate through which the light must travel
to reach the active layer of the disk. Figure 80.25(c) shows the bending of the rays at the disk surface that causes
the aberration. This problem can be solved by taking into account the effects of the substrate in the design of
the objective, so that the lens is corrected for all aberrations including those arising at the substrate. Recent
developments in molding of aspheric glass lenses have gone a long way in simplifying the lens design problem.
Figure 80.26 shows a pair of molded glass aspherics designed for optical disk system applications; both the
collimator and the objective are single-element lenses and are corrected for aberrations.
Automatic Focusing
We mentioned in the preceding section that since the objective has a large numerical aperture (NA 3 0.5), its
depth of focus d is rather shallow (d . ±1 mm at l = 780 nm). During all read/write/erase operations, therefore,
the disk must remain within a fraction of a micrometer from the focal plane of the objective. In practice,
however, the disks are not flat and they are not always mounted rigidly parallel to the focal plane, so that
movements away from focus occur a few times during each revolution. The peak-to-peak movement in and
out of focus may be as much as 100 mm. Without automatic focusing of the objective along the optical axis,
this runout (or disk flutter) will be detrimental to the operation of the system. In practice, the objective is
mounted on a small motor (usually a voice coil) and allowed to move back and forth in order to keep its
distance within an acceptable range from the disk. The spindle turns at a few thousand rpm, which is a hundred
or so revolutions per second. If the disk moves in and out of focus a few times during each revolution, then
the voice coil must be fast enough to follow these movements in real time; in other words, its frequency response
must extend to several kilohertz.
The signal that controls the voice coil is obtained from the light reflected from the disk. There are several
techniques for deriving the focus error signal, one of which is depicted in Fig. 80.27(a). In this so-called
obscuration method a secondary lens is placed in the path of the reflected light, one-half of its aperture is
covered, and a split detector is placed at its focal plane. When the disk is in focus, the returning beam is
collimated and the secondary lens will focus the beam at the center of the split detector, giving a difference
FIGURE 80.26 Molded glass aspheric lens pair for optical disk applications. These singlets can replace the multi-element
spherical lenses shown in Figs. 80.24(b) and 80.25(a).
d
l
.
NA
2
? 2000 by CRC Press LLC
signal DS equal to zero. If the disk now moves away from the objective, the returning beam will become
converging, as in Fig. 80.27(b), sending all the light to detector #1. In this case DS will be positive and the voice
coil will push the lens towards the disk. On the other hand, when the disk moves close to the objective, the
returning beam becomes diverging and detector #2 receives the light [see Fig. 80.27(c)]. This results in a negative
DS that forces the voice coil to pull back in order to return DS to zero. A given focus error detection scheme
is generally characterized by the shape of its focus error signalDS versus the amount of defocus Dz; one such
curve is shown in Fig. 80.27(d). The slope of the focus error signal (FES) curve near the origin is of particular
importance, since it determines the overall performance and stability of the servo loop.
Automatic Tracking
Consider a track at a certain radial location, say r
0
, and imagine viewing this track through the access window
shown in Fig. 80.21. It is through this window that the head gains access to arbitrarily selected tracks. To a
viewer looking through the window, a perfectly circular track centered on the spindle axis will look stationary,
irrespective of the rotation rate. However, any eccentricity will cause an apparent radial motion of the track.
The peak-to-peak distance traveled by a track (as seen through the window) depends on a number of factors
including centering accuracy of the hub, deformability of the substrate, mechanical vibrations, manufacturing
tolerances, etc. For a typical 3.5-in. disk, for example, this peak-to-peak motion can be as much as 100 mm
during one revolution. Assuming a revolution rate of 3600 rpm, the apparent velocity of the track in the radial
direction will be several millimeters per second. Now, if the focused spot remains stationary while trying to read
from or write to this track, it is clear that the beam will miss the track for a good fraction of every revolution cycle.
Practical solutions to the above problem are provided by automatic tracking techniques. Here the objective
is placed in a fine actuator, typically a voice coil, which is capable of moving the necessary radial distances and
FIGURE 80.27 Focus error detection by the obscuration method. In (a) the disk is in focus, and the two halves of the split
detector receive equal amounts of light. When the disk is too far from the objective (b) or too close to it (c), the balance of
detector signals shifts to one side or the other. A plot of the focus error signal (FES) versus defocus is shown in (d), and its
slope near the origin is identified as the FES gain, G.
? 2000 by CRC Press LLC
maintaining a lock on the desired track. The signal that
controls the movement of this actuator is derived from
the reflected light itself, which carries information about
the position of the focused spot. There exist several
mechanisms for extracting the track error signal (TES);
all these methods require some sort of structure on the
disk surface in order to identify the track. In the case of
read-only disks (CD, CD-ROM, and video disk), the
embossed pattern of data provides ample information
for tracking purposes. In the case of write-once and
erasable disks, tracking guides are “carved” on the sub-
strate in the manufacturing process. As mentioned ear-
lier, the two major formats for these tracking guides are pregrooves (for continuous tracking) and sampled-
servo marks (for discrete tracking). A combination of the two schemes, known as continuous/composite format,
is often used in practice. This scheme is depicted in Fig. 80.28 which shows a small section containing five tracks,
each consisting of the tail end of a groove, synchronization marks, a mirror area used for adjusting focus/track
offsets, a pair of wobble marks for sampled tracking, and header information for sector identification.
Tracking on Grooved Regions
As shown in Fig. 80.23(a), grooves are continuous depressions that are either embossed or etched or molded
onto the substrate prior to deposition of the storage medium. If the data is recorded on the grooves, then the
lands are not used except for providing a guard band between neighboring grooves. Conversely, the land regions
may be used to record the information, in which case grooves provide the guard band. Typical track widths
are about one wavelength. The guard bands are somewhat narrower than the tracks, their exact shape and
dimensions depending on the beam size, required track-servo accuracy, and the acceptable levels of cross-talk
between adjacent tracks. The groove depth is usually around one-eighth of one wavelength (l/8), since this
depth can be shown to give the largest TES in the push-pull method. Cross sections of the grooves may be
rectangular, trapezoidal, triangular, etc.
When the focused spot is centered on track, it is diffracted symmetrically from the two edges of the track,
resulting in a balanced far field pattern. As soon as the spot moves away from the center, the symmetry breaks
down and the light distribution in the far field tends to shift to one side or the other. A split photodetector
placed in the path of the reflected light can therefore sense the relative position of the spot and provide the
appropriate feedback signal. This strategy is depicted schematically in Fig. 80.29; also shown in the figure are
intensity plots at the detector plane for light reflected from various regions of the disk. Note how the intensity
shifts to one side or the other depending on the direction of motion of the spot.
Sampled Tracking
Since dynamic track runout is usually a slow and gradual process, there is actually no need for continuous
tracking as done on grooved media. A pair of embedded marks, offset from the track center as in Fig. 80.23(b),
can provide the necessary information for correcting the relative position of the focused spot. The reflected
intensity will indicate the positions of the two servo marks as two successive short pulses. If the beam happens
to be on track, the two pulses will have equal magnitudes and there will be no need for correction. If, on the
other hand, the beam is off-track, one of the pulses will be stronger than the other. Depending on which pulse
is the stronger, the system will recognize the direction in which it has to move and will correct the error
accordingly. The servo marks must appear frequently enough along the track to ensure proper track following.
In a typical application, the track might be divided into groups of 18 bytes, 2 bytes dedicated as servo offset
areas and 16 bytes filled with other format information or left blank for user data.
Thermomagnetic Recording Process
Recording and erasure of information on a magneto-optical disk are both achieved by the thermomagnetic
process. The essence of thermomagnetic recording is shown in Fig. 80.30. At the ambient temperature the film
FIGURE 80.28 Servo fields in continuous/composite
format contain a mirror area and offset marks for tracking
Groove Synch
Mark
MirrorWobble
Marks
Header
? 2000 by CRC Press LLC
has a high magnetic coercivity
1
and therefore does not respond to the externally applied field. When a focused
beam raises the local temperature of the film, the hot spot becomes magnetically soft (i.e., its coercivity drops).
As the temperature rises, coercivity drops continuously until such time as the field of the electromagnet finally
overcomes the material’s resistance to reversal and switches its magnetization. Turning the laser off brings the
temperatures back to normal, but the reverse-magnetized domain remains frozen in the film. In a typical
situation in practice, the film thickness may be around 300 ?, laser power at the disk .10 mW, diameter of
the focused spot .1 mm, laser pulse duration .50 ns, linear velocity of the track .10 m/s, and the magnetic
field strength .200 gauss. The temperature may reach a peak of 500 K at the center of the spot, which is
sufficient for magnetization reversal, but is not nearly high enough to melt or crystalize or in any other way
modify the material’s structure.
The materials of magneto-optical recording have strong perpendicular magnetic anisotropy. This type of
anisotropy favors the “up” and “down” directions of magnetization over all other orientations. The disk is
initialized in one of these two directions, say up, and the recording takes place when small regions are selectively
reverse-magnetized by the thermomagnetic process. The resulting magnetization distribution then represents
the pattern of recorded information. For instance, binary sequences may be represented by a mapping of zeros
to up-magnetized regions and ones to down-magnetized regions (non-return to zero or NRZ). Alternatively,
the NRZI scheme might be used, whereby transitions (up-to-down and down-to-up) are used to represent the
ones in the bit-sequence.
1
Coercivity of a magnetic medium is a measure of its resistance to magnetization reversal. For example, consider a thin
film with perpendicular magnetic moment saturated in the +Z direction. A magnetic field applied along –Z will succeed in
reversing the direction of magnetization only if the field is stronger than the coercivity of the film.
FIGURE 80.29 (a) Push-pull sensor for tracking on grooves. (b) Calculated distribution of light intensity at the detector
plane when the disk is in focus and the beam is centered on track. (c) Calculated intensity distribution at the detector plane
with disk in focus but the beam centered on the groove edge. (d) Same as (c) except for the spot being focused on the
opposite edge of the groove.
Net Signal = A - B
A
B
? 2000 by CRC Press LLC
Recording by Laser Power Modulation (LPM)
In this traditional approach to thermomagnetic recording, the electromagnet produces a constant field, while
the information signal is used to modulate the power of the laser beam. As the disk rotates under the focused
spot, the on/off laser pulses create a sequence of up/down domains along the track. The Lorentz electron
micrograph in Fig. 80.30(b) shows a number of domains recorded by LPM. The domains are highly stable and
may be read over and over again without significant degradation. If, however, the user decides to discard a
recorded block and to use the space for new data, the LPM scheme does not allow direct overwrite; the system
must erase the old data during one disk revolution cycle and record the new data in a subsequent revolution cycle.
During erasure, the direction of the external field is reversed, so that up-magnetized domains in Fig. 80.30(a)
now become the favored ones. Whereas writing is achieved with a modulated laser beam, in erasure the laser
stays on for a relatively long period of time, erasing an entire sector. Selective erasure of individual domains is
not practical, nor is it desired, since mass data storage systems generally deal with data at the level of blocks,
which are recorded onto and read from individual sectors. Note that at least one revolution period elapses
between the erasure of an old block and its replacement by a new block. The electromagnet therefore need not
be capable of rapid switchings. (When the disk rotates at 3600 rpm, for example, there is a period of 16 ms or
so between successive switchings.) This kind of slow reversal allows the magnet to be large enough to cover all
the tracks simultaneously, thereby eliminating the need for a moving magnet and an actuator. It also affords a
relatively large gap between the disk and the magnet, which enables the use of double-sided disks and relaxes
the mechanical tolerances of the system without overburdening the magnet’s driver.
The obvious disadvantage of LPM is its lack of direct overwrite capability. A more subtle concern is that it
is perhaps unsuitable for the PWM (pulse width modulation) scheme of representing binary waveforms. Due
to fluctuations in the laser power, spatial variations of material properties, lack of perfect focusing and track
following, etc., the length of a recorded domain along the track may fluctuate in small but unpredictable ways.
If the information is to be encoded in the distance between adjacent domain walls (i.e., PWM), then the LPM
scheme of thermomagnetic writing may suffer from excessive domain-wall jitter. Laser power modulation works
well, however, when the information is encoded in the position of domain centers (i.e., pulse position modu-
lation or PPM). In general, PWM is superior to PPM in terms of the recording density, and, therefore, recording
techniques that allow PWM are preferred.
Recording by Magnetic Field Modulation
Another method of thermomagnetic recording is based on magnetic field modulation (MFM) and is depicted
schematically in Fig. 80.31(a). Here the laser power may be kept constant while the information signal is used
to modulate the magnetic field. Photomicrographs of typical domain patterns recorded in the MFM scheme
are shown in Fig. 80.31(b). Crescent-shaped domains are the hallmark of the field modulation technique. If
one assumes (using a much simplified model) that the magnetization aligns itself with the applied field within
FIGURE 80.30 Thermomagnetic recording process. (a) The field of the electromagnet helps reverse the direction of
magnetization in the area heated by the focused laser beam. (b) Lorentz micrograph of domains written thermomagnetically.
The various tracks shown here were written at different laser powers, with power level decreasing from top to bottom.
(Source: F. Greidanus et al., Paper 26B-5, presented at the International Symposium on Optical Memory, Kobe, Japan,
September 1989. With permission.)
? 2000 by CRC Press LLC
a region whose temperature has passed a certain critical value, T
crit
, then one can explain the crescent shape
of these domains in the following way: With the laser operating in the CW mode and the disk moving at
constant velocity, temperature distribution in the magnetic medium assumes a steady-state profile, such as that
shown in Fig. 80.31(c). Of course, relative to the laser beam, the temperature profile is stationary, but in the
frame of reference of the disk the profile moves along the track with the linear track velocity. The isotherm
corresponding to T
crit
is identified as such in the figure; within this isotherm the magnetization aligns itself
with the applied field. Figure 80.31(d) shows a succession of critical isotherms along the track, each obtained
at the particular instant of time when the magnetic field switches direction. From this picture it is easy to infer
how the crescent-shaped domains form and also understand the relation between the waveform that controls
the magnet and the resulting domain pattern.
The advantages of magnetic field modulation recording are that (1) direct overwriting is possible and (2) domain-
wall positions along the track, being rather insensitive to defocus and laser power fluctuations, are fairly
accurately controlled by the timing of the magnetic field switchings. On the negative side, the magnet must
now be small and fly close to the disk surface, if it is to produce rapidly switched fields with a magnitude of a
hundred gauss or so. Systems that utilize magnetic field modulation often fly a small electromagnet on the
opposite side of the disk from the optical stylus. Since mechanical tolerances are tight, this might compromise
the removability of the disk. Moreover, the requirement of close proximity between the magnet and the storage
medium dictates the use of single-sided disks in practice.
FIGURE 80.31 (a) Thermomagnetic recording by magnetic field modulation. The power of the beam is kept constant,
while the magnetic field direction is switched by the data signal. (b) Polarized-light microphotograph of recorded domains.
(c) Computed isotherms produced by a CW laser beam, focused on the magnetic layer of a disk. The disk moves with
constant velocity under the beam. The region inside the isotherm marked as T
crit
is above the critical temperature for writing,
that is, its magnetization aligns with the direction of the applied field. (d) Magnetization within the heated region (above
T
crit
) follows the direction of the applied field, whose switchings occur at times t
n
. The resulting domains are crescent-shaped.
? 2000 by CRC Press LLC
Magneto-Optical Readout
The information recorded on a perpendicularly magnetized medium may be read with the aid of the polar
magneto-optical Kerr effect. When linearly polarized light is normally incident on a perpendicular magnetic
medium, its plane of polarization undergoes a slight rotation upon reflection. This rotation of the plane of
polarization, whose sense depends on the direction of magnetization in the medium, is known as the polar
Kerr effect. The schematic representation of this phenomenon in Fig. 80.32 shows that if the polarization vector
suffers a counterclockwise rotation upon reflection from an up-magnetized region, then the same vector will
rotate clockwise when the magnetization is down. A magneto-optical medium is characterized in terms of its
reflectivity R and its Kerr rotation angle q
k
. R is a real number (between 0 and 1) that indicates the fraction
of the incident power reflected back from the medium at normal incidence. q
k
is generally quoted as a positive
number, but is understood to be positive or negative depending on the direction of magnetization; in MO
readout, it is the sign of q
k
that carries the information about the state of magnetization, i.e., the recorded bit
pattern.
The laser used for readout is usually the same as that used for recording, but its output power level is
substantially reduced in order to avoid erasing (or otherwise obliterating) the previously recorded information.
For instance, if the power of the write/erase beam is 20 mW, then for the read operation the beam is attenuated
to about 3 or 4 mW. The same objective lens that focuses the write beam is now used to focus the read beam,
creating a diffraction-limited spot for resolving the recorded marks. Whereas in writing the laser was pulsed
to selectively reverse-magnetize small regions along the track, in readout it operates with constant power, i.e.,
in CW mode. Both up- and down-magnetized regions are read as the track passes under the focused spot. The
reflected beam, which is now polarization-modulated, goes back through the objective and becomes collimated
once again; its information content is subsequently decoded by polarization-sensitive optics, and the scanned
pattern of magnetization is reproduced as an electronic signal.
Differential Detection
Figure 80.33 shows the differential detection system that is the basis of magneto-optical readout in practically
all erasable optical storage systems in use today. The beam splitter (BS) diverts half of the reflected beam away
from the laser and into the detection module.
1
The polarizing beam splitter (PBS) splits the beam into two
parts, each carrying the projection of the incident polarization along one axis of the PBS, as shown in
Fig. 80.33(b). The component of polarization along one of the axes goes straight through, while the component
15
The use of an ordinary beam splitter is an inefficient way of separating the incoming and outgoing beams, since half
the light is lost in each pass through the splitter. One can do much better by using a so-called “leaky” polarizing beam splitter.
FIGURE 80.32 Schematic diagram describing the polar magneto-optical Kerr effect. Upon reflection from the surface of
a perpendicularly magnetized medium, the polarization vector undergoes a rotation. The sense of rotation depends on the
direction of magnetization, M, and switches sign when M is reversed.
? 2000 by CRC Press LLC
along the other axis splits off and branches to the side. The PBS is oriented such that in the absence of the Kerr
effect its two branches will receive equal amounts of light. In other words, if the polarization, upon reflection
from the disk, did not undergo any rotations whatsoever, then the beam entering the PBS would be polarized
at 45° to the PBS axes, in which case it would split equally between the two branches. Under this condition,
the two detectors generate identical signals and the differential signal DS will be zero. Now, if the beam returns
from the disk with its polarization rotated clockwise (rotation angle = q
k
), then detector #1 will receive more
light than detector #2, and the differential signal will be positive. Similarly, a counterclockwise rotation will
generate a negative DS. Thus, as the disk rotates under the focused spot, the electronic signal DS reproduces
the pattern of magnetization along the scanned track.
Materials of Magneto-Optical Data Storage
Amorphous rare earth transition metal alloys are presently the media of choice for erasable optical data storage
applications. The general formula for the composition of the alloy may be written (Tb
y
Gd
1–y
)
x
(Fe
z
Co
1-z
)
1–x
where terbium and gadolinium are the rare earth (RE) elements, while iron and cobalt are the transition metals
FIGURE 80.33 Differential detection scheme utilizes a polarizing beam splitter and two photodetectors in order to convert
the rotation of polarization to an electronic signal. E
úú
and E
^
are the reflected components of polarization; they are,
respectively, parallel and perpendicular to the direction of incident polarization. The diagram in (b) shows the orientation
of the PBS axes relative to the polarization vectors.
? 2000 by CRC Press LLC
(TM). In practice, the transition metals constitute roughly 80 atomic percent of the alloy (i.e., x . 0.2). In the
transition metal subnetwork, the fraction of cobalt is usually small, typically around 10%, and iron is the
dominant element (z . 0.9). Similarly, in the rare earth subnetwork Tb is the main element (y . 0.9) while
the gadolinium content is small or it may even be absent in some cases. Since the rare earth elements are highly
reactive to oxygen, RE-TM films tend to have poor corrosion resistance and, therefore, require protective
coatings. In multilayer disk structures, the dielectric layers that enable optimization of the medium for the best
optical/thermal behavior also perform the crucial function of protecting the MO layer from the environment.
The amorphous nature of the material allows its composition to be continuously varied until a number of
desirable properties are achieved. In other words, the fractions x, y, z of the various elements are not constrained
by the rules of stoichiometry. Disks with very large areas can be coated uniformly with thin films of these
media, and, in contrast to polycrystalline films whose grains and grain boundaries scatter the beam and cause
noise, amorphous films are continuous, smooth, and substantially free from noise. The films are deposited
either by sputtering from an alloy target or by co-sputtering from multiple elemental targets. In the latter case,
the substrate moves under the various targets and the fraction of a given element in the alloy is determined by
the time spent under each target as well as the power applied to that target. During film deposition the substrate
is kept at a low temperature (usually by chilled water) in order to reduce the mobility of deposited atoms and
thus inhibit crystal growth. The type of the sputtering gas (argon, krypton, xenon, etc.) and its pressure during
sputtering, the bias voltage applied to the substrate, deposition rate, nature of the substarte and its pretreatment,
and temperature of the substrate all can have dramatic effects on the composition and short-range order of
the deposited film. A comprehensive discussion of the factors that influence film properties will take us beyond
the intended scope here; the interested reader may consult the vast literature of this field for further information.
Defining Terms
Automatic focusing: The process in which the distance of the disk from the objective’s focal plane is contin-
uously monitored and fed back to the system in order to keep the disk in focus at all times.
Automatic tracking: The process in which the distance of the focused spot from the track center is contin-
uously monitored and the information fed back to the system in order to maintain the read/write beam
on track at all times.
Compact disk (CD): A plastic substrate embossed with a pattern of pits that encode audio signals in digital
format. The disk is coated with a metallic layer (to enhance its reflectivity) and read in a drive (CD
player) that employs a focused laser beam and monitors fluctuations of the reflected intensity in order
to detect the pits.
Error correction coding (ECC): Systematic addition of redundant bits to a block of binary data, as insurance
against possible read/write errors. A given error-correcting code can recover the original data from a
contaminated block, provided that the number of erroneous bits is less than the maximum number
allowed by that particular code.
Grooved media of optical storage: A disk embossed with grooves of either the concentric-ring type or the
spiral type. If grooves are used as tracks, then the lands (i.e., regions between adjacent grooves) are the
guard bands. Alternatively, lands may be used as tracks, in which case the grooves act as guard bands.
In a typical grooved optical disk in use today the track width is 1.1 mm, the width of the guard band is
0.5 mm, and the groove depth is 70 nm.
Magneto-optical Kerr effect: The rotation of the plane of polarization of a linearly polarized beam of light
upon reflection from the surface of a perpendicularly magnetized medium.
Objective lens: A well-corrected lens of high numerical aperture, similar to a microscope objective, used to
focus the beam of light onto the surface of the storage medium. The objective also collects and recollimates
the light reflected from the medium.
Optical path: Optical elements in the path of the laser beam in an optical drive. The path begins at the laser
itself and contains a collimating lens, beam shaping optics, beam splitters, polarization-sensitive elements,
photodetectors, and an objective lens.
Preformat: Information such as sector address, synchronization marks, servo marks, etc., embossed perma-
nently on the optical disk substrate.
? 2000 by CRC Press LLC
Sector: A small section of track with the capacity to store one block of user data (typical blocks are either
512 or 1024 bytes). The surface of the disk is covered with tracks, and tracks are divided into contiguous
sectors.
Thermomagnetic process: The process of recording and erasure in magneto-optical media, involving local
heating of the medium by a focused laser beam, followed by the formation or annihilation of a reverse-
magnetized domain. The successful completion of the process usually requires an external magnetic field
to assist the reversal of the magnetization.
Track: A narrow annulus or ring-like region on a disk surface, scanned by the read/write head during one
revolution of the spindle; the data bits of magnetic and optical disks are stored sequentially along these
tracks. The disk is covered either with concentric rings of densely packed circular tracks or with one
continuous, fine-pitched spiral track.
Related Topics
42.2 Optical Fibers and Cables ? 43.1 Introduction
References
A. B. Marchant, Optical Recording, Reading, Mass.: Addison-Wesley, 1990.
P. Hansen and H. Heitman, “Media for erasable magneto-optic recording,” IEEE Trans. Mag., vol. 25,
pp. 4390–4404, 1989.
M. H. Kryder, “Data-storage technologies for advanced computing,” Scientific American, vol. 257, pp. 116–125,
1987.
G. Bouwhuis, J. Braat, A. Huijser, J. Pasman, G. Van Rosmalen, and K. S. Immink, Principles of Optical Disk
Systems, Bristol: Adam Hilger Ltd., 1985, chap. 2 and 3.
Special issue of Applied Optics on video disks, July 1, 1978.
E. Wolf, “Electromagnetic diffraction in optical systems. I. An integral representation of the image field,” Proc.
R. Soc. Ser. A, vol. 253, pp. 349–357, 1959.
M. Mansuripur, “Certain computational aspects of vector diffraction problems,” J. Opt. Soc. Am. A, vol. 6,
pp. 786–806, 1989.
D. O. Smith, “Magneto-optical scattering from multilayer magnetic and dielectric films,” Opt. Acta, vol. 12,
p. 13, 1965.
P. S. Pershan, “Magneto-optic effects,” J. Appl. Phys., vol. 38, pp. 1482–1490, 1967.
K. Egashira and R. Yamada, “Kerr effect enhancement and improvement of readout characteristics in MnBi
film memory,” J. Appl. Phys., vol. 45, pp. 3643–3648, 1974.
H. S. Carslaw and J. C. Jaeger, Conduction of Heat in Solids, London: Oxford University Press, 1954.
P. Kivits, R. deBont, and P. Zalm, “Superheating of thin films for optical recording,” Appl. Phys., vol. 24,
pp. 273–278, 1981.
M. Mansuripur, G. A. N. Connell, and J. W. Goodman, “Laser-induced local heating of multilayers,” Appl. Opt.,
vol. 21, p. 1106, 1982.
J. Heemskerk, “Noise in a video disk system: experiments with an (AlGa)As laser,” Appl. Opt., vol. 17, p. 2007,
1978.
A. Arimoto, M. Ojima, N. Chinone, A. Oishi, T. Gotoh, and N. Ohnuki, “Optimum conditions for the high
frequency noise reduction method in optical video disk players,” Appl. Opt., vol. 25, p. 1398, 1986.
M. Mansuripur, G. A. N. Connell, and J. W. Goodman, “Signal and noise in magneto-optical readout,” J. Appl.
Phys., vol. 53, p. 4485, 1982.
J. W. Beck, “Noise considerations of optical beam recording,” Appl. Opt., vol. 9, p. 2559, 1970.
S. Chikazumi and S. H. Charap, Physics of Magnetism, New York: John Wiley, 1964.
B. G. Huth, “Calculation of stable domain radii produced by thermomagnetic writing,” IBM J. Res. Dev.,
pp. 100–109, 1974.
A. P. Malozemoff and J. C. Slonczewski, Magnetic Domain Walls in Bubble Materials, New York: Academic Press,
1979.
? 2000 by CRC Press LLC
A. M. Patel, “Signal and error-control coding,” in Magnetic Recording, vol. II, C. D. Mee and E. D. Daniel, Eds.
New York: McGraw-Hill, 1988.
K. A. S. Immink, “Coding methods for high-density optical recording,” Philips J. Res., vol. 41, pp. 410–430, 1986.
L. I. Maissel and R. Glang, Eds., Handbook of Thin Film Technology, New York: McGraw-Hill, 1970.
G. L. Weissler and R. W. Carlson, Eds., Vacuum Physics and Technology, vol. 14 of Methods of Experimental
Physics, New York: Academic Press, 1979.
T. Suzuki, “Magneto-optic recording materials,” Mater. Res. Soc. Bull., pp. 42-47, Sept. 1996.
K. G. Ashar, Magnetic Disk Drive Technology, New York: IEEE Press, 1997.
Further Information
Proceedings of the Optical Data Storage Conference are published annually by SPIE, the International Society
for Optical Engineering. These proceedings document the latest developments in the field of optical recording
each year. Two other conferences in this field are the International Symposium on Optical Memory (ISOM),
whose proceedings are published as a special issue of the Japanese Journal of Applied Physics, and the Magneto-
Optical Recording International Symposium (MORIS), whose proceedings appear in a special issue of the Journal
of the Magnetics Society of Japan.
? 2000 by CRC Press LLC