1
Network File
System Concepts
Chuan-Ming Liu
Computer Science and Information Engineering
National Taipei University of Technology
Taipei,TAIWAN
2
Outline
INTRODUCTION
REMOTE FILES
REVIEW OF THE UNIX FILE SYSTEM
NFS OVERVIEW
FILE HANDLE
STATELESS SREVER
MOUNT AND TRANSPORT PROTOCOLS
3
Introduction
This chapter describes the general concept
of remote file access and reviews the
concepts underlying a particular remote
access mechanism
Because the remote file access mechanism
derives many ideas and details from the
underlying OS,the chapter reviews the
Linux file system and the semantics of file
operations
4
Outline
INTRODUCTION
REMOTE FILES
REVIEW OF THE UNIX FILE SYSTEM
NFS OVERVIEW
FILE HANDLE
STATELESS SREVER
MOUNT AND TRANSPORT PROTOCOLS
5
Remote File Access vs,
Transfer
File transfer services,invoked by users
File access services,invoked by
application programs to transfer one small
block of data at a time
6
Remote File Access vs,
Transfer (Cont.)
The server checks each request to verify
that the client is authorized to access the
specified file,performs the specified
operation,and returns a result to the client
The Network File System,defined by Sun
Microsystems,is a remote file access
mechanism that has been widely accepted
through the computer industry
7
Operations on Remote
Files
NFS provides the same operations (open,
read,write,seek,and close) on remote
files that one expects to use on local files
seek to a specified position in the file
8
File Access Among
Heterogeneous Computers
A remote file access service must handle
differences in the way the client and server
systems name files,denote paths through
directories,and store information about files
The file access software must accommodate
differences in the semantic interpretation of
file operations
NFS was designed to accommodate
heterogeneous computer systems
9
Outline
INTRODUCTION
REMOTE FILES
REVIEW OF THE UNIX FILE SYSTEM
NFS OVERVIEW
FILE HANDLE
STATELESS SREVER
MOUNT AND TRANSPORT PROTOCOLS
10
NFS and Unix File Semantics
The NFS designers adopted UNIX file
system semantics when defining the
meaning of individual operations
Understanding the original UNIX file
system is essential to understanding NFS
because NFS uses the UNIX file system
terminology and semantics
11
Review of the UNIX File System
Basic definition
UNIX defines a file to consist of a sequence of
bytes
UNIX files can grow dynamically
UNIX numbers the bytes in a file starting at
zero
The size of a file is defined to be the number
of bytes
Permit random access to any file
12
Review,A Byte Sequence
Without Record Boundaries
UNIX does not have notions of record
boundaries,record blocking,indexed files,
or typed files found in other systems
The point is that the file system itself does
not understand the file contents,
applications that use a file must agree on
the format
13
Review,A File’s Owner and
Group Identifier
Each file has a single owner,represented by
the numeric identifier of the user who created
the file
Ownership information is stored with the file
Assign a subset of users a numeric group
identifier
The system compares the owner and group
identifiers stored with a file to the user and
group identifiers of a particular application
process to determine what operations that
program can perform on the file
14
Review,Protection and
Access
Fig,1 shows that file access permissions
(called file mode,protection mode,or file
access mode) can viewed as matrix of
protection bits
Fig,2 illustrates how UNIX encodes file
protection bits into 9 low-order bits of a file
mode integer
UNIX defines additional bits of the mode
integer to specify other properties of the file
(e.g.,mode bits specify whether the file is a
regular file or a dictionary)
17
Review,The Open-Read-
Write-Close Paradigm
Applications that run on system such as Linux
use the open-read-write-close paradigm
For example,
fdesc = open(“filename”,O_CREAT |
O_RDWR,0644)
O_CREATE,specifies that the file should be
created if it does not already exist
O_RDWR,specifies that the file should be created
for both reading and writing
18
Review,Data Transfer
An application uses read to transfer data
from a file into memory,and use write to
transfer data from memory to a file
For example,n = read(fdesc,buf,24)
Both read and write begin transfer at the
current file position,and both operations
update the file position when they finish
19
Review,Data Transfer
If an application attempts to read more
bytes than the file contains,the read
function extracts as many bytes as the file
contains and returns the number read as its
result
If the file is positioned at the end of a file
when an application calls read,the read
call returns zero to indicate an end-of-file
condition
20
Review,Permission to Search
a Dictionary
UNIX systems organize files into a
hierarchy using directories to hold files and
other directories
The directory permissions (the same 9-bit
protection bit) only specify which
operations are allowed on the directory
itself
21
Review,Permission to Search
a Dictionary
UNIX interprets the execute permission
bits for directories to mean search
permission
If an application has search permission,it
can reference a file that lies in a directory
Search permission can be used to hide or
uncover an entire subtree of the file
directory without modifying the
permissions on individual files in the
subtree
22
Review,Random Access
After a file has been opened,the position
can be changed by calling function lseek
Lseek takes three arguments that specify a
file descriptor,an offset,and a measure for
the offset:
lseek(fdesc,100L,L_SET)
The current position of the file with descriptor
fdesc should be moved to byte number 100
23
Review,Seeking Beyond the
End of File
If an application seeks beyond the end of
file and writes new data,the file system
extends the file
Fig,3 illustrates the concept
Thus,the file size records the highest byte
position into which data has been written,
not the total number of bytes written
25
Review,File Position and
Concurrent Access
The UNIX file system permits multiple
application programs to access a file
concurrently
The descriptor for each open file
references a data structure that records a
current position in the file
26
Review,File Position and
Concurrent Access (cont.)
When a process calls fork to create a new
process,the child inherits copies of all file
descriptors the parent had opened at the
time of the fork
Each call to open generates a new
descriptor with a file position that is
independent of that obtained by previous
calls to open
27
Review,File Position and
Concurrent Access (cont.)
Separating the current file position from
the file itself permits multiple applications
to access a file concurrently without
interference
28
Review,Semantics of Write
During Concurrent Access
UNIX systems do not provide mutual
exclusion or define the semantics of
concurrent access except to specify that a file
always contains the data written most
recently
Responsibility for correctness falls to the
programmers (e.g.,using a lock mechanism,
flock)
A programmer must be careful to construct
concurrent programs in such a way that they
always produce the same results
29
Review,File Names and Paths
Fig,4 illustrates the names of files and
directories for an example hierarchical file
system
Each directory or file has a full path name
that denotes the position of the file within
the hierarchy
31
Review,inode
A UNIX file system stores information
about each file on stable storage and the
information is kept in a structure known as
the file’s inode
32
Review,inode (cont.)
The inode contains,
the owner and group identifiers
the file mode descriptor
the time of last access
the time of last modification
the file size
the disk drive and file system on which the file resides
the number of directory entries for the file
the number of disk blocks currently used by the file
the basic types (e.g.,regular files or directory)
33
Review,inode (cont.)
UNIX separates information such as
ownership and file protectors bits from the
directory entry for a file
Doing so makes it possible to have two
directory entries that point to the same file
Fig,5 shows an illustration of hard links or
links (refer to directory entries for a file)
35
Review,Stat Operation
The system function stat (pathname,
&result_struct) extracts information about
a file form its inode and returns the
information to the caller
The second argument specifies the address of
a structure into which it places the result and
must be the address of an area in memory
large enough to hold the structure in Fig,5a
37
Review,The File Naming
Mechanism
Instead of forcing users to identify a drive
as well as a file,the designers invented the
idea of allowing the system manager to
attach the hierarchy on one drive to the
hierarchy on another
This result is a single,unified file
namespace that permits the user to work
without knowing the location of files
38
Review,The File Naming
Mechanism (cont.)
The naming mechanism operates as follows:
The manager designates the hierarchy on one
of the drives to be the root
The manager creates an empty directory in the
root hierarchy; Let the full path name of the
empty directory be given by a string /a
The manager instruct the naming mechanism
to overlay a new hierarchy (usually one from
some other drive) over directory /a
39
Review,The File Naming
Mechanism (cont.)
The file naming mechanism provides users
and application programs with a single,
uniform file hierarchy even though the
underlying files span multiple physical
disks
It also allows a system manager to
partition a single physical disk drive into
one or more file systems; Each file system
is an independent hierarchy
40
Review,File System
Mounts
The system manager uses mount (usually
automatically at system startup) to specify how
a file system on one drive should be attached in
the hierarchy
Fig,6 illustrates 3 file systems that has been
mounted to form a single hierarchy
The user cannot distinguish where files and
directories are because mounting hides all the
boundaries under a uniform naming scheme
42
Review,File System
Mounts (cont.)
The system includes an executable
application,mount,that queries the system,
and then displays the list of mounted file
systems
The concept of mounting file systems to
form a single hierarchy provides incredible
flexibility
It also provides a convenient way to
introduce remote files into a hierarchy
43
Review,File Name
Resolution
In a Linux system,name resolution means
finding the inode that identifies a file
A path name is resolved one component at
a time
Resolution begins at the root of the
hierarchy and at the component of the path
It repeatedly extracts the next component
from the path and finds a file or
subdirectory with that name
44
Review,Symbolic Links
A symbolic link (shortcut) is a special text
file that contains the name of another file
For example,if a program opens file /a/b/c,
the system finds that it contains a symbolic
link with value /a/q and automatically
switches to file /a/q
The chief advantage of symbolic links lies
in their generality,because it can name any
file directory
45
Review,Symbolic Links
(cont.)
It can be used to abbreviate a long path
name or to make a directory in a distant
part of the hierarchy appear to be much
closer
The chief disadvantage of symbolic links
arrives from their lack of consistency and
reliability,a symbolic link to a nonexistent
object or a set of symbolic links that forms
a cycle
46
Outline
INTRODUCTION
REMOTE FILES
REVIEW OF THE UNIX FILE SYSTEM
NFS OVERVIEW
FILE HANDLE
STATELESS SREVER
MOUNT AND TRANSPORT PROTOCOLS
47
Files Under NFS
NFS uses many of the UNIX file system
definitions
The next sections describe several features
of NFS and show how they relate to the
UNIX file system described earlier
48
NFS File Types
NFS uses the same basic file types as
UNIX
It defines enumerated values that a server
can use when specifying a file type,as
shown in Fig,6a
UNIX permits system managers to
configure I/O devices in the file system
namespace
50
NFS File Types (cont.)
NFS has adopted UNIX’s terminology that
divides I/O devices into block-oriented (e.g.,a
hard-drive that always transfer data in 512-
byte blocks) and character-oriented (a serial
port)
A file name that corresponds to a block-
oriented device has type block-special,while
a name that corresponds to a character-
oriented device has type character-special
51
NFS File Modes
Like UNIX,NFS assumes that each file or
directory has a mode that specifies its type
and access protection
Fig,7 lists individual bits of the NFS file
mode integer and gives their meanings
The definitions correspond directly to those
returned by the UNIX stat function
53
NFS File Modes (cont.)
Although NFS defines file protection bits
that determine whether a client can read or
write a particular file,NFS denies a remote
machine access to all devices,even if the
protection bits specify that access is
allowed
54
NFS File Attributes
NFS uses the term file attributes when
referring to file information
Structure fattr3 describes the file attributes
that NFS provides,as shown in Fig,7a
(similar to the information that the UNIX
stat function returns)
56
NFS Client and Server
An NFS server runs on a machine that has
a local file system
An NFS client runs on an arbitrary
machine,and accesses one or more remote
machines that each has an NFS server
Fig,8 shows procedures in an OS that
called when an application opens a file
58
File Server
Computer has large disks to serve as an
NFS server
Forbids users from running applications
Keep the load low
Accelerate the response to access requests
59
NFS Client Operation
The path name syntax used by the remote
file system may differ from that of the
client machine (e.g.,Windows uses
backslash (\) vs,UNIX uses slash (/))
To accommodate potential differences
between the client and server path name
syntax,NFS follows a simple rule,only the
client side interprets full path names
60
NFS Client Operation
(cont.)
To trace a full path name through the
server’s hierarchical directory system,the
client sends each individual path name
component one at a time and receives
information about the file or directory it
names (e.g.,to look up path name /a/b/c on
a server,root,a,b,c)
The chief disadvantage of requiring the
client to parse path names,It requires an
exchange across the network for each
component in the path
61
NFS Client Operation
(cont.)
The chief advantage is both the applications
and the client code can be written to access
remote files without knowing where files will
be located or the naming conventions used
by the file systems on the servers
To keep applications on client machines
independent of file locations and server
computer systems,NFS requires that only
clients interpret full path names
62
NFS Client and UNIX
Systems
Implementations of NFS client code for a
UNIX system employ an extended version
of the mount mechanism (chief advantage,
consistency) to integrate remote file
systems into the naming hierarchy along
with local file systems
63
NFS Client and UNIX
Systems (cont.)
When an application performs an operation
on a file descriptor (e.g.,read),the system
checks to see whether the descriptor refers
to a local file or a remote file.
If the file is remote,the OS calls NFS
client code that translates the operation into
an equivalent NFS operation and places an
RPC to the server
64
NFS Mounts
When managers add NFS mount entries to
a UNIX mount table,they must specify a
remote machine that operates an NFS
server,a hierarchy on that server,a local
directory onto which the mount will be
added,and information that specifies
details about the mount
For example,the output from the UNIX
mount,is shown in Fig,8a
66
NFS Mounts (cont.)
NFS defines two basic paradigms for
remote mounts,soft mount and hard
mount
Using a soft mount specifies that an NFS
client should implement a timeout
mechanism and consider the server offline
if the timeout expires
Using a hard mount specifies that an NFS
client should not use a timeout mechanism
67
NFS Mounts (cont.)
Usually,all mounts are created
automatically at system startup
68
Outline
INTRODUCTION
REMOTE FILES
REVIEW OF THE UNIX FILE SYSTEM
NFS OVERVIEW
FILE HANDLE
STATELESS SREVER
MOUNT AND TRANSPORT PROTOCOLS
69
File Handle
NFS arranges for a server to assign each
file a unique file handle that it uses as an
identifier
The server makes up a handle and sends
it to the client when the client first opens
the file
The client sends the handle back to the
server when it requests operations on the
file
70
File Handle (Cont.)
From the client’s point of view,the file
handle consists of a 64-byte string that the
server used to identify a file
In NFS terminology,a file handle is
opaque to the client,meaning that a client
cannot decode the handle or fabricate a
handle itself
71
File Handle (cont.)
Servers choose some of the bits in a handle
at random to help ensure that clients cannot
fabricate a valid handle
To improve security,the server encodes a
timestamp in the handle to limit a handle’s
life-time
72
Handles Replace Path
Names
When an application specifies a file,the
application gives a complete path
The OS follows the usual algorithm to
resolve the path name --- it starts at the file
system root and looks up each component
of the path
When it reaches a remote mount point,the
local OS passes control to the NFS client,
as shown in Fig,9 (look up a file with path
/a/b/c in the server’s hierarchy)
74
Outline
INTRODUCTION
REMOTE FILES
REVIEW OF THE UNIX FILE SYSTEM
NFS OVERVIEW
FILE HANDLE
STATELESS SREVER
MOUNT AND TRANSPORT PROTOCOLS
75
Stateless Servers
The NFS design stores state information at
the client side,allowing servers to remain
stateless
Because the server is stateless,disruptions
in service will not affect client operation
Because a stateless server does not need to
allocate resources for each client,a
stateless design can scale to handle many
more clients than a stateful design
A server cannot keep any notion of
position,whether in a file or directory
76
File Positioning with a
Stateless Server
Because NFS uses a stateless server design,
the server cannot store a file position for
each application that is using a file
Storing file position information at the
client’s local file table helps optimize
operations that change the file position
If the client calls lseek,the system records
the new file position in the table without
sending a message to the server
77
Operations on Directories
NFS defines a directory to consist of a set
of pairs,a file name and a pointer to the
named file
NFS provides operations that permit a
client to,insert a file in a directory,delete
a file from a directory,search a directory
for a name,and read the contents of a
directory
78
Reading a Directory
Statelessly
Because directories can be arbitrarily large
and communication networks impose a
fixed limit on the size of a single message,
reading the content of a directory may
require multiple requests
The NFS designers choose to overcome the
limitations of stateless servers by arranging
for an NFS server to return a position
identifier when it answers a request for an
entry from a directory
79
Reading a Directory
Statelessly (cont.)
When a client wishes to read entries from a
remote directory,it steps through the
directory by making repeated requests that
each specify the position identifier (called
a magic cookie) returned in the previous
request
A magic cookie does not guarantee
atomicity,nor does it lock the directory
80
Reading a Directory
Statelessly (cont.)
Users seldom understand the details of
how the system interprets concurrent
directory operations because they rarely
need to know
81
Multiple Hierarchies in an
NFS Server
Later version of NFS allows a single server
to provide remote access to files located in
several hierarchies
It requires an addition mechanism to allow
a client to specify one of the possible
hierarchies and obtain a handle for its root
82
Outline
INTRODUCTION
REMOTE FILES
REVIEW OF THE UNIX FILE SYSTEM
NFS OVERVIEW
FILE HANDLE
STATELESS SREVER
MOUNT AND TRANSPORT PROTOCOLS
83
The Mount Protocol
NFS uses a mount protocol to handle the
problem of finding a root directory,which
is separate remote program,not part of the
NFS remote program
The mount provides 4 basic services that
clients need before they can use NFS:
84
The Mount Protocol (cont.)
It allow client to obtain a list of the directory
hierarchies that the client can access through
NFS
It accepts full path names that allow the client
to identify a particular directory hierarchy
It authenticates each client’s request and
validates the client’s permission to access the
requested hierarchy
It returns a file handle for the root directory of
the hierarchy a client specifies
85
The Mount Protocol (cont.)
A client system uses the mount protocol to
contact a server and verify access to the
remote file system before adding the remote
mount to its local hierarchical namespace
If the mount protocol approves access,the
client code stores the handle for the root of
the remote file system so it can use the
handle when an application tries to open a
file on that file system
86
Transport Protocols for NFS
The original version of NFS was designed
for use in an LAN environment,that
included basic facilities for timeout and
retransmission of requests
The early version of NFS did not work
well in a WAN,retransmission was not
adaptive,and did not include provisions for
either congestion control or end-to-end
flow control
87
Transport Protocols for NFS
Implementations of NFS are now available
that use TCP to achieve file access over the
connected Internet
88
Summary
To allow many clients to access a server
and to keep the servers isolated from client
crashes,NFS uses stateless servers
When a client looks up a particular
component name,the server returns a 64-
bit file handle that the client uses as
reference to the file or directory in
subsequent operations
89
Summary (cont.)
NFS adopted the open-read-write-close
paradigm used in UNIX,along with the
basic file types and file protection modes
A companion to NFS,the mount protocol
makes it possible for a single NFS server to
provide access to multiple directory
hierarchies
Once the client obtains a handle for the root,
it can use NFS procedures to access
directories and files in that hierarchy