Operating Systems Lecture Notes

16 April 2012 • Distributed File Systems

Outline

Background
Naming and transparency.
Remote file access.
Stateful vs stateless service.
File replication.

Background

A distributed file system is a file system connected to the rest of the system via a ~~bus~~ network.

Consequences

Network-based disks aggregating is much easier than is bus-based aggregating.
- At the cost of managing dispersed storage units.
Network-based disks heterogeneity is easier than is bus-based heterogeneity.
File systems can become unified and semantically richer.
- Back-up and cd-jukebox file systems.

Structure

A distributed file system is a service offered by a server to clients.
- Service: an interface and associated semantic definitions (behaviors).
- Server: A system (software + hardware) implementing a service.
- Client: A process (software) invoking service operations.
The service offered should look a lot like file-system system calls.

Location

Where is the file?
Location transparency makes that question hard to answer.
- Under regular file-system semantics.
- But maybe possible under distributed file system semantics
Naming maps logical to physical objects.
- File names and files, for example.
- An extended dictionary look-up.

Location Properties

Assigning meaning and operations to location can be subtle.
Location transparency is the ideal: logical objects are location agnostic.
```
SquareListRow.scm
```
Location independence adds logical location information to logical objects.
```
yesterday:SquareListRow.scm
```

Naming Locations

Non-transparent naming: <host>:<file>, a non-logical mapping.
- Globally unique names (usually).
Explicitly patching in remote file systems into local file systems.
- mounting directories.
Automatically global file systems.
- A difficult semantic and operational problem.

Caching

Networks add delay and latency to file operations.
Client-system caching is almost mandatory for good (tolerable) performance.
- Satisfy requests out of the client cache.
A file can have pieces distributed over several client caches.
- This is the cache-consistency problem with bells on.

Cache Location

Remote-file caches can be in primary store, or on disk.
Disk-based caches are stable (with care) and capacious, but slow.
Store-based caches are fast, but small and unstable.
Most likely, some combination of both.
There may be (usually are) distinct client and server caching needs.

Cache Management

The usual concerns apply, magnified by the network.
- Reliability decreases in strange ways.
- Cache invalidation can occur from anywhere.
Write-through improves reliability, but at high performance cost.
Write-back has better performance, but costs reliability.

Consistency

A process somewhere writes to a file. Where are those changes visible?
Several processes in several places write to a file. What's the result?
Client-side copies make these questions hard to answer.
- Outlawing client-side copies is not practical.
Let's skip conflicting concurrent modifications.

Consistency Management

Two general approaches: client-based and server-based.
- The difference is in the check initiator.
A client validates its copies with the server (pull).
- Client sends a checksum to the server.
A server informs clients their copies are outdated (push).
- Server sends notices to clients.

Caching v Consistency

What caching gives it takes away.
- Client-side caching reduces network traffic for file operations.
- But increases traffic due to consistency management.
The task is to strike a balance caching and consistency.
- This is a deal with the devil.

But Wait

Other factors intervene:
- File semantics - temp or private files.
- Access patterns - the read-write mix, the amount.
- Network characteristics - LAN vs WAN.
These characteristics may not be static.

Server Side

The server can look like a file system with primary-store buffers.
- A familiar system with lots of advantages.
  - Better performance, semantically rich operations, more efficient function.
But crashes are catastrophic.
- Collected information disappears, along with the caches.

Stateless Servers

Take the opposite approach: the server holds no information.
- Except for simple performance tricks.
Nothing is lost when a server crashes, and recovery's fast.
- But the server doesn't know anything.
Clients have to be detailed on every operation.
- The state has moved from the server to the client.

Statefull v Stateless

Stateless servers
- are fast and simple in operation and recovery.
- offer simple operations and off-load work to the clients.
Statefull servers
- can be fast, but usually aren't simple
- offer complex and low-cost operations.
This is the classic time-space trade-off.

Comments

There is a network time-space trade-off at work too.
- Stateless servers tend toward larger, more frequent messages.
Some file systems fit better with one or the other.
- UNIX (file, offset) semantics is statefull.
Stateless servers are closer to fancy disks,
statefull servers are closer to fancy file systems.

Duplication

Have duplicate server systems.
The naming scheme should be replication transparent (arguable).
More copies should mean more reliability and better performance.
- If one server's down, go to a duplicate.
- Put a duplicate near a busy client.

Network File System

The Network File System (NFS) developed by Sun Microsystems in the mid 80s’.
- NFS is an open standard, managed by the IETF.
Originally developed to provide Unix-like file semantics over a LAN.
- Services have grown over successive versions.
- Currently at version 4.1 (RFC 5661).

NFS Basics

An NFS server provides directories in its local file system.
Clients mount NFS-offered directories into their local file systems.
- It's transparent (mostly) after the mount.
Clients and servers communicate via the NFS protocol over UDP (mostly).
- No other client-server relation assumed.

Summary

A distributed file system puts a network between the file system and everything else.
Location transparency lets various degrees of the network show through.
Using a network accentuates old issues and raises some new ones.
- Time-space trade-offs, consistency, replication, naming.

References

Distributed File Systems: Concepts and Examples by Eliezer Levy and Abraham Silberschatz in Computing Surveys, December 1990.
A Comparison of Three Distributed File System Architectures: Vnode, Sprite, and Plan 9 by Brent Welch in Computing Systems, Spring, 1994.

This page last modified on 2012 April 18.