When & Where
The second exam will be held online on March 30, 2020. It will be available starting at 6:20pm and you have until 10:00pm to complete it.
The exam will be administered as a Canvas quiz and will contain four short-answer and 16 multiple-choice questions as well as an honor pledge. I expect that it will take most of you well under an hour to complete but I do not want time to be a stress factor, so you will have at least three hours to complete the exam.
Be sure you are in an environment where you have reliable Internet connectivity and are free of distractions during the allotted time.
The exam is an individual exam. You may refer to your notes but you may not work in a group or consult classates or other people. I rely on your honor for this. Any evidence of a violation or plagerism will result in a grade of 0 on the exam and a grade of no higher than a D (or F if taking a pass/fail option).
I want to try to avoid using ProctorTrack since I feel it’s uncomfortably invasive. If this format does not work well, I may have to resort to it for exam 3.
You can use my past exams as a guide to what this exam may look like.
I do not refer to old exams when I come up with a new one, so it is likely that many of the topics that I considered important in past exams will show up on future exams. Some material may have changed, however, so do not worry about questions that appear to relate to topics we have not covered.
You are responsible for the material since exam 1, weeks 5–8.
I’ve prepared a study guide that attempts to cover most of the material you should know and will amend it during the week. It is not a substitute for the lectures, lecture material, and other reading matter. My goal is to put most of the information you need to know in as concise a form as possible, with more elaboration in areas where textbook coverage may be lacking.
Topics that you should know and may be on the exam include:
- Understand properties: safety, liveness, fairness
- Types: centralized, token-based, contention-based
- Central-server algorithm
- Distributed mutual exclusion algorithms:
- Token ring
- Ricart & Agrawala
- Understand pros and cons of each
- Bully algorithm
- Ring algorithm
- Chang and Roberts ring algorithm optimizations (I won’t ask you to recite the algorithm but understand what it does)
- Network partition problem (split-brain/segmentation)
- Purpose of replicated state machines
- Understand consensus requirements
- Uniform agreement
- Termination (= progress)
- Goal: fault tolerant consensus algorithm
- States that a server may take
- Terms and term numbers
- Leader election, timeouts, heartbeat, split votes
- RPCs used
- Log matching, committing entries
Network Attached Storage
- File service, file server
- Service models: remote access vs. upload/downloa
- Consistency: sequential semantics, session semantics, ambiguous semantics
- Stateful vs. stateless design
- Caching considerations
- Case studies
- Mounting filesystems
- You do not need to know the automounter (we did not discuss it in class)
- UDP vs TCP transport
- Directory and file access protocol (don’t memorize the list of RPC calls but understand why they exist, what lookup does, why there’s no open)
- Inconsistencies because of caching & validation
- Know that a user-level lock manager was added to NFS but otherwise don’t bother with features of successive versions.
- Don’t memorize features of NFS v4 but know:
- the protocol is now stateful
- supports better caching (similar to oplocks) - compound RPC support added
- AFS (versions 1,2)
- Service model, semantics
- Cache coherence (callbacks)
- Replicated volumes
- Accessible Volume Storage Group
- Client modification log
- Disconnected operation
- DFS (AFS v.3)
- Token mechanism: don’t memorize tokens but understand the goals
- High-level protocol (RPC-like)
- Caching model: oplocks/leases (don’t memorize oplocks but know what their purpose is)
- Microsoft DFS: just know that it adds consistent naming, similar to AFS
- SMB2 – understand these concepts that improved performance
- Credit-based flow control
Distributed file systems
- Purpose: lock manager, name server, and simple file system
- Architecture of a Chubby cell: master and replicas
- Named locks
- Event notification
- Whole file reads and writes
- Don’t try to memorize the API
- Client cache consistency
- Use of Paxos
What is a parallel file system?
- chunkservers - what do they do?
- master - what does it do?
- Google cluster environment: colocate computation and data on same set of machines
- chunk handle
- operation log
- why large chunks?
- replication: data flow vs. control flow
- HDFS is practically a clone of GFS. Focus on GFS terminology. If I use any HDFS terminology, I will identify the GFS equivalent
- NameNode = Master
- DataNode = Chunkserver
- blocks = chunks
- Transaction log = Operation log
- You do not have to know any of the differences between GFS and HDFS (they’re minor)
- Rack-aware distribution - client tries to read from the closest replica
Distributed Lookup (Distributed Hash Tables)
- Purpose: distributed, highly available key-value store
- Central coordinator
- What is an overlay network?
- Back propagation, time to live
- Distributed hash table (DHT)
- What is consistent hashing?
- Content-Addressable Network
- Multi-dimensional hashes, zone, node insertion
- Ring, successor nodes
- How do finger tables make query routing more efficient?
- Amazon Dynamo
- Understand the goal: distributed, highly available key-value store
- Functional interface: get, put
- Consistency model for replication (eventually consistent)
- Goals: incremental scalability, symmetry, decentralization, heterogenous systems
- Conflict resolution
- Who does it?
- Use of vector timestamps for conflict detection
- Partitioning among multiple nodes
- What is a virtual node and why are they a good idea?
Properties of transactions (ACID)
Nested transactions: private work spaces, write-ahead logs
Two-phase commit protocol
- Understand the need for a log
- What happens in each of the phases?
- Problems if coordinator or participants die
Three-phase commit protocol
- Coordinator recovery
- Understand the conditions of what happens to the protocol when the coordinator crashes (slide 33)
- Problems with 3PC
What does applying a consensus protocol (e.g., Raft) achieve?
Scaling via replication and Brewer’s CAP Theorem
- Fault tolerance
Eventual consistency: BASE approach vs. ACID
- Schedules, serial schedule
- Lock manager
- Two-phase locking (2PL)
- What is the purpose? Preserves serializability (“Isolated” in ACID)
- Cascading aborts
- Strong strict two-phase locking (SS2PL)
- Separating read and write locks
- Read locks (shared locks), write locks (exclusive locks), commit locks
- Two-version-based concurrency control
- Timestamp ordering
- Optimistic concurrency control: just the concepts
- Validation phase
- Update phase
- Use of timestamp ordering
- Multiversion Concurrency Control
- Snapshot isolation
- Use of timestamps
- Conditions for deadlock, deadlock resolution
- Centralized detection
- Wait-for graph, Global wait-for graph
- Phantom deadlock (what’s meant by this?)
- Detection: Chandy-Misra-Haas edge chasing algorithm
- Just the high level: a probe message circulates back to the originator
- Prevention: wait-die, wound-wait
- Just the high level: cyclies are prevented because transactions can only wait on older (or younger) resources.
- I will not ask you to remember which is wait-die (old can only wait on young) or which is wound-wait (old kills young)