Goal: Determine if a transactions will wait for resources in such a manner as to create an indefinite wait, called deadlock. Find ways to ensure that this will not happen.
A deadlock occurs when there is a circular dependency on processes holding and requesting resources. The four conditions that must hold are:
- mutual exclusion: A resource can be held by at most one process.
- hold and wait: Processes that already hold resources can wait for another resource.
- non-preemption: A resource, once granted, cannot be taken away.
- circular wait: Two or more processes are waiting for resources held by one of the other processes.
Three approaches can be used for managing deadlock in distributed systems.
- A centralized deadlock detection approach uses a central coordinator to manage a resource graph of processes and the resources they are using. Each time a process wants to grab or releases a resource, it sends a message to this coordinator (waiting-for or releasing). The coordinator builds a graph of all the processes and the resources they are holding or waiting for. This is called a wait-for graph. If a cycle is detected, in the graph then the coordinator knows a deadlock exists. In some cases, if release and waiting-for messages are received out of order, they can lead the coordinator to believe that there is a deadlock cycle when none really exists. In reality the release message should have been processed first and would cause the deadlock to not happen. This condition is known as phantom deadlock.
- The Chandy-Misra-Haas distributed deadlock detection algorithm has a process send a probe message to a process that is holding a resource prior to waiting for the resource. The receiving process forwards the probe to every process that contains resources it is waiting for. This is called edge chasing. If the original process receives its own probe message then it knows that a dependency cycle, and hence deadlock, will exist if it waits for the resource it wants.
- Deadlock prevention approaches require processes to access resources in restricted ways to ensure that a deadlock cannot occur. The approach to implementing this is to make decisions based on the timestamps of each transaction competing for resources.
- The wait-die algorithm states that if a younger process is using the resource, then the older process (that wants the resource) waits. If an older process is holding a resource, then the younger process that wants the resource kills itself (that’s ok; transactions are designed to be restartable). This forces the resource utilization graph to be directed from older to younger processes, making cycles impossible.
- The wound-wait algorithm ensures that the graph flows from young to old and cycles are again impossible. an old process will preempt (kill) the younger process that holds a resource. If a younger process wants a resource that an older one is using, then it waits until the old process is done.
- This requires predicting the precise resources that will be needed, the times they will be needed, and which processes will need them and manage resource allocation or transaction scheduling to ensure that this will not happen. This is generally impossible to predict and hence is not a practical approach.