APSP Floyd-Warshall
Problem
Given a directed weighted graph (with no negative weight cycles), we want to be able to answer queries of the form βwhat is the minimum distance between v and w?β quickly. This is a classic All Pairs Shortest Path Algorithm.
Attempt 1
Do not pre-process the graph at all. Run Dijkstra each time you get a query (if the query has the same source vertex as a previous query, no need to run Dijsktra again).
Then, to answer Q queries (Q>V), it would take O(VElogV). In case of a dense graph, this becomes O(V3logV)
Floyd-Warshall (FW)
Floyd and Warshall gave a beautifully simple algorithm to calculate APSP in O(V3) time. So, it is better to use FW rather than repeatedly running Dijsktra in case of a dense graph. In case of a sparse graph, running Dijkstra V times is faster than FW since logV grows slower than E=O(V). In fact, in case of a sparse graph, we donβt know how to solve APSP faster than O(V2logV).
Notice that if all the edge weights were identical, we could simply run BFS/DFS from each node and get the APSP in O(VE) time.
FW is also useful for finding the diameter of a graph (since diameter involves knowing the shortest path between all pairs of nodes, and then taking the maximum one)
FW outputs dist[v,w] to be the shortest distance from v to w, for all pairs of vertices (v,w).
Why can we use DP here? Well, shortest paths have an amazing amount of optimal sub-structure: We know that if P is the shortest path from u to v and it contains w, then P contains the shortest path from u to w and from w to v. So, a shortest path between two nodes is composed of the shortest path between many intermediary nodes too.
Moreover, (and more importantly) many shortest path calculations depend on the same sub-pieces. That is, they have overlapping subproblems!
Although it is easy to understand and believe that shortest paths have overlapping subproblems, it is much more difficult to correctly identify these right subproblems?
This is where the genius of Floyd and Warshall truly shines:
Let S[v,w,P] be the shortest path from v to w that only uses intermediate nodes (if any) from the set P. In other words, S[v,w,P] is the length of the shortest path from v to w that does not include any vertices not in P.
Let e(v,w) be the weight of an edge from v to w.
Then, our base case is: S[v,w,Ο]=e(v,w) (where Ο represents the empty set). That is, if you cannot use any intermediate nodes, the only way to get from v to w is through a direct edge, if any. If no such edge exists, e(v,w)=β.
But the problem now is that there are so many possible such sets P. In particular, there are 2V possible subsets of P that are all candidates and possible subproblems that we need to solve. But the question is, do we really need to solve all of them? Or can we only choose a few to solve and still guarantee that we get the correct answer.
This step represents another flair of creativity and intelligence by Floyd and Warshall. They realised that they only needed to solve n+1 subproblems (where n=V). In particular, only the following sets need to be considered as subproblems:
P0β=Ο
P1β=1
P2β=1,2
P3β=1,2,3
β¦
Pnβ=1,2,3,β¦,n
At each step, as we grow our set P, we are allowed to use more nodes as intermediate nodes if necessary. So, in the last step, we are allowed to use all possible n nodes as intermediate nodes, giving us the correct shortest paths.
How do we obtain the recurrence relation now, using the pre-calculated subproblems? Say, we are trying to calculate S[v,w,P8β] and we have already calculated all the shortest paths when we are allowed to use nodes in P7β as intermediaries.
Then, there are two possibilities:
The shortest path in S[v,w,P8β] includes 8 (the new shortest path contains 8 as an intermediate node)
Since 8 is an intermediate node along the shortest path from v to w using only the nodes in P8β, the shortest distance from v to w along this new path must be S[v,8,P7β]+S[8,w,P7β]
We cannot have any other nodes like 9,10,11,β¦ in this path from v to 8 or from 8 to w since they do not appear in the overall path from v to w using P8β.
The shortest path in S[v,w,P8β] does not include 8 (that is, even though we are allowed to use 8, it does not give us a shorter path)
So, the recurrence relation is S[v,w,P8β]=min(Β S[v,w,P7β]Β ,Β S[v,8,P7β]+S[8,w,P7β]Β ).

Optimising Space
Notice that we donβt actually need to store a 3-D matrix for S. To calculate the APSP using intermediate nodes in Piβ we only need the values from . So, a single 2-D matrix suffices as we store the current value and keep over-writing the old values, updating as and when necessary. Notice that it can never be the case that the shortest path from v to k includes k as an intermediate node (as that would indicate a negative cycle containing k). So we donβt need to worry about some values already using Pkβ while others using Pkβ1β while we are adding k to the set of allowed intermediate nodes as we progressively update the dp table.
Pseudocode

In the above code, the outer loop tracks which nodes are allowed to be used (the current value of k determines Pkβ). The inner 2 loops, deal with looking at every pair of nodes. We return a 2-D matrix where S[v][w] is the shortest path from node v to w.
Merely glancing at the 3 nested for loops should convince you that the running time of FW is O(V3)
In fact, FW is so elegant that it only needs 4 lines of code to express the key algorithm: 3 of which are for-loops, and one represents the sheer brilliance of the two computer scientists.
Note that there are V3 subproblems in FW: for each node (there are V nodes), you need to find the shortest path to all other nodes (again V nodes) using V sets of allowed intermediate nodes. So, VΓVΓV = O(V3). Another way to think about this is that we were originally using a 3-D matrix to store all our subproblems but now we are overwriting entries in a 2-D matrix (since we donβt need to store older values). In particular, each entry in the 2-D matrix is overwritten exactly V times (once for each iteration of the outermost loop) and since the size of the matrix is V2, there must be V3 subproblems. Using this, we also observe that all our time is spent filling this table and it takes is O(1) to fill up a cell in the memo table since we are just calculating the minimum of two values. So, the running time is O(V3).
Path Reconstruction
What if you want the path, along with the path length between any two nodes?
There is also an optimal substructure for this! If z is the first hop on the shortest path from v to w, then the shortest path from v to w is z+ shortest path from z to w.

So, rather than storing the entire path, it suffices to store just the first hop for each destination (e.g. routing table) and you can reconstruct the path between any pair of nodes.
You only need O(V2) space for this. For each pair of nodes, store the first node in the path. That is, path[v][w] stores the node that is 1 hop away from v in the path from v to w. Then, to reconstruct the path, keep looking at those 1-hop away nodes till you reach w.
For example, in the above diagram, path[v][w]=z. Then look at path[z][w] to determine the next node. And so on.
Actually, there is no reason we need to even store the βfirstβ node. You can store any node (say, z) along the shortest path from v to w and then recursively find the shortest path from v to z and z to w. In FW, you can store this intermediate node each time you modify/update the matrix entry for a pair.

Variants of FW
Transitive closure - Return a matrix M where M[v][w]=1 if there exists a path from v to w; M[v][w]=0 otherwise
Minimum bottleneck edge - For (v,w), the bottleneck edge is the heaviest edge on a path between v and w. Return a matrix B where B[v][w] = weight of the minimum bottleneck along v to w.
Last updated