APSP Floyd-Warshall

Problem

Given a directed weighted graph (with no negative weight cycles), we want to be able to answer queries of the form “what is the minimum distance between vv and ww?” quickly. This is a classic All Pairs Shortest Path Algorithm.

Attempt 1

Do not pre-process the graph at all. Run Dijkstra each time you get a query (if the query has the same source vertex as a previous query, no need to run Dijsktra again).

Then, to answer QQ queries (Q>V)(Q > V), it would take O(VElogV)O(VElogV). In case of a dense graph, this becomes O(V3logV)O(V^3logV)

Floyd-Warshall (FW)

Floyd and Warshall gave a beautifully simple algorithm to calculate APSP in O(V3)O(V^3) time. So, it is better to use FW rather than repeatedly running Dijsktra in case of a dense graph. In case of a sparse graph, running Dijkstra VV times is faster than FW since logVlogV grows slower than E=O(V)E = O(V). In fact, in case of a sparse graph, we don’t know how to solve APSP faster than O(V2logV)O(V^2logV).

Notice that if all the edge weights were identical, we could simply run BFS/DFS from each node and get the APSP in O(VE)O(VE) time.

FW is also useful for finding the diameter of a graph (since diameter involves knowing the shortest path between all pairs of nodes, and then taking the maximum one)

FW outputs dist[v,w]dist[v,w] to be the shortest distance from vv to ww, for all pairs of vertices (v,w)(v,w).

Why can we use DP here? Well, shortest paths have an amazing amount of optimal sub-structure: We know that if PP is the shortest path from uu to vv and it contains ww, then PP contains the shortest path from uu to ww and from ww to v.v. So, a shortest path between two nodes is composed of the shortest path between many intermediary nodes too.

Moreover, (and more importantly) many shortest path calculations depend on the same sub-pieces. That is, they have overlapping subproblems!

Although it is easy to understand and believe that shortest paths have overlapping subproblems, it is much more difficult to correctly identify these right subproblems?

This is where the genius of Floyd and Warshall truly shines:

Let S[v,w,P]S[v,w,P] be the shortest path from vv to ww that only uses intermediate nodes (if any) from the set PP. In other words, S[v,w,P]S[v,w,P] is the length of the shortest path from vv to ww that does not include any vertices not in PP.

Let e(v,w)e(v,w) be the weight of an edge from vv to ww.

Then, our base case is: S[v,w,ϕ]=e(v,w)S[v,w,\phi]= e(v,w) (where ϕ\phi represents the empty set). That is, if you cannot use any intermediate nodes, the only way to get from vv to ww is through a direct edge, if any. If no such edge exists, e(v,w)=e(v,w) = \infty.

But the problem now is that there are so many possible such sets PP. In particular, there are 2V2^V possible subsets of PP that are all candidates and possible subproblems that we need to solve. But the question is, do we really need to solve all of them? Or can we only choose a few to solve and still guarantee that we get the correct answer.

This step represents another flair of creativity and intelligence by Floyd and Warshall. They realised that they only needed to solve n+1n + 1 subproblems (where n=Vn = V). In particular, only the following sets need to be considered as subproblems:

P0=ϕP_0= \phi

P1=1P_1 = {1}

P2=1,2P_2 = {1,2}

P3=1,2,3P_3 = {1,2,3}

\dots

Pn=1,2,3,,nP_n = {1,2,3,\dots,n}

At each step, as we grow our set PP, we are allowed to use more nodes as intermediate nodes if necessary. So, in the last step, we are allowed to use all possible nn nodes as intermediate nodes, giving us the correct shortest paths.

How do we obtain the recurrence relation now, using the pre-calculated subproblems? Say, we are trying to calculate S[v,w,P8]S[v,w,P_8] and we have already calculated all the shortest paths when we are allowed to use nodes in P7P_7 as intermediaries.

Then, there are two possibilities:

  1. The shortest path in S[v,w,P8]S[v,w,P_8] includes 8 (the new shortest path contains 8 as an intermediate node)

    1. Since 88 is an intermediate node along the shortest path from vv to ww using only the nodes in P8P_8, the shortest distance from vv to ww along this new path must be S[v,8,P7]+S[8,w,P7]S[v,8,P_7] + S[8,w,P_7]

    2. We cannot have any other nodes like 9,10,11,9,10,11,\dots in this path from vv to 88 or from 88 to ww since they do not appear in the overall path from vv to ww using P8P_8.

  2. The shortest path in S[v,w,P8]S[v,w,P_8] does not include 8 (that is, even though we are allowed to use 88, it does not give us a shorter path)

So, the recurrence relation is S[v,w,P8]=min( S[v,w,P7] , S[v,8,P7]+S[8,w,P7] )S[v,w,P_8] = min(\ S[v,w,P_7]\ ,\ S[v,8,P_7] + S[8,w,P_7]\ ).

Optimising Space

Notice that we don’t actually need to store a 3-D matrix for S. To calculate the APSP using intermediate nodes in PiP_i we only need the values from . So, a single 2-D matrix suffices as we store the current value and keep over-writing the old values, updating as and when necessary. Notice that it can never be the case that the shortest path from vv to kk includes kk as an intermediate node (as that would indicate a negative cycle containing kk). So we don’t need to worry about some values already using PkP_k while others using Pk1P_{k-1} while we are adding kk to the set of allowed intermediate nodes as we progressively update the dp table.

Pseudocode

In the above code, the outer loop tracks which nodes are allowed to be used (the current value of kk determines PkP_k). The inner 2 loops, deal with looking at every pair of nodes. We return a 2-D matrix where S[v][w]S[v][w] is the shortest path from node vv to ww.

Merely glancing at the 3 nested for loops should convince you that the running time of FW is O(V3)O(V^3)

In fact, FW is so elegant that it only needs 4 lines of code to express the key algorithm: 3 of which are for-loops, and one represents the sheer brilliance of the two computer scientists.

Note that there are V3V^3 subproblems in FW: for each node (there are VV nodes), you need to find the shortest path to all other nodes (again VV nodes) using VV sets of allowed intermediate nodes. So, V×V×VV\times V \times V = O(V3)O(V^3). Another way to think about this is that we were originally using a 3-D matrix to store all our subproblems but now we are overwriting entries in a 2-D matrix (since we don’t need to store older values). In particular, each entry in the 2-D matrix is overwritten exactly VV times (once for each iteration of the outermost loop) and since the size of the matrix is V2V^2, there must be V3V^3 subproblems. Using this, we also observe that all our time is spent filling this table and it takes is O(1)O(1) to fill up a cell in the memo table since we are just calculating the minimum of two values. So, the running time is O(V3)O(V^3).

Path Reconstruction

What if you want the path, along with the path length between any two nodes?

There is also an optimal substructure for this! If zz is the first hop on the shortest path from vv to ww, then the shortest path from vv to ww is z+z + shortest path from zz to ww.

So, rather than storing the entire path, it suffices to store just the first hop for each destination (e.g. routing table) and you can reconstruct the path between any pair of nodes.

You only need O(V2)O(V^2) space for this. For each pair of nodes, store the first node in the path. That is, path[v][w]path[v][w] stores the node that is 1 hop away from vv in the path from vv to ww. Then, to reconstruct the path, keep looking at those 1-hop away nodes till you reach ww.

For example, in the above diagram, path[v][w]=zpath[v][w] = z. Then look at path[z][w]path[z][w] to determine the next node. And so on.

Actually, there is no reason we need to even store the “first” node. You can store any node (say, zz) along the shortest path from vv to ww and then recursively find the shortest path from vv to zz and zz to ww. In FW, you can store this intermediate node each time you modify/update the matrix entry for a pair.

Variants of FW

  1. Transitive closure - Return a matrix MM where M[v][w]=1M[v][w] = 1 if there exists a path from vv to ww; M[v][w]=0M[v][w] = 0 otherwise

  2. Minimum bottleneck edge - For (v,w)(v,w), the bottleneck edge is the heaviest edge on a path between vv and ww. Return a matrix BB where B[v][w]B[v][w] = weight of the minimum bottleneck along vv to ww.

Last updated