🧑‍💻
CS2040S Data Structures and Algorithms
  • Algorithms
    • Overview
    • Time Complexity
    • Binary Search
  • Sorting Algorithms
  • Order Statistics
  • Interval Searching
  • Orthogonal Range Queries
  • Random Permutation Generation
  • Disjoint Set Union
  • Data Structures
    • Binary Search Tree
    • Trie
    • Hash (Symbol) Table
    • (a, b)-Trees
    • (k, d)-Trees
    • Heap
  • Graphs
    • Introduction
    • BFS and DFS
    • DAG and Topological Sort
    • SCC (Tarjan's and Kosaraju's)
    • SSSP (Bellman-Ford and Dijkstra)
    • MST (Prim's and Kruskal's)
  • Dynamic Programming
    • Concept
    • APSP Floyd-Warshall
    • Longest Increasing Subsequence
    • 0-1 Knapsack
    • Prize Collecting
    • Vertex Cover on a Tree
Powered by GitBook
On this page
  • Problem
  • Attempt 1
  • Floyd-Warshall (FW)
  • Optimising Space
  • Pseudocode
  • Path Reconstruction
  • Variants of FW

Was this helpful?

  1. Dynamic Programming

APSP Floyd-Warshall

Problem

Given a directed weighted graph (with no negative weight cycles), we want to be able to answer queries of the form “what is the minimum distance between vvv and www?” quickly. This is a classic All Pairs Shortest Path Algorithm.

Attempt 1

Do not pre-process the graph at all. Run Dijkstra each time you get a query (if the query has the same source vertex as a previous query, no need to run Dijsktra again).

Then, to answer QQQ queries (Q>V)(Q > V)(Q>V), it would take O(VElogV)O(VElogV)O(VElogV). In case of a dense graph, this becomes O(V3logV)O(V^3logV)O(V3logV)

Floyd-Warshall (FW)

Floyd and Warshall gave a beautifully simple algorithm to calculate APSP in O(V3)O(V^3)O(V3) time. So, it is better to use FW rather than repeatedly running Dijsktra in case of a dense graph. In case of a sparse graph, running Dijkstra VVV times is faster than FW since logVlogVlogV grows slower than E=O(V)E = O(V)E=O(V). In fact, in case of a sparse graph, we don’t know how to solve APSP faster than O(V2logV)O(V^2logV)O(V2logV).

Notice that if all the edge weights were identical, we could simply run BFS/DFS from each node and get the APSP in O(VE)O(VE)O(VE) time.

FW is also useful for finding the diameter of a graph (since diameter involves knowing the shortest path between all pairs of nodes, and then taking the maximum one)

FW outputs dist[v,w]dist[v,w]dist[v,w] to be the shortest distance from vvv to www, for all pairs of vertices (v,w)(v,w)(v,w).

Why can we use DP here? Well, shortest paths have an amazing amount of optimal sub-structure: We know that if PPP is the shortest path from uuu to vvv and it contains www, then PPP contains the shortest path from uuu to www and from www to v.v.v. So, a shortest path between two nodes is composed of the shortest path between many intermediary nodes too.

Moreover, (and more importantly) many shortest path calculations depend on the same sub-pieces. That is, they have overlapping subproblems!

Although it is easy to understand and believe that shortest paths have overlapping subproblems, it is much more difficult to correctly identify these right subproblems?

This is where the genius of Floyd and Warshall truly shines:

Let S[v,w,P]S[v,w,P]S[v,w,P] be the shortest path from vvv to www that only uses intermediate nodes (if any) from the set PPP. In other words, S[v,w,P]S[v,w,P]S[v,w,P] is the length of the shortest path from vvv to www that does not include any vertices not in PPP.

Let e(v,w)e(v,w)e(v,w) be the weight of an edge from vvv to www.

Then, our base case is: S[v,w,ϕ]=e(v,w)S[v,w,\phi]= e(v,w)S[v,w,ϕ]=e(v,w) (where ϕ\phiϕ represents the empty set). That is, if you cannot use any intermediate nodes, the only way to get from vvv to www is through a direct edge, if any. If no such edge exists, e(v,w)=∞e(v,w) = \inftye(v,w)=∞.

But the problem now is that there are so many possible such sets PPP. In particular, there are 2V2^V2V possible subsets of PPP that are all candidates and possible subproblems that we need to solve. But the question is, do we really need to solve all of them? Or can we only choose a few to solve and still guarantee that we get the correct answer.

This step represents another flair of creativity and intelligence by Floyd and Warshall. They realised that they only needed to solve n+1n + 1n+1 subproblems (where n=Vn = Vn=V). In particular, only the following sets need to be considered as subproblems:

P0=ϕP_0= \phiP0​=ϕ

P1=1P_1 = {1}P1​=1

P2=1,2P_2 = {1,2}P2​=1,2

P3=1,2,3P_3 = {1,2,3}P3​=1,2,3

…\dots…

Pn=1,2,3,…,nP_n = {1,2,3,\dots,n}Pn​=1,2,3,…,n

At each step, as we grow our set PPP, we are allowed to use more nodes as intermediate nodes if necessary. So, in the last step, we are allowed to use all possible nnn nodes as intermediate nodes, giving us the correct shortest paths.

How do we obtain the recurrence relation now, using the pre-calculated subproblems? Say, we are trying to calculate S[v,w,P8]S[v,w,P_8]S[v,w,P8​] and we have already calculated all the shortest paths when we are allowed to use nodes in P7P_7P7​ as intermediaries.

Then, there are two possibilities:

  1. The shortest path in S[v,w,P8]S[v,w,P_8]S[v,w,P8​] includes 8 (the new shortest path contains 8 as an intermediate node)

    1. Since 888 is an intermediate node along the shortest path from vvv to www using only the nodes in P8P_8P8​, the shortest distance from vvv to www along this new path must be S[v,8,P7]+S[8,w,P7]S[v,8,P_7] + S[8,w,P_7]S[v,8,P7​]+S[8,w,P7​]

    2. We cannot have any other nodes like 9,10,11,…9,10,11,\dots9,10,11,… in this path from vvv to 888 or from 888 to www since they do not appear in the overall path from vvv to www using P8P_8P8​.

  2. The shortest path in S[v,w,P8]S[v,w,P_8]S[v,w,P8​] does not include 8 (that is, even though we are allowed to use 888, it does not give us a shorter path)

So, the recurrence relation is S[v,w,P8]=min( S[v,w,P7] , S[v,8,P7]+S[8,w,P7] )S[v,w,P_8] = min(\ S[v,w,P_7]\ ,\ S[v,8,P_7] + S[8,w,P_7]\ )S[v,w,P8​]=min( S[v,w,P7​] , S[v,8,P7​]+S[8,w,P7​] ).

Optimising Space

Notice that we don’t actually need to store a 3-D matrix for S. To calculate the APSP using intermediate nodes in PiP_iPi​ we only need the values from . So, a single 2-D matrix suffices as we store the current value and keep over-writing the old values, updating as and when necessary. Notice that it can never be the case that the shortest path from vvv to kkk includes kkk as an intermediate node (as that would indicate a negative cycle containing kkk). So we don’t need to worry about some values already using PkP_kPk​ while others using Pk−1P_{k-1}Pk−1​ while we are adding kkk to the set of allowed intermediate nodes as we progressively update the dp table.

Pseudocode

In the above code, the outer loop tracks which nodes are allowed to be used (the current value of kkk determines PkP_kPk​). The inner 2 loops, deal with looking at every pair of nodes. We return a 2-D matrix where S[v][w]S[v][w]S[v][w] is the shortest path from node vvv to www.

Merely glancing at the 3 nested for loops should convince you that the running time of FW is O(V3)O(V^3)O(V3)

In fact, FW is so elegant that it only needs 4 lines of code to express the key algorithm: 3 of which are for-loops, and one represents the sheer brilliance of the two computer scientists.

Note that there are V3V^3V3 subproblems in FW: for each node (there are VVV nodes), you need to find the shortest path to all other nodes (again VVV nodes) using VVV sets of allowed intermediate nodes. So, V×V×VV\times V \times VV×V×V = O(V3)O(V^3)O(V3). Another way to think about this is that we were originally using a 3-D matrix to store all our subproblems but now we are overwriting entries in a 2-D matrix (since we don’t need to store older values). In particular, each entry in the 2-D matrix is overwritten exactly VVV times (once for each iteration of the outermost loop) and since the size of the matrix is V2V^2V2, there must be V3V^3V3 subproblems. Using this, we also observe that all our time is spent filling this table and it takes is O(1)O(1)O(1) to fill up a cell in the memo table since we are just calculating the minimum of two values. So, the running time is O(V3)O(V^3)O(V3).

Path Reconstruction

What if you want the path, along with the path length between any two nodes?

There is also an optimal substructure for this! If zzz is the first hop on the shortest path from vvv to www, then the shortest path from vvv to www is z+z +z+ shortest path from zzz to www.

So, rather than storing the entire path, it suffices to store just the first hop for each destination (e.g. routing table) and you can reconstruct the path between any pair of nodes.

You only need O(V2)O(V^2)O(V2) space for this. For each pair of nodes, store the first node in the path. That is, path[v][w]path[v][w]path[v][w] stores the node that is 1 hop away from vvv in the path from vvv to www. Then, to reconstruct the path, keep looking at those 1-hop away nodes till you reach www.

For example, in the above diagram, path[v][w]=zpath[v][w] = zpath[v][w]=z. Then look at path[z][w]path[z][w]path[z][w] to determine the next node. And so on.

Actually, there is no reason we need to even store the “first” node. You can store any node (say, zzz) along the shortest path from vvv to www and then recursively find the shortest path from vvv to zzz and zzz to www. In FW, you can store this intermediate node each time you modify/update the matrix entry for a pair.

Variants of FW

  1. Transitive closure - Return a matrix MMM where M[v][w]=1M[v][w] = 1M[v][w]=1 if there exists a path from vvv to www; M[v][w]=0M[v][w] = 0M[v][w]=0 otherwise

  2. Minimum bottleneck edge - For (v,w)(v,w)(v,w), the bottleneck edge is the heaviest edge on a path between vvv and www. Return a matrix BBB where B[v][w]B[v][w]B[v][w] = weight of the minimum bottleneck along vvv to www.

PreviousConceptNextLongest Increasing Subsequence

Last updated 8 months ago

Was this helpful?