from ipynb.fs.full.Graph import DirectedAdjList, Edge
A flow network is a directed graph where each edge is labeled with a capacity (non-negative number) and there are two distinguished vertices: the source $s$ and sink $t$.
A flow is a function $f$ that maps each edge to a number such that
The value of a flow, written $|f|$, is the sum of the flows out of the source minus the flows into the source. $$ |f| = \sum_u f(s,u) - \sum_v f(v,s) $$
The maximum flow problem is to find the flow with the maximum value for a given flow network.
name = ['A','B','C','D','E','S','T']
def vertex_name(u):
return name[u]
id = {'A':0,'B':1,'C':2,'D':3,'E':4,'S':5,'T':6}
named_capacity = {('S','A'):2, ('S','B'):3, ('S','C'): 1, ('A','D'):1,
('A','C'):2, ('B','C'):2, ('B','E'):2, ('C','D'):2,
('C', 'E'): 4, ('C', 'T'): 2, ('D','T'):2, ('E','T'):1}
capacity = {Edge(id[e[0]],id[e[1]]):c for (e,c) in named_capacity.items()}
def label_capacity(capacity):
return lambda e: str(capacity[e])
edges = capacity.keys()
G = DirectedAdjList(7, edges, vertex_label=vertex_name,
edge_label=label_capacity(capacity))
G.show()
In the following graph, we've labeled each edge with two number, written $f:c$, where $f$ is the flow and $c$ is the capacity. The flow value is $4$, the sum of the three out edges of $S$.
flow_by_name = {('S','A'): 1, ('S','B'): 2, ('S','C'): 1, ('A','C'): 0,
('A','D'):1, ('B','C'): 2, ('B','E'):0, ('C','D'):0,
('C','E'):1, ('C','T'):2, ('D','T'): 1, ('E','T'): 1 }
flow = {Edge(id[e[0]],id[e[1]]):f for (e,f) in flow_by_name.items()}
def label_flow(flow, capacity):
return lambda e: str(flow[e]) + ":" + str(capacity[e])
G = DirectedAdjList(G.num_vertices(), edges, vertex_name, label_flow(flow, capacity))
G.show()
The residual network of a flow network $G$ and flow $f$ is another graph $G_f$ with the same vertices as $G$. For each edge $(u,v) \in G$:
The residual network includes just the edges that can be used to add or remove flow from the network.
We can go in the other direction and compute the residual capacity of an edge in the residual network, $e \in G_f$.
def residual_capacity(e, flow, capacity, G):
if e in G.edges():
return capacity[e] - flow[e]
elif e.flip() in G.edges():
return flow[e.flip()]
else:
return 0
The following is the residual network for the example graph. (Ignoring the backward edges for now.)
def residual_network(G, flow, capacity):
residual_capacity = {}
G_flow = DirectedAdjList(G.num_vertices(), vertex_label=G.vertex_label,
edge_label=label_capacity(residual_capacity))
for e in G.edges():
residual = capacity[e] - flow[e]
if residual > 0:
new_e = G_flow.add_edge(e.source, e.target)
residual_capacity[new_e] = residual
if flow[e] > 0:
new_e = G_flow.add_edge(e.target, e.source)
residual_capacity[new_e] = flow[e]
return (G_flow, residual_capacity)
(G_flow, residual_cap) = residual_network(G, flow, capacity)
G_flow.show()
Instead of explicitly creating the residual network, we sometimes would just like to navigate it by creating its edges on the fly.
def residual_out_edges(u, G, flow, capacity):
for e in G.out_edges(u):
if capacity[e] - flow[e] > 0:
yield e
for e in G.in_edges(u):
if flow[e] > 0:
yield e.flip()
An augmenting path is a path from the source to the sink (without cycles) in the residual network.
The residual capacity of an augmenting path is the smallest of the residual capacities of all the edges on the path.
def residual_capacity_of_path(path, capacity):
return min([capacity[e] for e in path])
Lemma 26.2 and Corollary 26.3 Given a flow network $G$ and a flow $f$, you can add the residual capacity of any augmenting path in $G_f$ to all the edges in the path to obtain a new flow $f'$ and $|f'| > |f|$.
We can find an augmenting path by applying Depth-First Search to the residual network. The following is an augmenting path in the example graph with a residual capacity of $1$.
from ipynb.fs.full.DepthFirstSearch import DFS_find_path
augmenting_path = DFS_find_path(G_flow, id['S'], id['T'])
def color_resid_path(path):
on_path = set(path)
def color(e):
if e in on_path:
return "red"
else:
return "black"
return color
G_flow_aug = DirectedAdjList(G_flow.num_vertices(), G_flow.edges(),
vertex_label=vertex_name, edge_label=label_capacity(residual_cap),
edge_color=color_resid_path(augmenting_path))
path_resid_cap = residual_capacity_of_path(augmenting_path, residual_cap)
print("residual capacity of path: " + str(path_resid_cap))
G_flow_aug.show()
residual capacity of path: 1
And here's the flow network with the residual capacity added to the flows on the augmenting path.
def augment_flow(aug_path, flow, resid_cap, edge_set):
for e in aug_path:
if e in edge_set:
flow[e] = flow[e] + resid_cap
else:
flow[e.flip()] = flow[e.flip()] - resid_cap
augment_flow(augmenting_path, flow, path_resid_cap, G.edges())
G_aug = DirectedAdjList(G.num_vertices(), G.edges(), G.vertex_label, label_flow(flow, capacity))
G_aug.show()
The Ford-Fulkerson method repeatedly finds augmenting paths and adds their residual capacity to the flow.
It stops when it can no longer find an augmenting path.
But does that coincide with finding the maximum flow?
This theorem will tell us that a flow is maximum if and only if there are no augmenting paths in the residual network.
But first we have to learn what a min-cut is.
A cut $(S,T)$ of a flow network includes the source on one side $S$ and the sink on the other $T$.
The net flow across a cut $f(S,T)$ is the sum of the flows from $S$ to $T$ minus the flows from $T$ to $S$.
The capacity of a cut $c(S,T)$ is the sum of the capacities of the edges that cross from $S$ to $T$.
The minimum cut of a network is a cut whose capacity is minimum with respect to all other cuts.
Lemma 26.4 The net flow across any cut is equal to the value of the flow.
The following is a cut in the example graph. The net flow across this cut is $4$ which indeed matches the flow value (the sum of the flows leaving $S$). The capacity of this cut is $11$.
from graphviz import Source
Source.from_file('cut-flow.dot', engine='neato')
Corollary 26.5 The value of every flow is no larger than the capacity of any cut.
Theorem 26.6 (Max-flow min-cut theorem) The following conditions are equivalent:
Next we show that (2) implies (3). Since there are no augmenting paths, we create a cut $S$ of all the nodes reachable from the source in the residual graph and put the rest in $T$. By Lemma 26.4, we have $$ |f| = f(S,T) $$ and by definition, $$ f(S,T) = \sum_{u\in S} \sum_{v\in T} f(u,v) - \sum_{v\in T}\sum_{u\in S} f(v,u) $$
Thus, the above equation simplifies to $$ f(S,T) = \sum_{u\in S} \sum_{v\in T} c(u,v) $$ which means $$ f(S,T) = c(S,T) $$
Finally, we show that (3) implies (1). So there is a cut $(S,T)$ such that $|f| = c(S,T)$. Suppose $f'$ is another flow. By Corollary 26.5, $|f'| \leq c(S,T)$. So $|f'| \leq |f|$.
QED
The algorithm performs the following steps:
flow
on every edge to 0.G_flow
and look for an augmenting path aug_path
.flow
.def ford_fulkerson_max_flow(G, s, t, capacity):
flow = {e:0 for e in G.edges()}
(G_flow,residual_capacity) = residual_network(G, flow, capacity)
aug_path = DFS_find_path(G_flow, s, t)
while aug_path:
resid_cap = residual_capacity_of_path(aug_path, residual_capacity)
augment_flow(aug_path, flow, resid_cap, G.edges())
G_aug = DirectedAdjList(G.num_vertices(), G.edges(),
G.vertex_label, label_flow(flow, capacity),
color_path(aug_path))
display(G_aug.show())
(G_flow,residual_capacity) = residual_network(G, flow, capacity)
aug_path = DFS_find_path(G_flow, s, t)
return flow
The following shows each step of the Ford-Fulkerson algorithm on the example graph. Some augmenting paths involve a "backward" edge in the residual graph, which we depict as a blue edge in the graph. The augmenting path traverses the backward edge in reverse, which removes some flow from that edge and adds it to another edges.
def color_path(path):
on_path = set(path)
def color(e):
if e in on_path:
return "red"
elif e.flip() in on_path:
return "blue"
else:
return "black"
return color
max_flow = ford_fulkerson_max_flow(G, id['S'], id['T'], capacity)
DirectedAdjList(G.num_vertices(), G.edges(), G.vertex_label,
label_flow(max_flow, capacity)).show()
while
loop iterates at most $|f|$ times, incrementing the flow by at least $1$ each time.residual_network
is $O(m)$ for the same reason. DFS_find_path
is $O(n + m)$.residual_capacity
is $O(m)$ (in the worst case, the path contains all the edges).The Edmonds-Karp algorithm is almost the same as Ford-Fulkerson but it instead uses Breadth-First Search to find augmenting paths.
from ipynb.fs.full.BreadthFirstSearch import BFS_find_path
def edmonds_karp_max_flow(G, s, t, capacity):
flow = {e:0 for e in G.edges()}
while True:
(G_flow, residual_capacity) = residual_network(G, flow, capacity)
aug_path = BFS_find_path(G_flow, s, t)
if not aug_path:
break;
resid_cap = residual_capacity_of_path(aug_path, residual_capacity)
augment_flow(aug_path, flow, resid_cap, G.edges())
G_aug = DirectedAdjList(G.num_vertices(), G.edges(),
G.vertex_label, label_flow(flow, capacity),
color_path(aug_path))
display(G_aug.show())
return flow
max_flow = edmonds_karp_max_flow(G, id['S'], id['T'], capacity)
DirectedAdjList(G.num_vertices(), G.edges(), G.vertex_label,
label_flow(max_flow, capacity)).show()
The Edmonds-Karp algorithm has time complexity $O(n m^2)$.
This is because the number of iterations through the while
loop is $O(nm)$
(explained below) and the body of the loop is $O(m)$.
By using BFS we restrict the search for augmenting paths to shortest paths.
Consider the distance from the source to all other vertices in the residual network. That distance either stays the same or increases after each augmentation. The reason is that when we augment the flow, at least one edge $(u,v)$ on a shortest path is removed from the residual graph, causing the distances to vertices further down the path to possibly increase. That edge may get added back later via flow through its "backward" edge, but when it does, it will be via a strictly longer path. Here's why:
Now, a path can only get so long, up to the number of vertices. So the maximum number of times we can augment the flow is the number of edges times the number of vertices: $O(nm)$.
The push-relabel approach to computing maximum flow is more efficient than Edmonds-Karp, with an $O(n^3)$ time complexity. It is based on the following ideas:
relabel
operation.At the beginning, the source is placed at height $n$ and all other vertices at height $0$. The neighbors of the source are given excess flow equal to the capacity of the edge from the source. The source is given a negative excess equal to the capacities of its out-edges.
The push operation can be applied along an edge when the source node has positive excess flow, the residual capacity of the edge is positive, and the height of the source is one greater than the target.
def can_push(e, excess, flow, capacity, height, G):
return excess[e.source] > 0 \
and residual_capacity(e, flow, capacity, G) > 0 \
and height[e.source] == height[e.target] + 1
The push operation moves as much of the excess flow as it can (up to the residual capacity of the edge)
from the source to the target of the edge. It also updates the flow for the edge. We'll explain the update_overflow
function later.
def push(e, excess, height, flow, capacity, G, overflow = None, overflow_node = None):
delta = min(excess[e.source], residual_capacity(e, flow, capacity, G))
if e in G.edges():
flow[e] = flow[e] + delta
else:
flow[e.flip()] = flow[e.flip()] - delta
excess[e.source] = excess[e.source] - delta
excess[e.target] = excess[e.target] + delta
update_overflow(e.source, excess, overflow, overflow_node)
update_overflow(e.target, excess, overflow, overflow_node)
def update_overflow(u, excess, overflow, overflow_node):
if overflow != None:
if excess[u] > 0 and overflow_node[u] == None:
overflow_node[u] = overflow.insert(u)
if excess[u] == 0 and overflow_node[u]:
overflow.remove(overflow_node[u])
overflow_node[u] = None
A saturating push is a push that increases the flow on the edge to be equal to the capacity. A saturating push causes the edge to be removed from the residual network.
The relabel operation is applicable to a node when it contains excess flow and all of its neighbors in the residual network are at equal or greater height.
def can_relabel(u, height, excess, G, flow, capacity):
adj = [e.target for e in residual_out_edges(u, G, flow, capacity)]
return excess[u] > 0 and all([height[u] <= height[v] for v in adj]) \
and len(G_flow.adjacent(u)) > 0
The relabel operation increase the height of the node to one greater than the lowest of its neighbors.
def relabel(u, height, G, flow, capacity):
adj = [e.target for e in residual_out_edges(u, G, flow, capacity)]
assert len(adj) > 0
height[u] = 1 + min([height[v] for v in adj])
We'll use the following function to display the excess flow and height of each node.
def label_excess(excess, height):
return lambda u: name[u] + ' e' + str(excess[u]) + ' h' + str(height[u])
Consider again the following example network that has been initialized as described above.
flow = { e:0 for e in G.edges()}
height = [0 for u in G.vertices()]
excess = [0 for u in G.vertices()]
height[id['S']] = G.num_vertices()
for e in G.out_edges(id['S']):
flow[e] = capacity[e]
excess[e.target] = capacity[e]
excess[id['S']] = excess[id['S']] - capacity[e]
G = DirectedAdjList(G.num_vertices(), edges, label_excess(excess,height), label_flow(flow, capacity))
G.show()
Node A
has excess flow but no downhill neighbors, so we raise it to height 1.
if can_relabel(id['A'], height, excess, G, flow, capacity):
relabel(id['A'], height, G, flow, capacity)
G.show()
Now we can push the excess from node A
to node C
.
e = Edge(id['A'],id['C'])
if can_push(e, excess, flow, capacity, height, G):
push(e, excess, height, flow, capacity, G)
G.show()
Similarly, we can relabel C
and then push 2 units of its excess to T
.
if can_relabel(id['C'], height, excess, G, flow, capacity):
relabel(id['C'], height, G, flow, capacity)
e = Edge(id['C'],id['T'])
if can_push(e, excess, flow, capacity, height, G):
push(e, excess, height, flow, capacity, G)
G.show()
By repeatedly pushing and relabeling, the maximum flow will eventually
reach the sink T
.
However, there is often some excess flow remaining in some of the vertices. That means the flow does not yet satisfy the "flows-in equals flows-out" constraint.
So the process of pushing and relabeling continues, raising many of the vertices above the source vertex and returning all the excess flow to the source.
For efficiency, we'd like to quickly identify nodes that can be relabeled or from which
we can push some flow. Those are nodes that are overflowing, i.e., that have positive excess flow.
So we maintain a list of nodes named overflow
. The update_overflow
function adds or removes
the given node from the overflow
.
The algorithm proceeds as follows:
overflow
list.overflow
list, repeat the following stepsoverflow
list.from llist import dllist
def goldberg_push_relabel_max_flow(G, s, t, capacity):
# initialize the auxiliary data structures
height = [0 for u in G.vertices()]
excess = [0 for u in G.vertices()]
height[s] = G.num_vertices()
overflow = dllist()
overflow_node = [None for u in G.vertices()]
flow = { e:0 for e in G.edges()}
# push flow from the source to its neighbors
for e in G.out_edges(s):
flow[e] = capacity[e]
excess[e.target] = capacity[e]
update_overflow(e.target, excess, overflow, overflow_node)
excess[s] = excess[s] - capacity[e]
# the main loop, repeatedly relabel and push from the overflowing nodes
while len(overflow) > 0:
u = overflow.first.value
if u == t:
overflow.popleft()
continue
if can_relabel(u, height, excess, G, flow, capacity):
relabel(u, height, G, flow, capacity)
else:
for e in residual_out_edges(u, G, flow, capacity):
if can_push(e, excess, flow, capacity, height, G):
push(e, excess, height, flow, capacity, G, overflow, overflow_node)
break
return flow
G = DirectedAdjList(G.num_vertices(), G.edges(), vertex_label=vertex_name)
max_flow = goldberg_push_relabel_max_flow(G, id['S'], id['T'], capacity)
DirectedAdjList(G.num_vertices(), G.edges(), vertex_label=vertex_name,
edge_label=label_flow(max_flow, capacity)).show()
Let $n$ be the number of vertices in the graph and $m$ the number of edges.
So the total time is dominated by the number of non-saturating pushes. Assuming $n < m$, the time complexity is $O(n^2 m)$.