# The secrets of submodularity

Today’s topic is submodularity! This special structure allows a whole new mathematical toolbox to be used in the never-ending fight to improve formulations and reach optimal solutions faster.

Let $N := \{1,\dots,n\}$ be a finite ground set. A submodular function $f : 2^N\rightarrow\mathbb{R}$ is a set-defined function that exhibits a “diminishing returns” property. That is, adding an element to a small set results in a greater impact on the function than adding that element to a larger set. There are multiple equivalent ways to define a submodular function. Perhaps the most intuitive of these is below.

Let $S_1, S_2\subseteq N$ with $S_1\subseteq S_2$. $f$ is submodular if for every $x\in N\setminus S_2$, we have $f(S_1\cup \{x\}) - f(S_1) \geq f(S_2\cup \{x\}) - f(S_2)$. Here, $f(S_1\cup \{x\}) - f(S_1)$ represents the marginal increase in the set function when element $x\in N\setminus S_2$ is added to a smaller subset of $N$, $S_1$. This marginal increase must be at least as large as the marginal increase obtained by adding $x$ to a larger subset of $N$ that contains $S_1$.

An equivalent definition is as follows. Let $S_1, S_2\subseteq N$. $f$ is submodular if $f(S_1) + f(S_2) \geq f(S_1\cap S_2) + f(S_1\cup S_2)$.

It should also be noted that under certain conditions, optimizing submodular functions can be very easy.

Let’s examine a few places in public sector operations research that submodularity could arise.

Example 1: Path interdiction

Let $P=\{a_1, a_2, \dots, a_{|P|}\}$ be a path in a network. In my previous blog post, we discussed basic maximum-reliability network interdiction. In the same vein, let $p_a$ be the probability of traversing arc $a\in P$ undetected when arc $a$ is not interdicted. Let $q_a$ be the probability of traversing arc $a\in P$ undetected when the arc is interdicted. Naturally, $p_a>q_a$ for all $a\in P$. Let $S\subseteq P$ be the set of path arcs that we choose to inderdict. Let the function $f : 2^P\rightarrow [0,1]$ be the probability of an attacker traversing the entire path undetected. This function can be written as

$f(S)= \left(\displaystyle\prod_{a\in P} p_a\right) \left(\displaystyle\prod_{a\in S} \frac{q_a}{p_a}\right)$.

Then $f(S)$ is supermodular. A supermodular function simply reverses the inequalities above: i.e. $f(S_1\cup \{x\}) - f(S_1) \leq f(S_2\cup \{x\}) - f(S_2)$ holds for $S_1, S_2\subseteq N$ with $S_1\subseteq S_2$ and every $x\in N\setminus S_2$. The proof is obtained by directly applying the definition of supermodularity. The fact that $f$ is supermodular is very intuitive; the more arcs that we interdict, the less of a reduction (not increase!) we notice in the attacker’s traversal probability.

It follows that $-f(S)$ is submodular.

Imagine we were to give up our lives of fortune as operations researchers and pursue a humbler existence in the radio entertainment industry. We wish to select a subset of radio towers from which to broadcast our station. The broadcast ranges of these towers overlap.

Our goal is to cover as much surface area as possible. Let $N := \{1,\dots,n\}$ be the set of towers. For a subset of selected towers $S\subseteq N$, let $f(S)$ be the surface area covered by those towers. Then $f(S)$ is submodular. Below are a few pictures to illustrate the point.

Suppose we have $n=4$ potential radio towers to use for coverage. Suppose also that want to use the second radio tower, because the cost of doing so is negligible.

What happens if we choose to also use the third radio tower? Naturally, our covered surface will increase. Note that the marginal increase in surface area is the strictly-blue region.

Cool! Now, suppose that we initially choose to utilize the first and second radio stations.

When both of these radio stations are selected, what happens to the marginal increase in surface area when we include the third tower in our broadcast coverage?

Again, the strictly-blue region represents the surface area we added to our network by including the third radio tower. This blue area is much smaller than it was when we included the second tower and not the first. Looking at the definition of submodularity, we expect this marginal increase to be smaller! Mathematically, we have $f(\{2\}\cup\{3\}) - f(\{2\}) \geq f(\{1,2\}\cup\{3\}) - f(\{1,2\})$.

Conforti, Michele, Gérard Cornuéjols, and Giacomo Zambelli. Integer programming. Vol. 271. Berlin: Springer, 2014.

## 4 thoughts on “The secrets of submodularity”

1. Nice post – I particularly enjoyed the radio tower pictures. Is it “easy” to optimize the two examples you provided here?

Like

1. ET says:

Submodular maximization in general is known to NP-hard. However, we can still find acceptable solutions to some submodular maximization problems. For example, if we only allow ourselves to utilize a fixed number of radio towers, we can use a result from Nemhauser and friends. They showed that a greedy algorithm has a performance guarantee of ~63.2%.

The best-case scenario for submodular maximization is when we are maximizing a nonnegative linear function over what’s known as a submodular polyhedron (with a few other tiny assumptions). In this case, the greedy algorithm yields precisely the optimal solution.

Like

2. soovinyoon says:

I think this is a very nice introduction of submodularity and public sector OR. And I also enjoyed the picture in the radio station example, they are very well visualized!

We see examples for submodular maximization(and supermodular minimization) here, that’s NP-hard in general. Do you have any other thoughts for the inverse case, submodular minimization, in public sector OR ?

Like

1. ET says:

Submodular minimization is easier. This makes sense when you think of submodular functions as the “discrete version” of convex functions. There exist many polynomial algorithms for submodular minimization. The min-cut problem, which could be often seen in public sector OR, can be formulated as a submodular minimization problem.

Like