Uncertain Information

By Eric DuBois

Back in August 2011, I was in Massachusetts preparing for my first season as a teacher at Nature’s Classroom in Colebrook, CT.  Colebrook is located in the Berkshire Mountains in the rural northwest corner of Connecticut.  To get there requires driving for about an hour on small state and local roads.  Unfortunately, at the same time as I was preparing to leave, Hurricane Irene was hitting southern New England and by the time it was over, the region was a mess.

Many of the rivers across the region were flooded.  As many of the roads in the region either follow, or at a minimum cross, rivers, they had become impassable in the aftermath of the storm.  To further complicate matters, the huge number of downed trees and powerlines added additional roadblocks to my journey.  From the outside this appears to be a fairly straightforward OR problem.  All we must do is find the new shortest path. Right?

Unfortunately, in this instance as in most real disasters, the reality does not fit the model very well.  Specifically, most shortest path problem models assume perfect knowledge of the network or at the very least a good working knowledge of the probabilities that the roads will be open.  Two things worked against that in Connecticut.

  1. The ‘interdicted’ locations change over time. As the water proceeds down the mountains it floods progressively larger watercourses.  So when I first start out the major mountain streams may have flooded the local roads and the smaller rivers are beginning to flood.  By the time that I have reached the mountains, those small rivers have reached flood stage.  Worse yet, there is no guarantee that the local roads are now passable since they may have washed out or become impassible with debris left by the receding water.
  2. More importantly, information is quite scarce on where the blockages are. In the actual event, the State Police could provide little more than a rough sketch of what roads outside of the suburbs were underwater and almost no information on where debris had blocked transit.  It took me over two and a half hours to find a navigable route to Colebrook and that was after checking with the police and news sources for an hour.  Other teachers were not so lucky, having not properly checked the news, and didn’t arrive until the day after.

The moral of this tale is understanding the importance and difficulty in maintaining an accurate understanding of the state of the system during a disaster.  Significant literature is now looking at the use of social media and crowd-sourced information to flesh out the details of a disaster.  Unfortunately, this information is often inaccurate and misleading.  This can arise from something as simple as the lack of characters allowed in a tweet up to a complete misunderstanding of the situation by the individuals on the ground.  Unlike state police or disaster responders, these individuals have likely had no formal training in communicating accurately or previous disaster experience to gauge the conditions by.  So while the information provided may be better than nothing, it is certainly no panacea.

I can certainly see using the information to provide a gauge to determining where infrastructure restoration is most likely needed.  However, in the eventuality that we need to route emergency vehicles around these roadblocks, probabilities and guessing is not, in my opinion, a terribly great way to go about finding the optimal choice.  As operations researchers, how do you feel we can make use of this new source of data to inform out disaster response planning?

Advertisements

Lagrangian Relaxation..

By Suzan Afacan

In my research, I am using Lagrangian relaxation dual to solve the mixed integer problem. This week I would like to explain the basics of Lagrangian Relaxation.

One of the most useful methods to solve combinatorial optimization models, especially for larger instances, is Lagrangian relaxation. Not only can we obtain a lower bound to the original (minimizing) model, but we can also obtain a feasible solution and upper bound to the original model by using Lagrangian relaxation dual. The most important feature of Lagrangian relaxation is making a difficult problem easier by relaxing the hard constraints.

The basic steps of the relaxation are:

  • Take an integer or mixed integer programming formulation,
  • Attach a Lagrangian multiplier to the constraints in the formula (you need to use an educated guess or luck to find the best constraint)
  • Relax these constraints into the objective function,
  • Solve this resulting integer programming optimally,
  • This optimal solution gives us a lower (upper) bound to the original minimization (maximization) problem.

Mathematically speaking, here is a general example:

example

To solve Lagrangian relaxation dual the most often used method is the Subgradient Algorithm. Here is the basic outline of the Subgradient Algorithm:

subgrad

The Subgradient Algorithm requires an upper bound on the original objective as an input. We need to set the starting value for Lagrangian multiplier u, the decreasing parameter Θ to 2 and the best bound to negative infinity.

Here we can choose some of the values problem specifically. For instance,the starting value for u can be set to 0, or better starting values can be found depending on the problem. The important property for the step size is

Screen Shot 2016-03-28 at 11.11.53 PM

We want to reduce the stepsize such that it is not going to stick before finding the optimal value, but also we want to make sure that we are not shrinking the search area too quickly. Even though the formula used in the Subgradient Algorithm (Polyak’s formula) might provide a good solution for the Lagrangian relaxation dual, other choices of step size which satisfy the properties above might be better than Polyak’s formula depending on the problem.

To conclude,

Lagrangian relaxation can be used to solve difficult problems by relaxing the most difficult constraints, but you need to find the best choices for the constraints to relax on and the step size to get best Lagrangian Relaxation solution. If you can find the best choices for your problem, Lagrangian relaxation will be a great method. Even if you cannot find the best choice, Lagrangian relaxation will still give you a lower bound and it is not hard to construct a feasible solution using the Lagrangian relaxation dual solution to get an upper bound on the original problem.

 

References:

Comments on The Lagrangian Relaxation Method for Solving Integer Programming Problems. (2004). 50(12 supplement), 1872-1874.

K. Ahuja, T. L. Magnanti and J. B. Orlin. Network Flows: Theory, Algorithms, and Applications. Prentice Hall, 1993.

 

 

Get your priorities straight!

I recently read a paper entitled “Prioritization via Stochastic Optimization” (Koç and Morton, 2014), which was a timely read based on what we’ve been discussing in class lately. [1] The purpose of the paper is to present a new approach to solving resource-constrained activity-selection problems (RCASPs) using stochastic optimization and activity prioritization. The example application given by the authors is for the p-median model, a model with which we’re all familiar by this point. (As a quick recap, the idea of the p-median model is that there exists a set of customers located at various points and a certain maximum number p of facilities that can be built to service the customers. The objective is to minimize the total Euclidean distance any customer has to travel to reach a facility.) Formally, the p-median model can be written

\min_{x,y} \sum_{j\in J}\sum_{i\in I} d_{ij}y_{ij}\\  \text{s.t.} \phantom{XX} \sum_{i \in I}x_i \leq k\\  \phantom{XXXX} x_i \geq y_{ij}\phantom{XX} i \in I, j \in J\\  \phantom{XXXX} \sum_{i \in I} y_{ij} = 1\phantom{XX} j \in J\\  \phantom{XXXX} x_i \in \{0,1\}\phantom{XX} i \in I\\  \phantom{XXXX} y_{ij} \in \{0,1\}\phantom{XX} i \in I, j \in J

When everything is deterministic, this model is relatively straightforward. Problems occur, however, when the maximum number of facilities is uncertain (which could be the result of uncertainty in the facility-building budget certain), but the decisions about where to place the facilities must be made before the uncertainty is realized. The figure below depicts optimal facility placement when the maximum number of facilities is known to be 4.

 

p = 4
Deterministic placement of facilities for p = 1, 2, 3, and 4

 

Now, we add uncertainty. If either one or two facilities could be built with equal probability, we get the first figure below. If up to three facilities or up to four facilities could be built with equal probabilities, we get the second and third figures, respectively.

Prioritized placement of facilities for p = 1 or 2 with equal probability
Prioritized placement of facilities for p = 1, 2, or 3 with equal probability
Prioritized placement of facilities for p = 1, 2, 3, or 4 with equal probability

We’ll formalize this model in a moment, but first I want to explicitly outline the series of events that occur in this prioritized model. First, a prioritized list of activities (facility locations) is selected without knowing the actual budget. Next, the uncertainty is realized .Finally, we make additional decisions (which facility covers each customer) depending on which scenario is encountered. Thus, perhaps the most “interesting” part of this problem is how to prioritize the activities. A priority list is a many-to-one assignment of activities to priority levels. We have two requirements for a priority list:

  1. A lower priority activity cannot be selected before a higher priority one in any scenario.
  2. Either all activities on priority level t are selected or none are.

The first-stage problem, then, corresponds to creating a priority list. Suppose F^\omega(\mathscr{L}) corresponds to the optimal objective value for scenario \omega \in \Omega. We write the first-stage problem as

\min_{\mathscr{L}} \sum_{\omega \in \Omega}q^\omega F^\omega(\mathscr{L})\\  \text{s.t.} \phantom{XX} \mathscr{L} = [\mathscr{L}_1,\dotsm,\mathscr{L}_L]\\  \phantom{XXXX}\mathscr{L}_t \subseteq I \phantom{XX} t = 1,\dotsm,L\\  \phantom{XXXX}\mathscr{L}_t \neq \emptyset\phantom{XX} t = 1,\dotsm,L\\  \phantom{XXXX}\mathscr{L}_t \cap \mathscr{L}_{t'}= \emptyset \phantom{XX} t \neq t', t,t' = 1,\dotsm,L\\  \phantom{XXXX} \bigcup_{t=1,\dotsm,L} \mathscr{L}_t = I

Essentially, these constraints require the priority list \mathscr{L} = [\mathscr{L}_1,\dotsm,\mathscr{L}_L] to be a list of subsets of the activities such that at least one activity is found in a given priority level, no activity is found in multiple priority levels, and taken together the whole list covers all the possible activities.

The second-stage problem is

\min_{x,y} \sum_{\omega \in \Omega}q^\omega\sum_{j\in J}\sum_{i\in I} d_{ij}y_{ij}^\omega\\  \text{s.t.} \phantom{XX} \sum_{i \in I}x^\omega_i \leq k^\omega\\  \phantom{XXXX} x_i^\omega \geq y_{ij}^\omega\phantom{XX} i \in I, j \in J\\  \phantom{XXXX} \sum_{i \in I} y_{ij}^\omega = 1\phantom{XX} j \in J\\  \phantom{XXXX} x_i^\omega \in \{0,1\}\phantom{XX} i \in I\\  \phantom{XXXX} y_{ij}^\omega \in \{0,1\}\phantom{XX} i \in I, j \in J\\  \phantom{XXXX} x_i^\omega \geq x_{i'}^\omega \phantom{XX} i \in \mathscr{L}_{t}, i' \in\mathscr{L}_{t'}, t < t', t,t' = 1,\dotsm,L\\  \phantom{XXXX} x_i^\omega = x_{i'}^\omega \phantom{XX} i,i' \in\mathscr{L}_t, t = 1,\dotsm,L

where the second-to-last constraint enforces the requirement that all higher-priority activities are selected before lower priority ones and the final constraint requires either all activities or no activities on a given priority level to be selected. The rest of the constraints are the same as in the original p-median model, but the variables and parameters are scenario-dependent. Of course, the first-stage model is not a standard integer program, so we would like to find a formulation for this problem that can be solved by standard stochastic optimization techniques. Toward that end, we create additional decision variables s_{ii'}, which are 1 if activity i has no lower priority than activity i’ and 0 otherwise. The corresponding two-stage stochastic integer model is

\min_{x,y,s} \sum_{\omega \in \Omega}q^\omega\sum_{j\in J}\sum_{i\in I} d_{ij}y_{ij}^\omega\\  \text{s.t.} \phantom{XX} \sum_{i \in I}x^\omega_i \leq k^\omega\\  \phantom{XXXX}  s_{ii'} + s_{i'i} \geq 1 \phantom{XX} i < i', i,i' \in I\\  \phantom{XXXX} s_{ii'} \in \{0,1\} \phantom{XX} i \neq i', i,i' \in I\\  \phantom{XXXX} x_i^\omega \geq x_{i'}^\omega + s_{ii'} - 1 \phantom{XX} i \neq i', i,i' \in I, \omega \in \Omega\\  \phantom{XXXX} x_i^\omega \geq y_{ij}^\omega\phantom{XX} i \in I, j \in J\\  \phantom{XXXX} \sum_{i \in I} y_{ij}^\omega = 1\phantom{XX} j \in J\\  \phantom{XXXX} x_i^\omega \in \{0,1\}\phantom{XX} i \in I\\  \phantom{XXXX} y_{ij}^\omega \in \{0,1\}\phantom{XX} i \in I, j \in J

The first constraint ensures either activities i and i’ have the same priority level or one is higher than the other. It’s also relatively straightforward to check that the third constraint becomes the final two constraints in the previous model depending on the values of the pair s_{ii'} ands_{i'i}.

Now, the activity-prioritization model isn’t particularly easy to solve in general (in fact, it’s NP-hard), but it does allow for potential stochastic or structural interdependencies between activities, which is an improvement on historical ways of looking at activity portfolios where each activity is treated as an independent entity. For more details on alternative formulations and some valid inequalities, I’ll refer you to the paper. Thanks for reading!

[1] Koç, Ali, and David P. Morton. “Prioritization via stochastic optimization.”Management Science 61.3 (2014): 586-603.

 

Playing a round of disaster management

This time right after the spring break (and with no class material additionally covered therefore), I would like to try bringing a very casual topic to the post – a board game about disaster management, more specifically, Pandemic.

It is one of the most popular cooperative board game, published by Z-Man Games in 2007. (I guess many of you might already have played this before.) Four diseases have broken out in the world, and players acting as CDC agents work cooperatively to stop the spread and discover all cures before diseases wipe out the world.

I am bringing this as a topic because the framework of this game is really well-suited to the optimization models. The game board is a world map, which is exactly a network with 48 big cities as nodes and arcs connecting nodes with equal distance, as you can see in the picture. Arrivals of the disease to the city as well as contagion outbreaks are probablistic events, implemented as the frequency in the card deck. Moreover, since this is a co-op game, we can say that there is only one stakeholder, which is a setting welcomed by operation researchers.

There exist many kinds of strategic decisions players can make, and one of the main decision is locating the research centers so that players can access all cities easily from the research centers. This can be considered as either a maximal covering or a p-median facility location problem. Actually, some people already studied this problem using basic network analysis. Quick spoiler: when we locate 3 research centers, Atlanta/Hong Kong/Cairo minimizes average distance. To read more about the list of ‘optimal placement’ decision they figured out, you can view Optimal Placement of Research Centers in Pandemic and Overanalyzing Board Games: Network Analysis and Pandemic.

Some say that there’s no fun if we attempt to break everything down into efficient mathematical equations. But I’d rather claim that it is always interesting to see how our intuitive decisions works compared to a computed ‘optimal’ one, or study why the discrepancy between the model and reality is happening. And we know that we can’t do so without doing the math.

Have fun!

Soovin

 

The secrets of submodularity

Today’s topic is submodularity! This special structure allows a whole new mathematical toolbox to be used in the never-ending fight to improve formulations and reach optimal solutions faster.

Let N := \{1,\dots,n\} be a finite ground set. A submodular function f : 2^N\rightarrow\mathbb{R} is a set-defined function that exhibits a “diminishing returns” property. That is, adding an element to a small set results in a greater impact on the function than adding that element to a larger set. There are multiple equivalent ways to define a submodular function. Perhaps the most intuitive of these is below.

Let S_1, S_2\subseteq N with S_1\subseteq S_2. f is submodular if for every x\in N\setminus S_2, we have f(S_1\cup \{x\}) - f(S_1) \geq f(S_2\cup \{x\}) - f(S_2). Here, f(S_1\cup \{x\}) - f(S_1) represents the marginal increase in the set function when element x\in N\setminus S_2 is added to a smaller subset of N, S_1. This marginal increase must be at least as large as the marginal increase obtained by adding x to a larger subset of N that contains S_1.

An equivalent definition is as follows. Let S_1, S_2\subseteq N. f is submodular if f(S_1) + f(S_2) \geq f(S_1\cap S_2) + f(S_1\cup S_2).

It should also be noted that under certain conditions, optimizing submodular functions can be very easy.

Let’s examine a few places in public sector operations research that submodularity could arise.

Example 1: Path interdiction

Let P=\{a_1, a_2, \dots, a_{|P|}\} be a path in a network. In my previous blog post, we discussed basic maximum-reliability network interdiction. In the same vein, let p_a be the probability of traversing arc a\in P undetected when arc a is not interdicted. Let q_a be the probability of traversing arc a\in P undetected when the arc is interdicted. Naturally, p_a>q_a for all a\in P. Let S\subseteq P be the set of path arcs that we choose to inderdict. Let the function f : 2^P\rightarrow [0,1] be the probability of an attacker traversing the entire path undetected. This function can be written as

f(S)= \left(\displaystyle\prod_{a\in P} p_a\right) \left(\displaystyle\prod_{a\in S} \frac{q_a}{p_a}\right).

Then f(S) is supermodular. A supermodular function simply reverses the inequalities above: i.e. f(S_1\cup \{x\}) - f(S_1) \leq f(S_2\cup \{x\}) - f(S_2) holds for S_1, S_2\subseteq N with S_1\subseteq S_2 and every x\in N\setminus S_2. The proof is obtained by directly applying the definition of supermodularity. The fact that f is supermodular is very intuitive; the more arcs that we interdict, the less of a reduction (not increase!) we notice in the attacker’s traversal probability.

It follows that -f(S) is submodular.

Example 2: Radio tower coverage

Imagine we were to give up our lives of fortune as operations researchers and pursue a humbler existence in the radio entertainment industry. We wish to select a subset of radio towers from which to broadcast our station. The broadcast ranges of these towers overlap.

Our goal is to cover as much surface area as possible. Let N := \{1,\dots,n\} be the set of towers. For a subset of selected towers S\subseteq N, let f(S) be the surface area covered by those towers. Then f(S) is submodular. Below are a few pictures to illustrate the point.

Suppose we have n=4 potential radio towers to use for coverage. Suppose also that want to use the second radio tower, because the cost of doing so is negligible.

radio_2

What happens if we choose to also use the third radio tower? Naturally, our covered surface will increase. Note that the marginal increase in surface area is the strictly-blue region.

radio_2-3

Cool! Now, suppose that we initially choose to utilize the first and second radio stations.

radio_1-2

When both of these radio stations are selected, what happens to the marginal increase in surface area when we include the third tower in our broadcast coverage?

radio_1-2-3

Again, the strictly-blue region represents the surface area we added to our network by including the third radio tower. This blue area is much smaller than it was when we included the second tower and not the first. Looking at the definition of submodularity, we expect this marginal increase to be smaller! Mathematically, we have f(\{2\}\cup\{3\}) - f(\{2\}) \geq f(\{1,2\}\cup\{3\}) - f(\{1,2\}).

Conforti, Michele, Gérard Cornuéjols, and Giacomo Zambelli. Integer programming. Vol. 271. Berlin: Springer, 2014.

Overview of a Mail Transportation Vehicle Routing Problem (VRP)

Way back in my first blog post, I talked about how there are several areas where operations research would likely be useful within the State of Wisconsin, and one of the areas I highlighted was the State’s interdepartmental mail transportation system. Over the last few months, I’ve been working to model this system with Professor Linderoth, and I figured it’d be worth discussing here since it’s technically a public sector OR problem.

To provide some context for this problem, here’s the basic setup. There are 57 locations in Madison that require mail deliveries each day, and five trucks are available to make these deliveries. Each truck is assigned a total of four routes (one for each time period), and these routes all start and end at the “depot” where the mail is sorted and prepared for the next delivery. Some locations require multiple stops throughout the day, but otherwise there are very few additional complexities. As a result, this problem can be represented using a set-covering model where the objective is to choose a set of routes that minimizes the total distance the trucks have to travel each day.

This model can be seen below:

Mail Transportation VRP model (Full)

Unfortunately, despite the simple mathematical formulation, developing a reasonable solution to this problem is quite challenging because of how many possible routes there are. Even when the number of locations on each route is restricted to 15, there are still more than 2.88e+25 routes that could be taken [1]. That’s a lot of routes, so generating and including all of them in the model simply isn’t practical.

Instead, a more reasonable approach is to create a small set of “good” routes that could be used to find a heuristic solution. While there are a number of sophisticated techniques that exist for this purpose, I originally started with an extremely simple approach and added additional complexity over time. At first, this consisted of randomly generating a large number of routes and throwing out any that exceeded a “high cost” threshold. Building off of this basic structure, I then improved the process by creating routes using location “clusters” to increase the likelihood of generating a good route each time. These routes then served as a starting point for the iterative, column-generation scheme that I’ve been working on most recently, which is something I’ll probably describe in more detail in a future blog post. Although these are fairly crude approaches to route generation, they’ve already resulted in a 15-20% reduction in total travel time when compared to the current state, and that number can only improve as further refinements are made.

[1] Expressed mathematically, this is 57!/(57-15)! since there are 57 locations, the order of stops along a route matters, and each location can only be visited once per route.

Bilevel Programming: An Introduction

Happy week-before-spring-break! In less than a month, I am going to be presenting a paper entitled “Interdicting a Nuclear-Weapons Project,” by Brown et al., written in 2009. While I will obviously be getting into more of the gritty details in my presentation, I wanted to give you a little flavor of what the problem looks like. In particular, I’d like to briefly introduce the concept of bilevel programming and how it relates to nuclear weapon project interdiction.

A bilevel problem can be thought as a “game” between two decision makers. One decision maker makes an decision based on a set of constraints and some objective that benefits them, and then a second decision maker with a different (often contrasting) objective takes action that benefits them. In [2], the authors present the general form of the bilevel programming problem as follows:

\max_x ax + by \text{ s.t. } \{y = \arg \min_y cx + dy\text{ s.t. } Ax + By \geq \bar{b}\}.

In this general form, the first decision maker tries to guess how the second decision maker will minimize their objective and acts accordingly, counteracting the expected actions of the second decision maker. Then the second decision maker takes action after seeing what the first decision maker chose to do, which affects their feasible set of actions. Of course, further restrictions can be imposed, such as requiring certain variables to be integer. Many bilevel problems are nonconvex, so they can be incredibly difficult to solve.

The interdiction problem is a good example of a bilevel programming problem. With nuclear weapon proliferation, specifically, the proliferator would like to quickly produce (and possibly disseminate) a batch of weapons, whereas the interdictor would like to delay the production as much as possible. The underlying assumption is that the proliferator can see any action done by the interdictor and thus can act to inhibit or avoid any such action. The objective used in the nuclear weapon interdiction paper is

z = max_x\{min_y S_{end}\}

where S is the earliest start time of the final task that must be completed by the proliferator, and x and y are the decisions made by the interdictor and the proliferator, respectively.

The nuclear weapons interdiction problem also has many constraints and multiple decisions to be made, so it is a complex problem. I’ll get into the details of the model and the algorithms used to solve it in my presentation, and probably in a future blog post as well, but for the time being I wanted to set up the idea of bilevel programming and how nuclear weapons proliferation can be viewed as a bilevel program.

[1] Brown, Gerald G., et al. “Interdicting a nuclear-weapons project.” Operations Research 57.4 (2009): 866-877.

[2] Bard, Jonathan F. “An algorithm for solving the general bilevel programming problem.” Mathematics of Operations Research 8.2 (1983): 260-272.