HOME

Dual variables as marginal values

Table of Contents

This section is dedicated to all the economics majors taking the course (but everyone else needs to learn this stuff too!)

The archetypal factory problem

Say you run a factory and are trying to decide how many units of each kind of product to manufacture; let \(x_j\) be the number of units of product #\(j\) you plan to make. Each of the \(n\) products requires some amount of each of \(m\) different resource; let \(a_{ij}\) be the amount of resource #\(i\) needed to manufacture one unit of product #\(i\). Say also that \(b_i\) is the amount of resource #\(i\) you have available and that \(c_j\) is the predicted profit you make from each unit of product #\(j\). Then to maximize profit you would want to solve the standard form linear program:

Maximize \( c_1 x_1 + c_2 x_2 + \cdots c_n x_n \)

subject to \[\begin{aligned} a_{11} x_1 + a_{12} x_2 + \cdots + a_{1n} x_n & \le b_1 \\ a_{21} x_1 + a_{22} x_2 + \cdots + a_{2n} x_n & \le b_2 \\ \vdots \\ a_{m1} x_1 + a_{m2} x_2 + \cdots + a_{mn} x_n & \le b_m \\ x_1, x_2, \ldots, x_n & \ge 0 \\ \end{aligned}\]

Here the constraints1 express that the manufacture of all the products together should not require more than the available amount of each resource.

(Notice that not all standard form linear programs can be interpreted as a factory-type situation: in these situations all \(a_{ij} \ge 0\) and all \(b_i \ge 0\). It’s still a good guide for intuition in general.)

In this model, what do the dual variables \(y_1, \ldots, y_m\) mean? Let’s start by figuring out their units. The dual constraints look like \(\sum_{i=1}^m a_{ij} y_i \ge c_j\). So the units of \(y_i\) must be something that whose product with the units of \(a_{ij}\) matches the units of \(c_j\). The units of \(a_{ij}\) are (whatever unit resource #\(i\) is measured in) per (unit of product #\(j\)); and the units of \(c_j\) are, say, dollars per (unit of product #\(j\)). This means that \(y_i\) is measured in units of dollars per (whatever unit resource #\(i\) is measured in).

That should make it plausible that, in an optimal dual solution, \(y_i\) is the marginal value of resource #\(i\), that is, \(y_i\) is the amount of extra profit you’d expect to make for each additional unit of resource #\(i\) available. This is therefore also the price that you, as factory planner, should think is fair for one unit of resource #\(i\). So, if someone offers to sell you resource #\(i\) at less than \(y_i\) dollars per unit, that’s a good deal and you should buy some if you can.

This advice is only “local”, that is, for small enough changes in the amount of resource #\(i\) available, \(y_i\) correctly predicts the change in profit, but for larger changes the situation can change. If you like multi-variable calculus, you can think of the marginal values this way: the optimal value of the primal LP is a function of all the \(a_{ij}\), \(b_i\) and the \(c_j\); let’s focus on the \(b_i\) and regard the \(a_{ij}\) and \(c_j\) as constants and let \(z(b_1, \ldots, b_m)\) be the maximum profit as a function of available resources. Then we are saying that \(y_i = \frac{\partial z}{\partial b_i}\).

We will now prove a theorem which says that, with some caveats, this interpretation is correct.

The marginal values theorem

The following theorem won’t mention anything about resources or factories, but it might be more intuitive if you keep the above interpretation in mind. The theorem has two parts: the first shows that the marginal values bound the optimal value of the altered primal, the second gives a condition under which that bound really is the new optimal value.

Consider the following standard form LP, its dual and an altered primal with slightly modified constraints:

Primal LP Dual LP   Altered Primal LP
max \(\mathbf{c} \cdot \mathbf{x}\) min \(\mathbf{b} \cdot \mathbf{y}\) max \(\mathbf{c} \cdot \mathbf{x}\)
subject to subject to subject to
\( A \mathbf{x} \le \mathbf{b} \) \( A^\top \mathbf{y} \ge \mathbf{c} \) \( A \mathbf{x} \le \mathbf{b} + \Delta\mathbf{b} \)
\( \mathbf{x} \ge 0\) \( \mathbf{y} \ge 0\) \( \mathbf{x} \ge 0\)
  1. Assume the primal (and therefore also its dual) has an optimal solution, and let \(\mathbf{y}^*\) be any optimal solution of the dual; so that the optimal value of the primal is \(z^* = \mathbf{b}\cdot \mathbf{y}^*\). If \(\mathbf{x}\) is any feasible solution of the altered primal, then \(\mathbf{c} \cdot \mathbf{x} \le z^* + \mathbf{y}^* \cdot \Delta\mathbf{b}\). In particular, if the altered primal is feasible (it may not be, depending on \(\Delta \mathbf{b}\)), then its optimal value \(z'\) satisfies \(z' \le z^* + \mathbf{y}^* \cdot \Delta\mathbf{b}\).
  2. Consider the basis \(\mathbf{x}_B\) of the final dictionary that the Simplex Method produces for the primal LP and use \(A_B\), \(\mathbf{c}_B\), etc. as usual. Take \(\mathbf{y}^* = \left(\mathbf{c}_B^\top A_B^{-1}\right)^\top\) as the optimal dual solution. If \(A_B^{-1} (\mathbf{b} + \Delta \mathbf{b}) \ge 0\), then the altered primal is guaranteed to have an optimal solution and the optimal value is \(z' = z^* + \mathbf{y}^* \cdot \Delta\mathbf{b}\).

  1. Notice that the dual of the altered primal is

    Minimize \((\mathbf{b} + \Delta \mathbf{b}) \cdot \mathbf{y}\)

    subject to \( \left\{ \begin{aligned} A^\top \mathbf{y} & \ge \mathbf{c} \\ \mathbf{y} &\ge 0 \end{aligned} \right.\)

    In particular, notice that it has the same constraints as the dual of the original primal. This tells us that \(\mathbf{y}^*\), the optimal dual solution, is still feasible for the altered dual. And so, weak duality for the altered LPs directly says that given any feasible solution of the altered primal, say \(\mathbf{x}\), we have \(\mathbf{c} \cdot \mathbf{x} \le (\mathbf{b} + \Delta \mathbf{b}) \cdot \mathbf{y} = z^* + \Delta \mathbf{b} \cdot \mathbf{y}\).

  2. Now, consider the final dictionary in the Simplex Method for the original primal LP:

    \[\begin{aligned} \mathbf{x}_B & = A_B^{-1} \mathbf{b} - A_B^{-1} A_N \mathbf{x}_N \\ z & = \mathbf{c}_B^\top A_B^{-1} \mathbf{b} + \left(\mathbf{c}_N^\top - \mathbf{c}_B^\top A_B^{-1} A_N \right) \mathbf{x}_N \end{aligned}\]

    What if we try to use the same basis for the altered LP? The dictionary would look like this:

    \[\begin{aligned} \mathbf{x}_B & = A_B^{-1} (\mathbf{b} + \Delta \mathbf{b}) - A_B^{-1} A_N \mathbf{x}_N \\ z & = \mathbf{c}_B^\top A_B^{-1} (\mathbf{b} + \Delta \mathbf{b}) + \left(\mathbf{c}_N^\top - \mathbf{c}_B^\top A_B^{-1} A_N \right) \mathbf{x}_N \end{aligned}\]

    Is this an optimal dictionary for the altered LP? Well, the \(z\)-row has the same coefficients as before so they are all non-positive as required for optimality. But this new dictionary is not necessarily feasible! Feasibility requires \(A_B^{-1} (\mathbf{b} + \Delta \mathbf{b}) \ge 0\). So, under that assumption, we do get an optimal dictionary for the altered primal, and the optimal value of the altered LP is \(\mathbf{c}_B^\top A_B^{-1} (\mathbf{b} + \Delta \mathbf{b}) = \mathbf{y}^* \cdot (\mathbf{b} + \Delta \mathbf{b})= z^* + \mathbf{y}^* \cdot \Delta \mathbf{b}\), as desired.

The case of non-degenerate optimal solutions

How often should you expect the second half of the theorem to be applicable? To answer that notice that typically optimal solutions are non-degenerate: typically all the basic variables in the final dictionary have strictly positive values, that is, \(A_B^{-1} \mathbf{b} > 0\). In that case, any vector \(\Delta \mathbf{b}\) whose entries are close enough to zero will satisfy \(A_B^{-1} (\mathbf{b} + \Delta \mathbf{b}) \ge 0\).

You may have noticed that the first part of the theorem is for any optimal dual solution but the second part is only for \(\mathbf{y}^* = \left(\mathbf{c}_B^\top A_B^{-1}\right)^\top\). In the non-degenerate case where \(A_B^{-1} \mathbf{b} > 0\), the dual optimal solution is unique, so you needn’t worry then about the distinction! Indeed, all the basic variables \(\mathbf{x}_B\) are positive, so complementary slackness gives us \(m\) different equations for \(y_1, \ldots, y_m\); namely, we get that \(A_B^\top \mathbf{y}^* = \mathbf{c}_B\). The only solution of that system is \(\mathbf{y}^* = \left(A_B^\top\right)^{-1} \mathbf{c}_B = \left(\mathbf{c}_B^\top A_B^{-1}\right)^\top \).


1

The ones that are not just positivity constraints, I mean.

Omar Antolín Camarena