Duality theorems

Bounding the optimal value
- Lower bounds
- Upper bounds
The dual of a linear program in standard form
Weak duality theorem
Possibilities for a primal-dual pair of LPs
Strong duality theorem
Magic coefficients

Bounding the optimal value

As motivation for the dual, let’s discuss how we can bound the optimal value of a linear program. Let’s stick to the case of a linear program in standard form. We’ll illustrate with the following example:

Maximize \(x_1 - 2x_2\)

subject to \[\begin{aligned} x_1 + 3x_2 & \le 4 \\ x_1 - 4x_2 & \le 2 \\ x_1, x_2 & \ge 0 \end{aligned}\]

Lower bounds

For a maximization problem, finding a lower bound is simply a matter of finding a feasible solution: the value the objective function takes on some feasible solution is certainly less than or equal to the maximum value it can take on feasible solutions! Here, for instance, we can easily see that \(x_1 = 2, x_2 = 0\) is feasible, which shows the optimal value is at least \(x_1 - 2x_2 = 2\).

Upper bounds

Finding an upper bound for a maximization problem is necessarily more subtle: to show that, for example, \(2.5\) is an upper bound¹ for the optimal value, we would need some argument that explains why none of the feasible solutions could possibly give an objective function value of more than \(2.5\).

Here’s one example of an upper bound we can prove for the optimal value in the example LP:

Suppose \(x_1, x_2\) are a feasible solution of the LP. Then, \(x_1, x_2 \ge 0\) and \(x_1 + 3x_2 \le 4\), from which we can conclude that \(x_1 \le 4\) —and \(3x_2 \le 4\), or, \(x_2 \le \frac{4}{3}\). Because \(x_1 \le 4\) and \(x_2 \ge 0\), we conclude \(z = x_1 - 2x_2 \le 4\).

How can we search for better (i.e., smaller) upper bounds? Here we only used the first inequality in the LP (plus the positivity constraints), we can try combining both. For example, multiplying the first constraint by \(\frac{1}{3}\) and the second by \(\frac{2}{3}\) and adding we get \(\frac{1}{3}(x_1 + 3x_2) + \frac{2}{3}(x_1-4x_2) \le \frac{4}{3} + \frac{4}{3}\), that is, \( x_1 - \frac{5}{3} x_2 \le \frac{8}{3}\). Since \(x_2 \ge 0\), we have \(z = x_1 - 2x_2 \le x_1 - \frac{5}{3} x_2 \le \frac{8}{3}\).

Can we find even better upper bound by using different coefficients? Let’s be systematic about it. Say we multiply the first constraint by \(y_1\) and the second by \(y_2\). To keep the inequalities going in the same direction we will require \(y_1, y_2 \ge 0\). Adding those multiples of the constraints we get: \[y_1(x_1 + 3x_2) + y_2(x_1-4x_2) \le 4y_1 + 2y_2.\]

When will the left hand side be \(\ge z\)? Let’s regroup to see the coefficients of \(x_1\) and \(x_2\) explicitly, we can rewrite the inequality as: \[(y_1 + y_2) x_1 + (3y_1 - 4y_2) x_2 \le 4y_1 + 2y_2.\]

Since \(z = x_1 - 2x_2\) (and \(x_1, x_2 \ge 0\)!), we will have \( z \le (y_1 + y_2) x_1 + (3y_1 - 4y_2) x_2\) as long as:

\[ \begin{aligned} y_1 + y_2 & \ge 1 \\ 3y_1 - 4y_2 & \ge -2 \\ \end{aligned}\]

All together we have shown that whenever \(y_1, y_2 \ge 0\) satisfy the above inequalities, then the number \(4y_1 + 2y_2\) is an upper bound for the optimal value of the LP we started with.

We might now ask what is the best (i.e., smallest) upper bound obtainable by this method? Well, the answer is given by the optimal value of the following LP called the dual of the original LP:

Minimize \(4y_1 + 2y_2\)

subject to \[ \begin{aligned} y_1 + y_2 & \ge 1 \\ 3y_1 - 4y_2 & \ge -2 \\ y_1, y_2 & \ge 0 \end{aligned}\]

The dual of a linear program in standard form

Repeating the steps that led to the dual LP in the above example but now for a general LP in standard form, we see that the dual of:

Maximize \( c_1 x_1 + c_2 x_2 + \cdots c_n x_n \)

subject to \[\begin{aligned} a_{11} x_1 + a_{12} x_2 + \cdots + a_{1n} x_n & \le b_1 \\ a_{21} x_1 + a_{22} x_2 + \cdots + a_{2n} x_n & \le b_2 \\ \vdots \\ a_{m1} x_1 + a_{m2} x_2 + \cdots + a_{mn} x_n & \le b_m \\ x_1, x_2, \ldots, x_n & \ge 0 \\ \end{aligned}\]

is given by:

Minimize \( b_1 y_1 + b_2 y_2 + \cdots b_m y_m \)

subject to \[\begin{aligned} a_{11} y_1 + a_{21} y_2 + \cdots + a_{m1} y_m & \ge c_1 \\ a_{12} y_1 + a_{22} y_2 + \cdots + a_{m2} y_m & \ge c_2 \\ \vdots \\ a_{1n} y_1 + a_{2n} y_2 + \cdots + a_{mn} y_m & \ge c_m \\ y_1, y_2, \ldots, y_n & \ge 0 \\ \end{aligned}\]

Or, even better, in matrix form the dual of:

Maximize \( \mathbf{c} \cdot \mathbf{x} \)

subject to \(\left\{\begin{aligned} A \mathbf{x} & \le \mathbf{b} \\ \mathbf{x} & \ge 0 \\ \end{aligned}\right.\)

is the problem:

Minimize \( \mathbf{b} \cdot \mathbf{y} \)

subject to \(\left\{\begin{aligned} A^\top \mathbf{y} & \ge \mathbf{c} \\ \mathbf{y} & \ge 0 \\ \end{aligned}\right.\)

This formula for dual of a LP only applies to problems in standard form! To find the dual of a LP that is not in standard form you need to convert to standard form first. A good exercise is to show that the dual of the dual of a problem is equivalent to the problem you started with!

Weak duality theorem

From the way we constructed the dual it is clear that the value of the dual objective function on any feasible solution of the dual is an upper bound for the objective function of the original or primal problem. This fact is important enough to be recorded as a theorem:

Weak duality. Consider the following pair of dual linear programs:

Primal LP	Dual LP
max \(\mathbf{c} \cdot \mathbf{x}\)	min \(\mathbf{b} \cdot \mathbf{y}\)
subject to	subject to
\( A \mathbf{x} \le \mathbf{b} \)	\( A^\top \mathbf{y} \ge \mathbf{c} \)
\( \mathbf{x} \ge 0\)	\( \mathbf{y} \ge 0\)

If \(\mathbf{x}\) is any feasible solution of the primal and \(\mathbf{y}\) is any feasible solution of the dual, then \(\mathbf{c} \cdot \mathbf{x} \le \mathbf{b} \cdot \mathbf{y}\).

Moreover, if equality holds, that is if \(\mathbf{c} \cdot \mathbf{x} = \mathbf{b} \cdot \mathbf{y}\), then \(\mathbf{x}\) is be an optimal solution of the primal LP and \(\mathbf{y}\) is an optimal solution of the dual LP.

The first part, the part before “moreover”, we already proved at the same time as we introduced the dual!

Here’s another proof though, to get some practice with matrix manipulations:

Assume \(\mathbf{x}\) is feasible for the primal and \(\mathbf{y}\) is feasible for the dual. Then \[ \mathbf{c}^\top \mathbf{x} \le (A^\top \mathbf{y})^\top \mathbf{x} = \mathbf{y}^\top A \mathbf{x} \le \mathbf{y}^\top \mathbf{b}. \]

Where:

the first inequality is because \( A^\top \mathbf{y} \ge \mathbf{c}\) and \(\mathbf{x} \ge 0\),
the equality is by standard properties of the transpose,
and the second inequality is because \( A \mathbf{x} \le \mathbf{b}\) and \(\mathbf{y} \ge 0\).

Now for the “moreover” part.

Assume, as before, that \(\mathbf{x}\) is feasible for the primal and \(\mathbf{y}\) is feasible for the dual, and that \( \mathbf{c} \cdot \mathbf{x} = \mathbf{b} \cdot \mathbf{y}\). Let’s show that \(\mathbf{x}\) is optimal for the primal LP; a very similar argument would show that \(\mathbf{y}\) is optimal for the dual LP.

Let \(\mathbf{x}'\) be any other feasible solution for the primal. To show that \(\mathbf{x}\) is optimal, we need to show that \( \mathbf{c} \cdot \mathbf{x}' \le \mathbf{c} \cdot \mathbf{x}\). But by the first part of the theorem applied to \(\mathbf{x}'\) and \(\mathbf{y}\), we know that \( \mathbf{c} \cdot \mathbf{x}' \le \mathbf{b} \cdot \mathbf{y}\); and we are assuming that \( \mathbf{b} \cdot \mathbf{y} = \mathbf{c} \cdot \mathbf{x}\), which finishes the proof.

Possibilities for a primal-dual pair of LPs

Recall that the fundamental theorem of linear programming says that any LP falls into exactly one of three cases: it’s infeasible, unbounded or has an optimal solution. For a LP and it’s dual which combinations are possible? The duality theorems rule some combinations out. Already the weak duality theorem tells us that if both primal and dual are feasible, then neither can be unbounded (“each bounds the other”). A stronger version of the duality theorem, which we will discuss next rules out the possibility of one problem of the pair being infeasible while the other has an optimal solution. It turns out all other combinations are possible (this needs to be shown by giving examples of each case, maybe this would be a good problem for a later assignment!). Here’s a table summarizing:

Primal/Dual	infeasible	unbounded	optimal
infeasible	yes	yes	no (SD)
unbounded	yes	no (WD)	no (WD)
optimal	no (SD)	no (WD)	yes

Strong duality theorem

Strong duality. Assume a primal-dual pair of linear problems satisfies any of the following conditions:

the primal has a optimal solution,
the dual has an optimal solution, or,
both primal and dual are feasible,

Then both primal and dual must have optimal solutions (say \(\mathbf{x^*}\) and \(\mathbf{y^*}\)) and the optimal values of the objective functions are equal (\( \mathbf{c} \cdot \mathbf{x^*} = \mathbf{b} \cdot \mathbf{y^*}\), using the above notation for the primal and dual).

We’ll first argue it is enough to show the following claim:

If the primal has an optimal solution \(\mathbf{x^*}\), then the dual has a feasible solution \(\mathbf{y^*}\) such that \( \mathbf{c} \cdot \mathbf{x^*} = \mathbf{b} \cdot \mathbf{y^*}\).

Notice that by the second part of the weak duality theorem, the conclusion of the claim implies that \(\mathbf{y^*}\) is optimal for the dual. So the claim is enough to show strong duality in case 1. Case 2 is just case 1 applied to the dual problem, and case 3, by weak duality implies that both primal and dual have optimal solutions, so we can establish case 3 using either case 1 or 2.

Now we show the claim. Imagine we solve the primal problem by the Simplex Method and we get as final dictionary the one corresponding to a basis \(\mathbf{x}_B\). Then the final dictionary is

\[\begin{aligned} \mathbf{x}_B & = A_B^{-1} \mathbf{b} - A_B^{-1} A_N \mathbf{x}_N \\ z & = \mathbf{c}_B^\top A_B^{-1} \mathbf{b} + \left(\mathbf{c}_N^\top - \mathbf{c}_B^\top A_B^{-1} A_N \right) \mathbf{x}_N \end{aligned}\]

Since this is the final dictionary and we have an optimal solution, all the coefficients in the \(z\)-row must be non-positive, so we know that \[ \mathbf{c}_N^\top - \mathbf{c}_B^\top A_B^{-1} A_N \le 0.\]

That vector of coefficients has one entry for every non-basic variable, and the entry corresponding to a particular non-basic \(x_j\) is \(c_j - \mathbf{c}_B^\top A_B^{-1} A_j\) (where \(A_j\) is the \(j\)-th column of \(A^\mathsf{aug}\), the column corresponding to \(x_j\)). So we’ve learned that \(c_j - \mathbf{c}_B^\top A_B^{-1} A_j \le 0\) for every \(j\) such that \(x_j\) is non-basic in the final dictionary.

We want to know that this inequality is also true for \(j\)’s such that \(x_j\) is a basic variable. Those inequalities for \(x_j\) basic put together would say that \[ \mathbf{c}_B^\top - \mathbf{c}_B^\top A_B^{-1} A_B \le 0,\] which is certainly true because the left-hand side is zero!

So now we know that \(c_j - \mathbf{c}_B^\top A_B^{-1} A_j \le 0\) for every \(j\). Now separate the variables into original variables and slack variables to get:

\begin{align} \mathbf{c}^\top - \mathbf{c}_B^\top A_B^{-1} A & \le 0 \label{main} \\ \mathbf{0}^\top - \mathbf{c}_B^\top A_B^{-1} I & \le 0 \label{pos} \end{align}

(Recall that the columns of \(A^\mathsf{aug}\) corresponding to the original variables form the matrix \(A\) and the columns for slack variables form an identity matrix; also, the entries of \(\mathbf{c}^\mathsf{aug}\) for the original variables form the vector \(\mathbf{c}\) and the entries for the slacks are all \(0\).)

The feasible solution of the dual that we seek is given by \(\mathbf{y^*}^\top = \mathbf{c}_B^\top A_B^{-1}\) (so \(\mathbf{y^*} = (A_B^{-1})^\top \mathbf{c}_B \)). We just need to check it really is feasible for the dual and that \( \mathbf{c} \cdot \mathbf{x^*} = \mathbf{b} \cdot \mathbf{y^*}\):

Main constraints:: \(A^\top \mathbf{y^*} \ge \mathbf{c}\) is equivalent to \eqref{main} (just transpose both sides of the inequality).
Positivity constraints:: \(\mathbf{y^*} \ge 0\) is equivalent to \eqref{pos}.
Equality of objective functions:: we have \(\mathbf{b} \cdot \mathbf{y^*} = \mathbf{y^*}^\top \mathbf{b} = (\mathbf{c}_B^\top A_B^{-1}) \mathbf{b}\), which, as you can see from the final dictionary for the primal, is the optimal value of the primal objective function.

Magic coefficients

Notice from the proof (not just the statement!) of the strong duality theorem that an optimal solution of the dual can be read off from the final dictionary for the primal! The proof shows you can take \(\mathbf{y^*}^\top = \mathbf{c}_B^\top A_B^{-1}\) as an optimal dual solution, and looking at how we obtained \eqref{pos}, we see we can describe the vector \(\mathbf{c}_B^\top A_B^{-1}\) more intuitively as minus the coefficients of the (primal) slack variables in the \(z\)-row of the final dictionary. We’ll call this phenomenon, magic coefficients.

Take, for example, the following LP:

Maximize \( x_1 + 2x_2 - x_3\)

subject to \[\begin{aligned} 2x_1 + x_2 + x_3 & \le 14\\ 4x_1 + 2x_2 + 3x_3 & \le 28\\ 2x_1 + 5x_2 + 5x_3 & \le 30\\ x_1, x_2, x_3 & \ge 0 \end{aligned}\]

After running the Simplex Method on it we obtained as final dictionary:

\[\begin{array}{*{10}{r}} x_1 = & 5 & & -\frac{5}{8} x_4& +\frac{1}{8} x_6\\ x_2 = & 4 & -x_3& +\frac{1}{4} x_4& -\frac{1}{4} x_6\\ x_5 = & 0 & -x_3& +2 x_4\\ z = & 13& -3 x_3& -\frac{1}{8} x_4& -\frac{3}{8} x_6\\ \end{array}\]

The slack variables are \(x_4, x_5, x_6\), which appear with coefficients \(-\frac{1}{8}, 0, -\frac{3}{8}\) in the \(z\)-row (the coefficient of \(x_5\) is \(0\) because it is basic in the final dictionary). So by “magic coefficients”, \(y_1 = \frac{1}{8}, y_2 = 0, y_3 = \frac{3}{8}\) is an² optimal solution of the dual.

We’ll see that it is not! But this is irrelevant to the point being made.

It’s not the only one! Can you find all of them without using the Simplex Method on the dual?