Matrix formulas for dictionaries

Matrix form of a linear program
Adding the slack variables
Dictionaries

Matrix form of a linear program

Just like linear algebra benefits from formulating linear systems of equations in matrix notation, so too does linear programming. Recall our standard form linear program:

Maximize \( c_1 x_1 + c_2 x_2 + \cdots c_n x_n \)

subject to \[\begin{aligned} a_{11} x_1 + a_{12} x_2 + \cdots + a_{1n} x_n & \le b_1 \\ a_{21} x_1 + a_{22} x_2 + \cdots + a_{2n} x_n & \le b_2 \\ \vdots \\ a_{m1} x_1 + a_{m2} x_2 + \cdots + a_{mn} x_n & \le b_m \\ x_1, x_2, \ldots, x_n & \ge 0 \\ \end{aligned}\]

We can rewrite this is matrix form, by setting:

\( \mathbf{x} = (x_1, x_2, \ldots, x_n)^\top\),
\( \mathbf{c} = (c_1, c_2, \ldots, c_n)^\top\),
\( \mathbf{b} = (b_1, b_2, \ldots, b_m)^\top\), and
\(A = \begin{pmatrix} a_{11} & a_{12} & \cdots & a_{1n} \\ a_{21} & a_{22} & \cdots & a_{2n} \\ \vdots & \ddots & \vdots \\ a_{m1} & a_{m2} & \cdots & a_{mn} \\ \end{pmatrix}\).

With those definitions the LP becomes:

Maximize \( \mathbf{c}^\top \mathbf{x} \)

subject to \(\left\{\begin{aligned} A \mathbf{x} & \le \mathbf{b} \\ \mathbf{x} & \ge 0 \\ \end{aligned}\right.\)

Adding the slack variables

The first thing we did to solve this LP in the Simplex Method was add slack variables \(x_{n+1}, \ldots, x_{n+m}\), one per constraint (other than the positivity constraints). In matrix terms that corresponds to considering the following linear program:

Maximize \(\mathbf{c}^\top \mathbf{x}_{\mathsf{dec}}\)

subject to \(\left\{\begin{aligned} A \mathbf{x}_{\mathsf{dec}} + \mathbf{x}_\mathsf{slack} & = \mathbf{b} \\ \mathbf{x}_{\mathsf{dec}}, \mathbf{x}_{\mathsf{slack}} & \ge 0, \\ \end{aligned}\right.\)

where \( \mathbf{x}_{\mathsf{dec}} = (x_1, x_2, \ldots, x_n)^\top\) is the vector of original or decision variables and \( \mathbf{x}_{\mathsf{slack}} = (x_{n+1}, x_{n+2}, \ldots, x_{n+m})^\top\) is the vector of slack variables.

We can rewrite this last LP in terms of the augmented \(m \times (n + m)\) matrix \(A^\mathsf{aug} = \begin{pmatrix} A & I \end{pmatrix}\), where \(I\) denotes the \(m \times m\) identity matrix. Using \(\mathbf{x}\) to now denote the vector of all variables, both original and slack, and setting \(\mathbf{c}^\mathsf{aug} = (c_1, \dots, c_n, 0, \ldots, 0)^\top\) (with a zero for every slack variable) the LP is:

Maximize \(\left(\mathbf{c}^\mathsf{aug}\right)^\top \mathbf{x}\)

subject to \(\left\{\begin{aligned} A^\mathsf{aug} \mathbf{x} & = \mathbf{b} \\ \mathbf{x} & \ge 0 \\ \end{aligned}\right.\)

Dictionaries

A warm up, let’s consider the initial dictionary. Roughly speaking, it looks like:

“slack variables = right-hand sides of constraints - left-hand sides”

“\(z\) = objective function”

This is exactly what we get by solving for \(\mathbf{x}_{\mathsf{slack}}\) in the next to last LP we wrote above:

\[\begin{aligned} \mathbf{x}_{\mathsf{slack}} & = \mathbf{b} - A \mathbf{x}_{\mathsf{dec}} \\ z & = \mathbf{c}^\top \mathbf{x}_{\mathsf{dec}} \end{aligned}\]

Later on in the Simplex Method, after performing some pivots, the non-basic variables are no longer \(\mathbf{x}_{\mathsf{dec}}\) and the basic variables are no longer \(\mathbf{x}_{\mathsf{slack}}\).

Let’s say that we are interested in some dictionary where the basic variables for a vector \(\mathbf{x}_B\) and the non-basic variables form a vector \(\mathbf{x}_N\). Let \(A_B\) and \(A_N\) denote the submatrices of \(A^\mathsf{aug}\) formed by the columns corresponding to basic and non-basic variables, respectively. The first order of business is to see that the equation \(A^\mathsf{aug} \mathbf{x} = \mathbf{b}\) can be written as \(A_B \mathbf{x}_B + A_N \mathbf{x}_N = b \).

One way to think about this is to use block matrix multiplication. Think of ordering the variables in \(\mathbf{x}\) so that all the basics come first and the vector is partitioned as \(\mathbf{x} = \begin{pmatrix}\mathbf{x}_B \\ \mathbf{x}_N\end{pmatrix}\); rearranging the columns of \(A^\mathsf{aug}\) in the corresponding order, we get \(A^\mathsf{aug} = \begin{pmatrix}A_B & A_N\end{pmatrix}\). Then, by block multiplication, \(A^\mathsf{aug} \mathbf{x} = \begin{pmatrix}A_B & A_N\end{pmatrix} \begin{pmatrix}\mathbf{x}_B \\ \mathbf{x}_N\end{pmatrix} = A_B \mathbf{x}_B + A_N \mathbf{x}_N \).

I definitely encourage everyone to learn all about block matrix multiplication, but there is a less fancy way to deal with this particular situation: if you have a matrix \(M\) with columns \(M_1, M_2, \ldots, M_p\) and a vector \(\mathbf{v} = (v_1, \ldots, v_p)=\), then the product \(M \mathbf{v}\) is the linear combination of the columns of \(M\) with coefficients the entries of the vector \(\mathbf{v}\), in symbols, \(M \mathbf{v} = v_1 M_1 + \cdots + v_p M_p\) —which you can easily check by computing both sides. So, if the columns of \(A^\mathsf{aug}\) are \(A_1, \ldots, A_{m+n}\), then \(A^\mathsf{aug} \mathbf{x} = A_B \mathbf{x}_B + A_N \mathbf{x}_N \) simply because both sides work out to be \(x_1 A_1 + \cdots + x_{m+n} A_{m+n}\).

OK, now that we’ve justified the equation, solving for \(\mathbf{x}_B\), we get \(\mathbf{x}_B = A_B^{-1} \mathbf{b} - A_B^{-1} A_N \mathbf{x}_N \). This is the top part of the dictionary that gives formulas for the basic variables. To get the \(z\)-row, we can split the entries of \(\mathbf{c}^\mathsf{aug}\) into the coefficients of basic and non-basic variables, say, \(\mathbf{c}_B\) and \(\mathbf{c}_N\). Then the objective function is \[ \begin{aligned} z & = (\mathbf{c}^\mathsf{aug})^\top x \\ & = \mathbf{c}_B^\top \mathbf{x}_B + \mathbf{c}_N^\top \mathbf{x}_N \\ & = \mathbf{c}_B^\top (A_B^{-1} \mathbf{b} - A_B^{-1} A_N \mathbf{x}_N) + \mathbf{c}_N^\top \mathbf{x}_N \\ & = \mathbf{c}_B^\top A_B^{-1} \mathbf{b} + \left(\mathbf{c}_N^\top - \mathbf{c}_B^\top A_B^{-1} A_N \right) \mathbf{x}_N \end{aligned}\]

Putting it all together, we see that the dictionary whose basic variables are \(\mathbf{x}_B\) is given by

\[\begin{aligned} \mathbf{x}_B & = A_B^{-1} \mathbf{b} - A_B^{-1} A_N \mathbf{x}_N \\ z & = \mathbf{c}_B^\top A_B^{-1} \mathbf{b} + \left(\mathbf{c}_N^\top - \mathbf{c}_B^\top A_B^{-1} A_N \right) \mathbf{x}_N \end{aligned}\]

Matrix formulas for dictionaries

Table of Contents

Matrix form of a linear program

Adding the slack variables

Dictionaries