Introduction to the simplex method

Slack variables
Dictionaries
- The solution associated to a dictionary
- Feasibility
Pivoting

We’ll start by explaining the “easy case” of the Simplex Method: when you start with a linear program in standard form where all the right-hand sides of the constraints are non-negative.

Roughly speaking, you turn the LP into a dictionary¹, and then repeatedly pivot to get new dictionaries until at some point the numbers in the dictionary indicate you are done. We’ll illustrate the procedure with the following example:

Maximize \( x_1 + 2x_2 - x_3\)

subject to \[\begin{aligned} 2x_1 + x_2 + x_3 & \le 14\\ 4x_1 + 2x_2 + 3x_3 & \le 28\\ 2x_1 + 5x_2 + 5x_3 & \le 30\\ x_1, x_2, x_3 & \ge 0 \end{aligned}\]

Slack variables

The first step will be to introduce slack variables, one for each of the constraints (except the positivity constraints). These are simply the difference between the right-hand side and the left-hand side of a constraint. For example, for the first constraint we define a slack variable \( x_4 = 14 - 2x_1 - x_2 - x_3 \). In terms of this new variable, the first constraint is equivalent simply to \(x_4 \ge 0\), the positivity constraint for \( x_4 \).

Introducing these slack variables leads to a linear program equivalent to the original one² but for which all constraints are either equations or positivity constraints:

Maximize \( x_1 + 2x_2 - x_3\)

subject to \[\begin{array}{*{10}{r}} 2x_1 & + x_2 & + x_3 & + x_4 &&&= 14\\ 4x_1 & + 2x_2 & + 3x_3 && + x_5 &&= 28\\ 2x_1 & + 5x_2 & + 5x_3 &&& + x_6 &= 30\\ x_1, & x_2, & x_3, & x_4, & x_5, & x_6 & \ge 0 \end{array}\]

Dictionaries

Solving for the slack variables we get the initial dictionary for our problem:

\[\begin{array}{*{10}{r}} x_4 = & 14 & -2 x_1& -x_2& -x_3\\ x_5 = & 28 & -4 x_1& -2 x_2& -3 x_3\\ x_6 = & 30 & -2 x_1& -5 x_2& -5 x_3\\ z = & 0& + x_1& +2 x_2& -x_3\\ \end{array}\]

The variables on the left, \(x_4, x_5, x_6\), are the basic variables for this dictionary; the set of these variables is called the basis. (In the initial dictionary the basic variables are the slack variables, that changes after pivoting.) The rest of the variables are called non-basic. Notice that at the bottom we’ve added a variable \(z\) for the objective function.

Each dictionary is a system of equations that is equivalent to the equality constraints from the LP obtained from the original LP by adding slack variables. In terms of dictionaries, the LP we want to solve is equivalent to “maximize \(z\) subject to the equations in the dictionary and to positivity constraints for all \(x_i\)”.

The solution associated to a dictionary

To any dictionary there is an associated solution of the system of equations: just set all the non-basic variables to zero and compute the values of the basic variables from the dictionary. For the above dictionary the associated solution is \[ x_1 = x_2 = x_3 = 0, x_4 = 14, x_5 = 28, x_6 = 30. \]

Feasibility

The solution associated to a dictionary certainly satisfies all of the constraints that are equations, but may fail to satisfy the positivity constraints. When it does also satisfy the positivity constraints it is said to be a feasible solution and the dictionary is called feasible too.

Here the initial dictionary is feasible because the LP we started with happened to have all the right-hand sides of constraints non-negative. This needn’t always happen and later on we will cover what to do when some constraints have negative right-hand sides; but for now, let’s focus on the case when the initial dictionary is feasible, like in this example.

Pivoting

The main idea of the Simplex Method is to go from dictionary to dictionary by exchanging a basic variable for a non-basic one, in such a way that:

The objective function increases at each step³.
The dictionary is feasible at every step.

Let’s explain how to pick the variables you swap.

Picking a non-basic variable to enter the basis

In the dictionary we have, the \(z\)-row is \(z=0+x_1+2x_2-x_3\). This tells us that for the solution associated to the dictionary, the objective function has the value \(0\); and it also tells us, for each non-basic variable, whether increasing that variable will increase or decrease \(z\). Since \(x_1\) and \(x_2\) have positive coefficients, increasing them would increase \(z\). On the other hand, increasing \(x_3\) will decrease \(z\).

So, to increase \(z\) we will choose either \(x_1\) or \(x_2\) to enter the basis.

Either one would be a valid choice and make some progress toward solving the LP, but we’ll use a definite rule to pick one:

Standard rule⁴ for picking the entering variable: among the non-basic variables that have a positive coefficient in the objective function, choose the one with the largest coefficient. If more than one variable is tied for largest coefficient, pick the one with smallest index (e.g., if both \(x_4\) and \(x_2\) had a coefficient of \(3\), and \(3\) was the largest, you’d pick \(x_2\)).

This is not done for mathematical reasons but for logistical reasons! By having a standard rule for the course, I can make sure that when you solve an LP you don’t get crazy numbers, and the course grader has an easier time, too.

Picking a basic variable to exit the basis

We’ve settled on putting \(x_2\) in the basis and now we need to decide which basic variable to swap it with. The solution associated to our current dictionary has \(x_2=0\). The idea now is to increase the value of \(x_2\) as much as possible while keeping the resulting dictionary feasible. Let’s focus on how the values of the basic variables depend on \(x_2\) (keeping \(x_1=x_3=0\)):

\[\begin{array}{rrr} x_4 = & 14 & -x_2\\ x_5 = & 28 & -2 x_2\\ x_6 = & 30 & -5 x_2\\ z = & 0 & +2 x_2\\ \end{array}\]

We see that:

To keep \(x_4 \ge 0\), we can only increase \(x_2\) up to \(14\).
To keep \(x_5 \ge 0\), we can only increase \(x_2\) up to \(14\).
To keep \(x_6 \ge 0\), we can only increase \(x_2\) up to \(6\).

So, to keep all variables non-negative, we can only go up to \(6\). Since \(x_6\) had the strictest requirement on the entering variable \(x_2\), we pick \(x_6\) as the exiting variable.

In case two variable are tied for strictest bound on the entering variable we need a tie breaking rule:

Standard rule for picking the exiting variable: In case of ties, pick the variable with the smallest index.

Performing the pivot

So we’ve decided that \(x_2\) will enter the basis and \(x_6\) will exit. We get the next dictionary by:

Solving for \(x_2\) in the equation for \(x_6\).
Plugging that value for \(x_2\) into the equations for the other basic variables and the equation for the objective function.

This gives us:

\[\begin{array}{*{10}{r}} x_2 = & 6 & -\frac{2}{5} x_1& -x_3& -\frac{1}{5} x_6\\ x_4 = & 8 & -\frac{8}{5} x_1& & +\frac{1}{5} x_6\\ x_5 = & 16 & -\frac{16}{5} x_1& -x_3& +\frac{2}{5} x_6\\ z = & 12& +\frac{1}{5} x_1& -3 x_3& -\frac{2}{5} x_6\\ \end{array}\]

This is a new dictionary that still represents the same LP. The associated solution has \(x_1=x_3=x_6=0, x_2=6, x_4=8, x_5=16\) and \(z=12\). This is a feasible solution so we learn that getting \(z=12\) is possible, and thus the maximum \(z\), which is what we seek, is \(12\) or larger.

One way to tell you made a mistake

The solution turned out feasible again by virtue of the way we picked the exiting variable. If we had, for example, decided to make \(x_2\) enter but \(x_4\) exit we would have gotten the following dictionary:

\[\begin{array}{*{10}{r}} x_2 = & 14 & -2 x_1& -x_3& -x_4\\ x_5 = & 0 & & -x_3& +2 x_4\\ x_6 = & -40 & +8 x_1& & +5 x_4\\ z = & 28& -3 x_1& -3 x_3& -2 x_4\\ \end{array}\]

The associated solution has \(x_6=-40\) violating the positivity constraint. If that ever happens, you’ll know you made a mistake somewhere!

When to stop pivoting

At this point you know how to pivot to increase the objective function while keeping the dictionary feasible. For example, in the last feasible dictionary we can pivot by having \(x_1\) enter. Either \(x_4\) or \(x_5\) would have to exit and our tie-breaking rule has us choose \(x_4\). After pivoting we get:

\[\begin{array}{*{10}{r}} x_1 = & 5 & & -\frac{5}{8} x_4& +\frac{1}{8} x_6\\ x_2 = & 4 & -x_3& +\frac{1}{4} x_4& -\frac{1}{4} x_6\\ x_5 = & 0 & -x_3& +2 x_4\\ z = & 13& -3 x_3& -\frac{1}{8} x_4& -\frac{3}{8} x_6\\ \end{array}\]

This new dictionary has all of the coefficients in the \(z\)-row negative, so we can’t pick a new entering variable. This means we are done with this problem and \(z=13\) is the maximum value of the objective function. The solution associated to this dictionary, \(x_3=x_4=x_6=0, x_1=5, x_2=4, x_5=0\) is an optimal solution.

Why exactly is \(z=13\) the maximum? Clearly the Simplex Method stops here, since there is no way to pick variables for the next pivot according to the rules, but can’t there be a sneaky way to increase \(z\), say, by doing a pivot that’s against the rules because it decreases \(z\), followed by a pivot that makes \(z\) even bigger than \(13\)?

There’s no need to worry, that can’t happen and the dictionary tells us why: we have \(z = 13 -3 x_3 -\frac{1}{8} x_4 -\frac{3}{8} x_6\) and each variable has a positivity constraint, so that equation really tells us that \(z = 13 - {}\) (something non-negative) \(\le 13\).

Additionally we learn that \(z=13\) only when \(x_3=x_4=x_6=0\). And having those variable be zero tells us that \(x_1=5, x_2=4, x_5=0\), so that the optimal solution we found is the only optimal solution.

There is an alternative way of presenting the Simplex Method using tableaux instead of dictionaries. This really just a different way of writing the same calculation. If you’ve studied linear programming before you are likely to have seen tableaux. It’s a good exercise to solve the same LP both ways, so you can see how this dictionary method corresponds to the tableau. If you don’t know about tableaux, don’t worry about them.

Notice that this equivalent LP is not in standard form.

Or, at least, doesn’t decrease. Sometimes you are forced to make degenerate pivots which don’t change the associated solution or the value of the objective function.

⁴

In some course materials you might see this standard rule called “Anstee’s rule” instead, named tongue in cheek after Richard Anstee who has taught this course many times. Don’t expect non-UBC people to call it Anstee’s rule.