Linear programming is a technique to optimize any problem with multiple variables and constraints. It's a simple but powerful tool every data scientist should master.

Imagine you are a strategist recruiting an army. You have:

  • Three resources: 🌾food, 🪵wood, and 🪙gold
  • Three units: 🗡️swordsmen, 🏹bowmen, and 🐎horsemen.

Horsemen are stronger than bowmen, who are in turn stronger than swordsmen. The following table provides the cost and power of each unit:

Unit 🌾Food 🪵Wood 🪙Gold 💪Power
🗡️Swordsman 60 20 0 70
🏹Bowman 80 10 40 95
🐎Horseman 140 0 100 230

Now we have 1200 🌾food, 800 🪵wood, and 600 🪙gold. How should we maximize the power of our army considering these resources?

We could simply find the unit with the best power/cost ratio, take as many of them as possible, and repeat the process with the other two units. But this "guess and check" solution might not even be optimal...

Now imagine we have millions of units and resources: the previous greedy strategy is likely to completely miss the optimal solution. It is possible to use a machine learning algorithm (e.g., a genetic algorithm) to solve this problem, but we have no guarantee that the solution will be optimal either.

Fortunately for us, there is a method that can solve our problem in an optimal way: linear programming (or linear optimization), which is part of the field of operations research (OR). In this article, we'll use it to find the best numbers of swordsmen, bowmen, and horsemen to build the army with the highest power possible.

You can run the code from this tutorial with the following Google Colab notebook.

🧠 I. Solvers

In Python, there are different libraries for linear programming such as the multi-purposed SciPy, the beginner-friendly PuLP, the exhaustive Pyomo, and many others.

Today, we are going to use Google OR-Tools, which is quite user-friendly, comes with several prepackaged solvers, and has by far the most stars on GitHub.

If the installation doesn't work, please restart the kernel and try again: it can fail sometimes. ¯\_(ツ)_/¯

!python -m pip install --upgrade --user -q ortools

All these libraries have a hidden benefit: they act as interfaces to use the same model with different solvers. Solvers like Gurobi, Cplex, or SCIP have their own APIs, but the models they create are tied to a specific solver.

OR-Tools allows us to use an abstract (and quite pythonic) way of modeling our problems. We can then choose one or several solvers to find an optimal solution. The model we built is thus highly reusable!

OR-Tools comes with its own linear programming solver, called GLOP (Google Linear Optimization Package). It is an open-source project created by Google's Operations Research Team and written in C++.

Other solvers are available such as SCIP, an excellent non-commercial solver created in 2005 and updated and maintained to this day. We could also use popular commercial options like Gurobi and Cplex. However, we would need to install them on top of OR-Tools and get the appropriate licenses (which can be quite costly). For now, let's try GLOP.

from ortools.linear_solver import pywraplp

# Create a solver using the GLOP backend
solver = pywraplp.Solver('Maximize army power', pywraplp.Solver.GLOP_LINEAR_PROGRAMMING)

🧮 II. Variables

We created an instance of the OR-Tools solver using GLOP. Now, how to use linear programming? The first thing we want to define is the variables we want to optimize.

In our example, we have three variables: the number of 🗡️swordsmen, 🏹bowmen, and 🐎horsemen in the army. OR-Tools accepts three types of variables:

  • NumVar for continuous variables;
  • IntVar for integer variables;
  • BoolVar for boolean variables.

We're looking for round numbers of units, so let's choose IntVar. We then need to specify lower and upper bounds for these variables. We want at least 0 unit, but we don't really have an upper bound. So we can say that our upper bound is infinity (or any big number we will never reach). It can be written as:

$$0 \leq swordsmen < \infty \\ 0 \leq bowmen < \infty \\ 0 \leq horsemen < \infty$$

Let's translate it into code. Infinity is replaced by solver.infinity() in OR-Tools. Other than that, the syntax is quite straightforward:

swordsmen = solver.IntVar(0, solver.infinity(), 'swordsmen')
bowmen = solver.IntVar(0, solver.infinity(), 'bowmen')
horsemen = solver.IntVar(0, solver.infinity(), 'horsemen')

⛓️ III. Constraints

We defined our variables, but the constraints are just as important.

Perhaps counter-intuitively, adding more constraints helps the solver to find an optimal solution faster. Why is this the case? Think of the solver as a tree: constraints help it trim branches and reduce the search space.

In our case, we have a limited number of resources we can use to produce units. In other words, we can't spend more resources than we have. For instance, the 🌾food spent to recruit units cannot be higher than 1200. The same is true with 🪵wood (800) and 🪙gold (600).

According to our table, units have the following costs:

  • 1 swordsman = 🌾60 + 🪵20;
  • 1 bowman = 🌾80 + 🪵10 + 🪙40;
  • 1 horseman = 🌾140 + 🪙100.

We can write one constraint per resource as follows:

$$60\times swordsmen + 80\times bowmen + 140\times horsemen \leq 1200 \\ 20\times swordsmen + 10\times bowmen \leq 800 \\ 40\times bowmen + 100\times horsemen \leq 600$$

In OR-Tools, we simply add the constraints to our solver instance with solver.Add().

solver.Add(swordsmen*60 + bowmen*80 + horsemen*140 <= 1200) # Food
solver.Add(swordsmen*20 + bowmen*10 <= 800)                 # Wood
solver.Add(bowmen*40 + horsemen*100 <= 600)                 # Gold

🎯 IV. Objective

Now that we have our variables and constraints, we want to define our goal (or objective function).

In linear programming, this function has to be linear (like the constraints), so of the form $ax + by + cz + d$. In our example, the objective is quite clear: we want to recruit the army with the highest power. The table gives us the following power values:

  • 1 swordsman = 💪70;
  • 1 bowman = 💪95;
  • 1 horseman = 💪230.

Maximizing the power of the army amounts to maximizing the sum of the power of each unit. Our objective function can be written as:

$$max\ 70\times swordsmen + 95\times bowmen + 230\times horsemen$$

In general, there are two types of objective functions: maximizing and minimizing. In OR-Tools, we declare this goal with solver.Maximize() or solver.Minimize().

solver.Maximize(swordsmen*70 + bowmen*95 + horsemen*230)

And we're done! There are three steps to model any linear optimization problem:

  1. Declaring the variables to optimize with lower and upper bounds;
  2. Adding constraints to these variables;
  3. Defining the objective function to maximize or to minimize.

Now that is clear, we can ask the solver to find an optimal solution for us.

🥇 V. Optimize!

Calculating the optimal solution is done with solver.Solve(). This function returns a status that can be used to check that the solution is indeed optimal.

Let's print the highest total power we can get with the best army configuration.

status = solver.Solve()

# If an optimal solution has been found, print results
if status == pywraplp.Solver.OPTIMAL:
  print('================= Solution =================')
  print(f'Solved in {solver.wall_time():.2f} milliseconds in {solver.iterations()} iterations')
  print()
  print(f'Optimal power = {solver.Objective().Value()} 💪power')
  print('Army:')
  print(f' - 🗡️Swordsmen = {swordsmen.solution_value()}')
  print(f' - 🏹Bowmen = {bowmen.solution_value()}')
  print(f' - 🐎Horsemen = {horsemen.solution_value()}')
else:
  print('The solver could not find an optimal solution.')
================= Solution =================
Solved in 87.00 milliseconds in 2 iterations

Optimal power = 1800.0 💪power
Army:
 - 🗡️Swordsmen = 6.0000000000000036
 - 🏹Bowmen = 0.0
 - 🐎Horsemen = 5.999999999999999

Great! The solver found an optimal solution: our army has a total power of 💪1800 with 6 🗡️swordsmen and 6 🐎horsemen (sorry bowmen!).

Let's unpack this result:

  • The solver decided to take the maximum number of 🐎horsemen (6, since we only have 🪙600 and they each cost 🪙100);
  • The remaining resources are spent in 🗡️swordsmen: we have $1200 – 6*140 = 360$🌾food left, which is why the solver chose 6 🗡️swordsmen;
  • We can deduce that the horsemen are the best unit and the bowmen are the worst one because they haven't been chosen at all.

Okay, but there's something quite weird: these numbers are not round, even though we specified that we wanted integers (IntVar). So what happened?

Unfortunately, answering this question requires a deep dive into linear programming… To keep things simple in this introduction, let's say it's because of GLOP. Solvers have characteristics we have to take into account, and GLOP doesn't handle integers. This is another proof that building reusable models is more than just convenient.

We'll explain why GLOP has this strange behavior and how to fix it in a more advanced tutorial.

Conclusion

We saw through this example the five main steps of any linear optimization problem:

  1. Choosing a solver: in our case, we selected GLOP for convenience.
  2. Declaring variables: the parameters to optimize were the number of swordsmen, bowmen, and horsemen.
  3. Declaring constraints: each of these units has a cost. The total cost could not exceed our limited resources.
  4. Defining objective: the criterion to maximize was the total power of this army. It could have been something else, like the number of units.
  5. Optimizing: GLOP found an optimal solution to this problem in less than a second.

This is the main benefit of linear programming: the algorithm gives us a guarantee that the solution that was found is optimal (with a certain error). This guarantee is powerful, but comes at a cost: the model can be so complex that the solver takes years (or more) to find an optimal solution. In this scenario, we have two options:

  • We can stop the solver after a certain time (and probably obtain a suboptimal answer);
  • We can use a metaheuristic like a genetic algorithm to calculate an excellent solution in a short amount of time.

In the next article, we'll talk about the different types of optimization problems and generalize our approach to an entire class of them.

I hope you enjoyed this introduction! Feel free to share it and spread the knowledge about linear optimization. Let's connect on Twitter where I post summaries of these articles. Cheers!

🥇 Linear Programming Course

🔎 Course overview

📝 Chapter 1: Introduction to Linear Programming

📝 Chapter 2: Integer vs. Linear Programming

📝 Chapter 3: Constraint Programming