A Tale of Two Coffees

A Tale of Two Coffees

As winter casts its frosty spell over DS City, 'The Cozy Bean' transforms into a haven for coffee aficionados and a bustling hub where artists and data scientists converge. The Cozy Bean roasts their own coffee, to unique perfection. As rent and labor prices have soared recently, they have turned to you to help protect their profitability. Do you think you can save this beloved hotspot?


[To dive straight into the challenge, you can skip ahead to the section just before the curvy arrow. Be sure not to scroll below it.]

Evelyn, the owner of The Cozy Bean tells you that they usually buy raw coffee at a bargain by procuring coffee that must be roasted in the same week, or it will go bad. She says that they can afford to buy these perishable beans because they have highly limited roasting capabilities and the demand always surpasses their production capabilities anyway. Coffee is therefore roasted just in time and is consumed in The Cozy Bean immediately.

Evelyn goes on to explain that there are two kinds of coffee beans on the market, Amber Harmony (A) and Black Velvet (B). Both cost the same, $500 per standard coffee unit (SCU). Both bring the same revenue, $5,000 per SCU. And both have the same expected roasting time of 10 hours per SCU, which is the only limiting factor for production. They can only roast for 60 hours each week and this limit cannot be extended as this would require a different business license.

You listen intently as she goes on to elaborate that the only difference between the two coffees is how their roasting times vary, based on how dry the beans are to begin with. Dryer beans roast faster, whereas wetter beans take longer.

For production reasons, roasting times are always full hours. When you ask if Evelyn has any data available on the roasting times, you're impressed to find that she has a spreadsheet going back two years. For every week, she noted down how long roasting took for the coffee that had been bought for that week.

As you investigate the data, you notice that Amber Harmony's roasting times per SCU seem to vary normally around 10 hours (always rounded to the nearest integer) with a standard deviation of 2 hours, while Black Velvet has a minimum roasting time of 8 hours plus some additional time that varies in accordance with a Poisson distribution with lambda 2, which implies that the expected roasting time is also 10 hours and varies with the same standard deviation of 2 hours as Amber Harmony.

You denote the randomized roasting time for Amber Harmony with R(A), and the randomized roasting time for Black Velvet with R(B). Then, the production planning problem, where you determine how much of each coffee A or B to roast, is a simple linear program:

Maximize 5,000 A + 5,000 B

Such that

                 A R(A) + B R(B) <= 60

                 with 0 <= A <= Q(A) and 0 <= B <= Q(B)

Whereby Q(X) is the quantity of coffee X that Evelyn bought for that week.


"But here is the problem," Evelyn laments, "I do not know which coffee I should buy. The roasting times only become clear once roasting starts, making production optimization straightforward then. But when I purchase the beans, those times are unknown." Evelyn looks desperate. "But with our fixed costs now at $20,000 per week, I'm worried for The Cozy Bean's future."

Resolved to assist Evelyn, you refine the problem statement. Now, the primary decision is about the quantities of each coffee type to purchase, while the production amounts A and B will be set optimally later.


Maximize Expected Profit = Expected Revenue - 500 (Q(A) + Q(B)) - 20,000

Such that Revenue = Maximum 5,000 A + 5,000 B

                                     s.t. A R(A) + B R(B) <= 60

                                     with 0 <= A <= Q(A) and 0 <= B <= Q(B)

                  R(A) ~ Round(Norm(10, 2))

                  R(B) ~ 8 + Pois(2)

                   0 <= Q(A), Q(B) [fractions of SCUs are allowed]


Is one coffee better than the other? Or is there no difference, as the expectation operator and all terms are linear, and both mean and standard deviation for both types of coffee are the same anyway? What should Evelyn purchase?




Grab a coffee and think some more?



To maximize her expected profits, Evelyn should purchase 5 SCUs of Amber Harmony and 3.33 SCUs of Black Velvet. Her expected profit is then $7,210.

This result is counter-intuitive in many ways.

  1. At first, Black Velvet appears disadvantageous because its minimum roasting time is 8 hours, while Amber Harmony may frequently roast in less than 8 hours. Plus, the Poisson-distributed Black Velvet runs a significant risk that the roasting times could be much, much longer than 10 hours, while the risk that the normally-distributed Amber Harmony needs very long roasting times declines exponentially.

  2. As it turns out, if we were to choose just one coffee, Black Velvet would actually be the better choice, with an expected profit of $6,800 per week when purchasing 7.5 SCUs (the best we can do with Amber Harmony is $6,657 when purchasing 8.6 SCUs of that coffee). The reason why Black Velvet offers better profitability is because its probability mass for roasting times of 8 or 9 hours far exceeds that of Amber Harmony.

  3. But why then is it not best to buy only Black Velvet? Because the coffee is so cheap that we can buy more than we will use. The solution above procures 8.33 SCUs of coffee, when, on expectation, we can only roast 6 SCUs. This provides a cushion that allows us to react once we know the roasting times of both coffees and then to roast more of the coffee that is dryer. But this strategy requires both coffees to be available.


Automated Decision Support

Decision-making under uncertainty is often counter-intuitive and the best course of action may be very surprising. Thankfully, you have help in the form of InsideOpt's Seeker. Here is the Seeker model for our problem above:

import seeker as skr

coffee = ['Amber Harmony', 'Black Velvet']

# Create a stochastic Seeker Environment
env = skr.Env("license.sio", True)

# Declare the decision variables
Q = {c: env.continuous(0, 10) for c in coffee}

# Set up the nested LP
# Revenue coefficients
lpObj = env.convert([5000, 5000])
# Maximum production
lpVarBounds = [env.convert(0), Q['Amber Harmony']
    , env.convert(0), Q['Black Velvet']]
# Maximum roast time
noBound = env.convert(-1e5)
lpRowBounds = [noBound, env.convert(60)]
# Stochastic roast time constraint
R = {'Amber Harmony': env.max_0(env.round(env.normal(10, 2)))
    , 'Black Velvet': 8 + env.poisson(2, 1e20)}
lpMatrix = [list(R.values())]
lp = env.lp(lpObj, lpVarBounds, lpRowBounds, lpMatrix, True)

# Enforce LP feasibility and optimality
env.enforce_eq(lp.get_solution_status(), 1)

# Compute profit and expected profit
revenue = lp.get_objective()
profit = revenue - 500 * env.sum(list(Q.values())) - 20000
expProfit = env.aggregate_mean(profit)

# Maximize expected profit
env.maximize(expProfit, 60)

# Write the solution
print("Buy Decisions", [Q[c].get_value() for c in Q])
print("Exp Profit", expProfit.get_value())

Buy Decisions [4.9997395442476105, 3.3346340091727367]

Exp Profit 7209.505947287972


As you can see, it is very easy in Seeker to formulate a nested LP optimization. What is more, notice that the nested LP in this case depends on both the outer decision variables Q as well as stochastic data in its maximum roast time constraint.


Risk Management

We built Seeker to make better decisions under uncertainty. However, the real value of this new solver lies in the ability to spend less time on solving problems and more time on discussing what problem should be solved. In this particular scenario, we Insiders would definitely suggest considering the risk of running a very low profit in a given week, especially if Evelyn's cash flow is tight.

Let us use Seeker to have a look at the variability of the profit:

import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

dat = profit.get_values()
datdf = pd.DataFrame([max(0,k) for k in dat], columns=["profit"])
ax = sns.kdeplot(x=datdf["profit"], log_scale=False)
sns.kdeplot(data=datdf, fill=True)

As the distribution of our profit shows, there is a good chance that Evelyn will be very lucky and make in excess of $10,000 in some weeks. However, there are also weeks when profits are very low or where she might even encounter a (temporary) loss.

We can ask Seeker to tell what risk Evelyn runs of making less than $1,000.

risk = env.aggregate_relative_frequency_leq(profit, 1000)
print("Risk <= $1,000:", risk.get_value())

Risk <= $1,000: 0.04056


Simultaneously Optimizing Risk and Profit

A 4% chance of having a low-profit week is not great but acceptable for Evelyn. She asks how much it would cost to lower the risk. We inquire what levels of risk and profit she considers fair and excellent. She replies that a risk of running a low profit of 3% and an expected profit of $5,000 would be fair while 1% risk and $8,000 expected profit would be excellent. We enter these values in Seeker's multi-objective optimizer:

env.multi_objective([risk, expProfit] # Objectives
                    , [0.03, 5000]    # Fair Levels
                    , [0.01, 8000]    # Excellent Levels
                    , [False, True]   # Min Risk, Max Profit
                    , 60              # Time Limit
print("Reduced Risk", risk.get_value())
print("Buy Decisions", [Q[c].get_value() for c in Q])
print("Exp Profit", expProfit.get_value())

Reduced Risk 0.01618

Buy Decisions [2.804282004443563, 4.914618222403016]

Exp Profit 7191.540962722829


Note how Seeker found some excellent value by reducing the overall amount of coffee from 8.33 SCUs to 7.71 SCUs while increasing the amount of Black Velvet by almost 50%. This costs Evelyn a meager $18 in expected profits, but reduces her risk by more than 60%! The probability to achieve a profit of $1,000 or less falls from 4% to 1.6%. Seeker is able to find such favorable trade-offs and thereby enables stakeholders to engage in a meaningful conversation about the appropriate levels of risk vs. rewards.


The Future of Optimization

  1. Distributions of estimated data matter tremendously when making decisions under uncertainty. Even for small problems, point-forecasted MIP models are frequently 8-40% sub-optimal when compared with Seeker.

  2. Stochastic optimization that simply replicates the deterministic model for 50-100 scenarios is insuffient for real-world applications.

  3. Be skeptical when a salesperson tells you their solver provided provably optimal solutions. A near-optimal solution to a realistic world model is almost always superior to an optimal solution to an approximated model --- such as any model that used point forecasts and ignores all uncertainty.


Be Part of This Future

Are you ready to try Seeker for yourself? Test our limited demo version:

  • Go to https://colab.research.google.com/ and open a new Python Notebook

  • pip install insideopt-demo

  • import seekerdemo as skr

  • env = skr.Env()    (or env = skr.Env("", True) for a stochastic environment)


The Seeker Documentation and a Seeker GPT can be found here.