Puzzle: The Bamboozling Shipping Riddle

Puzzle: The Bamboozling Shipping Riddle

You work at SafeData LLC in DS City. You were hired as a consultant by LastDrop.com, a drop-shipper. They have gathered some data on their operations, and they are perplexed by what they found. Can you help them?


LastDrop.com operates a big warehouse near the port where ample supplies of all products are always available. They also rent space in a smaller warehouse in the Minsky Midlands that gets supplies from the large central warehouse.

They sell two products, AceAir and BreezeBox, that directly compete with each other. LastDrop.com has a stellar data science team and an operations research team that is the envy of the industry.

The data science team has considered the following weekly sales data for the two products in the Minsky Midlands for the past three years:

Format: weekly units sold [AceAir BreezeBox]

[100 10] [9 101] [11 99] [11 99] [10 100]

[9 101] [10 100] [11 99] [9 101] [11 99]

[10 99] [11 99] [10 100] [9 101] [10 100]

[10 100] [10 100] [99 11] [8 102] [10 100]

[9 101] [11 99] [11 99] [12 98] [12 98]

[9 101] [12 99] [99 11] [8 102] [9 101]

[100 10] [9 101] [101  9] [9 101] [9 101]

 [12 99] [11 99] [9 101] [11 99] [9 101]


 [9 101] [10 100] [ 11 99] [10 100] [9 101]

 [9 101] [11 99] [12 98] [11 99] [ 9 101]

(Notice that there is no seasonality, it is always hot in the Minsky Midlands.)


Based on their data, they found that the expected weekly demand for AceAirs in the Minsky Midlands is 19 units per week, whereas the expected demand for BreezeBoxes is 91 (as it turns out, these expected values are exactly correct).

They have a truck that runs from the central warehouse to the regional warehouse weekly and that can transport 110 units at most. If they run the truck, it costs them $350. In return, it costs $10 less per AceAir unit to ship from the regional warehouse, and for BreezeBoxes they still save $5 per unit when compared to shipping from the central warehouse, which they can always do if there is more demand than local supply. On the other hand, if not all relocated units sell, they incur additional inventory holding costs of $3 for AceAirs and $2 for BreezeBoxes at the local warehouse.

The OR team did the math based on this data and found that shipping 19 units of AceAir and 91 BreezeBoxes was provably optimal, with an expected cost of $350 compared to $645 when not relocating inventory.

LastDrop.com has been relocating inventory for a year. Now, due to some bridge closures followed by a truck driver strike, they have not been able to run the truck for the past 4 weeks. To their astonishment, instead of the expected 85% increase, they found that their operational costs went up by a mere 15% when compared with the average operational costs for the prior year. The demands for the past four weeks were [9 101] [10 100] [11 99] [10 100].

Now, the strike is about to end, and the bridges are open again. They have asked you to shed light on the matter. Should they resume the truck runs? And if so, how many units of each product should they ship?







Hint: Look at the sales data again. Do you notice the shift in demand between the two products in some weeks?

[10 100] -> [99 11] -> [8 102]

Out of the 50 weeks given here, this happens 5 times, so with 10% probability. Simplify the problem by assuming that sales are either [10 100] (with 90% probability) or [100 10] (with 10% probability). In this simplified setup, you should now be able to make an optimal shipping recommendation.






The company should resume shipping and relocate 10 AceAir units and 100 BreezeBoxes to the regional warehouse weekly.


Optimization based on expected values has an overfitting problem

Note that the data science and optimization departments at LastDrop.com did a perfect job: the expected demand for AceAirs is 19, and 91 for BreezeBoxes. And for the scenario where this correct demand forecast is exactly met, shipping the corresponding number of units is indeed provably optimal. The problem is: This forecasted mean scenario never happens.

Looking at the sales data, we see that the demand is bimodal: There is a high 90% chance that consumers will ask for ~10 AceAirs and ~100 BreezeBoxes, but in some weeks (with 10% probability) the demand is exactly reversed. And this variability makes all the difference because it means that the optimization overfits a scenario that has almost no chance of materializing. So much for provably optimal solutions when any data in the model is estimated!


Truly optimal solutions must take the variability into account

To address this problem, the DS and OR teams at LastDrop.com should have handled the integration of their respective components better. Instead of providing just the expected demands, the DS team should have provided an estimate of the distribution of demand. The OR team, on the other hand, should have used a stochastic solver like InsideOpt Seeker(TM) instead of plain mathematical programming.

By the way, in this example, the solution provided by the LastDrop.com teams incurs average operational costs of ~$520 (not $350 as hallucinated when overfitting the mean scenario in the perfectly deterministic world of integer programming), compared with ~$645 when not shipping at all (FYI, there are even problem instances for which not shipping at all is actually more cost-efficient than the 'optimized' solution based on mean estimates, while the stochastic solution provides significant savings). The fact that LastDrop.com's operational costs went up by a mere 15% in the past four weeks is due to the fact that they did not encounter an inversion of demands in this time period. When no inversion occurs, their operational cost is only ~$600.

The optimal stochastic solution, on the other hand, incurs average costs of ~$465 (including the 10% possibility of inversions). That means 10.5% operational savings for your client, not bad!



The next time you see a significant difference between the optimized solution value on paper and your actual operational costs (you are comparing these regularly, aren't you?), you will want to have a word with your DS and OR teams - and maybe reach out to one of our experts at InsideOpt!

In case you are curious, here is the InsideOpt Seeker(TM) model for this problem:

def main():
    truck_capacity = 110
    truck_costs = 350
    inventory_costs = [3, 2]
    shipping_costs = [10, 5]

# create Seeker environment with path to license and

# set stochastic resolution to 10^6

    env = Env("Seeker_License.sio", stochastic=True)
    env.set_stochastic_parameters(round(1e6), 0)

# create stochastic vector based on historical data

    demand_model = Demand()
    demands = env.user_vector_distribution(demand_model)

# create decision variables: decision to relocate, and shipment quantities

    relocate = env.categorical(0, 1)
    ship = [env.ordinal(0, truck_capacity), env.ordinal(0, truck_capacity)]

# define realized shipment quantities

    real_ship = [env.min([relocate * truck_capacity, ship[i]]) for i in range(2)]

# enforce truck capacity

    total_ship = env.sum(real_ship)
    env.enforce_leq(total_ship, truck_capacity)

# compute excess and shortage

    excess = [env.max_0(real_ship[i] - demands[i]) for i in range(2)]
    shortage = [env.max_0(demands[i] - real_ship[i]) for i in range(2)]

# compute the costs

    local_inventory_costs = [excess[i]*inventory_costs[i] for i in range(2)]
    central_shipping_costs = [shortage[i]*shipping_costs[i] for i in range(2)]
    relocation_costs = truck_costs*relocate
    costs = relocation_costs + env.sum(central_shipping_costs) + env.sum(local_inventory_costs)
    expected_costs = env.aggregate_mean(costs)

# minimize expected costs for half a minute

env.minimize(expected_costs, 30)

# report results

print("Estimated Costs:", expected_costs.get_value())
print("Relocate:", relocate.get_value())
print("AceAir:", real_ship[0].get_value(), " BreezeBox:", real_ship[1].get_value())
print("Evaluations:", env.get_number_evaluations())

# end the environment


The Result:

Estimated Costs: 457.972692

Relocate: 1.0

AceAir: 10.0 BreezeBox: 100.0

Evaluations: 34164

We can also ask Seeker(TM) to return the density of the stochastic 'costs' variable by calling 'costs.get_values()' and plot it using seaborn:


In this way, we get a better understanding of the risks we run in our operations when executing an optimized plan. Do your operational plans come with such a risk profile?

Please reach out to one of our specialists at info@insideopt.com if you want to start using OR technology of the 21st century!