*Two main players in the fruit industry are seeking to increase their sales in an overall growing market. The incumbent is Amazing Apples Inc.; the up-and-coming fruit company is Better Bananas LLC. Better Bananas has hired your firm to optimize their marketing campaign. But beware: marketing is madness, and you are bound to be surprised!*

Better Bananas has designated $500,000 for their campaign. They are targeting three different age groups: TikTokers (T, under 30), Hashtaggers (H, 30–50), and SilverSurfers (S, over 50). Now they need to decide how to distribute their budget to these groups in increments of k = $50,000. The problem is that when they launch their campaign, they are sure that Amazing Apples will counter with their own campaign. And Amazing Apples has deep pockets; they can spend $1 million on their own marketing campaign.

Each company is only aiming to maximize its own sales; there is no incentive to hurt the sales of the other company. However, by marketing its own product, each company inadvertently diminishes the effect of the other's marketing.

Let us denote the marketing spend per age group G by company X with X_G (e.g., B_T for Better Bananas' spend on TikTokers or A_S for Amazing Apples' spend on SilverSurfers). Moreover, recall that k = 50,000.

Then, for Better Bananas, the expected growth in sales is given by this formula:

Bananas [in thousand pounds] = 300 * B_T/(A_T+k) + 350 * B_H/(A_H+k) + 400 * B_S/(A_S+k)

The formula for Amazing Apples is:

Apples [in thousand pounds] = 250 * A_T/(B_T+k) + 500 * A_H/(B_H+k) + 750 * A_S/(B_S+k)

The only additional information we have is that Better Bananas cannot spend more than $300,000 on any one age group, while Amazing Apples can spend at most $500,000 on any individual age group. After these maxima are reached, no further increase in sales is to be gained.

The simple question for you is: How much money should Better Bananas allocate to each age group? You have 10 tokens of $50,000 each to distribute and cannot allocate more than 6 tokens to any one age group.

____________

Scroll down for the solution.

___________

Maybe think a bit more? The solution is not obvious.

The optimal allocation is $50,000 to TikTokers, $150,000 to Hashtaggers, and $300,000 to SilverSurfers. The optimal reply by Amazing Apples then is to allocate $500,000 to TikTokers and an equal amount to Hashtaggers. With these allocations, we are expecting an increase in sales of 2.52 million pounds of bananas and 2.5 million pounds of apples.

Better Bananas' designation of $300,000 to SilverSurfers will not surprise, as the marketing spend on this age group leads to the largest increase in sales (the biggest bang for the buck). But why is it not optimal to spend the entire remaining $200,000 on the Hashtaggers?

The answer is simple: If we were to allocate $200,000 to the Hashtaggers, then it would be better for Amazing Apples to compete with us in the SilverSurfer than in the Hashtaggers age group. Particularly, Amazing Apples' optimal allocation of their $1 million would then be to spend $500,000 on TikTokers and the same amount on SilverSurfers. This would lead to increased sales of only 1.61 million pounds of bananas and a whopping 3.57 million pounds of apples (where the primary sales driver is that they have no competition marketing TikTokers).

Therefore, by **deliberately lowering** our marketing spend in the Hashtaggers segment, we leave enough room for Amazing Apples to invest here instead of in the SilverSurfer segment, which we care more about. Stated simpler: don't go bananas with your marketing.

*This example shows how our business results can depend on the actions of other players who are not necessarily antagonists but who pursue their own objectives. There are many surprising and often counterintuitive effects that can occur in such a setting. Thankfully, InsideOpt Seeker(TM) can model these problems with ease. Below, you will find the complete model for this problem.*

1 import seeker as skr 2 3 env = skr.Env("Seeker_License.sio") 4 k = 50000 5 investments_B = [env.ordinal(0,6)*k for _ in range(3)] 6 obj_A_const = [250, 500, 750] 7 obj_A = [ obj_A_const[i]/(investments_B[i]+k) for i in range(3) ] 8 var_bounds = env.convert([0, 10*k, 0, 10*k, 0, 10*k]) 9 row_bounds = env.convert([0, 1000000]) 10 matrix = env.convert([[1,1,1]]) 11 opt_model_A = env.lp(obj_A, var_bounds, row_bounds, matrix, maximize=True) 12 solution_A_value = opt_model_A.get_objective() 13 investments_A = opt_model_A.get_solution() 14 obj_B_const = [300, 350, 400] 15 env.enforce_leq(env.sum(investments_B),500000) 16 obj_B = env.sum([ obj_B_const[i] * investments_B[i] / (investments_A[i]+k) for i in range(3) ]) 17 18 env.maximize(obj_B,1) 19 20 print("Investments B:", [investments_B[i].get_value() for i in range(3)]) 21 print("Extra Bananas [1,000 pounds]:",obj_B.get_value()) 22 print("Investments A:", [investments_A[i].get_value() for i in range(3)]) 23 print("Extra Apples [1,000 pounds]:",solution_A_value.get_value()) 24 print("Evaluations:", env.get_number_evaluations())

*Note that, in Seeker(TM), you can specify terms that conduct their own separate optimization. In Line 11, we specify a linear programming optimization model that expresses that Amazing Apples is pursuing its own separate objective function, which itself depends on our investment decisions. We obtain access to the optimal investment decisions from Amazing Apples in Line 13.*

*After optimizing our own objective for 1 second in Line 18, we obtain this output:*

Investments B: [50000.0, 150000.0, 300000.0]

Extra Bananas [1,000 pounds]: 2522.727272727273

Investments A: [500000.0, 500000.0, 0.0]

Extra Apples [1,000 pounds]: 2500.0000

Evaluations: 45405

*Overall, thanks to Seeker(TM), we increased the effectiveness of our campaign by over 55%. Not bad for a day's work!*

*You work at SafeData LLC in DS City. You were hired as a consultant by LastDrop.com, a drop-shipper. They have gathered some data on their operations, and they are perplexed by what they found. Can you help them?*

LastDrop.com operates a big warehouse near the port where ample supplies of all products are always available. They also rent space in a smaller warehouse in the Minsky Midlands that gets supplies from the large central warehouse.

They sell two products, AceAir and BreezeBox, that directly compete with each other. LastDrop.com has a stellar data science team and an operations research team that is the envy of the industry.

The data science team has considered the following weekly sales data for the two products in the Minsky Midlands for the past three years:

Format: weekly units sold [AceAir BreezeBox]

[100 10] [9 101] [11 99] [11 99] [10 100]

[9 101] [10 100] [11 99] [9 101] [11 99]

[10 99] [11 99] [10 100] [9 101] [10 100]

[10 100] [10 100] [99 11] [8 102] [10 100]

[9 101] [11 99] [11 99] [12 98] [12 98]

[9 101] [12 99] [99 11] [8 102] [9 101]

[100 10] [9 101] [101 9] [9 101] [9 101]

[12 99] [11 99] [9 101] [11 99] [9 101]

....

[9 101] [10 100] [ 11 99] [10 100] [9 101]

[9 101] [11 99] [12 98] [11 99] [ 9 101]

(Notice that there is no seasonality, it is always hot in the Minsky Midlands.)

Based on their data, they found that the expected weekly demand for AceAirs in the Minsky Midlands is 19 units per week, whereas the expected demand for BreezeBoxes is 91 (as it turns out, these expected values are **exactly correct**).

They have a truck that runs from the central warehouse to the regional warehouse weekly and that can transport 110 units at most. If they run the truck, it costs them $350. In return, it costs $10 less per AceAir unit to ship from the regional warehouse, and for BreezeBoxes they still save $5 per unit when compared to shipping from the central warehouse, which they can always do if there is more demand than local supply. On the other hand, if not all relocated units sell, they incur additional inventory holding costs of $3 for AceAirs and $2 for BreezeBoxes at the local warehouse.

The OR team did the math based on this data and found that shipping 19 units of AceAir and 91 BreezeBoxes was provably optimal, with an expected cost of $350 compared to $650 when not relocating inventory.

LastDrop.com has been relocating inventory for a year. Now, due to some bridge closures followed by a truck driver strike, they have not been able to run the truck for the past 4 weeks. To their astonishment, instead of the expected 85% increase, they found that their operational costs went up by a mere 15% when compared with the average operational costs for the prior year. The demands for the past four weeks were [9 101] [10 100] [11 99] [10 100].

Now, the strike is about to end, and the bridges are open again. They have asked you to shed light on the matter. Should they resume the truck runs? And if so, how many units of each product should they ship?

____________

THIS IS A HARD PROBLEM - SCROLL DOWN FOR A HINT

____________

____________

Hint: Look at the sales data again. Do you notice the shift in demand between the two products in some weeks?

[10 100] -> **[99 11]** -> [8 102]

Out of the 50 weeks given here, this happens 5 times, so with 10% probability. Simplify the problem by assuming that sales are either [10 100] (with 90% probability) or [100 10] (with 10% probability). In this simplified setup, you should now be able to make an optimal shipping recommendation.

SCROLL DOWN FOR SOLUTION

____________

The company should resume shipping and relocate 10 AceAir units and 100 BreezeBoxes to the regional warehouse weekly.

*Optimization based on expected values has an overfitting problem*

Note that the data science **and** optimization departments at LastDrop.com did a perfect job: the expected demand for AceAirs is 19, and 91 for BreezeBoxes. And for the scenario where this correct demand forecast is exactly met, shipping the corresponding number of units is indeed provably optimal. The problem is: This forecasted mean scenario never happens.

Looking at the sales data, we see that the demand is bimodal: There is a high 90% chance that consumers will ask for ~10 AceAirs and ~100 BreezeBoxes, but in some weeks (with 10% probability) the demand is exactly reversed. And this variability makes all the difference because it means that the optimization **overfits** a scenario that has almost no chance of materializing. So much for provably optimal solutions when any data in the model is estimated!

*Truly optimal solutions must take the variability into account*

To address this problem, the DS and OR teams at LastDrop.com should have handled the **integration** of their respective components better. Instead of providing just the expected demands, the DS team should have provided an estimate of the **distribution** of demand. The OR team, on the other hand, should have used a **stochastic solver** like InsideOpt Seeker(TM) instead of plain mathematical programming.

By the way, in this example, the solution provided by the LastDrop.com teams incurs average operational costs of ~$520 (not $350 as hallucinated when overfitting the mean scenario in the perfectly deterministic world of integer programming), compared with ~$645 when not shipping at all (FYI, there are even problem instances for which not shipping at all is actually more cost-efficient than the 'optimized' solution based on mean estimates, while the stochastic solution provides significant savings). The fact that LastDrop.com's operational costs went up by a mere 15% in the past four weeks is due to the fact that they did not encounter an inversion of demands in this time period. When no inversion occurs, their operational cost is only ~$600.

The optimal stochastic solution, on the other hand, incurs average costs of ~$465 (including the 10% possibility of inversions). **That means 10.5% operational savings for your client, not bad!**

*The next time you see a significant difference between the optimized solution value on paper and your actual operational costs (you **are** comparing these regularly, aren't you?), you will want to have a word with your DS and OR teams - and maybe reach out to one of our experts at InsideOpt!*

*In case you are curious, here is the InsideOpt Seeker(TM) model for this problem:*

def main(): truck_capacity = 110 truck_costs = 350 inventory_costs = [3, 2] shipping_costs = [10, 5]

*# create Seeker environment with path to license and*

*# set stochastic resolution to 10^6*

env = Env("Seeker_License.sio", stochastic=True) env.set_stochastic_parameters(round(1e6), 0)

*# create stochastic vector based on historical data*

demand_model = Demand() demands = env.user_vector_distribution(demand_model)

*# create decision variables: decision to relocate, and shipment quantities*

relocate = env.categorical(0, 1) ship = [env.ordinal(0, truck_capacity), env.ordinal(0, truck_capacity)]

*# define realized shipment quantities*

real_ship = [env.min([relocate * truck_capacity, ship[i]]) for i in range(2)]

*# enforce truck capacity*

total_ship = env.sum(real_ship) env.enforce_leq(total_ship, truck_capacity)

*# compute excess and shortage*

excess = [env.max_0(real_ship[i] - demands[i]) for i in range(2)] shortage = [env.max_0(demands[i] - real_ship[i]) for i in range(2)]

*# compute the costs*

local_inventory_costs = [excess[i]*inventory_costs[i] for i in range(2)] central_shipping_costs = [shortage[i]*shipping_costs[i] for i in range(2)] relocation_costs = truck_costs*relocate costs = relocation_costs + env.sum(central_shipping_costs) + env.sum(local_inventory_costs) expected_costs = env.aggregate_mean(costs)

*# minimize expected costs for half a minute*

env.minimize(expected_costs, 30)

*# report results*

print("Estimated Costs:", expected_costs.get_value()) print("Relocate:", relocate.get_value()) print("AceAir:", real_ship[0].get_value(), " BreezeBox:", real_ship[1].get_value()) print("Evaluations:", env.get_number_evaluations())

*# end the environment*

env.end()

*The Result:*

*Estimated Costs: 457.972692*

*Relocate: 1.0*

*AceAir: 10.0 BreezeBox: 100.0*

*Evaluations: 34164*

*We can also ask Seeker(TM) to return the density of the stochastic 'costs' variable by calling 'costs.get_values()' and plot it using seaborn:*

*In this way, we get a better understanding of the risks we run in our operations when executing an optimized plan. Do your operational plans come with such a risk profile?*

*Please reach out to one of our specialists at info@insideopt.com if you want to start using OR technology of the 21st century!*

The AI4TSP Competition is trying to foster research on the intersection of optimization (addressing the first issue of efficient computation for hard problems) and artificial intelligence (addressing the issue of handling uncertainty). TSP stands for Traveling Salesperson Problem, one of the most studied routing problems in the optimization community. The task is to determine the order in which to visit a given set of locations, starting from, and returning to, a given home location, such that the total travel time is minimized. In its original form, the travel times between all locations are known upfront. In the competition, these times were not known but sampled according to a probability distribution. The objective was to visit as many locations as possible within a given period of time, whereby each location was associated with a specific reward. To complicate matters further, the locations that were visited on the tour had to be reached within fixed time windows. Arriving too early meant having to wait until the location would open, arriving too late was associated with penalties.

When travel times are known, the problem looks innocent enough. However, consider this: The number of possible tours grows more than exponentially and is given by “n! = 1 * 2 * 3 … * n” (n factorial) for a problem instance with n locations. If we could evaluate, in parallel, one potential tour for every atom in the universe, and each atomic processor could evaluate the tours at Planck time, which is the shortest time unit that anything can be measured in, and we run that computer from the Big Bang until today - then we could not even enumerate all solutions for just one TSP instance with 91 locations. The biggest problems at the competition had 200 locations - with over 10 to the power of 375 potential solutions.

The competition consisted of two tracks. In the first, we had to determine a tour for a given TSP instance that would work well on expectation when averaged over all possible travel time scenarios. That is to say: A tour had to be chosen, and then we had to stick to that tour no matter how the travel times turned out when executing the tour. The results were then averaged ten times over 100,000 scenarios to determine the winner.

In the second track, we were allowed to build the tour on the fly. At every location, we could choose which location we were to go to next, taking into account how much time had already elapsed. The policy that determined how to choose the next location was evaluated on 100 travel time realizations for each of 1,000 different TSP instances to determine the winner.

InsideOpt optimization expert Dr. Meinolf Sellmann collaborated with Prof. Dr. Kevin Tierney, Dr. Tapan Shah at GE Research, Fynn Schmitt-Ulms, and Andre Hottung from the University of Bielefeld to compete and win 1st Prize in both tracks.

Congratulations, Meinolf and Team!

]]>