Does creativity emerge from search, as stated in this video?
The Principle of Search and Inference
Search algorithms are by no means new. The entire field of operations research can be viewed as the science of search and deterministic inference. Search is the process of zooming in, making a choice, trying it out, and then shifting the focus elsewhere. Deterministic (sometimes called logic or dual) inference is the opposite process of zooming out, finding that certain choices can never work, and eliminating them from further consideration. The very same principle powers logic programming (think Sudoku playing, where you as a human probably hate instances that require search) and game-playing in AI (alpha-beta search).
Game 2, Move 37
While the memory of Move 37 in Game 2 of AlphaGo vs Lee Sedol is more recent, just ask Gary Kasparov how he felt about another Game 2 Move 37, namely 37. Bishop E4 in his match vs IBM Deep Blue in 1997. I worked at Bell Labs in 1997/1998, and Ken Thompson, who had advised Kasparov, told me how Kasparov would not shut up about this one move. He certainly considered it deep and creative. To the point where he seriously considered the possibility that IBM was cheating. (I later worked with Murray Campbell, one of the creators of Deep Blue, at IBM. He assured me they were not cheating.)
The Missing Spark
So the result of search and deterministic inference can be surprising and appear fresh and unexpectedly neat. But is that creativity?
When talking about intelligence and creativity, there appears to be a missing spark, doesn't it? And indeed, while plain search and deterministic inference certainly work, they can be very inefficient. Intelligence-driven creativity would be expected to do more than just try things and rule out what does not work.
For example, think of an intrepid mathematician who works around the clock, generating ever-new hypotheses and eventually finding one that can be proven. Now, of course, they can come up with a new theorem, but they would arrive there more by tenacity than by genius. A truly gifted mathematician would be expected to have this je-ne-sais-quoi, the uncanny ability to focus their attention on structures that yield interesting new theorems, as if guided by an oracle.
Machine Intuition
Clearly, what traditional search algorithms are missing is intuition. Or, as we computer scientists call it, our traditional algorithms lack the ability to conduct statistical inference. Just look at the standard textbooks on OR and see how little space they devote to branching. But isn't magical search guidance precisely the part that a deterministic Turing machine is missing (NP ?= P)?
This was my starting point in 2007. I wanted optimization algorithms to learn to do better search. When I told my colleagues that I worked on the use of machine learning in OR, I was told I was a fool (the actual words used were harsher) for thinking that a machine could learn to have this magical guidance, because the problems were 'too hard' (nota bene: the same people are now working on ML4OR grants). And indeed, the first results were not great. Learning while solving a problem instance would lead to a more effective search, but it was overall slower because of the computational efforts invested in learning.
So we shifted our attention to offline learning. The question was: Could a machine adapt its search behavior based on the experience of solving similar problems in the past? This approach separated the learning from the solving and allowed us to benefit from our learnings without having to pay for them while solving a problem.
The crucial term in this scientific quest is "similar." Just as in ML, there is also a no-free-lunch theorem in OR: For every algorithmic strategy that works well for one problem instance, there exists another problem instance for which the same strategy is abysmal. So, inherently, we have to start from a distribution of problem instances.
But now we are facing a chicken-and-egg problem: The best way to search depends on the distribution of search situations we encounter, but our search choices also affect the distribution of search situations we will encounter later. We investigated this issue in a paper in 2008 on the accuracy of search heuristics.
The immediate result of this research was that we would not be able to fix a distribution of "typical" search situations and to create a training set to which we could then apply the ML kitchen sink because, as soon as we would let the learned strategies make the choices for us, they would skew the distribution of search situations away from the ones they were trained to handle well. (Ironically, there also exist search strategies that are positively dumb, but that skew the search in such a way that this single-minded way of making choices becomes more and more accurate—read the paper above; it is fun!).
The only feasible approach to learning superior search strategies for us was to learn based on end-to-end search performance. And this gave rise to the idea of building automatic algorithm configurators.
The rest of the story is then very simple: We built configurators that find good default search strategies first, and then we enabled them to tailor the search strategies for the given problem instance at hand by using algorithm portfolios. In total, we won 20 first places at international competitions in this way, so it is safe to say that this approach to augmenting search algorithms with statistical inference works.
Learning to Search
In an effort to overcome the static flavor of offline learning, even when using instance-specific tuning, eventually we built so-called 'hyper-reactive' search algorithms: These are search algorithms that track features of the search itself (how long has it been since we found an improving solution? where do we spend most our search efforts so far?) and then learn how to use these features to adapt their search strategies online.
Using this approach, we won the AI4TSP (think: traveling salesperson with uncertain travel times) competition in 2021. This win (on a problem none of us had ever worked on before) sparked the creation of InsideOpt: We wanted companies and academics to have access to this technology at last.
In October 2023, we launched Seeker 1.0.
As you would expect, the entire idea of the solver is that it gets tuned for your business problems, the instances that you encounter, and for the time that is affordable for you. Why would you care about some industry benchmark you never heard of before, you want to know that the tools you choose work for your problems. When using the tuned Seeker, you now leverage a solver whose search strategies are specifically tailored for your business.
Bridging the Gap
In the 80ies, rule-based systems ruled and people believed that expert systems were the answer. Now the pendulum has swung to the other side and the main efforts are spent on statistical inference. To me, both thesis and antithesis are lacking, and the synthesis is actually quite obvious: We need our tools to leverage deterministic inference, statistical inference, and search. Each play their part, and together they can create incredibly powerful new capabilities where a solver like Seeker can be shipped, tailor itself automatically to your business, while providing robust decision support even under uncertainty.
So what is creativity? In my opinion, it is the ability to efficiently find solutions through the confluence of:
- Logic (deterministic inference): Use your hard knowledge to avoid futile work.
- Chance (search): Go to places you have little knowledge of and investigate them further in the hope of finding something interesting.
- Intuition (statistical inference): Learn how to make more lucky guesses.
What do you think?