Discussion about this post

User's avatar
Calvin McCarter's avatar

I agree (as I would, since I'm an ML scientist at a lab-in-the-loop therapeutics startup!), but the devil is in the details for making closed-loop optimization work effectively. A lot of the literature and discourse around active learning and black-box optimization is that it's tricky to balance exploration and exploitation. But there's an even trickier problem about how to do effective exploration and effective exploitation. Your model needs to be intelligent enough to actually learn something useful from an iteration of the loop, or your loop becomes mere frenetic futility.

With poor ML modeling, exploration is simply trying random "mutations". For high-dimensional and "spiky" optimization spaces where successes are rare, this doesn't give positive signal, so you can't learn. (In other words, learning by process of elimination is not viable.) Thus, your exploration still needs to be strategic, so that the designs you propose lead to useful data for learning and optimization.

Similarly, with poor ML modeling, exploitation is simply trying random mutations around the design which worked the best so far. This is also unlikely to succeed, because in high dimensions not only is the space of designs large, but even the space of mutations is large. Furthermore, frequently a design that performs well is still a "local maximum", so your model has to do something smarter than merely propose to search its neighborhood. Thus, weak ML models are unlikely to provide improvements unless your model can learn to construct reasonable hypotheses for why certain previous designs worked better than others, and can perform compositional generalization from the designs which worked well previously.

Thus, while I agree that closed-loop optimization is necessary, it's not sufficient. You need strong AI models that aggregate as much domain knowledge as possible, that are able to reason strategically about optimal next actions, and (most importantly IMO) that are able to reason inductively about how actions influence state and then objectives. (One loose analogy is that BigTech companies were correct to anticipate that conversational AI was the AI architecture of the future, but Siri, Alexa, and Google Assistant struggled because the right ML tech, LLMs, hadn't yet been unlocked by OpenAI.)

This doesn't mean that the inductive reasoning needs to take the form of simplistic, legible hypotheses! One promising direction is TabPFN (https://www.nature.com/articles/s41586-024-08328-6.pdf), which employs meta-learning to train a neural network to perform tabular prediction tasks in-context on synthetic data, thus giving it native inductive reasoning ability. (Full disclosure: I'm a bit biased, as I've contributed in a small way in this research direction.) Regardless of whether this or some other approach ends up working best, I think getting the ML details correct is key.

Akash Kulgod's avatar

This is excellent. Thanks for writing the long version

No posts

Ready for more?