The use and abuse of machine learning in astronomy

Machine learning is both underused and overused in astrophysics.

Where is it overused?

  • Many ML applications in astrophysics aren’t necessary. When applied to the wrong problem, modern ML can be overinterpreted and overextrapolated.
  • E.g., when deep NNs are applied to low-data low-dimensional problems.

A simpler algorithm will often do just as well and be far more interpretable. Interpretation is the essence of science!

In this regime we need more inductive biases and priors, more Bayes, and more explicit interpretation.

However, it’s also underused.

  • At the same time, there are countless areas where ML has not been applied and where it could have a massive impact.
  • ML is not just regression & classification!

We need more simulation-based inference, more symbolic learning, and more variety of applications.

The most obvious area at the moment seems to be simulation-based likelihood-free inference. This will have a huge impact on astrophysics and is just getting started. This is essentially summarized by the following: Likelihood-free inference See papers

Symbolic learning also seems to be extremely underused in a field built on (!) symbolic expressions. Check out this paper where we extract Newton’s law from a trained graph network: 1909.05862. Symbolic formula from a NN

Also see @pushmeet’s nice article where he discusses how interpretable ML can help accelerate science more generally.

Categories:

Updated:

Leave a comment