SymbolicRegression.jl

Latest releaseDocumentationBuild statusCoverage
versionDev StableCICoverage Status

Distributed High-Performance symbolic regression in Julia.

Check out PySR for a Python frontend.

<img src="https://astroautomata.com/data/srdemoimage1.png" alt="demo1" width="700"/> <img src="https://astroautomata.com/data/srdemoimage2.png" alt="demo2" width="700"/>

Cite this software

Quickstart

Install in Julia with:

using Pkg
Pkg.add("SymbolicRegression")

The heart of this package is the EquationSearch function, which takes a 2D array (shape [features, rows]) and attempts to model a 1D array (shape [rows]) using analytic functional forms.

Run distributed on four processes with:

using SymbolicRegression

X = randn(Float32, 5, 100)
y = 2 * cos.(X[4, :]) + X[1, :] .^ 2 .- 2

options = SymbolicRegression.Options(
    binary_operators=(+, *, /, -),
    unary_operators=(cos, exp),
    npopulations=20
)

hall_of_fame = EquationSearch(X, y, niterations=40, options=options, numprocs=4)

We can view the equations in the dominating Pareto frontier with:

dominating = calculate_pareto_frontier(X, y, hall_of_fame, options)

We can convert the best equation to SymbolicUtils.jl with the following function:

eqn = node_to_symbolic(dominating[end].tree, options)
println(simplify(eqn*5 + 3))

We can also print out the full pareto frontier like so:

println("Complexity\tMSE\tEquation")

for member in dominating
    size = count_nodes(member.tree)
    loss = member.loss
    string = string_tree(member.tree, options)

    println("$(size)\t$(loss)\t$(string)")
end

Contents