Parametrized Template Expressions
Template expressions in SymbolicRegression.jl can include parametric forms - expressions with tunable constants that are optimized during the search. This can even include learn class-specific parameters that vary by category, analogous to ParametricExpression
s.
In this tutorial, we'll demonstrate how to use parametric template expressions to learn a model where:
- Some constants are shared across all data points
- Other constants vary by class
- The structure combines known forms (like cosine) with unknown sub-expressions
using SymbolicRegression
using Random: MersenneTwister, randn, rand
using MLJBase: machine, fit!, predict, report
The Model Structure
We'll work with a model that combines:
- A cosine term with class-specific phase shifts
- A polynomial term
- Global scaling parameters
Specifically, let's say that our true model has the form:
\[y = A \cos(f(x_2) + \Delta_c) + g(x_1) - B\]
where:
- $A$ is a global amplitude (same for all classes)
- $\Delta_c$ is a phase shift that depends on the class label
- $f(x_2)$ is some function of $x_2$ (in our case, just $x_2$)
- $g(x_1)$ is some function of $x_1$ (in our case, $x_1^2$)
- $B$ is a global offset
We'll generate synthetic data where:
- $A = 2.0$ (amplitude)
- $\Delta_1 = 0.1$ (phase shift for class 1)
- $\Delta_2 = 1.5$ (phase shift for class 2)
- $B = 2.0$ (offset)
# Set random seed for reproducibility
rng = MersenneTwister(0)
# Number of data points
n = 200
# Generate random features
x1 = randn(rng, n) # feature 1
x2 = randn(rng, n) # feature 2
class = rand(rng, 1:2, n) # class labels 1 or 2
# Define the true parameters
Δ_phase = [0.1, 1.5] # phase shift for class 1 and 2
A = 2.0 # amplitude
B = 2.0 # offset
# Add some noise
eps = randn(rng, n) * 1e-5
# Generate targets using the true underlying function
y = [
A * cos(x2[i] + Δ_phase[class[i]]) + x1[i]^2 - B
for i in 1:n
]
y .+= eps
200-element Vector{Float64}:
-0.5701122294617497
-0.5997248546207727
-3.213537760261551
-3.7125477018459843
-2.4548070885543494
-1.9517096670556926
1.1041462223217429
-2.244653893032382
-2.2719553386239197
-1.2771884871477148
⋮
-0.03912319768225371
-2.942389458733852
-0.5295870820960696
-1.0432446686006935
-0.2995868424235526
-3.538898427486066
-0.811395381747143
-0.2146004346109726
0.5166949494020697
Defining the Template
Now we'll use the @template_spec
macro to encode this structure, which will create a TemplateExpressionSpec
object.
# Define the template structure with sub-expressions f and g
template = @template_spec(
expressions=(f, g),
parameters=(p1=2, p2=2)
) do x1, x2, class
return p1[1] * cos(f(x2) + p2[class]) + g(x1) - p1[2]
end
TemplateExpressionSpec{TemplateStructure{(:f, :g), (:p1, :p2), Main.var"#3#4", @NamedTuple{f::Int64, g::Int64}, @NamedTuple{p1::Int64, p2::Int64}}}(TemplateStructure{(:f, :g), (:p1, :p2), Main.var"#3#4", @NamedTuple{f::Int64, g::Int64}, @NamedTuple{p1::Int64, p2::Int64}}(Main.var"#3#4"(), (f = 1, g = 1), (p1 = 2, p2 = 2)))
Let's break down this template:
- We declared two sub-expressions:
f
andg
that we want to learn- By calling
f(x2)
andg(x1)
, the forward pass will constrain both expressions to only include a single input argument.
- By calling
- We declared two parameter vectors:
p1
(length 2) andp2
(length 2) - The template combines these components as:
p1[1]
is the amplitude (global parameter)cos(f(x2) + p2[class])
adds a class-specific phase shift viap2[class]
g(x1)
represents (we hope) the quadratic termp1[2]
is the global offset
Now we'll set up an SRRegressor with our template:
model = SRRegressor(
binary_operators = (+, -, *, /),
niterations = 300,
maxsize = 20,
expression_spec = template,
)
# Package data up for MLJ
X = (; x1, x2, class)
mach = machine(model, X, y)
untrained Machine; caches model-specific representations of data
model: SRRegressor(defaults = nothing, …)
args:
1: Source @069 ⏎ ScientificTypesBase.Table{Union{AbstractVector{ScientificTypesBase.Continuous}, AbstractVector{ScientificTypesBase.Count}}}
2: Source @598 ⏎ AbstractVector{ScientificTypesBase.Continuous}
At this point, you would run:
fit!(mach)
which will evolve expressions following our template structure. The final result is accessible with:
report(mach)
which returns a named tuple of the fitted results, including the .equations
field containing the TemplateExpression
objects that dominated the Pareto front.
Interpreting Results
After training, you can inspect the expressions found:
r = report(mach)
best_expr = r.equations[r.best_idx]
You can also extract the individual sub-expressions (stored as ComposableExpression
objects):
inner_exprs = get_contents(best_expr)
metadata = get_metadata(best_expr)
The learned expression should closely match our true generating function:
f(x2)
should be approximatelyx2
(note it will show up asx1
in the raw contents, but this simply is a relative indexing of its arguments!)g(x1)
should be approximatelyx1^2
- The parameters should be close to their true values:
p1[1] ≈ 2.0
(amplitude)p1[2] ≈ 2.0
(offset)p2[1] ≈ 0.1 mod 2π
(phase shift for class 1)p2[2] ≈ 1.5 mod 2π
(phase shift for class 2)
You can use the learned expression to make predictions using either predict(mach, X)
, or by calling best_expr(X_raw)
directly (note that X_raw
needs to be a matrix of shape (n, d)
where n
is the number of samples and d
is the dimension of the features).
Show raw source code
#=
# Parametrized Template Expressions
Template expressions in SymbolicRegression.jl can include parametric forms - expressions with tunable constants
that are optimized during the search. This can even include learn class-specific parameters that vary by category,
analogous to `ParametricExpression`s.
In this tutorial, we'll demonstrate how to use parametric template expressions to learn a model where:
- Some constants are shared across all data points
- Other constants vary by class
- The structure combines known forms (like cosine) with unknown sub-expressions
=#
using SymbolicRegression
using Random: MersenneTwister, randn, rand
using MLJBase: machine, fit!, predict, report
#=
## The Model Structure
We'll work with a model that combines:
- A cosine term with class-specific phase shifts
- A polynomial term
- Global scaling parameters
Specifically, let's say that our true model has the form:
\```math
y = A \cos(f(x_2) + \Delta_c) + g(x_1) - B
\```
where:
- ``A`` is a global amplitude (same for all classes)
- ``\Delta_c`` is a phase shift that depends on the class label
- ``f(x_2)`` is some function of ``x_2`` (in our case, just ``x_2``)
- ``g(x_1)`` is some function of ``x_1`` (in our case, ``x_1^2``)
- ``B`` is a global offset
We'll generate synthetic data where:
- ``A = 2.0`` (amplitude)
- ``\Delta_1 = 0.1`` (phase shift for class 1)
- ``\Delta_2 = 1.5`` (phase shift for class 2)
- ``B = 2.0`` (offset)
=#
## Set random seed for reproducibility
rng = MersenneTwister(0)
## Number of data points
n = 200
## Generate random features
x1 = randn(rng, n) # feature 1
x2 = randn(rng, n) # feature 2
class = rand(rng, 1:2, n) # class labels 1 or 2
## Define the true parameters
Δ_phase = [0.1, 1.5] # phase shift for class 1 and 2
A = 2.0 # amplitude
B = 2.0 # offset
## Add some noise
eps = randn(rng, n) * 1e-5
## Generate targets using the true underlying function
y = [
A * cos(x2[i] + Δ_phase[class[i]]) + x1[i]^2 - B
for i in 1:n
]
y .+= eps
#=
## Defining the Template
Now we'll use the `@template_spec` macro to encode this structure, which will create
a `TemplateExpressionSpec` object.
=#
## Define the template structure with sub-expressions f and g
template = @template_spec(
expressions=(f, g),
parameters=(p1=2, p2=2)
) do x1, x2, class
return p1[1] * cos(f(x2) + p2[class]) + g(x1) - p1[2]
end
#=
Let's break down this template:
- We declared two sub-expressions: `f` and `g` that we want to learn
- By calling `f(x2)` and `g(x1)`, the forward pass will constrain both expressions
to only include a single input argument.
- We declared two parameter vectors: `p1` (length 2) and `p2` (length 2)
- The template combines these components as:
- `p1[1]` is the amplitude (global parameter)
- `cos(f(x2) + p2[class])` adds a class-specific phase shift via `p2[class]`
- `g(x1)` represents (we hope) the quadratic term
- `p1[2]` is the global offset
Now we'll set up an SRRegressor with our template:
=#
model = SRRegressor(
binary_operators = (+, -, *, /),
niterations = 300,
maxsize = 20,
expression_spec = template,
early_stop_condition = (loss, complexity) -> loss < 1e-5 && complexity < 10, #src
)
## Package data up for MLJ
X = (; x1, x2, class)
mach = machine(model, X, y)
#=
At this point, you would run:
\```julia
fit!(mach)
\```
which will evolve expressions following our template structure. The final result is accessible with:
\```julia
report(mach)
\```
which returns a named tuple of the fitted results, including the `.equations` field containing
the `TemplateExpression` objects that dominated the Pareto front.
## Interpreting Results
After training, you can inspect the expressions found:
\```julia
r = report(mach)
best_expr = r.equations[r.best_idx]
\```
You can also extract the individual sub-expressions (stored as `ComposableExpression` objects):
\```julia
inner_exprs = get_contents(best_expr)
metadata = get_metadata(best_expr)
\```
The learned expression should closely match our true generating function:
- `f(x2)` should be approximately `x2` (note it will show up as `x1` in the raw contents, but this simply is a relative indexing of its arguments!)
- `g(x1)` should be approximately `x1^2`
- The parameters should be close to their true values:
- `p1[1] ≈ 2.0` (amplitude)
- `p1[2] ≈ 2.0` (offset)
- `p2[1] ≈ 0.1 mod 2π` (phase shift for class 1)
- `p2[2] ≈ 1.5 mod 2π` (phase shift for class 2)
You can use the learned expression to make predictions using either `predict(mach, X)`,
or by calling `best_expr(X_raw)` directly (note that `X_raw` needs to be a matrix of shape
`(n, d)` where `n` is the number of samples and `d` is the dimension of the features).
=#
which uses Literate.jl to generate this page.