Customization

Many parts of SymbolicRegression.jl are designed to be customizable.

The normal way to do this in Julia is to define a new type that subtypes an abstract type from a package, and then define new methods for the type, extending internal methods on that type.

Custom Options

For example, you can define a custom options type:

SymbolicRegression.CoreModule.OptionsStructModule.AbstractOptionsType
AbstractOptions

An abstract type that stores all search hyperparameters for SymbolicRegression.jl. The standard implementation is Options.

You may wish to create a new subtypes of AbstractOptions to override certain functions or create new behavior. Ensure that this new type has all properties of Options.

For example, if we have new options that we want to add to Options:

Base.@kwdef struct MyNewOptions
    a::Float64 = 1.0
    b::Int = 3
end

we can create a combined options type that forwards properties to each corresponding type:

struct MyOptions{O<:SymbolicRegression.Options} <: SymbolicRegression.AbstractOptions
    new_options::MyNewOptions
    sr_options::O
end
const NEW_OPTIONS_KEYS = fieldnames(MyNewOptions)

# Constructor with both sets of parameters:
function MyOptions(; kws...)
    new_options_keys = filter(k -> k in NEW_OPTIONS_KEYS, keys(kws))
    new_options = MyNewOptions(; NamedTuple(new_options_keys .=> Tuple(kws[k] for k in new_options_keys))...)
    sr_options_keys = filter(k -> !(k in NEW_OPTIONS_KEYS), keys(kws))
    sr_options = SymbolicRegression.Options(; NamedTuple(sr_options_keys .=> Tuple(kws[k] for k in sr_options_keys))...)
    return MyOptions(new_options, sr_options)
end

# Make all `Options` available while also making `new_options` accessible
function Base.getproperty(options::MyOptions, k::Symbol)
    if k in NEW_OPTIONS_KEYS
        return getproperty(getfield(options, :new_options), k)
    else
        return getproperty(getfield(options, :sr_options), k)
    end
end

Base.propertynames(options::MyOptions) = (NEW_OPTIONS_KEYS..., fieldnames(SymbolicRegression.Options)...)

which would let you access a and b from MyOptions objects, as well as making all properties of Options available for internal methods in SymbolicRegression.jl

source

Any function in SymbolicRegression.jl you can generally define a new method on your custom options type, to define custom behavior.

Custom Mutations

You can define custom mutation operators by defining a new method on mutate!, as well as subtyping AbstractMutationWeights:

SymbolicRegression.MutateModule.mutate!Function
mutate!(
    tree::N,
    member::P,
    ::Val{S},
    mutation_weights::AbstractMutationWeights,
    options::AbstractOptions;
    kws...,
) where {N<:AbstractExpression,P<:PopMember,S}

Perform a mutation on the given tree and member using the specified mutation type S. Various kws are provided to access other data needed for some mutations.

You may overload this function to handle new mutation types for new AbstractMutationWeights types.

Keywords

  • temperature: The temperature parameter for annealing-based mutations.
  • dataset::Dataset: The dataset used for scoring.
  • score: The score of the member before mutation.
  • loss: The loss of the member before mutation.
  • curmaxsize: The current maximum size constraint, which may be different from options.maxsize.
  • nfeatures: The number of features in the dataset.
  • parent_ref: Reference to the mutated member's parent (only used for logging purposes).
  • recorder::RecordType: A recorder to log mutation details.

Returns

A MutationResult{N,P} object containing the mutated tree or member (but not both), the number of evaluations performed, if any, and whether to return immediately from the mutation function, or to let the next_generation function handle accepting or rejecting the mutation. For example, a simplify operation will not change the loss, so it can always return immediately.

source
SymbolicRegression.CoreModule.MutationWeightsModule.AbstractMutationWeightsType
AbstractMutationWeights

An abstract type that defines the interface for mutation weight structures in the symbolic regression framework. Subtypes of AbstractMutationWeights specify how often different mutation operations occur during the mutation process.

You can create custom mutation weight types by subtyping AbstractMutationWeights and defining your own mutation operations. Additionally, you can overload the sample_mutation function to handle sampling from your custom mutation types.

Usage

To create a custom mutation weighting scheme with new mutation types, define a new subtype of AbstractMutationWeights and implement the necessary fields. Here's an example using Base.@kwdef to define the struct with default values:

using SymbolicRegression: AbstractMutationWeights

# Define custom mutation weights with default values
Base.@kwdef struct MyMutationWeights <: AbstractMutationWeights
    mutate_constant::Float64 = 0.1
    mutate_operator::Float64 = 0.2
    custom_mutation::Float64 = 0.7
end

Next, overload the sample_mutation function to include your custom mutation types:

# Define the list of mutation names (symbols)
const MY_MUTATIONS = [
    :mutate_constant,
    :mutate_operator,
    :custom_mutation
]

# Import the `sample_mutation` function to overload it
import SymbolicRegression: sample_mutation
using StatsBase: StatsBase

# Overload the `sample_mutation` function
function sample_mutation(w::MyMutationWeights)
    weights = [
        w.mutate_constant,
        w.mutate_operator,
        w.custom_mutation
    ]
    weights = weights ./ sum(weights)  # Normalize weights to sum to 1.0
    return StatsBase.sample(MY_MUTATIONS, StatsBase.Weights(weights))
end

# Pass it when defining `Options`:
using SymbolicRegression: Options
options = Options(mutation_weights=MyMutationWeights())

This allows you to customize the mutation sampling process to include your custom mutations according to their specified weights.

To integrate your custom mutations into the mutation process, ensure that the mutation functions corresponding to your custom mutation types are defined and properly registered with the symbolic regression framework. You may need to define methods for mutate! that handle your custom mutation types.

See Also

  • MutationWeights: A concrete implementation of AbstractMutationWeights that defines default mutation weightings.
  • sample_mutation: Function to sample a mutation based on current mutation weights.
  • mutate!: Function to apply a mutation to an expression tree.
  • AbstractOptions: See how to extend abstract types for customizing options.
source
SymbolicRegression.MutateModule.condition_mutation_weights!Function
condition_mutation_weights!(weights::AbstractMutationWeights, member::PopMember, options::AbstractOptions, curmaxsize::Int)

Adjusts the mutation weights based on the properties of the current member and options.

This function modifies the mutation weights to ensure that the mutations applied to the member are appropriate given its current state and the provided options. It can be overloaded to customize the behavior for different types of expressions or members.

Note that the weights were already copied, so you don't need to worry about mutation.

Arguments

  • weights::AbstractMutationWeights: The mutation weights to be adjusted.
  • member::PopMember: The current population member being mutated.
  • options::AbstractOptions: The options that guide the mutation process.
  • curmaxsize::Int: The current maximum size constraint for the member's expression tree.
source
SymbolicRegression.MutateModule.MutationResultType
MutationResult{N<:AbstractExpression,P<:PopMember}

Represents the result of a mutation operation in the genetic programming algorithm. This struct is used to return values from mutate! functions.

Fields

  • tree::Union{N, Nothing}: The mutated expression tree, if applicable. Either tree or member must be set, but not both.
  • member::Union{P, Nothing}: The mutated population member, if applicable. Either member or tree must be set, but not both.
  • num_evals::Float64: The number of evaluations performed during the mutation, which is automatically set to 0.0. Only used for things like optimize.
  • return_immediately::Bool: If true, the mutation process should return immediately, bypassing further checks, used for things like simplify or optimize where you already know the loss value of the result.

Usage

This struct encapsulates the result of a mutation operation. Either a new expression tree or a new population member is returned, but not both.

Return the member if you want to return immediately, and have computed the loss value as part of the mutation.

source

Custom Expressions

You can create your own expression types by defining a new type that extends AbstractExpression.

DynamicExpressions.ExpressionModule.AbstractExpressionType
AbstractExpression{T,N}

(Experimental) Abstract type for user-facing expression types, which contain both the raw expression tree operating on a value type of T, as well as associated metadata to evaluate and render the expression.

See ExpressionInterface for a full description of the interface implementation, as well as tests to verify correctness.

If you wish to use @parse_expression, you can also customize the parsing behavior with

  • parse_leaf

The interface is fairly flexible, and permits you define specific functional forms, extra parameters, etc. See the documentation of DynamicExpressions.jl for more details on what methods you need to implement. You can test the implementation of a given interface by using ExpressionInterface which makes use of Interfaces.jl:

DynamicExpressions.InterfacesModule.ExpressionInterfaceType
    ExpressionInterface

An Interfaces.jl Interface with mandatory components (:get_contents, :get_metadata, :get_tree, :get_operators, :get_variable_names, :copy, :with_contents, :with_metadata) and optional components (:copy_into!, :count_nodes, :count_constant_nodes, :count_depth, :index_constant_nodes, :has_operators, :has_constants, :get_scalar_constants, :set_scalar_constants!, :string_tree, :default_node_type, :constructorof, :tree_mapreduce).

Defines the interface of AbstractExpression for user-facing expression types, which can store operators, extra parameters, functional forms, variable names, etc.

Extended help

Mandatory keys:

  • get_contents: extracts the runtime contents of an expression
  • get_metadata: extracts the runtime metadata of an expression
  • get_tree: extracts the expression tree from AbstractExpression
  • get_operators: returns the operators used in the expression (or pass operators explicitly to override)
  • get_variable_names: returns the variable names used in the expression (or pass variable_names explicitly to override)
  • copy: returns a copy of the expression
  • with_contents: returns the expression with different tree
  • with_metadata: returns the expression with different metadata

Optional keys:

  • copy_into!: copies an expression into a preallocated container
  • count_nodes: counts the number of nodes in the expression tree
  • count_constant_nodes: counts the number of constant nodes in the expression tree
  • count_depth: calculates the depth of the expression tree
  • index_constant_nodes: indexes constants in the expression tree
  • has_operators: checks if the expression has operators
  • has_constants: checks if the expression has constants
  • get_scalar_constants: gets constants from the expression tree, returning a tuple of: (1) a flat vector of the constants, and (2) an reference object that can be used by set_scalar_constants! to efficiently set them back
  • set_scalar_constants!: sets constants in the expression tree, given: (1) a flat vector of constants, (2) the expression, and (3) the reference object produced by get_scalar_constants
  • string_tree: returns a string representation of the expression tree
  • default_node_type: returns the default node type for the expression
  • constructorof: gets the constructor function for a type
  • tree_mapreduce: applies a function across the tree

Then, for SymbolicRegression.jl, you would pass expression_type to the Options constructor, as well as any expression_options you need (as a NamedTuple).

If needed, you may need to overload SymbolicRegression.ExpressionBuilder.extra_init_params in case your expression needs additional parameters. See the method for ParametricExpression as an example.

You can look at the files src/ParametricExpression.jl and src/TemplateExpression.jl for more examples of custom expression types, though note that ParametricExpression itself is defined in DynamicExpressions.jl, while that file just overloads some methods for SymbolicRegression.jl.

Other Customizations

Other internal abstract types include the following:

SymbolicRegression.SearchUtilsModule.AbstractRuntimeOptionsType
AbstractRuntimeOptions

An abstract type representing runtime configuration parameters for the symbolic regression algorithm.

AbstractRuntimeOptions is used by equation_search to control runtime aspects such as parallelism and iteration limits. By subtyping AbstractRuntimeOptions, advanced users can customize runtime behaviors by passing it to equation_search.

See Also

source
SymbolicRegression.SearchUtilsModule.AbstractSearchStateType
AbstractSearchState{T,L,N}

An abstract type encapsulating the internal state of the search process during symbolic regression.

AbstractSearchState instances hold information like populations and progress metrics, used internally by equation_search. Subtyping AbstractSearchState allows customization of search state management.

Look through the source of equation_search to see how this is used.

See Also

source

These let you include custom state variables and runtime options.