API

MLJ interface

SymbolicRegression.MLJInterfaceModule.SRRegressorType
SRRegressor

A model type for constructing a Symbolic Regression via Evolutionary Search, based on SymbolicRegression.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

SRRegressor = @load SRRegressor pkg=SymbolicRegression

Do model = SRRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in SRRegressor(defaults=...).

Single-target Symbolic Regression regressor (SRRegressor) searches for symbolic expressions that predict a single target variable from a set of input variables. All data is assumed to be Continuous. The search is performed using an evolutionary algorithm. This algorithm is described in the paper https://arxiv.org/abs/2305.01582.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

OR

mach = machine(model, X, y, w)

Here:

  • X is any table of input features (eg, a DataFrame) whose columns are of scitype Continuous; check column scitypes with schema(X). Variable names in discovered expressions will be taken from the column names of X, if available. Units in columns of X (use DynamicQuantities for units) will trigger dimensional analysis to be used.

  • y is the target, which can be any AbstractVector whose element scitype is Continuous; check the scitype with scitype(y). Units in y (use DynamicQuantities for units) will trigger dimensional analysis to be used.

  • w is the observation weights which can either be nothing (default) or an AbstractVector whose element scitype is Count or Continuous.

Train the machine using fit!(mach), inspect the discovered expressions with report(mach), and predict on new data with predict(mach, Xnew). Note that unlike other regressors, symbolic regression stores a list of trained models. The model chosen from this list is defined by the function selection_method keyword argument, which by default balances accuracy and complexity. You can override this at prediction time by passing a named tuple with keys data and idx.

Hyper-parameters

  • defaults: What set of defaults to use for Options. The default, nothing, will simply take the default options from the current version of SymbolicRegression. However, you may also select the defaults from an earlier version, such as v"0.24.5".

  • binary_operators: Vector of binary operators (functions) to use. Each operator should be defined for two input scalars, and one output scalar. All operators need to be defined over the entire real line (excluding infinity - these are stopped before they are input), or return NaN where not defined. For speed, define it so it takes two reals of the same type as input, and outputs the same type. For the SymbolicUtils simplification backend, you will need to define a generic method of the operator so it takes arbitrary types.

  • unary_operators: Same, but for unary operators (one input scalar, gives an output scalar).

  • constraints: Array of pairs specifying size constraints for each operator. The constraints for a binary operator should be a 2-tuple (e.g., (-1, -1)) and the constraints for a unary operator should be an Int. A size constraint is a limit to the size of the subtree in each argument of an operator. e.g., [(^)=>(-1, 3)] means that the ^ operator can have arbitrary size (-1) in its left argument, but a maximum size of 3 in its right argument. Default is no constraints.

  • batching: Whether to evolve based on small mini-batches of data, rather than the entire dataset.

  • batch_size: What batch size to use if using batching.

  • elementwise_loss: What elementwise loss function to use. Can be one of the following losses, or any other loss of type SupervisedLoss. You can also pass a function that takes a scalar target (left argument), and scalar predicted (right argument), and returns a scalar. This will be averaged over the predicted data. If weights are supplied, your function should take a third argument for the weight scalar. Included losses: Regression: - LPDistLoss{P}(), - L1DistLoss(), - L2DistLoss() (mean square), - LogitDistLoss(), - HuberLoss(d), - L1EpsilonInsLoss(ϵ), - L2EpsilonInsLoss(ϵ), - PeriodicLoss(c), - QuantileLoss(τ), Classification: - ZeroOneLoss(), - PerceptronLoss(), - L1HingeLoss(), - SmoothedL1HingeLoss(γ), - ModifiedHuberLoss(), - L2MarginLoss(), - ExpLoss(), - SigmoidLoss(), - DWDMarginLoss(q).

  • loss_function: Alternatively, you may redefine the loss used as any function of tree::AbstractExpressionNode{T}, dataset::Dataset{T}, and options::AbstractOptions, so long as you output a non-negative scalar of type T. This is useful if you want to use a loss that takes into account derivatives, or correlations across the dataset. This also means you could use a custom evaluation for a particular expression. If you are using batching=true, then your function should accept a fourth argument idx, which is either nothing (indicating that the full dataset should be used), or a vector of indices to use for the batch. For example,

      function my_loss(tree, dataset::Dataset{T,L}, options)::L where {T,L}
          prediction, flag = eval_tree_array(tree, dataset.X, options)
          if !flag
              return L(Inf)
          end
          return sum((prediction .- dataset.y) .^ 2) / dataset.n
      end
  • expression_type::Type{E}=Expression: The type of expression to use. For example, Expression.

  • node_type::Type{N}=default_node_type(Expression): The type of node to use for the search. For example, Node or GraphNode. The default is computed by default_node_type(expression_type).

  • populations: How many populations of equations to use.

  • population_size: How many equations in each population.

  • ncycles_per_iteration: How many generations to consider per iteration.

  • tournament_selection_n: Number of expressions considered in each tournament.

  • tournament_selection_p: The fittest expression in a tournament is to be selected with probability p, the next fittest with probability p*(1-p), and so forth.

  • topn: Number of equations to return to the host process, and to consider for the hall of fame.

  • complexity_of_operators: What complexity should be assigned to each operator, and the occurrence of a constant or variable. By default, this is 1 for all operators. Can be a real number as well, in which case the complexity of an expression will be rounded to the nearest integer. Input this in the form of, e.g., [(^) => 3, sin => 2].

  • complexity_of_constants: What complexity should be assigned to use of a constant. By default, this is 1.

  • complexity_of_variables: What complexity should be assigned to use of a variable, which can also be a vector indicating different per-variable complexity. By default, this is 1.

  • complexity_mapping: Alternatively, you can pass a function that takes the expression as input and returns the complexity. Make sure that this operates on AbstractExpression (and unpacks to AbstractExpressionNode), and returns an integer.

  • alpha: The probability of accepting an equation mutation during regularized evolution is given by exp(-delta_loss/(alpha * T)), where T goes from 1 to 0. Thus, alpha=infinite is the same as no annealing.

  • maxsize: Maximum size of equations during the search.

  • maxdepth: Maximum depth of equations during the search, by default this is set equal to the maxsize.

  • parsimony: A multiplicative factor for how much complexity is punished.

  • dimensional_constraint_penalty: An additive factor if the dimensional constraint is violated.

  • dimensionless_constants_only: Whether to only allow dimensionless constants.

  • use_frequency: Whether to use a parsimony that adapts to the relative proportion of equations at each complexity; this will ensure that there are a balanced number of equations considered for every complexity.

  • use_frequency_in_tournament: Whether to use the adaptive parsimony described above inside the score, rather than just at the mutation accept/reject stage.

  • adaptive_parsimony_scaling: How much to scale the adaptive parsimony term in the loss. Increase this if the search is spending too much time optimizing the most complex equations.

  • turbo: Whether to use LoopVectorization.@turbo to evaluate expressions. This can be significantly faster, but is only compatible with certain operators. Experimental!

  • bumper: Whether to use Bumper.jl for faster evaluation. Experimental!

  • migration: Whether to migrate equations between processes.

  • hof_migration: Whether to migrate equations from the hall of fame to processes.

  • fraction_replaced: What fraction of each population to replace with migrated equations at the end of each cycle.

  • fraction_replaced_hof: What fraction to replace with hall of fame equations at the end of each cycle.

  • should_simplify: Whether to simplify equations. If you pass a custom objective, this will be set to false.

  • should_optimize_constants: Whether to use an optimization algorithm to periodically optimize constants in equations.

  • optimizer_algorithm: Select algorithm to use for optimizing constants. Default is Optim.BFGS(linesearch=LineSearches.BackTracking()).

  • optimizer_nrestarts: How many different random starting positions to consider for optimization of constants.

  • optimizer_probability: Probability of performing optimization of constants at the end of a given iteration.

  • optimizer_iterations: How many optimization iterations to perform. This gets passed to Optim.Options as iterations. The default is 8.

  • optimizer_f_calls_limit: How many function calls to allow during optimization. This gets passed to Optim.Options as f_calls_limit. The default is 10_000.

  • optimizer_options: General options for the constant optimization. For details we refer to the documentation on Optim.Options from the Optim.jl package. Options can be provided here as NamedTuple, e.g. (iterations=16,), as a Dict, e.g. Dict(:x_tol => 1.0e-32,), or as an Optim.Options instance.

  • autodiff_backend: The backend to use for differentiation, which should be an instance of AbstractADType (see ADTypes.jl). Default is nothing, which means Optim.jl will estimate gradients (likely with finite differences). You can also pass a symbolic version of the backend type, such as :Zygote for Zygote, :Enzyme, etc. Most backends will not work, and many will never work due to incompatibilities, though support for some is gradually being added.

  • perturbation_factor: When mutating a constant, either multiply or divide by (1+perturbation_factor)^(rand()+1).

  • probability_negate_constant: Probability of negating a constant in the equation when mutating it.

  • mutation_weights: Relative probabilities of the mutations. The struct MutationWeights (or any AbstractMutationWeights) should be passed to these options. See its documentation on MutationWeights for the different weights.

  • crossover_probability: Probability of performing crossover.

  • annealing: Whether to use simulated annealing.

  • warmup_maxsize_by: Whether to slowly increase the max size from 5 up to maxsize. If nonzero, specifies the fraction through the search at which the maxsize should be reached.

  • verbosity: Whether to print debugging statements or not.

  • print_precision: How many digits to print when printing equations. By default, this is 5.

  • output_directory: The base directory to save output files to. Files will be saved in a subdirectory according to the run ID. By default, this is ./outputs.

  • save_to_file: Whether to save equations to a file during the search.

  • bin_constraints: See constraints. This is the same, but specified for binary operators only (for example, if you have an operator that is both a binary and unary operator).

  • una_constraints: Likewise, for unary operators.

  • seed: What random seed to use. nothing uses no seed.

  • progress: Whether to use a progress bar output (verbosity will have no effect).

  • early_stop_condition: Float - whether to stop early if the mean loss gets below this value. Function - a function taking (loss, complexity) as arguments and returning true or false.

  • timeout_in_seconds: Float64 - the time in seconds after which to exit (as an alternative to the number of iterations).

  • max_evals: Int (or Nothing) - the maximum number of evaluations of expressions to perform.

  • input_stream: the stream to read user input from. By default, this is stdin. If you encounter issues with reading from stdin, like a hang, you can simply pass devnull to this argument.

  • skip_mutation_failures: Whether to simply skip over mutations that fail or are rejected, rather than to replace the mutated expression with the original expression and proceed normally.

  • nested_constraints: Specifies how many times a combination of operators can be nested. For example, [sin => [cos => 0], cos => [cos => 2]] specifies that cos may never appear within a sin, but sin can be nested with itself an unlimited number of times. The second term specifies that cos can be nested up to 2 times within a cos, so that cos(cos(cos(x))) is allowed (as well as any combination of + or - within it), but cos(cos(cos(cos(x)))) is not allowed. When an operator is not specified, it is assumed that it can be nested an unlimited number of times. This requires that there is no operator which is used both in the unary operators and the binary operators (e.g., - could be both subtract, and negation). For binary operators, both arguments are treated the same way, and the max of each argument is constrained.

  • deterministic: Use a global counter for the birth time, rather than calls to time(). This gives perfect resolution, and is therefore deterministic. However, it is not thread safe, and must be used in serial mode.

  • define_helper_functions: Whether to define helper functions for constructing and evaluating trees.

  • niterations::Int=10: The number of iterations to perform the search. More iterations will improve the results.

  • parallelism=:multithreading: What parallelism mode to use. The options are :multithreading, :multiprocessing, and :serial. By default, multithreading will be used. Multithreading uses less memory, but multiprocessing can handle multi-node compute. If using :multithreading mode, the number of threads available to julia are used. If using :multiprocessing, numprocs processes will be created dynamically if procs is unset. If you have already allocated processes, pass them to the procs argument and they will be used. You may also pass a string instead of a symbol, like "multithreading".

  • numprocs::Union{Int, Nothing}=nothing: The number of processes to use, if you want equation_search to set this up automatically. By default this will be 4, but can be any number (you should pick a number <= the number of cores available).

  • procs::Union{Vector{Int}, Nothing}=nothing: If you have set up a distributed run manually with procs = addprocs() and @everywhere, pass the procs to this keyword argument.

  • addprocs_function::Union{Function, Nothing}=nothing: If using multiprocessing (parallelism=:multithreading), and are not passing procs manually, then they will be allocated dynamically using addprocs. However, you may also pass a custom function to use instead of addprocs. This function should take a single positional argument, which is the number of processes to use, as well as the lazy keyword argument. For example, if set up on a slurm cluster, you could pass addprocs_function = addprocs_slurm, which will set up slurm processes.

  • heap_size_hint_in_bytes::Union{Int,Nothing}=nothing: On Julia 1.9+, you may set the --heap-size-hint flag on Julia processes, recommending garbage collection once a process is close to the recommended size. This is important for long-running distributed jobs where each process has an independent memory, and can help avoid out-of-memory errors. By default, this is set to Sys.free_memory() / numprocs.

  • worker_imports::Union{Vector{Symbol},Nothing}=nothing: If you want to import additional modules on each worker, pass them here as a vector of symbols. By default some of the extensions will automatically be loaded when needed.

  • runtests::Bool=true: Whether to run (quick) tests before starting the search, to see if there will be any problems during the equation search related to the host environment.

  • run_id::Union{String,Nothing}=nothing: A unique identifier for the run. This will be used to store outputs from the run in the outputs directory. If not specified, a unique ID will be generated.

  • loss_type::Type=Nothing: If you would like to use a different type for the loss than for the data you passed, specify the type here. Note that if you pass complex data ::Complex{L}, then the loss type will automatically be set to L.

  • selection_method::Function: Function to selection expression from the Pareto frontier for use in predict. See SymbolicRegression.MLJInterfaceModule.choose_best for an example. This function should return a single integer specifying the index of the expression to use. By default, this maximizes the score (a pound-for-pound rating) of expressions reaching the threshold of 1.5x the minimum loss. To override this at prediction time, you can pass a named tuple with keys data and idx to predict. See the Operations section for details.

  • dimensions_type::AbstractDimensions: The type of dimensions to use when storing the units of the data. By default this is DynamicQuantities.SymbolicDimensions.

Operations

  • predict(mach, Xnew): Return predictions of the target given features Xnew, which should have same scitype as X above. The expression used for prediction is defined by the selection_method function, which can be seen by viewing report(mach).best_idx.
  • predict(mach, (data=Xnew, idx=i)): Return predictions of the target given features Xnew, which should have same scitype as X above. By passing a named tuple with keys data and idx, you are able to specify the equation you wish to evaluate in idx.

Fitted parameters

The fields of fitted_params(mach) are:

  • best_idx::Int: The index of the best expression in the Pareto frontier, as determined by the selection_method function. Override in predict by passing a named tuple with keys data and idx.
  • equations::Vector{Node{T}}: The expressions discovered by the search, represented in a dominating Pareto frontier (i.e., the best expressions found for each complexity). T is equal to the element type of the passed data.
  • equation_strings::Vector{String}: The expressions discovered by the search, represented as strings for easy inspection.

Report

The fields of report(mach) are:

  • best_idx::Int: The index of the best expression in the Pareto frontier, as determined by the selection_method function. Override in predict by passing a named tuple with keys data and idx.
  • equations::Vector{Node{T}}: The expressions discovered by the search, represented in a dominating Pareto frontier (i.e., the best expressions found for each complexity).
  • equation_strings::Vector{String}: The expressions discovered by the search, represented as strings for easy inspection.
  • complexities::Vector{Int}: The complexity of each expression in the Pareto frontier.
  • losses::Vector{L}: The loss of each expression in the Pareto frontier, according to the loss function specified in the model. The type L is the loss type, which is usually the same as the element type of data passed (i.e., T), but can differ if complex data types are passed.
  • scores::Vector{L}: A metric which considers both the complexity and loss of an expression, equal to the change in the log-loss divided by the change in complexity, relative to the previous expression along the Pareto frontier. A larger score aims to indicate an expression is more likely to be the true expression generating the data, but this is very problem-dependent and generally several other factors should be considered.

Examples

using MLJ
SRRegressor = @load SRRegressor pkg=SymbolicRegression
X, y = @load_boston
model = SRRegressor(binary_operators=[+, -, *], unary_operators=[exp], niterations=100)
mach = machine(model, X, y)
fit!(mach)
y_hat = predict(mach, X)
# View the equation used:
r = report(mach)
println("Equation used:", r.equation_strings[r.best_idx])

With units and variable names:

using MLJ
using DynamicQuantities
SRegressor = @load SRRegressor pkg=SymbolicRegression

X = (; x1=rand(32) .* us"km/h", x2=rand(32) .* us"km")
y = @. X.x2 / X.x1 + 0.5us"h"
model = SRRegressor(binary_operators=[+, -, *, /])
mach = machine(model, X, y)
fit!(mach)
y_hat = predict(mach, X)
# View the equation used:
r = report(mach)
println("Equation used:", r.equation_strings[r.best_idx])

See also MultitargetSRRegressor.

source
SymbolicRegression.MLJInterfaceModule.MultitargetSRRegressorType
MultitargetSRRegressor

A model type for constructing a Multi-Target Symbolic Regression via Evolutionary Search, based on SymbolicRegression.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

MultitargetSRRegressor = @load MultitargetSRRegressor pkg=SymbolicRegression

Do model = MultitargetSRRegressor() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in MultitargetSRRegressor(defaults=...).

Multi-target Symbolic Regression regressor (MultitargetSRRegressor) conducts several searches for expressions that predict each target variable from a set of input variables. All data is assumed to be Continuous. The search is performed using an evolutionary algorithm. This algorithm is described in the paper https://arxiv.org/abs/2305.01582.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X, y)

OR

mach = machine(model, X, y, w)

Here:

  • X is any table of input features (eg, a DataFrame) whose columns are of scitype

Continuous; check column scitypes with schema(X). Variable names in discovered expressions will be taken from the column names of X, if available. Units in columns of X (use DynamicQuantities for units) will trigger dimensional analysis to be used.

  • y is the target, which can be any table of target variables whose element scitype is Continuous; check the scitype with schema(y). Units in columns of y (use DynamicQuantities for units) will trigger dimensional analysis to be used.

  • w is the observation weights which can either be nothing (default) or an AbstractVector whose element scitype is Count or Continuous. The same weights are used for all targets.

Train the machine using fit!(mach), inspect the discovered expressions with report(mach), and predict on new data with predict(mach, Xnew). Note that unlike other regressors, symbolic regression stores a list of lists of trained models. The models chosen from each of these lists is defined by the function selection_method keyword argument, which by default balances accuracy and complexity. You can override this at prediction time by passing a named tuple with keys data and idx.

Hyper-parameters

  • defaults: What set of defaults to use for Options. The default, nothing, will simply take the default options from the current version of SymbolicRegression. However, you may also select the defaults from an earlier version, such as v"0.24.5".

  • binary_operators: Vector of binary operators (functions) to use. Each operator should be defined for two input scalars, and one output scalar. All operators need to be defined over the entire real line (excluding infinity - these are stopped before they are input), or return NaN where not defined. For speed, define it so it takes two reals of the same type as input, and outputs the same type. For the SymbolicUtils simplification backend, you will need to define a generic method of the operator so it takes arbitrary types.

  • unary_operators: Same, but for unary operators (one input scalar, gives an output scalar).

  • constraints: Array of pairs specifying size constraints for each operator. The constraints for a binary operator should be a 2-tuple (e.g., (-1, -1)) and the constraints for a unary operator should be an Int. A size constraint is a limit to the size of the subtree in each argument of an operator. e.g., [(^)=>(-1, 3)] means that the ^ operator can have arbitrary size (-1) in its left argument, but a maximum size of 3 in its right argument. Default is no constraints.

  • batching: Whether to evolve based on small mini-batches of data, rather than the entire dataset.

  • batch_size: What batch size to use if using batching.

  • elementwise_loss: What elementwise loss function to use. Can be one of the following losses, or any other loss of type SupervisedLoss. You can also pass a function that takes a scalar target (left argument), and scalar predicted (right argument), and returns a scalar. This will be averaged over the predicted data. If weights are supplied, your function should take a third argument for the weight scalar. Included losses: Regression: - LPDistLoss{P}(), - L1DistLoss(), - L2DistLoss() (mean square), - LogitDistLoss(), - HuberLoss(d), - L1EpsilonInsLoss(ϵ), - L2EpsilonInsLoss(ϵ), - PeriodicLoss(c), - QuantileLoss(τ), Classification: - ZeroOneLoss(), - PerceptronLoss(), - L1HingeLoss(), - SmoothedL1HingeLoss(γ), - ModifiedHuberLoss(), - L2MarginLoss(), - ExpLoss(), - SigmoidLoss(), - DWDMarginLoss(q).

  • loss_function: Alternatively, you may redefine the loss used as any function of tree::AbstractExpressionNode{T}, dataset::Dataset{T}, and options::AbstractOptions, so long as you output a non-negative scalar of type T. This is useful if you want to use a loss that takes into account derivatives, or correlations across the dataset. This also means you could use a custom evaluation for a particular expression. If you are using batching=true, then your function should accept a fourth argument idx, which is either nothing (indicating that the full dataset should be used), or a vector of indices to use for the batch. For example,

      function my_loss(tree, dataset::Dataset{T,L}, options)::L where {T,L}
          prediction, flag = eval_tree_array(tree, dataset.X, options)
          if !flag
              return L(Inf)
          end
          return sum((prediction .- dataset.y) .^ 2) / dataset.n
      end
  • expression_type::Type{E}=Expression: The type of expression to use. For example, Expression.

  • node_type::Type{N}=default_node_type(Expression): The type of node to use for the search. For example, Node or GraphNode. The default is computed by default_node_type(expression_type).

  • populations: How many populations of equations to use.

  • population_size: How many equations in each population.

  • ncycles_per_iteration: How many generations to consider per iteration.

  • tournament_selection_n: Number of expressions considered in each tournament.

  • tournament_selection_p: The fittest expression in a tournament is to be selected with probability p, the next fittest with probability p*(1-p), and so forth.

  • topn: Number of equations to return to the host process, and to consider for the hall of fame.

  • complexity_of_operators: What complexity should be assigned to each operator, and the occurrence of a constant or variable. By default, this is 1 for all operators. Can be a real number as well, in which case the complexity of an expression will be rounded to the nearest integer. Input this in the form of, e.g., [(^) => 3, sin => 2].

  • complexity_of_constants: What complexity should be assigned to use of a constant. By default, this is 1.

  • complexity_of_variables: What complexity should be assigned to use of a variable, which can also be a vector indicating different per-variable complexity. By default, this is 1.

  • complexity_mapping: Alternatively, you can pass a function that takes the expression as input and returns the complexity. Make sure that this operates on AbstractExpression (and unpacks to AbstractExpressionNode), and returns an integer.

  • alpha: The probability of accepting an equation mutation during regularized evolution is given by exp(-delta_loss/(alpha * T)), where T goes from 1 to 0. Thus, alpha=infinite is the same as no annealing.

  • maxsize: Maximum size of equations during the search.

  • maxdepth: Maximum depth of equations during the search, by default this is set equal to the maxsize.

  • parsimony: A multiplicative factor for how much complexity is punished.

  • dimensional_constraint_penalty: An additive factor if the dimensional constraint is violated.

  • dimensionless_constants_only: Whether to only allow dimensionless constants.

  • use_frequency: Whether to use a parsimony that adapts to the relative proportion of equations at each complexity; this will ensure that there are a balanced number of equations considered for every complexity.

  • use_frequency_in_tournament: Whether to use the adaptive parsimony described above inside the score, rather than just at the mutation accept/reject stage.

  • adaptive_parsimony_scaling: How much to scale the adaptive parsimony term in the loss. Increase this if the search is spending too much time optimizing the most complex equations.

  • turbo: Whether to use LoopVectorization.@turbo to evaluate expressions. This can be significantly faster, but is only compatible with certain operators. Experimental!

  • bumper: Whether to use Bumper.jl for faster evaluation. Experimental!

  • migration: Whether to migrate equations between processes.

  • hof_migration: Whether to migrate equations from the hall of fame to processes.

  • fraction_replaced: What fraction of each population to replace with migrated equations at the end of each cycle.

  • fraction_replaced_hof: What fraction to replace with hall of fame equations at the end of each cycle.

  • should_simplify: Whether to simplify equations. If you pass a custom objective, this will be set to false.

  • should_optimize_constants: Whether to use an optimization algorithm to periodically optimize constants in equations.

  • optimizer_algorithm: Select algorithm to use for optimizing constants. Default is Optim.BFGS(linesearch=LineSearches.BackTracking()).

  • optimizer_nrestarts: How many different random starting positions to consider for optimization of constants.

  • optimizer_probability: Probability of performing optimization of constants at the end of a given iteration.

  • optimizer_iterations: How many optimization iterations to perform. This gets passed to Optim.Options as iterations. The default is 8.

  • optimizer_f_calls_limit: How many function calls to allow during optimization. This gets passed to Optim.Options as f_calls_limit. The default is 10_000.

  • optimizer_options: General options for the constant optimization. For details we refer to the documentation on Optim.Options from the Optim.jl package. Options can be provided here as NamedTuple, e.g. (iterations=16,), as a Dict, e.g. Dict(:x_tol => 1.0e-32,), or as an Optim.Options instance.

  • autodiff_backend: The backend to use for differentiation, which should be an instance of AbstractADType (see ADTypes.jl). Default is nothing, which means Optim.jl will estimate gradients (likely with finite differences). You can also pass a symbolic version of the backend type, such as :Zygote for Zygote, :Enzyme, etc. Most backends will not work, and many will never work due to incompatibilities, though support for some is gradually being added.

  • perturbation_factor: When mutating a constant, either multiply or divide by (1+perturbation_factor)^(rand()+1).

  • probability_negate_constant: Probability of negating a constant in the equation when mutating it.

  • mutation_weights: Relative probabilities of the mutations. The struct MutationWeights (or any AbstractMutationWeights) should be passed to these options. See its documentation on MutationWeights for the different weights.

  • crossover_probability: Probability of performing crossover.

  • annealing: Whether to use simulated annealing.

  • warmup_maxsize_by: Whether to slowly increase the max size from 5 up to maxsize. If nonzero, specifies the fraction through the search at which the maxsize should be reached.

  • verbosity: Whether to print debugging statements or not.

  • print_precision: How many digits to print when printing equations. By default, this is 5.

  • output_directory: The base directory to save output files to. Files will be saved in a subdirectory according to the run ID. By default, this is ./outputs.

  • save_to_file: Whether to save equations to a file during the search.

  • bin_constraints: See constraints. This is the same, but specified for binary operators only (for example, if you have an operator that is both a binary and unary operator).

  • una_constraints: Likewise, for unary operators.

  • seed: What random seed to use. nothing uses no seed.

  • progress: Whether to use a progress bar output (verbosity will have no effect).

  • early_stop_condition: Float - whether to stop early if the mean loss gets below this value. Function - a function taking (loss, complexity) as arguments and returning true or false.

  • timeout_in_seconds: Float64 - the time in seconds after which to exit (as an alternative to the number of iterations).

  • max_evals: Int (or Nothing) - the maximum number of evaluations of expressions to perform.

  • input_stream: the stream to read user input from. By default, this is stdin. If you encounter issues with reading from stdin, like a hang, you can simply pass devnull to this argument.

  • skip_mutation_failures: Whether to simply skip over mutations that fail or are rejected, rather than to replace the mutated expression with the original expression and proceed normally.

  • nested_constraints: Specifies how many times a combination of operators can be nested. For example, [sin => [cos => 0], cos => [cos => 2]] specifies that cos may never appear within a sin, but sin can be nested with itself an unlimited number of times. The second term specifies that cos can be nested up to 2 times within a cos, so that cos(cos(cos(x))) is allowed (as well as any combination of + or - within it), but cos(cos(cos(cos(x)))) is not allowed. When an operator is not specified, it is assumed that it can be nested an unlimited number of times. This requires that there is no operator which is used both in the unary operators and the binary operators (e.g., - could be both subtract, and negation). For binary operators, both arguments are treated the same way, and the max of each argument is constrained.

  • deterministic: Use a global counter for the birth time, rather than calls to time(). This gives perfect resolution, and is therefore deterministic. However, it is not thread safe, and must be used in serial mode.

  • define_helper_functions: Whether to define helper functions for constructing and evaluating trees.

  • niterations::Int=10: The number of iterations to perform the search. More iterations will improve the results.

  • parallelism=:multithreading: What parallelism mode to use. The options are :multithreading, :multiprocessing, and :serial. By default, multithreading will be used. Multithreading uses less memory, but multiprocessing can handle multi-node compute. If using :multithreading mode, the number of threads available to julia are used. If using :multiprocessing, numprocs processes will be created dynamically if procs is unset. If you have already allocated processes, pass them to the procs argument and they will be used. You may also pass a string instead of a symbol, like "multithreading".

  • numprocs::Union{Int, Nothing}=nothing: The number of processes to use, if you want equation_search to set this up automatically. By default this will be 4, but can be any number (you should pick a number <= the number of cores available).

  • procs::Union{Vector{Int}, Nothing}=nothing: If you have set up a distributed run manually with procs = addprocs() and @everywhere, pass the procs to this keyword argument.

  • addprocs_function::Union{Function, Nothing}=nothing: If using multiprocessing (parallelism=:multithreading), and are not passing procs manually, then they will be allocated dynamically using addprocs. However, you may also pass a custom function to use instead of addprocs. This function should take a single positional argument, which is the number of processes to use, as well as the lazy keyword argument. For example, if set up on a slurm cluster, you could pass addprocs_function = addprocs_slurm, which will set up slurm processes.

  • heap_size_hint_in_bytes::Union{Int,Nothing}=nothing: On Julia 1.9+, you may set the --heap-size-hint flag on Julia processes, recommending garbage collection once a process is close to the recommended size. This is important for long-running distributed jobs where each process has an independent memory, and can help avoid out-of-memory errors. By default, this is set to Sys.free_memory() / numprocs.

  • worker_imports::Union{Vector{Symbol},Nothing}=nothing: If you want to import additional modules on each worker, pass them here as a vector of symbols. By default some of the extensions will automatically be loaded when needed.

  • runtests::Bool=true: Whether to run (quick) tests before starting the search, to see if there will be any problems during the equation search related to the host environment.

  • run_id::Union{String,Nothing}=nothing: A unique identifier for the run. This will be used to store outputs from the run in the outputs directory. If not specified, a unique ID will be generated.

  • loss_type::Type=Nothing: If you would like to use a different type for the loss than for the data you passed, specify the type here. Note that if you pass complex data ::Complex{L}, then the loss type will automatically be set to L.

  • selection_method::Function: Function to selection expression from the Pareto frontier for use in predict. See SymbolicRegression.MLJInterfaceModule.choose_best for an example. This function should return a single integer specifying the index of the expression to use. By default, this maximizes the score (a pound-for-pound rating) of expressions reaching the threshold of 1.5x the minimum loss. To override this at prediction time, you can pass a named tuple with keys data and idx to predict. See the Operations section for details.

  • dimensions_type::AbstractDimensions: The type of dimensions to use when storing the units of the data. By default this is DynamicQuantities.SymbolicDimensions.

Operations

  • predict(mach, Xnew): Return predictions of the target given features Xnew, which should have same scitype as X above. The expression used for prediction is defined by the selection_method function, which can be seen by viewing report(mach).best_idx.
  • predict(mach, (data=Xnew, idx=i)): Return predictions of the target given features Xnew, which should have same scitype as X above. By passing a named tuple with keys data and idx, you are able to specify the equation you wish to evaluate in idx.

Fitted parameters

The fields of fitted_params(mach) are:

  • best_idx::Vector{Int}: The index of the best expression in each Pareto frontier, as determined by the selection_method function. Override in predict by passing a named tuple with keys data and idx.
  • equations::Vector{Vector{Node{T}}}: The expressions discovered by the search, represented in a dominating Pareto frontier (i.e., the best expressions found for each complexity). The outer vector is indexed by target variable, and the inner vector is ordered by increasing complexity. T is equal to the element type of the passed data.
  • equation_strings::Vector{Vector{String}}: The expressions discovered by the search, represented as strings for easy inspection.

Report

The fields of report(mach) are:

  • best_idx::Vector{Int}: The index of the best expression in each Pareto frontier, as determined by the selection_method function. Override in predict by passing a named tuple with keys data and idx.
  • equations::Vector{Vector{Node{T}}}: The expressions discovered by the search, represented in a dominating Pareto frontier (i.e., the best expressions found for each complexity). The outer vector is indexed by target variable, and the inner vector is ordered by increasing complexity.
  • equation_strings::Vector{Vector{String}}: The expressions discovered by the search, represented as strings for easy inspection.
  • complexities::Vector{Vector{Int}}: The complexity of each expression in each Pareto frontier.
  • losses::Vector{Vector{L}}: The loss of each expression in each Pareto frontier, according to the loss function specified in the model. The type L is the loss type, which is usually the same as the element type of data passed (i.e., T), but can differ if complex data types are passed.
  • scores::Vector{Vector{L}}: A metric which considers both the complexity and loss of an expression, equal to the change in the log-loss divided by the change in complexity, relative to the previous expression along the Pareto frontier. A larger score aims to indicate an expression is more likely to be the true expression generating the data, but this is very problem-dependent and generally several other factors should be considered.

Examples

using MLJ
MultitargetSRRegressor = @load MultitargetSRRegressor pkg=SymbolicRegression
X = (a=rand(100), b=rand(100), c=rand(100))
Y = (y1=(@. cos(X.c) * 2.1 - 0.9), y2=(@. X.a * X.b + X.c))
model = MultitargetSRRegressor(binary_operators=[+, -, *], unary_operators=[exp], niterations=100)
mach = machine(model, X, Y)
fit!(mach)
y_hat = predict(mach, X)
# View the equations used:
r = report(mach)
for (output_index, (eq, i)) in enumerate(zip(r.equation_strings, r.best_idx))
    println("Equation used for ", output_index, ": ", eq[i])
end

See also SRRegressor.

source

Low-Level API

SymbolicRegression.equation_searchFunction
equation_search(X, y[; kws...])

Perform a distributed equation search for functions f_i which describe the mapping f_i(X[:, j]) ≈ y[i, j]. Options are configured using SymbolicRegression.Options(...), which should be passed as a keyword argument to options. One can turn off parallelism with numprocs=0, which is useful for debugging and profiling.

Arguments

  • X::AbstractMatrix{T}: The input dataset to predict y from. The first dimension is features, the second dimension is rows.
  • y::Union{AbstractMatrix{T}, AbstractVector{T}}: The values to predict. The first dimension is the output feature to predict with each equation, and the second dimension is rows.
  • niterations::Int=100: The number of iterations to perform the search. More iterations will improve the results.
  • weights::Union{AbstractMatrix{T}, AbstractVector{T}, Nothing}=nothing: Optionally weight the loss for each y by this value (same shape as y).
  • options::AbstractOptions=Options(): The options for the search, such as which operators to use, evolution hyperparameters, etc.
  • variable_names::Union{Vector{String}, Nothing}=nothing: The names of each feature in X, which will be used during printing of equations.
  • display_variable_names::Union{Vector{String}, Nothing}=variable_names: Names to use when printing expressions during the search, but not when saving to an equation file.
  • y_variable_names::Union{String,AbstractVector{String},Nothing}=nothing: The names of each output feature in y, which will be used during printing of equations.
  • parallelism=:multithreading: What parallelism mode to use. The options are :multithreading, :multiprocessing, and :serial. By default, multithreading will be used. Multithreading uses less memory, but multiprocessing can handle multi-node compute. If using :multithreading mode, the number of threads available to julia are used. If using :multiprocessing, numprocs processes will be created dynamically if procs is unset. If you have already allocated processes, pass them to the procs argument and they will be used. You may also pass a string instead of a symbol, like "multithreading".
  • numprocs::Union{Int, Nothing}=nothing: The number of processes to use, if you want equation_search to set this up automatically. By default this will be 4, but can be any number (you should pick a number <= the number of cores available).
  • procs::Union{Vector{Int}, Nothing}=nothing: If you have set up a distributed run manually with procs = addprocs() and @everywhere, pass the procs to this keyword argument.
  • addprocs_function::Union{Function, Nothing}=nothing: If using multiprocessing (parallelism=:multiprocessing), and are not passing procs manually, then they will be allocated dynamically using addprocs. However, you may also pass a custom function to use instead of addprocs. This function should take a single positional argument, which is the number of processes to use, as well as the lazy keyword argument. For example, if set up on a slurm cluster, you could pass addprocs_function = addprocs_slurm, which will set up slurm processes.
  • heap_size_hint_in_bytes::Union{Int,Nothing}=nothing: On Julia 1.9+, you may set the --heap-size-hint flag on Julia processes, recommending garbage collection once a process is close to the recommended size. This is important for long-running distributed jobs where each process has an independent memory, and can help avoid out-of-memory errors. By default, this is set to Sys.free_memory() / numprocs.
  • worker_imports::Union{Vector{Symbol},Nothing}=nothing: If you want to import additional modules on each worker, pass them here as a vector of symbols. By default some of the extensions will automatically be loaded when needed.
  • runtests::Bool=true: Whether to run (quick) tests before starting the search, to see if there will be any problems during the equation search related to the host environment.
  • saved_state=nothing: If you have already run equation_search and want to resume it, pass the state here. To get this to work, you need to have set returnstate=true, which will cause `equationsearch` to return the state. The second element of the state is the regular return value with the hall of fame. Note that you cannot change the operators or dataset, but most other options should be changeable.
  • return_state::Union{Bool, Nothing}=nothing: Whether to return the state of the search for warm starts. By default this is false.
  • loss_type::Type=Nothing: If you would like to use a different type for the loss than for the data you passed, specify the type here. Note that if you pass complex data ::Complex{L}, then the loss type will automatically be set to L.
  • verbosity: Whether to print debugging statements or not.
  • logger::Union{AbstractSRLogger,Nothing}=nothing: An optional logger to record the progress of the search. You can use an SRLogger to wrap a custom logger, or pass nothing to disable logging.
  • progress: Whether to use a progress bar output. Only available for single target output.
  • X_units::Union{AbstractVector,Nothing}=nothing: The units of the dataset, to be used for dimensional constraints. For example, if X_units=["kg", "m"], then the first feature will have units of kilograms, and the second will have units of meters.
  • y_units=nothing: The units of the output, to be used for dimensional constraints. If y is a matrix, then this can be a vector of units, in which case each element corresponds to each output feature.

Returns

  • hallOfFame::HallOfFame: The best equations seen during the search. hallOfFame.members gives an array of PopMember objects, which have their tree (equation) stored in .tree. Their score (loss) is given in .score. The array of PopMember objects is enumerated by size from 1 to options.maxsize.
source

Options

SymbolicRegression.CoreModule.OptionsStructModule.OptionsType
Options(;kws...) <: AbstractOptions

Construct options for equation_search and other functions. The current arguments have been tuned using the median values from https://github.com/MilesCranmer/PySR/discussions/115.

Arguments

  • defaults: What set of defaults to use for Options. The default, nothing, will simply take the default options from the current version of SymbolicRegression. However, you may also select the defaults from an earlier version, such as v"0.24.5".

  • binary_operators: Vector of binary operators (functions) to use. Each operator should be defined for two input scalars, and one output scalar. All operators need to be defined over the entire real line (excluding infinity - these are stopped before they are input), or return NaN where not defined. For speed, define it so it takes two reals of the same type as input, and outputs the same type. For the SymbolicUtils simplification backend, you will need to define a generic method of the operator so it takes arbitrary types.

  • unary_operators: Same, but for unary operators (one input scalar, gives an output scalar).

  • constraints: Array of pairs specifying size constraints for each operator. The constraints for a binary operator should be a 2-tuple (e.g., (-1, -1)) and the constraints for a unary operator should be an Int. A size constraint is a limit to the size of the subtree in each argument of an operator. e.g., [(^)=>(-1, 3)] means that the ^ operator can have arbitrary size (-1) in its left argument, but a maximum size of 3 in its right argument. Default is no constraints.

  • batching: Whether to evolve based on small mini-batches of data, rather than the entire dataset.

  • batch_size: What batch size to use if using batching.

  • elementwise_loss: What elementwise loss function to use. Can be one of the following losses, or any other loss of type SupervisedLoss. You can also pass a function that takes a scalar target (left argument), and scalar predicted (right argument), and returns a scalar. This will be averaged over the predicted data. If weights are supplied, your function should take a third argument for the weight scalar. Included losses: Regression: - LPDistLoss{P}(), - L1DistLoss(), - L2DistLoss() (mean square), - LogitDistLoss(), - HuberLoss(d), - L1EpsilonInsLoss(ϵ), - L2EpsilonInsLoss(ϵ), - PeriodicLoss(c), - QuantileLoss(τ), Classification: - ZeroOneLoss(), - PerceptronLoss(), - L1HingeLoss(), - SmoothedL1HingeLoss(γ), - ModifiedHuberLoss(), - L2MarginLoss(), - ExpLoss(), - SigmoidLoss(), - DWDMarginLoss(q).

  • loss_function: Alternatively, you may redefine the loss used as any function of tree::AbstractExpressionNode{T}, dataset::Dataset{T}, and options::AbstractOptions, so long as you output a non-negative scalar of type T. This is useful if you want to use a loss that takes into account derivatives, or correlations across the dataset. This also means you could use a custom evaluation for a particular expression. If you are using batching=true, then your function should accept a fourth argument idx, which is either nothing (indicating that the full dataset should be used), or a vector of indices to use for the batch. For example,

      function my_loss(tree, dataset::Dataset{T,L}, options)::L where {T,L}
          prediction, flag = eval_tree_array(tree, dataset.X, options)
          if !flag
              return L(Inf)
          end
          return sum((prediction .- dataset.y) .^ 2) / dataset.n
      end
  • expression_type::Type{E}=Expression: The type of expression to use. For example, Expression.

  • node_type::Type{N}=default_node_type(Expression): The type of node to use for the search. For example, Node or GraphNode. The default is computed by default_node_type(expression_type).

  • populations: How many populations of equations to use.

  • population_size: How many equations in each population.

  • ncycles_per_iteration: How many generations to consider per iteration.

  • tournament_selection_n: Number of expressions considered in each tournament.

  • tournament_selection_p: The fittest expression in a tournament is to be selected with probability p, the next fittest with probability p*(1-p), and so forth.

  • topn: Number of equations to return to the host process, and to consider for the hall of fame.

  • complexity_of_operators: What complexity should be assigned to each operator, and the occurrence of a constant or variable. By default, this is 1 for all operators. Can be a real number as well, in which case the complexity of an expression will be rounded to the nearest integer. Input this in the form of, e.g., [(^) => 3, sin => 2].

  • complexity_of_constants: What complexity should be assigned to use of a constant. By default, this is 1.

  • complexity_of_variables: What complexity should be assigned to use of a variable, which can also be a vector indicating different per-variable complexity. By default, this is 1.

  • complexity_mapping: Alternatively, you can pass a function that takes the expression as input and returns the complexity. Make sure that this operates on AbstractExpression (and unpacks to AbstractExpressionNode), and returns an integer.

  • alpha: The probability of accepting an equation mutation during regularized evolution is given by exp(-delta_loss/(alpha * T)), where T goes from 1 to 0. Thus, alpha=infinite is the same as no annealing.

  • maxsize: Maximum size of equations during the search.

  • maxdepth: Maximum depth of equations during the search, by default this is set equal to the maxsize.

  • parsimony: A multiplicative factor for how much complexity is punished.

  • dimensional_constraint_penalty: An additive factor if the dimensional constraint is violated.

  • dimensionless_constants_only: Whether to only allow dimensionless constants.

  • use_frequency: Whether to use a parsimony that adapts to the relative proportion of equations at each complexity; this will ensure that there are a balanced number of equations considered for every complexity.

  • use_frequency_in_tournament: Whether to use the adaptive parsimony described above inside the score, rather than just at the mutation accept/reject stage.

  • adaptive_parsimony_scaling: How much to scale the adaptive parsimony term in the loss. Increase this if the search is spending too much time optimizing the most complex equations.

  • turbo: Whether to use LoopVectorization.@turbo to evaluate expressions. This can be significantly faster, but is only compatible with certain operators. Experimental!

  • bumper: Whether to use Bumper.jl for faster evaluation. Experimental!

  • migration: Whether to migrate equations between processes.

  • hof_migration: Whether to migrate equations from the hall of fame to processes.

  • fraction_replaced: What fraction of each population to replace with migrated equations at the end of each cycle.

  • fraction_replaced_hof: What fraction to replace with hall of fame equations at the end of each cycle.

  • should_simplify: Whether to simplify equations. If you pass a custom objective, this will be set to false.

  • should_optimize_constants: Whether to use an optimization algorithm to periodically optimize constants in equations.

  • optimizer_algorithm: Select algorithm to use for optimizing constants. Default is Optim.BFGS(linesearch=LineSearches.BackTracking()).

  • optimizer_nrestarts: How many different random starting positions to consider for optimization of constants.

  • optimizer_probability: Probability of performing optimization of constants at the end of a given iteration.

  • optimizer_iterations: How many optimization iterations to perform. This gets passed to Optim.Options as iterations. The default is 8.

  • optimizer_f_calls_limit: How many function calls to allow during optimization. This gets passed to Optim.Options as f_calls_limit. The default is 10_000.

  • optimizer_options: General options for the constant optimization. For details we refer to the documentation on Optim.Options from the Optim.jl package. Options can be provided here as NamedTuple, e.g. (iterations=16,), as a Dict, e.g. Dict(:x_tol => 1.0e-32,), or as an Optim.Options instance.

  • autodiff_backend: The backend to use for differentiation, which should be an instance of AbstractADType (see ADTypes.jl). Default is nothing, which means Optim.jl will estimate gradients (likely with finite differences). You can also pass a symbolic version of the backend type, such as :Zygote for Zygote, :Enzyme, etc. Most backends will not work, and many will never work due to incompatibilities, though support for some is gradually being added.

  • perturbation_factor: When mutating a constant, either multiply or divide by (1+perturbation_factor)^(rand()+1).

  • probability_negate_constant: Probability of negating a constant in the equation when mutating it.

  • mutation_weights: Relative probabilities of the mutations. The struct MutationWeights (or any AbstractMutationWeights) should be passed to these options. See its documentation on MutationWeights for the different weights.

  • crossover_probability: Probability of performing crossover.

  • annealing: Whether to use simulated annealing.

  • warmup_maxsize_by: Whether to slowly increase the max size from 5 up to maxsize. If nonzero, specifies the fraction through the search at which the maxsize should be reached.

  • verbosity: Whether to print debugging statements or not.

  • print_precision: How many digits to print when printing equations. By default, this is 5.

  • output_directory: The base directory to save output files to. Files will be saved in a subdirectory according to the run ID. By default, this is ./outputs.

  • save_to_file: Whether to save equations to a file during the search.

  • bin_constraints: See constraints. This is the same, but specified for binary operators only (for example, if you have an operator that is both a binary and unary operator).

  • una_constraints: Likewise, for unary operators.

  • seed: What random seed to use. nothing uses no seed.

  • progress: Whether to use a progress bar output (verbosity will have no effect).

  • early_stop_condition: Float - whether to stop early if the mean loss gets below this value. Function - a function taking (loss, complexity) as arguments and returning true or false.

  • timeout_in_seconds: Float64 - the time in seconds after which to exit (as an alternative to the number of iterations).

  • max_evals: Int (or Nothing) - the maximum number of evaluations of expressions to perform.

  • input_stream: the stream to read user input from. By default, this is stdin. If you encounter issues with reading from stdin, like a hang, you can simply pass devnull to this argument.

  • skip_mutation_failures: Whether to simply skip over mutations that fail or are rejected, rather than to replace the mutated expression with the original expression and proceed normally.

  • nested_constraints: Specifies how many times a combination of operators can be nested. For example, [sin => [cos => 0], cos => [cos => 2]] specifies that cos may never appear within a sin, but sin can be nested with itself an unlimited number of times. The second term specifies that cos can be nested up to 2 times within a cos, so that cos(cos(cos(x))) is allowed (as well as any combination of + or - within it), but cos(cos(cos(cos(x)))) is not allowed. When an operator is not specified, it is assumed that it can be nested an unlimited number of times. This requires that there is no operator which is used both in the unary operators and the binary operators (e.g., - could be both subtract, and negation). For binary operators, both arguments are treated the same way, and the max of each argument is constrained.

  • deterministic: Use a global counter for the birth time, rather than calls to time(). This gives perfect resolution, and is therefore deterministic. However, it is not thread safe, and must be used in serial mode.

  • define_helper_functions: Whether to define helper functions for constructing and evaluating trees.

source
SymbolicRegression.CoreModule.MutationWeightsModule.MutationWeightsType
MutationWeights(;kws...) <: AbstractMutationWeights

This defines how often different mutations occur. These weightings will be normalized to sum to 1.0 after initialization.

Arguments

  • mutate_constant::Float64: How often to mutate a constant.
  • mutate_operator::Float64: How often to mutate an operator.
  • swap_operands::Float64: How often to swap the operands of a binary operator.
  • rotate_tree::Float64: How often to perform a tree rotation at a random node.
  • add_node::Float64: How often to append a node to the tree.
  • insert_node::Float64: How often to insert a node into the tree.
  • delete_node::Float64: How often to delete a node from the tree.
  • simplify::Float64: How often to simplify the tree.
  • randomize::Float64: How often to create a random tree.
  • do_nothing::Float64: How often to do nothing.
  • optimize::Float64: How often to optimize the constants in the tree, as a mutation. Note that this is different from optimizer_probability, which is performed at the end of an iteration for all individuals.
  • form_connection::Float64: Only used for GraphNode, not regular Node. Otherwise, this will automatically be set to 0.0. How often to form a connection between two nodes.
  • break_connection::Float64: Only used for GraphNode, not regular Node. Otherwise, this will automatically be set to 0.0. How often to break a connection between two nodes.

See Also

source

Printing

DynamicExpressions.StringsModule.string_treeFunction
string_tree(
    tree::AbstractExpressionNode{T},
    operators::Union{AbstractOperatorEnum,Nothing}=nothing;
    f_variable::F1=string_variable,
    f_constant::F2=string_constant,
    variable_names::Union{Array{String,1},Nothing}=nothing,
    # Deprecated
    varMap=nothing,
)::String where {T,F1<:Function,F2<:Function}

Convert an equation to a string.

Arguments

  • tree: the tree to convert to a string
  • operators: the operators used to define the tree

Keyword Arguments

  • f_variable: (optional) function to convert a variable to a string, with arguments (feature::UInt8, variable_names).
  • f_constant: (optional) function to convert a constant to a string, with arguments (val,)
  • variable_names::Union{Array{String, 1}, Nothing}=nothing: (optional) what variables to print for each feature.
string_tree(
    ex::AbstractExpression,
    operators::Union{AbstractOperatorEnum,Nothing}=nothing;
    variable_names=nothing,
    kws...
)

Convert an expression to a string representation.

This method unpacks the operators and variable names from the expression and calls string_tree for AbstractExpressionNode.

Arguments

  • ex: The expression to convert to a string.
  • operators: (Optional) Operators to use. If nothing, operators are obtained from the expression.
  • variable_names: (Optional) Variable names to use in the string representation. If nothing, variable names are obtained from the expression.
  • kws...: Additional keyword arguments.

Returns

  • A string representation of the expression.
string_tree(tree::AbstractExpressionNode, options::AbstractOptions; kws...)

Convert an equation to a string.

Arguments

  • tree::AbstractExpressionNode: The equation to convert to a string.
  • options::AbstractOptions: The options holding the definition of operators.
  • variable_names::Union{Array{String, 1}, Nothing}=nothing: what variables to print for each feature.
source

Evaluation

DynamicExpressions.EvaluateModule.eval_tree_arrayFunction
eval_tree_array(
    tree::AbstractExpressionNode{T},
    cX::AbstractMatrix{T},
    operators::OperatorEnum;
    eval_options::Union{EvalOptions,Nothing}=nothing,
) where {T}

Evaluate a binary tree (equation) over a given input data matrix. The operators contain all of the operators used. This function fuses doublets and triplets of operations for lower memory usage.

Arguments

  • tree::AbstractExpressionNode: The root node of the tree to evaluate.
  • cX::AbstractMatrix{T}: The input data to evaluate the tree on, with shape [num_features, num_rows].
  • operators::OperatorEnum: The operators used in the tree.
  • eval_options::Union{EvalOptions,Nothing}: See EvalOptions for documentation on the different evaluation modes.

Returns

  • (output, complete)::Tuple{AbstractVector{T}, Bool}: the result, which is a 1D array, as well as if the evaluation completed successfully (true/false). A false complete means an infinity or nan was encountered, and a large loss should be assigned to the equation.

Notes

This function can be represented by the following pseudocode:

def eval(current_node)
    if current_node is leaf
        return current_node.value
    elif current_node is degree 1
        return current_node.operator(eval(current_node.left_child))
    else
        return current_node.operator(eval(current_node.left_child), eval(current_node.right_child))

The bulk of the code is for optimizations and pre-emptive NaN/Inf checks, which speed up evaluation significantly.

eval_tree_array(tree::AbstractExpressionNode, cX::AbstractMatrix, operators::GenericOperatorEnum; throw_errors::Bool=true)

Evaluate a generic binary tree (equation) over a given input data, whatever that input data may be. The operators enum contains all of the operators used. Unlike eval_tree_array with the normal OperatorEnum, the array cX is sliced only along the first dimension. i.e., if cX is a vector, then the output of a feature node will be a scalar. If cX is a 3D tensor, then the output of a feature node will be a 2D tensor. Note also that tree.feature will index along the first axis of cX.

However, there is no requirement about input and output types in general. You may set up your tree such that some operator nodes work on tensors, while other operator nodes work on scalars. eval_tree_array will simply return nothing if a given operator is not defined for the given input type.

This function can be represented by the following pseudocode:

function eval(current_node)
    if current_node is leaf
        return current_node.value
    elif current_node is degree 1
        return current_node.operator(eval(current_node.left_child))
    else
        return current_node.operator(eval(current_node.left_child), eval(current_node.right_child))

Arguments

  • tree::AbstractExpressionNode: The root node of the tree to evaluate.
  • cX::AbstractArray: The input data to evaluate the tree on.
  • operators::GenericOperatorEnum: The operators used in the tree.
  • throw_errors::Bool=true: Whether to throw errors if they occur during evaluation. Otherwise, MethodErrors will be caught before they happen and evaluation will return nothing, rather than throwing an error. This is useful in cases where you are unsure if a particular tree is valid or not, and would prefer to work with nothing as an output.

Returns

  • (output, complete)::Tuple{Any, Bool}: the result, as well as if the evaluation completed successfully (true/false). If evaluation failed, nothing will be returned for the first argument. A false complete means an operator was called on input types that it was not defined for.
eval_tree_array(
    ex::AbstractExpression,
    cX::AbstractMatrix,
    operators::Union{AbstractOperatorEnum,Nothing}=nothing;
    kws...
)

Evaluate an expression over a given input data matrix.

This method unpacks the operators from the expression and calls eval_tree_array for AbstractExpressionNode.

Arguments

  • ex: The expression to evaluate.
  • cX: The input data matrix.
  • operators: (Optional) Operators to use. If nothing, operators are obtained from the expression.
  • kws...: Additional keyword arguments.

Returns

  • A tuple (output, complete) indicating the result and success of the evaluation.
eval_tree_array(tree::Union{AbstractExpression,AbstractExpressionNode}, X::AbstractArray, options::AbstractOptions; kws...)

Evaluate a binary tree (equation) over a given input data matrix. The operators contain all of the operators used. This function fuses doublets and triplets of operations for lower memory usage.

This function can be represented by the following pseudocode:

function eval(current_node)
    if current_node is leaf
        return current_node.value
    elif current_node is degree 1
        return current_node.operator(eval(current_node.left_child))
    else
        return current_node.operator(eval(current_node.left_child), eval(current_node.right_child))

The bulk of the code is for optimizations and pre-emptive NaN/Inf checks, which speed up evaluation significantly.

Arguments

  • tree::Union{AbstractExpression,AbstractExpressionNode}: The root node of the tree to evaluate.
  • X::AbstractArray: The input data to evaluate the tree on.
  • options::AbstractOptions: Options used to define the operators used in the tree.

Returns

  • (output, complete)::Tuple{AbstractVector, Bool}: the result, which is a 1D array, as well as if the evaluation completed successfully (true/false). A false complete means an infinity or nan was encountered, and a large loss should be assigned to the equation.
source
DynamicExpressions.EvaluateModule.EvalOptionsType
EvalOptions

This holds options for expression evaluation, such as evaluation backend.

Fields

  • turbo::Val{T}=Val(false): If Val{true}, use LoopVectorization.jl for faster evaluation.
  • bumper::Val{B}=Val(false): If Val{true}, use Bumper.jl for faster evaluation.
  • early_exit::Val{E}=Val(true): If Val{true}, any element of any step becoming NaN or Inf will terminate the computation. For eval_tree_array, this will result in the second return value, the completion flag, being false. For calling an expression using tree(X), this will result in NaNs filling the entire buffer. This early exit is performed to avoid wasting compute cycles. Setting Val{false} will continue the computation as usual and thus result in NaNs only in the elements that actually have NaNs.
  • buffer::Union{ArrayBuffer,Nothing}: If not nothing, use this buffer for evaluation. This should be an instance of ArrayBuffer which has an array field and an index field used to iterate which buffer slot to use.

Derivatives

SymbolicRegression.jl can automatically and efficiently compute derivatives of expressions with respect to variables or constants. This is done using either eval_diff_tree_array, to compute derivative with respect to a single variable, or with eval_grad_tree_array, to compute the gradient with respect all variables (or, all constants). Both use forward-mode automatic, but use Zygote.jl to compute derivatives of each operator, so this is very efficient.

DynamicExpressions.EvaluateDerivativeModule.eval_diff_tree_arrayFunction
eval_diff_tree_array(
    tree::AbstractExpressionNode{T},
    cX::AbstractMatrix{T},
    operators::OperatorEnum,
    direction::Integer;
    turbo::Union{Bool,Val}=Val(false)
) where {T<:Number}

Compute the forward derivative of an expression, using a similar structure and optimization to evaltreearray. direction is the index of a particular variable in the expression. e.g., direction=1 would indicate derivative with respect to x1.

Arguments

  • tree::AbstractExpressionNode: The expression tree to evaluate.
  • cX::AbstractMatrix{T}: The data matrix, with shape [num_features, num_rows].
  • operators::OperatorEnum: The operators used to create the tree.
  • direction::Integer: The index of the variable to take the derivative with respect to.
  • turbo::Union{Bool,Val}: Use LoopVectorization.jl for faster evaluation. Currently this does not have any effect.

Returns

  • (evaluation, derivative, complete)::Tuple{AbstractVector{T}, AbstractVector{T}, Bool}: the normal evaluation, the derivative, and whether the evaluation completed as normal (or encountered a nan or inf).
eval_diff_tree_array(tree::Union{AbstractExpression,AbstractExpressionNode}, X::AbstractArray, options::AbstractOptions, direction::Int)

Compute the forward derivative of an expression, using a similar structure and optimization to evaltreearray. direction is the index of a particular variable in the expression. e.g., direction=1 would indicate derivative with respect to x1.

Arguments

  • tree::Union{AbstractExpression,AbstractExpressionNode}: The expression tree to evaluate.
  • X::AbstractArray: The data matrix, with each column being a data point.
  • options::AbstractOptions: The options containing the operators used to create the tree.
  • direction::Int: The index of the variable to take the derivative with respect to.

Returns

  • (evaluation, derivative, complete)::Tuple{AbstractVector, AbstractVector, Bool}: the normal evaluation, the derivative, and whether the evaluation completed as normal (or encountered a nan or inf).
source
DynamicExpressions.EvaluateDerivativeModule.eval_grad_tree_arrayFunction
eval_grad_tree_array(tree::AbstractExpressionNode{T}, cX::AbstractMatrix{T}, operators::OperatorEnum; variable::Union{Bool,Val}=Val(false), turbo::Union{Bool,Val}=Val(false))

Compute the forward-mode derivative of an expression, using a similar structure and optimization to evaltreearray. variable specifies whether we should take derivatives with respect to features (i.e., cX), or with respect to every constant in the expression.

Arguments

  • tree::AbstractExpressionNode{T}: The expression tree to evaluate.
  • cX::AbstractMatrix{T}: The data matrix, with each column being a data point.
  • operators::OperatorEnum: The operators used to create the tree.
  • variable::Union{Bool,Val}: Whether to take derivatives with respect to features (i.e., cX - with variable=true), or with respect to every constant in the expression (variable=false).
  • turbo::Union{Bool,Val}: Use LoopVectorization.jl for faster evaluation. Currently this does not have any effect.

Returns

  • (evaluation, gradient, complete)::Tuple{AbstractVector{T}, AbstractMatrix{T}, Bool}: the normal evaluation, the gradient, and whether the evaluation completed as normal (or encountered a nan or inf).
eval_grad_tree_array(
    ex::AbstractExpression,
    cX::AbstractMatrix,
    operators::Union{AbstractOperatorEnum,Nothing}=nothing;
    kws...
)

Compute the forward-mode derivative of an expression.

This method unpacks the operators from the expression and calls eval_grad_tree_array for AbstractExpressionNode.

Arguments

  • ex: The expression to evaluate.
  • cX: The data matrix.
  • operators: (Optional) Operators to use. If nothing, operators are obtained from the expression.
  • kws...: Additional keyword arguments.

Returns

  • A tuple (output, gradient, complete) indicating the result, gradient, and success of the evaluation.
eval_grad_tree_array(tree::Union{AbstractExpression,AbstractExpressionNode}, X::AbstractArray, options::AbstractOptions; variable::Bool=false)

Compute the forward-mode derivative of an expression, using a similar structure and optimization to evaltreearray. variable specifies whether we should take derivatives with respect to features (i.e., X), or with respect to every constant in the expression.

Arguments

  • tree::Union{AbstractExpression,AbstractExpressionNode}: The expression tree to evaluate.
  • X::AbstractArray: The data matrix, with each column being a data point.
  • options::AbstractOptions: The options containing the operators used to create the tree.
  • variable::Bool: Whether to take derivatives with respect to features (i.e., X - with variable=true), or with respect to every constant in the expression (variable=false).

Returns

  • (evaluation, gradient, complete)::Tuple{AbstractVector, AbstractArray, Bool}: the normal evaluation, the gradient, and whether the evaluation completed as normal (or encountered a nan or inf).
source

SymbolicUtils.jl interface

DynamicExpressions.ExtensionInterfaceModule.node_to_symbolicFunction
node_to_symbolic(tree::AbstractExpressionNode, operators::AbstractOperatorEnum;
            variable_names::Union{Array{String, 1}, Nothing}=nothing,
            index_functions::Bool=false)

The interface to SymbolicUtils.jl. Passing a tree to this function will generate a symbolic equation in SymbolicUtils.jl format.

Arguments

  • tree::AbstractExpressionNode: The equation to convert.
  • operators::AbstractOperatorEnum: OperatorEnum, which contains the operators used in the equation.
  • variable_names::Union{Array{String, 1}, Nothing}=nothing: What variable names to use for each feature. Default is [x1, x2, x3, ...].
  • index_functions::Bool=false: Whether to generate special names for the operators, which then allows one to convert back to a AbstractExpressionNode format using symbolic_to_node. (CURRENTLY UNAVAILABLE - See https://github.com/MilesCranmer/SymbolicRegression.jl/pull/84).
node_to_symbolic(tree::AbstractExpressionNode, options::Options; kws...)

Convert an expression to SymbolicUtils.jl form.

source

Note that use of this function requires SymbolicUtils.jl to be installed and loaded.

Pareto frontier

Logging

SymbolicRegression.LoggingModule.SRLoggerType
SRLogger(logger::AbstractLogger; log_interval::Int=100)

A logger for symbolic regression that wraps another logger.

Arguments

  • logger: The base logger to wrap
  • log_interval: Number of steps between logging events. Default is 100 (log every 100 steps).
source

The SRLogger allows you to track the progress of symbolic regression searches. It can wrap any AbstractLogger that implements the Julia logging interface, such as from TensorBoardLogger.jl or Wandb.jl.

using TensorBoardLogger

logger = SRLogger(TBLogger("logs/run"), log_interval=2)

model = SRRegressor(;
    logger=logger,
    kws...
)