API
EquationSearch
SymbolicRegression.EquationSearch
— MethodEquationSearch(X, y[; kws...])
Perform a distributed equation search for functions f_i
which describe the mapping f_i(X[:, j]) ≈ y[i, j]
. Options are configured using SymbolicRegression.Options(...), which should be passed as a keyword argument to options. One can turn off parallelism with numprocs=0
, which is useful for debugging and profiling.
Arguments
X::AbstractMatrix{T}
: The input dataset to predicty
from. The first dimension is features, the second dimension is rows.y::Union{AbstractMatrix{T}, AbstractVector{T}}
: The values to predict. The first dimension is the output feature to predict with each equation, and the second dimension is rows.niterations::Int=10
: The number of iterations to perform the search. More iterations will improve the results.weights::Union{AbstractMatrix{T}, AbstractVector{T}, Nothing}=nothing
: Optionally weight the loss for eachy
by this value (same shape asy
).varMap::Union{Vector{String}, Nothing}=nothing
: The names of each feature inX
, which will be used during printing of equations.options::Options=Options()
: The options for the search, such as which operators to use, evolution hyperparameters, etc.parallelism=:multithreading
: What parallelism mode to use. The options are:multithreading
,:multiprocessing
, and:serial
. By default, multithreading will be used. Multithreading uses less memory, but multiprocessing can handle multi-node compute. If using:multithreading
mode, the number of threads available to julia are used. If using:multiprocessing
,numprocs
processes will be created dynamically ifprocs
is unset. If you have already allocated processes, pass them to theprocs
argument and they will be used. You may also pass a string instead of a symbol, like"multithreading"
.numprocs::Union{Int, Nothing}=nothing
: The number of processes to use, if you wantEquationSearch
to set this up automatically. By default this will be4
, but can be any number (you should pick a number <= the number of cores available).procs::Union{Vector{Int}, Nothing}=nothing
: If you have set up a distributed run manually withprocs = addprocs()
and@everywhere
, pass theprocs
to this keyword argument.addprocs_function::Union{Function, Nothing}=nothing
: If using multiprocessing (parallelism=:multithreading
), and are not passingprocs
manually, then they will be allocated dynamically usingaddprocs
. However, you may also pass a custom function to use instead ofaddprocs
. This function should take a single positional argument, which is the number of processes to use, as well as thelazy
keyword argument. For example, if set up on a slurm cluster, you could passaddprocs_function = addprocs_slurm
, which will set up slurm processes.runtests::Bool=true
: Whether to run (quick) tests before starting the search, to see if there will be any problems during the equation search related to the host environment.saved_state::Union{StateType, Nothing}=nothing
: If you have already runEquationSearch
and want to resume it, pass the state here. To get this to work, you need to have return_state=true in the options, which will causeEquationSearch
to return the state. Note that you cannot change the operators or dataset, but most other options should be changeable.
Returns
hallOfFame::HallOfFame
: The best equations seen during the search. hallOfFame.members gives an array ofPopMember
objects, which have their tree (equation) stored in.tree
. Their score (loss) is given in.score
. The array ofPopMember
objects is enumerated by size from1
tooptions.maxsize
.
Options
SymbolicRegression.CoreModule.OptionsStructModule.Options
— MethodOptions(;kws...)
Construct options for EquationSearch
and other functions. The current arguments have been tuned using the median values from https://github.com/MilesCranmer/PySR/discussions/115.
Arguments
binary_operators
: Vector of binary operators (functions) to use. Each operator should be defined for two input scalars, and one output scalar. All operators need to be defined over the entire real line (excluding infinity - these are stopped before they are input), or returnNaN
where not defined. For speed, define it so it takes two reals of the same type as input, and outputs the same type. For the SymbolicUtils simplification backend, you will need to define a generic method of the operator so it takes arbitrary types.unary_operators
: Same, but for unary operators (one input scalar, gives an output scalar).constraints
: Array of pairs specifying size constraints for each operator. The constraints for a binary operator should be a 2-tuple (e.g.,(-1, -1)
) and the constraints for a unary operator should be anInt
. A size constraint is a limit to the size of the subtree in each argument of an operator. e.g.,[(^)=>(-1, 3)]
means that the^
operator can have arbitrary size (-1
) in its left argument, but a maximum size of3
in its right argument. Default is no constraints.batching
: Whether to evolve based on small mini-batches of data, rather than the entire dataset.batch_size
: What batch size to use if using batching.elementwise_loss
: What elementwise loss function to use. Can be one of the following losses, or any other loss of typeSupervisedLoss
. You can also pass a function that takes a scalar target (left argument), and scalar predicted (right argument), and returns a scalar. This will be averaged over the predicted data. If weights are supplied, your function should take a third argument for the weight scalar. Included losses: Regression: -LPDistLoss{P}()
, -L1DistLoss()
, -L2DistLoss()
(mean square), -LogitDistLoss()
, -HuberLoss(d)
, -L1EpsilonInsLoss(ϵ)
, -L2EpsilonInsLoss(ϵ)
, -PeriodicLoss(c)
, -QuantileLoss(τ)
, Classification: -ZeroOneLoss()
, -PerceptronLoss()
, -L1HingeLoss()
, -SmoothedL1HingeLoss(γ)
, -ModifiedHuberLoss()
, -L2MarginLoss()
, -ExpLoss()
, -SigmoidLoss()
, -DWDMarginLoss(q)
.loss_function
: Alternatively, you may redefine the loss used as any function oftree::Node{T}
,dataset::Dataset{T}
, andoptions::Options
, so long as you output a non-negative scalar of typeT
. This is useful if you want to use a loss that takes into account derivatives, or correlations across the dataset. This also means you could use a custom evaluation for a particular expression. Take a look at_eval_loss
in the filesrc/LossFunctions.jl
for an example.npopulations
: How many populations of equations to use. By default this is set equal to the number of coresnpop
: How many equations in each population.ncycles_per_iteration
: How many generations to consider per iteration.tournament_selection_n
: Number of expressions considered in each tournament.tournament_selection_p
: The fittest expression in a tournament is to be selected with probabilityp
, the next fittest with probabilityp*(1-p)
, and so forth.topn
: Number of equations to return to the host process, and to consider for the hall of fame.complexity_of_operators
: What complexity should be assigned to each operator, and the occurrence of a constant or variable. By default, this is 1 for all operators. Can be a real number as well, in which case the complexity of an expression will be rounded to the nearest integer. Input this in the form of, e.g., [(^) => 3, sin => 2].complexity_of_constants
: What complexity should be assigned to use of a constant. By default, this is 1.complexity_of_variables
: What complexity should be assigned to each variable. By default, this is 1.alpha
: The probability of accepting an equation mutation during regularized evolution is given by exp(-delta_loss/(alpha * T)), where T goes from 1 to 0. Thus, alpha=infinite is the same as no annealing.maxsize
: Maximum size of equations during the search.maxdepth
: Maximum depth of equations during the search, by default this is set equal to the maxsize.parsimony
: A multiplicative factor for how much complexity is punished.use_frequency
: Whether to use a parsimony that adapts to the relative proportion of equations at each complexity; this will ensure that there are a balanced number of equations considered for every complexity.use_frequency_in_tournament
: Whether to use the adaptive parsimony described above inside the score, rather than just at the mutation accept/reject stage.adaptive_parsimony_scaling
: How much to scale the adaptive parsimony term in the loss. Increase this if the search is spending too much time optimizing the most complex equations.fast_cycle
: Whether to thread over subsamples of equations during regularized evolution. Slightly improves performance, but is a different algorithm.turbo
: Whether to useLoopVectorization.@turbo
to evaluate expressions. This can be significantly faster, but is only compatible with certain operators. Experimental!migration
: Whether to migrate equations between processes.hof_migration
: Whether to migrate equations from the hall of fame to processes.fraction_replaced
: What fraction of each population to replace with migrated equations at the end of each cycle.fraction_replaced_hof
: What fraction to replace with hall of fame equations at the end of each cycle.should_optimize_constants
: Whether to use an optimization algorithm to periodically optimize constants in equations.optimizer_nrestarts
: How many different random starting positions to consider for optimization of constants.optimizer_algorithm
: Select algorithm to use for optimizing constants. Default is "BFGS", but "NelderMead" is also supported.optimizer_options
: General options for the constant optimization. For details we refer to the documentation onOptim.Options
from theOptim.jl
package. Options can be provided here asNamedTuple
, e.g.(iterations=16,)
, as aDict
, e.g. Dict(:x_tol => 1.0e-32,), or as anOptim.Options
instance.output_file
: What file to store equations to, as a backup.perturbation_factor
: When mutating a constant, either multiply or divide by (1+perturbation_factor)^(rand()+1).probability_negate_constant
: Probability of negating a constant in the equation when mutating it.mutation_weights
: Relative probabilities of the mutations. The structMutationWeights
should be passed to these options. See its documentation onMutationWeights
for the different weights.crossover_probability
: Probability of performing crossover.annealing
: Whether to use simulated annealing.warmup_maxsize_by
: Whether to slowly increase the max size from 5 up tomaxsize
. If nonzero, specifies the fraction through the search at which the maxsize should be reached.verbosity
: Whether to print debugging statements or not.save_to_file
: Whether to save equations to a file during the search.bin_constraints
: Seeconstraints
. This is the same, but specified for binary operators only (for example, if you have an operator that is both a binary and unary operator).una_constraints
: Likewise, for unary operators.seed
: What random seed to use.nothing
uses no seed.progress
: Whether to use a progress bar output (verbosity
will have no effect).early_stop_condition
: Float - whether to stop early if the mean loss gets below this value. Function - a function taking (loss, complexity) as arguments and returning true or false.timeout_in_seconds
: Float64 - the time in seconds after which to exit (as an alternative to the number of iterations).max_evals
: Int (or Nothing) - the maximum number of evaluations of expressions to perform.skip_mutation_failures
: Whether to simply skip over mutations that fail or are rejected, rather than to replace the mutated expression with the original expression and proceed normally.enable_autodiff
: Whether to enable automatic differentiation functionality. This is turned off by default. If turned on, this will be turned off if one of the operators does not have well-defined gradients.nested_constraints
: Specifies how many times a combination of operators can be nested. For example,[sin => [cos => 0], cos => [cos => 2]]
specifies thatcos
may never appear within asin
, butsin
can be nested with itself an unlimited number of times. The second term specifies thatcos
can be nested up to 2 times within acos
, so thatcos(cos(cos(x)))
is allowed (as well as any combination of+
or-
within it), butcos(cos(cos(cos(x))))
is not allowed. When an operator is not specified, it is assumed that it can be nested an unlimited number of times. This requires that there is no operator which is used both in the unary operators and the binary operators (e.g.,-
could be both subtract, and negation). For binary operators, both arguments are treated the same way, and the max of each argument is constrained.deterministic
: Use a global counter for the birth time, rather than calls totime()
. This gives perfect resolution, and is therefore deterministic. However, it is not thread safe, and must be used in serial mode.define_helper_functions
: Whether to define helper functions for constructing and evaluating trees.
SymbolicRegression.CoreModule.OptionsStructModule.MutationWeights
— MethodMutationWeights(;kws...)
This defines how often different mutations occur. These weightings will be normalized to sum to 1.0 after initialization.
Arguments
mutate_constant::Float64
: How often to mutate a constant.mutate_operator::Float64
: How often to mutate an operator.add_node::Float64
: How often to append a node to the tree.insert_node::Float64
: How often to insert a node into the tree.delete_node::Float64
: How often to delete a node from the tree.simplify::Float64
: How often to simplify the tree.randomize::Float64
: How often to create a random tree.do_nothing::Float64
: How often to do nothing.optimize::Float64
: How often to optimize the constants in the tree, as a mutation. Note that this is different fromoptimizer_probability
, which is performed at the end of an iteration for all individuals.
Printing
DynamicExpressions.EquationModule.string_tree
— Methodstring_tree(tree::Node, options::Options; kws...)
Convert an equation to a string.
Arguments
tree::Node
: The equation to convert to a string.options::Options
: The options holding the definition of operators.varMap::Union{Array{String, 1}, Nothing}=nothing
: what variables to print for each feature.
Evaluation
DynamicExpressions.EvaluateEquationModule.eval_tree_array
— Methodeval_tree_array(tree::Node, X::AbstractArray, options::Options; kws...)
Evaluate a binary tree (equation) over a given input data matrix. The operators contain all of the operators used. This function fuses doublets and triplets of operations for lower memory usage.
This function can be represented by the following pseudocode:
function eval(current_node)
if current_node is leaf
return current_node.value
elif current_node is degree 1
return current_node.operator(eval(current_node.left_child))
else
return current_node.operator(eval(current_node.left_child), eval(current_node.right_child))
The bulk of the code is for optimizations and pre-emptive NaN/Inf checks, which speed up evaluation significantly.
Arguments
tree::Node
: The root node of the tree to evaluate.X::AbstractArray
: The input data to evaluate the tree on.options::Options
: Options used to define the operators used in the tree.
Returns
(output, complete)::Tuple{AbstractVector, Bool}
: the result, which is a 1D array, as well as if the evaluation completed successfully (true/false). Afalse
complete means an infinity or nan was encountered, and a large loss should be assigned to the equation.
Derivatives
SymbolicRegression.jl
can automatically and efficiently compute derivatives of expressions with respect to variables or constants. This is done using either eval_diff_tree_array
, to compute derivative with respect to a single variable, or with eval_grad_tree_array
, to compute the gradient with respect all variables (or, all constants). Both use forward-mode automatic, but use Zygote.jl
to compute derivatives of each operator, so this is very efficient.
DynamicExpressions.EvaluateEquationDerivativeModule.eval_diff_tree_array
— Methodeval_diff_tree_array(tree::Node, X::AbstractArray, options::Options, direction::Int)
Compute the forward derivative of an expression, using a similar structure and optimization to evaltreearray. direction
is the index of a particular variable in the expression. e.g., direction=1
would indicate derivative with respect to x1
.
Arguments
tree::Node
: The expression tree to evaluate.X::AbstractArray
: The data matrix, with each column being a data point.options::Options
: The options containing the operators used to create thetree
.enable_autodiff
must be set totrue
when creating the options. This is needed to create the derivative operations.direction::Int
: The index of the variable to take the derivative with respect to.
Returns
(evaluation, derivative, complete)::Tuple{AbstractVector, AbstractVector, Bool}
: the normal evaluation, the derivative, and whether the evaluation completed as normal (or encountered a nan or inf).
DynamicExpressions.EvaluateEquationDerivativeModule.eval_grad_tree_array
— Methodeval_grad_tree_array(tree::Node, X::AbstractArray, options::Options; variable::Bool=false)
Compute the forward-mode derivative of an expression, using a similar structure and optimization to evaltreearray. variable
specifies whether we should take derivatives with respect to features (i.e., X
), or with respect to every constant in the expression.
Arguments
tree::Node
: The expression tree to evaluate.X::AbstractArray
: The data matrix, with each column being a data point.options::Options
: The options containing the operators used to create thetree
.enable_autodiff
must be set totrue
when creating the options. This is needed to create the derivative operations.variable::Bool
: Whether to take derivatives with respect to features (i.e.,X
- withvariable=true
), or with respect to every constant in the expression (variable=false
).
Returns
(evaluation, gradient, complete)::Tuple{AbstractVector, AbstractArray, Bool}
: the normal evaluation, the gradient, and whether the evaluation completed as normal (or encountered a nan or inf).
SymbolicUtils.jl interface
DynamicExpressions.InterfaceSymbolicUtilsModule.node_to_symbolic
— Methodnode_to_symbolic(tree::Node, options::Options; kws...)
Convert an expression to SymbolicUtils.jl form.
Pareto frontier
SymbolicRegression.HallOfFameModule.calculate_pareto_frontier
— Methodcalculate_pareto_frontier(X::AbstractMatrix{T}, y::AbstractVector{T},
hallOfFame::HallOfFame{T}, options::Options;
weights=nothing, varMap=nothing) where {T<:Real}
Compute the dominating Pareto frontier for a given hallOfFame. This is the list of equations where each equation has a better loss than all simpler equations.
SymbolicRegression.HallOfFameModule.calculate_pareto_frontier
— Methodcalculate_pareto_frontier(dataset::Dataset{T}, hallOfFame::HallOfFame{T},
options::Options) where {T<:Real}