Types

Equations

Equations are specified as binary trees with the Node type, defined as follows.

DynamicExpressions.NodeModule.NodeType
Node{T} <: AbstractExpressionNode{T}

Node defines a symbolic expression stored in a binary tree. A single Node instance is one "node" of this tree, and has references to its children. By tracing through the children nodes, you can evaluate or print a given expression.

Fields

  • degree::UInt8: Degree of the node. 0 for constants, 1 for unary operators, 2 for binary operators.
  • constant::Bool: Whether the node is a constant.
  • val::T: Value of the node. If degree==0, and constant==true, this is the value of the constant. It has a type specified by the overall type of the Node (e.g., Float64).
  • feature::UInt16: Index of the feature to use in the case of a feature node. Only used if degree==0 and constant==false. Only defined if degree == 0 && constant == false.
  • op::UInt8: If degree==1, this is the index of the operator in operators.unaops. If degree==2, this is the index of the operator in operators.binops. In other words, this is an enum of the operators, and is dependent on the specific OperatorEnum object. Only defined if degree >= 1
  • l::Node{T}: Left child of the node. Only defined if degree >= 1. Same type as the parent node.
  • r::Node{T}: Right child of the node. Only defined if degree == 2. Same type as the parent node. This is to be passed as the right argument to the binary operator.

Constructors

Node([T]; val=nothing, feature=nothing, op=nothing, l=nothing, r=nothing, children=nothing, allocator=default_allocator)
Node{T}(; val=nothing, feature=nothing, op=nothing, l=nothing, r=nothing, children=nothing, allocator=default_allocator)

Create a new node in an expression tree. If T is not specified in either the type or the first argument, it will be inferred from the value of val passed or l and/or r. If it cannot be inferred from these, it will default to Float32.

The children keyword can be used instead of l and r and should be a tuple of children. This is to permit the use of splatting in constructors.

You may also construct nodes via the convenience operators generated by creating an OperatorEnum.

You may also choose to specify a default memory allocator for the node other than simply Node{T}() in the allocator keyword argument.

When you create an Options object, the operators passed are also re-defined for Node types. This allows you use, e.g., t=Node(; feature=1) * 3f0 to create a tree, so long as * was specified as a binary operator. This works automatically for operators defined in Base, although you can also get this to work for user-defined operators by using @extend_operators:

SymbolicRegression.InterfaceDynamicExpressionsModule.@extend_operatorsMacro
@extend_operators options

Extends all operators defined in this options object to work on the AbstractExpressionNode type. While by default this is already done for operators defined in Base when you create an options and pass define_helper_functions=true, this does not apply to the user-defined operators. Thus, to do so, you must apply this macro to the operator enum in the same module you have the operators defined.

source

When using these node constructors, types will automatically be promoted. You can convert the type of a node using convert:

Base.convertMethod
convert(::Type{<:AbstractExpressionNode{T1}}, n::AbstractExpressionNode{T2}) where {T1,T2}

Convert a AbstractExpressionNode{T2} to a AbstractExpressionNode{T1}. This will recursively convert all children nodes to AbstractExpressionNode{T1}, using convert(T1, tree.val) at constant nodes.

Arguments

  • ::Type{AbstractExpressionNode{T1}}: Type to convert to.
  • tree::AbstractExpressionNode{T2}: AbstractExpressionNode to convert.

You can set a tree (in-place) with set_node!:

DynamicExpressions.NodeModule.set_node!Function
set_node!(tree::AbstractExpressionNode{T}, new_tree::AbstractExpressionNode{T}) where {T}

Set every field of tree equal to the corresponding field of new_tree.

You can create a copy of a node with copy_node:

DynamicExpressions.NodeModule.copy_nodeMethod
copy_node(tree::AbstractExpressionNode; break_sharing::Val{BS}=Val(false)) where {BS}

Copy a node, recursively copying all children nodes. This is more efficient than the built-in copy.

If break_sharing is set to Val(true), sharing in a tree will be ignored.

Expressions

Expressions are represented using the Expression type, which combines the raw Node type with an OperatorEnum.

DynamicExpressions.ExpressionModule.ExpressionType
Expression{T, N, D} <: AbstractExpression{T, N}

(Experimental) Defines a high-level, user-facing, expression type that encapsulates an expression tree (like Node) along with associated metadata for evaluation and rendering.

Fields

  • tree::N: The root node of the raw expression tree.
  • metadata::Metadata{D}: A named tuple of settings for the expression, such as the operators and variable names.

Constructors

  • Expression(tree::AbstractExpressionNode, metadata::NamedTuple): Construct from the fields
  • @parse_expression(expr, operators=operators, variable_names=variable_names, node_type=Node): Parse a Julia expression with a given context and create an Expression object.

Usage

This type is intended for end-users to interact with and manipulate expressions at a high level, abstracting away the complexities of the underlying expression tree operations.

These types allow you to define and manipulate expressions with a clear separation between the structure and the operators used.

Parametric Expressions

Parametric expressions are a type of expression that includes parameters which can be optimized during the search.

These types allow you to define expressions with parameters that can be tuned to fit the data better. You can specify the maximum number of parameters using the expression_options argument in SRRegressor.

Custom Expressions

You can create your own expression types by defining a new type that extends AbstractExpression.

DynamicExpressions.ExpressionModule.AbstractExpressionType
AbstractExpression{T,N}

(Experimental) Abstract type for user-facing expression types, which contain both the raw expression tree operating on a value type of T, as well as associated metadata to evaluate and render the expression.

See ExpressionInterface for a full description of the interface implementation, as well as tests to verify correctness.

If you wish to use @parse_expression, you can also customize the parsing behavior with

  • parse_leaf

The interface is fairly flexible, and permits you define specific functional forms, extra parameters, etc. See the documentation of DynamicExpressions.jl for more details on what methods you need to implement. Then, for SymbolicRegression.jl, you would pass expression_type to the Options constructor, as well as any expression_options you need (as a NamedTuple).

If needed, you may need to overload SymbolicRegression.ExpressionBuilder.extra_init_params in case your expression needs additional parameters. See the method for ParametricExpression as an example.

Population

Groups of equations are given as a population, which is an array of trees tagged with score, loss, and birthdate–-these values are given in the PopMember.

SymbolicRegression.PopulationModule.PopulationType
Population(pop::Array{PopMember{T,L}, 1})

Create population from list of PopMembers.

source
Population(dataset::Dataset{T,L};
           population_size, nlength::Int=3, options::Options,
           nfeatures::Int)

Create random population and score them on the dataset.

source
Population(X::AbstractMatrix{T}, y::AbstractVector{T};
           population_size, nlength::Int=3,
           options::Options, nfeatures::Int,
           loss_type::Type=Nothing)

Create random population and score them on the dataset.

source

Population members

SymbolicRegression.PopMemberModule.PopMemberType
PopMember(t::AbstractExpression{T}, score::L, loss::L)

Create a population member with a birth date at the current time. The type of the Node may be different from the type of the score and loss.

Arguments

  • t::AbstractExpression{T}: The tree for the population member.
  • score::L: The score (normalized to a baseline, and offset by a complexity penalty)
  • loss::L: The raw loss to assign.
source
PopMember(
    dataset::Dataset{T,L},
    t::AbstractExpression{T},
    options::Options
)

Create a population member with a birth date at the current time. Automatically compute the score for this tree.

Arguments

  • dataset::Dataset{T,L}: The dataset to evaluate the tree on.
  • t::AbstractExpression{T}: The tree for the population member.
  • options::Options: What options to use.
source

Hall of Fame

SymbolicRegression.HallOfFameModule.HallOfFameType
HallOfFame{T<:DATA_TYPE,L<:LOSS_TYPE}

List of the best members seen all time in .members, with .members[c] being the best member seen at complexity c. Including only the members which actually have been set, you can run .members[exists].

Fields

  • members::Array{PopMember{T,L},1}: List of the best members seen all time. These are ordered by complexity, with .members[1] the member with complexity 1.
  • exists::Array{Bool,1}: Whether the member at the given complexity has been set.
source

Dataset

SymbolicRegression.CoreModule.DatasetModule.DatasetType
Dataset{T<:DATA_TYPE,L<:LOSS_TYPE}

Fields

  • X::AbstractMatrix{T}: The input features, with shape (nfeatures, n).
  • y::AbstractVector{T}: The desired output values, with shape (n,).
  • index::Int: The index of the output feature corresponding to this dataset, if any.
  • n::Int: The number of samples.
  • nfeatures::Int: The number of features.
  • weighted::Bool: Whether the dataset is non-uniformly weighted.
  • weights::Union{AbstractVector{T},Nothing}: If the dataset is weighted, these specify the per-sample weight (with shape (n,)).
  • extra::NamedTuple: Extra information to pass to a custom evaluation function. Since this is an arbitrary named tuple, you could pass any sort of dataset you wish to here.
  • avg_y: The average value of y (weighted, if weights are passed).
  • use_baseline: Whether to use a baseline loss. This will be set to false if the baseline loss is calculated to be Inf.
  • baseline_loss: The loss of a constant function which predicts the average value of y. This is loss-dependent and should be updated with update_baseline_loss!.
  • variable_names::Array{String,1}: The names of the features, with shape (nfeatures,).
  • display_variable_names::Array{String,1}: A version of variable_names but for printing to the terminal (e.g., with unicode versions).
  • y_variable_name::String: The name of the output variable.
  • X_units: Unit information of X. When used, this is a vector of DynamicQuantities.Quantity{<:Any,<:Dimensions} with shape (nfeatures,).
  • y_units: Unit information of y. When used, this is a single DynamicQuantities.Quantity{<:Any,<:Dimensions}.
  • X_sym_units: Unit information of X. When used, this is a vector of DynamicQuantities.Quantity{<:Any,<:SymbolicDimensions} with shape (nfeatures,).
  • y_sym_units: Unit information of y. When used, this is a single DynamicQuantities.Quantity{<:Any,<:SymbolicDimensions}.
source