# Types

## Equations

Equations are specified as binary trees with the `Node`

type, defined as follows:

`DynamicExpressions.EquationModule.Node`

— Type`Node{T}`

Node defines a symbolic expression stored in a binary tree. A single `Node`

instance is one "node" of this tree, and has references to its children. By tracing through the children nodes, you can evaluate or print a given expression.

**Fields**

`degree::UInt8`

: Degree of the node. 0 for constants, 1 for unary operators, 2 for binary operators.`constant::Bool`

: Whether the node is a constant.`val::T`

: Value of the node. If`degree==0`

, and`constant==true`

, this is the value of the constant. It has a type specified by the overall type of the`Node`

(e.g.,`Float64`

).`feature::UInt16`

: Index of the feature to use in the case of a feature node. Only used if`degree==0`

and`constant==false`

. Only defined if`degree == 0 && constant == false`

.`op::UInt8`

: If`degree==1`

, this is the index of the operator in`operators.unaops`

. If`degree==2`

, this is the index of the operator in`operators.binops`

. In other words, this is an enum of the operators, and is dependent on the specific`OperatorEnum`

object. Only defined if`degree >= 1`

`l::Node{T}`

: Left child of the node. Only defined if`degree >= 1`

. Same type as the parent node.`r::Node{T}`

: Right child of the node. Only defined if`degree == 2`

. Same type as the parent node. This is to be passed as the right argument to the binary operator.

There are a variety of constructors for `Node`

objects, including:

`DynamicExpressions.EquationModule.Node`

— Method`Node([::Type{T}]; val=nothing, feature::Union{Integer,Nothing}=nothing) where {T}`

Create a leaf node: either a constant, or a variable.

**Arguments:**

`::Type{T}`

, optionally specify the type of the node, if not already given by the type of`val`

.`val`

, if you are specifying a constant, pass the value of the constant here.`feature::Integer`

, if you are specifying a variable, pass the index of the variable here.

`DynamicExpressions.EquationModule.Node`

— Method`Node(op::Integer, l::Node)`

Apply unary operator `op`

(enumerating over the order given) to `Node`

`l`

`DynamicExpressions.EquationModule.Node`

— Method`Node(op::Integer, l::Node, r::Node)`

Apply binary operator `op`

(enumerating over the order given) to `Node`

s `l`

and `r`

`DynamicExpressions.EquationModule.Node`

— Method`Node(var_string::String)`

Create a variable node, using the format `"x1"`

to mean feature 1

When you create an `Options`

object, the operators passed are also re-defined for `Node`

types. This allows you use, e.g., `t=Node(; feature=1) * 3f0`

to create a tree, so long as `*`

was specified as a binary operator. This works automatically for operators defined in `Base`

, although you can also get this to work for user-defined operators by using `@extend_operators`

:

`SymbolicRegression.InterfaceDynamicExpressionsModule.@extend_operators`

— Macro`@extend_operators options`

Extends all operators defined in this options object to work on the `Node`

type. While by default this is already done for operators defined in `Base`

when you create an options and pass `define_helper_functions=true`

, this does not apply to the user-defined operators. Thus, to do so, you must apply this macro to the operator enum in the same module you have the operators defined.

When using these node constructors, types will automatically be promoted. You can convert the type of a node using `convert`

:

`Base.convert`

— Method`convert(::Type{Node{T1}}, n::Node{T2}) where {T1,T2}`

Convert a `Node{T2}`

to a `Node{T1}`

. This will recursively convert all children nodes to `Node{T1}`

, using `convert(T1, tree.val)`

at constant nodes.

**Arguments**

`::Type{Node{T1}}`

: Type to convert to.`tree::Node{T2}`

: Node to convert.

You can set a `tree`

(in-place) with `set_node!`

:

`DynamicExpressions.EquationModule.set_node!`

— Method`set_node!(tree::Node{T}, new_tree::Node{T}) where {T}`

Set every field of `tree`

equal to the corresponding field of `new_tree`

.

You can create a copy of a node with `copy_node`

:

`DynamicExpressions.EquationModule.copy_node`

— Method`copy_node(tree::Node; preserve_sharing::Bool=false)`

Copy a node, recursively copying all children nodes. This is more efficient than the built-in copy. With `preserve_sharing=true`

, this will also preserve linkage between a node and multiple parents, whereas without, this would create duplicate child node copies.

id_map is a map from `objectid(tree)`

to `copy(tree)`

. We check against the map before making a new copy; otherwise we can simply reference the existing copy. Thanks to Ted Hopp.

Note that this will *not* preserve loops in graphs.

## Population

Groups of equations are given as a population, which is an array of trees tagged with score, loss, and birthdate–-these values are given in the `PopMember`

.

`SymbolicRegression.PopulationModule.Population`

— Type`Population(pop::Array{PopMember{T,L}, 1})`

Create population from list of PopMembers.

```
Population(dataset::Dataset{T,L};
population_size, nlength::Int=3, options::Options,
nfeatures::Int)
```

Create random population and score them on the dataset.

```
Population(X::AbstractMatrix{T}, y::AbstractVector{T};
population_size, nlength::Int=3,
options::Options, nfeatures::Int,
loss_type::Type=Nothing)
```

Create random population and score them on the dataset.

## Population members

`SymbolicRegression.PopMemberModule.PopMember`

— Type`PopMember(t::Node{T}, score::L, loss::L)`

Create a population member with a birth date at the current time. The type of the `Node`

may be different from the type of the score and loss.

**Arguments**

`t::Node{T}`

: The tree for the population member.`score::L`

: The score (normalized to a baseline, and offset by a complexity penalty)`loss::L`

: The raw loss to assign.

```
PopMember(dataset::Dataset{T,L},
t::Node{T}, options::Options)
```

Create a population member with a birth date at the current time. Automatically compute the score for this tree.

**Arguments**

`dataset::Dataset{T,L}`

: The dataset to evaluate the tree on.`t::Node{T}`

: The tree for the population member.`options::Options`

: What options to use.

## Hall of Fame

`SymbolicRegression.HallOfFameModule.HallOfFame`

— Type`HallOfFame{T<:DATA_TYPE,L<:LOSS_TYPE}`

List of the best members seen all time in `.members`

, with `.members[c]`

being the best member seen at complexity c. Including only the members which actually have been set, you can run `.members[exists]`

.

**Fields**

`members::Array{PopMember{T,L},1}`

: List of the best members seen all time. These are ordered by complexity, with`.members[1]`

the member with complexity 1.`exists::Array{Bool,1}`

: Whether the member at the given complexity has been set.

`SymbolicRegression.HallOfFameModule.HallOfFame`

— Method`HallOfFame(options::Options, ::Type{T}, ::Type{L}) where {T<:DATA_TYPE,L<:LOSS_TYPE}`

Create empty HallOfFame. The HallOfFame stores a list of `PopMember`

objects in `.members`

, which is enumerated by size (i.e., `.members[1]`

is the constant solution). `.exists`

is used to determine whether the particular member has been instantiated or not.

Arguments:

`options`

: Options containing specification about deterministic.`T`

: Type of Nodes to use in the population. e.g.,`Float64`

.`L`

: Type of loss to use in the population. e.g.,`Float64`

.

## Dataset

`SymbolicRegression.CoreModule.DatasetModule.Dataset`

— Type`Dataset{T<:DATA_TYPE,L<:LOSS_TYPE}`

**Fields**

`X::AbstractMatrix{T}`

: The input features, with shape`(nfeatures, n)`

.`y::AbstractVector{T}`

: The desired output values, with shape`(n,)`

.`n::Int`

: The number of samples.`nfeatures::Int`

: The number of features.`weighted::Bool`

: Whether the dataset is non-uniformly weighted.`weights::Union{AbstractVector{T},Nothing}`

: If the dataset is weighted, these specify the per-sample weight (with shape`(n,)`

).`extra::NamedTuple`

: Extra information to pass to a custom evaluation function. Since this is an arbitrary named tuple, you could pass any sort of dataset you wish to here.`avg_y`

: The average value of`y`

(weighted, if`weights`

are passed).`use_baseline`

: Whether to use a baseline loss. This will be set to`false`

if the baseline loss is calculated to be`Inf`

.`baseline_loss`

: The loss of a constant function which predicts the average value of`y`

. This is loss-dependent and should be updated with`update_baseline_loss!`

.`variable_names::Array{String,1}`

: The names of the features, with shape`(nfeatures,)`

.`display_variable_names::Array{String,1}`

: A version of`variable_names`

but for printing to the terminal (e.g., with unicode versions).`y_variable_name::String`

: The name of the output variable.`X_units`

: Unit information of`X`

. When used, this is a vector of`DynamicQuantities.Quantity{<:Any,<:Dimensions}`

with shape`(nfeatures,)`

.`y_units`

: Unit information of`y`

. When used, this is a single`DynamicQuantities.Quantity{<:Any,<:Dimensions}`

.`X_sym_units`

: Unit information of`X`

. When used, this is a vector of`DynamicQuantities.Quantity{<:Any,<:SymbolicDimensions}`

with shape`(nfeatures,)`

.`y_sym_units`

: Unit information of`y`

. When used, this is a single`DynamicQuantities.Quantity{<:Any,<:SymbolicDimensions}`

.

`SymbolicRegression.CoreModule.DatasetModule.Dataset`

— Method```
Dataset(X::AbstractMatrix{T}, y::Union{AbstractVector{T},Nothing}=nothing;
weights::Union{AbstractVector{T}, Nothing}=nothing,
variable_names::Union{Array{String, 1}, Nothing}=nothing,
y_variable_name::Union{String,Nothing}=nothing,
extra::NamedTuple=NamedTuple(),
loss_type::Type=Nothing,
X_units::Union{AbstractVector, Nothing}=nothing,
y_units=nothing,
) where {T<:DATA_TYPE}
```

Construct a dataset to pass between internal functions.

`SymbolicRegression.LossFunctionsModule.update_baseline_loss!`

— Function`update_baseline_loss!(dataset::Dataset{T,L}, options::Options) where {T<:DATA_TYPE,L<:LOSS_TYPE}`

Update the baseline loss of the dataset using the loss function specified in `options`

.