AbstractTrees.jl

This package provides an interface for handling tree-like data structures in Julia.

Specifically, a tree consists of a set of nodes (each of which can be represented by any data type) which are connected in a graph with no cycles. For example, each object in the nested array [1, [2, 3]] can be represented by a tree in which the object [1, [2, 3]] is itself the root of the tree, 1 and [2,3] are its children and 1, 2 and 3 are the leaves.

Using this package involves implementing the abstract tree interface which, at a minimum, requires defining the function AbstractTrees.children for an object.

See below for a complete guide on how to implement the interface.

The Abstract Tree Interface

Functions

All trees must define children.

By default children returns an empty tuple and parent returns nothing, meaning that all objects which do not define the abstract trees interface can be considered the sole node of a tree.

AbstractTrees.children — Function

children(node)

Get the immediate children of node node.

By default, every object is a parent node of an empty set of children. This is to make it simpler to define trees with nodes of different types, for example arrays are trees regardless of their eltype.

REQUIRED: This is required for all tree nodes with non-empty sets of children. If it is not possible to infer the children from the node alone, this should be defined for a wrapper object which does.

AbstractTrees.nodevalue — Method

nodevalue(node)

Get the value associated with a node in a tree. This removes wrappers such as IndexNode or TreeCursors.

By default, this function is the identity.

OPTIONAL: This should be implemented with any tree for which nodes have some "value" apart from the node itself. For example, integers cannot themselves be tree nodes, to create a tree in which the "nodes" are integers one can do something like

struct IntNode
    value::Int
    children::Vector{IntNode}
end

AbstractTrees.nodevalue(n::IntNode) = n.value

AbstractTrees.parent — Function

parent(node)

Get the immediate parent of a node node.

By default all objects are considered nodes of a trivial tree with no children and no parents. That is, the default method is simply parent(node) = nothing.

OPTIONAL: The 1-argument version of this function must be implemented for nodes with the StoredParents trait.

AbstractTrees.nextsibling — Function

nextsibling(node)

Get the next sibling (child of the same parent) of the tree node node. The returned node should be the same as the node that would be returned after node when iterating over (children ∘ parent)(node).

OPTIONAL: This function is required for nodes with the StoredSiblings trait. There is no default definition.

AbstractTrees.prevsibling — Function

prevsibling(node)

Get the previous sibling (child of the same parent) of the tree node node. The returned node should be the same as the node that would be returned prior to node when iterating over (children ∘ parent)(node).

OPTIONAL: This function is optional in all cases. Default AbstractTrees method assume that it is impossible to obtain the previous sibling and all iterators act in the "forward" direction.

AbstractTrees.childrentype — Function

childrentype(::Type{T})
childrentype(n)

Indicates the type of the children (the collection of children, not individual children) of the tree node n or its type T. children should return an object of this type.

If the childrentype can be inferred from the type of the node alone, the type ::Type{T} definition is preferred (the latter will fall back to it).

OPTIONAL: In most cases, childtype is used instead. If childtype is not defined it will fall back to eltype ∘ childrentype.

AbstractTrees.childtype — Function

childtype(::Type{T})
childtype(n)

Indicates the type of children of the tree node n or its type T.

If childtype can be inferred from the type of the node alone, the type ::Type{T} definition is preferred (the latter will fall back to it).

OPTIONAL: It is strongly recommended to define this wherever possible, as without it almost no tree algorithms can be type-stable. If childrentype is defined and can be known from the node type alone, this function will fall back to eltype(childrentype(T)). If this gives a correct result it's not necessary to define childtype.

AbstractTrees.childstatetype — Function

childstatetype(::Type{T})
childstatetype(n)

Indicates the type of the iteration state of the tree node n or its type T. This is used by tree traversal algorithms which must retain this state. It therefore is necessary to define this to ensure that most tree traversal is type stable.

OPTIONAL: Type inference is used to attempt to

Note

In general nodes of a tree do not all need to have the same type, but it is much easier to achieve type-stability if they do. To specify that all nodes of a tree must have the same type, one should define Base.eltype(::Type{<:TreeIterator{T}}), see Iteration.

Traits

`ParentLinks`

The default value of ParentLinks is ImplicitParents.

Types with the StoredParents trait must define parent.

AbstractTrees.ParentLinks — Type

ParentLinks(::Type{T})
ParentLinks(tree)

A trait which indicates whether a tree node stores references to its parents (StoredParents()) or if the parents must be inferred from the tree structure (ImplicitParents()).

Trees for which ParentLinks returns StoredParents() MUST implement parent.

If StoredParents(), all nodes in the tree must also have StoredParents(), otherwise use ImplicitParents().

OPTIONAL: This should be implemented for a tree if parents of nodes are stored

AbstractTrees.ParentLinks(::Type{<:TreeType}) = AbstractTrees.StoredParents()
parent(t::TreeType) = get_parent(t)

AbstractTrees.ImplicitParents — Type

ImplicitParents <: ParentLinks

Indicates that the tree does not store parents. In these cases parents must be inferred from the tree structure and cannot be inferred through a single node.

AbstractTrees.StoredParents — Type

StoredParents <: ParentLinks

Indicates that this node stores parent links explicitly. The implementation is responsible for defining the parentind function to expose this information.

If a node in a tree has this trait, so must all connected nodes. If this is not the case, use ImplicitParents instead.

Required Methods

parent

`SiblingLinks`

The default value of SiblingLinks is ImplicitSiblings.

Types with the StoredSiblings trait must define nextsibling and may define prevsibling.

AbstractTrees.SiblingLinks — Type

SiblingLinks(::Type{T})
SiblingLinks(tree)

A trait which indicates whether a tree node stores references to its siblings (StoredSiblings()) or must be inferred from the tree structure (ImplicitSiblings()).

If a node has the trait StoredSiblings(), so must all connected nodes in the tree. Otherwise, use ImplicitSiblings() instead.

OPTIONAL: This should be implemented for a tree if siblings of nodes are stored

AbstractTrees.SiblingLinks(::Type{<:TreeType}) = AbstractTrees.StoredSiblings()

AbstractTrees.ImplicitSiblings — Type

ImplicitSiblings <: SiblingLinks

Indicates that tree nodes do not store references to siblings so that they must be inferred from the tree structure.

AbstractTrees.StoredSiblings — Type

StoredSiblings <: SiblingLinks

Indicates that this tree node stores sibling links explicitly, or can compute them quickly (e.g. because the tree has a (small) fixed branching ratio, so the current index of a node can be determined by quick linear search).

If a node has this trait, so must all connected nodes in the tree. Otherwise, use ImplicitSiblings() instead.

Required Methods

nextsibling

`ChildIndexing`

AbstractTrees.ChildIndexing — Type

ChildIndexing(::Type{N})
ChildIndexing(node)

A trait indicating whether the tree node n has children (as returned by children) which can be indexed using 1-based indexing. Options are either NonIndexedChildren (default) or IndexedChildren.

To declare that the tree TreeType supports one-based indexing on the children, define

AbstractTrees.ChildIndexing(::Type{<:TreeType}) = AbstractTrees.IndexedChildren()

If a node has the IndexedChildren() so must all connected nodes in the tree. Otherwise, use NonIndexedChildren() instead.

AbstractTrees.NonIndexedChildren — Type

NonIndexedChildren <: ChildIndexing

Indicates that the object returned by children(n) where n is a tree node is not necessarily indexable. This trait applies to any tree which cannot guarantee indexable children in all cases, regardless of whether the tree is indexable in special cases. For example, Array has this trait even though there is a large class of indexable trees consisting of arrays.

AbstractTrees.IndexedChildren — Type

IndexedChildren <: ChildIndexing

Indicates that the object returned by children(n) where n is a tree node is indexable (1-based).

If a node has this trait so must all connected nodes in the tree. Otherwise, use NonIndexedChildren() instead.

Required Methods

A node node with this trait must return an indexable object from children(node).

The default value of ChildIndexing is NonIndexedChildren.

Types with the IndexedChildren trait must return an indexable object from children (i.e. children(node)[idx] must be valid for positive integers idx).

`NodeType`

AbstractTrees.NodeType — Type

NodeType(::Type)
NodeType(node)

A trait which specifies whether a tree has a predictable node type (HasNodeType()) or not (NodeTypeUnknown()).

This is analogous to Base.IteratorEltype. In particular the IteratorEltype of TreeIterator is dictated by this trait.

The default value is NodeTypeUnknown().

AbstractTrees.NodeTypeUnknown — Type

NodeTypeUnknown <: NodeType

Indicates that this node is connected to a tree for which it cannot be guaranteed that all nodes have the same type.

AbstractTrees.HasNodeType — Type

HasNodeType <: NodeType

Indicates that this node is connected to a tree for which all nodes have types descended from eltype(node).

AbstractTrees.nodetype — Function

nodetype(::Type{T})
nodetype(node) = nodetype(typeof(node))

Returns a type which must be a parent type of all nodes in the tree connected to node. This can be used to, for example, specify the eltype of any TreeIterator on node.

Providing the HasNodeType trait will guarantee that all nodes connected to the node must be of the type returned by nodetype.

An important use case of this trait is to guarantee the return types of a TreeIterator. Tree nodes with NodeTypeUnknown cannot have type-stable iteration over the entire tree.

For example

struct ExampleNode
    x::Int
    children::Vector{ExampleNode}
end

AbstractTrees.nodevalue(x::ExampleNode) = x.x
AbstractTrees.children(x::ExampleNode) = x.children

AbstractTrees.NodeType(::Type{<:ExampleNode}) = HasNodeType()
AbstractTrees.nodetype(::Type{<:ExampleNode}) = ExampleNode

In this example, iteration over a tree of ExampleNodes is type-stable with eltype ExampleNode.

Providing the nodetype(::Type) method is preferable to defining nodetype(::ExampleNode) because it ensures that nodetype can be involved at compile time even if values are not known.

The Indexed Tree Interface

The abstract tree interface assumes that all information about the descendants of a node is accessible through the node itself. The objects which can implement that interface are the nodes of a tree, not the tree itself.

The interface for trees which do not possess nodes for which tree structure can be inferred from the nodes alone is different. This is done by wrapping nodes in the IndexNode nodes which allow nodes to be accessed from a centralized tree object.

AbstractTrees.IndexNode — Type

IndexNode{T,I}

The node of a tree which implements the indexed tree interface. Such a tree consists of an object tree from which nodes can be obtained with the two-argument method of nodevalue which by default calls getindex.

An IndexNode implements the tree interface, and can be thought of an adapter from an object that implements the indexed tree interface to one that implements the tree interface.

IndexNode do not store the value associated with the node but can obtain it by calling nodevalue.

Constructors

IndexNode(tree, node_index)

IndexNode(tree) = IndexNode(tree, rootindex(tree))  # one-argument constructor requires `rootindex`

Here tree is an object which stores or can obtain information for the entire tree structure, and node_index is the index of the node for which node_index is being constructed.

Functions

All indexed trees must implement childindices.

Indexed trees rely on nodevalue(::Any, ::Any) for obtaining the value of a node given the tree and index. By default, nodevalue(tree, idx) = tree[idx], trees which do not store nodes in this way should define nodevalue.

Indexed trees can define ParentLinks or SiblingLinks. The IndexNodes of a tree will inherit these traits from the wrapped tree.

AbstractTrees.childindices — Function

childindices(tree, node_index)

Get the indices of the children of the node of tree tree specified by node_index.

To be consistent with children, by default this returns an empty tuple.

REQUIRED for indexed trees: Indexed trees, i.e. trees that do not implement children must implement this function.

AbstractTrees.nodevalue — Method

nodevalue(tree, node_index)

Get the value of the node specified by node_index from the indexed tree object tree.

By default, this falls back to tree[node_index].

OPTIONAL: Indexed trees only require this if the fallback to getindex is not sufficient.

AbstractTrees.parentindex — Function

parentindex(tree, node_index)

Get the index of the parent of the node of tree tree specified by node_index.

Nodes that have no parent (i.e. the root node) should return nothing.

OPTIONAL: Indexed trees with the StoredParents trait must implement this.

AbstractTrees.nextsiblingindex — Function

nextsiblingindex(tree, node_index)

Get the index of the next sibling of the node of tree tree specified by node_index.

Nodes which have no next sibling should return nothing.

OPTIONAL: Indexed trees with the StoredSiblings trait must implement this.

AbstractTrees.prevsiblingindex — Function

prevsiblingindex(tree, node_index)

Get the index of the previous sibling of the node of tree tree specified by node_index.

Nodes which have no previous sibling should return nothing.

OPTIONAL: Indexed trees that have StoredSiblings can implement this, but no built-in tree algorithms require it.

AbstractTrees.rootindex — Function

rootindex(tree)

Get the root index of the indexed tree tree.

OPTIONAL: The single-argument constructor for IndexNode requires this, but it is not required for any built-in tree algorithms.

The `AbstractNode` Type

It is not required that objects implementing the AbstractTrees.jl interface are of this type, but it can be used to indicate that an object must implement the interface.

AbstractTrees.AbstractNode — Type

AbstractNode{T}

Abstract type of tree nodes that implement the AbstractTrees.jl interface.

It is NOT necessary for tree nodes to inherit from this type to implement the AbstractTrees.jl interface. Conversely, all AbstractNode types are required to satisfy the AbstractTrees.jl interface (i.e. they must at least define children).

Package developers should keep in mind when writing methods that most trees will not be of this type. Therefore, any functions which are intended to work on any tree should not dispatch on AbstractNode.

The type parameter T is the type of the nodevalue of the concrete type descented from AbstractNode.

Type Stability and Performance

Because of the recursive nature of trees it can be quite challenging to achieve type stability when traversing it in any way such as iterating over nodes. Only trees which guarantee that all nodes are of the same type (with HasNodeType) can be type stable.

To make it easier to convert trees with non-uniform node types this package provides the StableNode type.

AbstractTrees.StableNode — Type

StableNode{T} <: AbstractNode{T}

A node belonging to a tree in which all nodes are of type StableNode{T}. This type is provided so that trees with NodeTypeUnknown can implement methods to be converted to type-stable trees with indexable children which allow for efficient traversal and iteration.

Constructors

StableNode{T}(x::T, ch)
StableNode(x, ch=())
StableNode{T}(𝒻, node)

Arguments

x: the value of the constructed node, returned by nodevalue.
ch: the children of the node, each must be of type StableNode.
𝒻: A function which, when called on the node of a tree returns a value which should be wrapped by a StableNode. The return value of 𝒻 must be convertible to T (see example).
T: The value type of the StableNodes in a tree.
node: A node from a tree which is to be used to construct the StableNode tree.

Examples

t = [1, [2,3]]

node = StableNode{Union{Int,Nothing}}(t) do n
    n isa Integer ? convert(Int, n) : nothing
end

In the above example node is a tree with HasNodeType, nodes of type StableNode{Union{Int,Nothing}}. The nodes in the new tree corresponding to arrays have value nothing while other nodes have their corresponding Int value.

To achieve the same performance with custom node types be sure to define at least

AbstractTrees.NodeType(::Type{<:ExampleNode}) = HasNodeType()
AbstractTrees.nodetype(::Type{<:ExampleNode}) = ExampleNode

In some circumstances it is also more efficient for nodes to have ChildIndexing since this also guarantees the type of the iteration state of the iterator returned by children.

Additional Functions

AbstractTrees.getdescendant — Function

getdescendant(node, idx)

Obtain a node from a tree by indexing each level of the tree with the elements of idx.

This function is defined for all trees regardless of whether they have the IndexedChildren. This is because a tree without IndexedChildren might have special cases in which all children are indexable, a prominent example being Array which may not have indexable sub-trees (e.g. an array containing a Dict) but there are common special cases in which array trees are fully indexable (e.g. a tree in which every non-leaf node is an array).

The elements of idx can be any argument to getindex, not necessarily integers. For example, getdescendant(Dict("a"=>1), ("a",)) returns 1.

Note that this is a separate concept from indexed trees which by default do not have IndexedChildren(), see IndexNode.

Example

v = [1, [2, [3, 4]]]

getdescendant(v, (2, 2, 1)) == 3

AbstractTrees.nodevalues — Function

nodevalues(itr::TreeIterator)

An iterator which returns the nodevalue of each node in the tree, equivalent to Iterators.map(nodevalue, itr).

AbstractTrees.ischild — Function

ischild(node1, node2; equiv=(===))

Check if node1 is a child of node2.

By default this iterates through children(node2), so performance may be improved by adding a specialized method for given node type.

Equivalence is established with the equiv function. New methods of this function should include this argument or else it will fall back to ===.

AbstractTrees.isroot — Function

isroot(x)

Whether x is the absolute root of a tree. More specifically, this returns true if parent(x) ≡ nothing, or parent(root, x) ≡ nothing. That is, while any node is the root of some tree, this function only returns true for nodes which have parents which cannot be obtained with the AbstractTrees interface.

AbstractTrees.intree — Function

intree(node, root; equiv=(≡))

Check if node is a member of the tree rooted at root.

By default this traverses through the entire tree in search of node, and so may be slow if a more specialized method has not been implemented for the given tree type.

Equivalence is established with the equiv function. Note that new methods should also define equiv or calls may fall back to the default method.

See also: isdescendant

AbstractTrees.isdescendant — Function

isdescendant(node1, node2; equiv=(≡))

Check if node1 is a descendant of node2. This isequivalent to checking whether node1 is a member of the subtree rooted at node2 (see intree) except that a node cannot be a descendant of itself.

Internally this calls intree(node1, node2) and so may be slow if a specialized method of that function is not available.

Equivalence is established with the equiv function. Note that new methods should also define equiv or calls may fall back to the default method.

AbstractTrees.treesize — Function

treesize(node)

Get the size of the tree rooted at node.

By default this recurses through all nodes in the tree and so may be slow if a more specialized method has not been implemented for the given type.

AbstractTrees.treebreadth — Function

treebreadth(node)

Get the number of leaves in the tree rooted at node. Leaf nodes have a breadth of one.

By default this recurses through all nodes in the tree and so may be slow if a more specialized method has not been implemented for the given type.

AbstractTrees.treeheight — Function

treeheight(node)

Get the maximum depth from node to any of its descendants. Leaf nodes have a height of zero.

By default this recurses through all nodes in the tree and so may be slow if a more specialized method has not been implemented for the given type.

AbstractTrees.descendleft — Function

descendleft(node)

Descend from the node node to the first encountered leaf node by recursively calling children and taking the first child.

AbstractTrees.getroot — Function

getroot(node)

Get the root of the tree containing node node. This requires node to have the trait StoredParents.

AbstractTrees.print_tree — Function

print_tree(tree; kw...)
print_tree(io::IO, tree; kw...)
print_tree(f::Function, io::IO, tree; kw...)
print_tree(f::Function, g::Function, io::IO, tree; kw...)

Print a text representation of tree to the given io object.

Arguments

f::Function - custom implementation of printnode to use. Should have the signature f(io::IO, node; kw...).
g::Function - custom implementation of print_child_key to use. Should have the signature g(io::IO, key;).
io::IO - IO stream to write to.
tree - tree to print.
maxdepth::Integer = 5 - truncate printing of subtrees at this depth.
indicate_truncation::Bool = true - print a vertical ellipsis character beneath truncated nodes.
charset::TreeCharSet - TreeCharSet to use to print branches.
printkeys::Union{Bool, Nothing} - Whether to print keys of child nodes (using pairs(children(node))). A value of nothing uses shouldprintkeys to decide the behavior on a node-by-node basis.
printnode_kw = (;) - keyword arguments to forward to f.

Examples

julia> tree = [1:3, "foo", [[[4, 5], 6, 7], 8]];

julia> print_tree(tree)
Vector{Any}
├─ UnitRange{Int64}
│  ├─ 1
│  ├─ 2
│  └─ 3
├─ "foo"
└─ Vector{Any}
   ├─ Vector{Any}
   │  ├─ Vector{Int64}
   │  │  ├─ 4
   │  │  └─ 5
   │  ├─ 6
   │  └─ 7
   └─ 8

julia> print_tree(tree, maxdepth=2)
Vector{Any}
├─ UnitRange{Int64}
│  ├─ 1
│  ├─ 2
│  └─ 3
├─ "foo"
└─ Vector{Any}
   ├─ Vector{Any}
   │  ⋮
   │
   └─ 8

julia> print_tree(tree, charset=AbstractTrees.ASCII_CHARSET)
Vector{Any}
+-- UnitRange{Int64}
|   +-- 1
|   +-- 2
|   \-- 3
+-- "foo"
\-- Vector{Any}
    +-- Vector{Any}
    |   +-- Vector{Int64}
    |   |   +-- 4
    |   |   \-- 5
    |   +-- 6
    |   \-- 7
    \-- 8

AbstractTrees.printnode — Function

printnode(io::IO, node; kw...)

Print a compact representation of a single node. By default, this prints nodevalue(node).

OPTIONAL: This can be extended for custom types and controls how nodes are shown in print_tree.

The keyword argument printnode_kw of print_tree will be passed to this function.

AbstractTrees.print_child_key — Function

print_child_key(io::IO, key)

Print the key for a child node.

AbstractTrees.TreeCharSet — Type

TreeCharSet(mid, terminator, skip, dash, trunc, pair)

Set of characters (or strings) used to pretty-print tree branches in print_tree.

Fields

mid: "Forked" branch segment connecting to middle children.
terminator: Final branch segment connecting to last child.
skip: Vertical branch segment.
dash: Horizontal branch segment printed to the right of mid and terminator.
trunc: Used to indicate the subtree has been truncated at the maximum depth.
pair: Printed between a child node and its key.

AbstractTrees.shouldprintkeys — Function

shouldprintkeys(children)::Bool

Whether a collection of children should be printed with its keys by default.

The base behavior is to print keys for all collections for which keys() is defined, with the exception of AbstractVectors and tuples.

AbstractTrees.repr_tree — Function

repr_tree(tree; context=nothing, kw...)
repr_tree(f, tree; context=nothing, kw...)
repr_tree(f, g, tree; context=nothing, kw...)

Get the string result of calling print_tree with the supplied arguments.

The context argument works as it does in Base.repr.

AbstractTrees.repr_node — Function

repr_node(node; context=nothing)

Get the string representation of a node using printnode. This works analogously to Base.repr.

context is an IO or IOContext object whose attributes are used for the I/O stream passed to printnode.

Example Implementations

All objects in base which define the abstract trees interface are defined in builtins.jl.
IDTree
OneNode
OneTree
FSNode
BinaryNode