Heaps

# Heaps

Heaps are data structures that efficiently maintain the minimum (or maximum) for a set of data that may dynamically change.

All heaps in this package are derived from `AbstractHeap`, and provide the following interface:

``````# Let `h` be a heap, `v` be a value, and `n` be an integer size

length(h)            # returns the number of elements

isempty(h)           # returns whether the heap is empty

push!(h, v)          # add a value to the heap

first(h)             # return the first (top) value of a heap

pop!(h)              # removes the first (top) value, and returns it

extract_all!(h)      # removes all elements and returns sorted array

extract_all_rev!(h)  # removes all elements and returns reverse sorted array

sizehint!(h, n)      # reserve capacity for at least `n` elements``````

Mutable heaps (values can be changed after being pushed to a heap) are derived from `AbstractMutableHeap <: AbstractHeap`, and additionally provides the following interface:

``````# Let `h` be a heap, `i` be a handle, and `v` be a value.

i = push!(h, v)            # adds a value to the heap and and returns a handle to v

update!(h, i, v)           # updates the value of an element (referred to by the handle i)

delete!(h, i)              # deletes the node with handle i from the heap

v, i = top_with_handle(h)  # returns the top value of a heap and its handle``````

Currently, both min/max versions of binary heap (type `BinaryHeap`) and mutable binary heap (type `MutableBinaryHeap`) have been implemented.

Examples of constructing a heap:

``````h = BinaryMinHeap{Int}()
h = BinaryMaxHeap{Int}()             # create an empty min/max binary heap of integers

h = BinaryMinHeap([1,4,3,2])
h = BinaryMaxHeap([1,4,3,2])         # create a min/max heap from a vector

h = MutableBinaryMinHeap{Int}()
h = MutableBinaryMaxHeap{Int}()      # create an empty mutable min/max heap

h = MutableBinaryMinHeap([1,4,3,2])
h = MutableBinaryMaxHeap([1,4,3,2])  # create a mutable min/max heap from a vector``````

## Using alternate orderings

Heaps can also use alternate orderings apart from the default one defined by `Base.isless`. This is accomplished by passing an instance of `Base.Ordering` as the first argument to the constructor. The top of the heap will then be the element that comes first according to this ordering.

The following example uses 2-tuples to track the index of each element in the original array, but sorts only by the data value:

``````data = collect(enumerate(["foo", "bar", "baz"]))

h1 = BinaryHeap(data) # Standard lexicographic ordering for tuples
first(h1)             # => (1, "foo")

h2 = BinaryHeap(Base.By(last), data) # Order by 2nd element only
first(h2)                            # => (2, "bar")``````

If the ordering type is a singleton it can be passed as a type parameter to the constructor instead:

``````BinaryHeap{T, O}()        # => BinaryHeap{T}(O())
MutableBinaryHeap{T, O}() # => MutableBinaryHeap{T}(O())``````

## Min-max heaps

Min-max heaps maintain the minimum and the maximum of a set, allowing both to be retrieved in constant (`O(1)`) time. The min-max heaps in this package are subtypes of `AbstractMinMaxHeap <: AbstractHeap` and have the same interface as other heaps with the following additions:

``````# Let h be a min-max heap, k an integer
minimum(h)     # return the smallest element
maximum(h)     # return the largest element

popmin!(h)     # remove and return the smallest element
popmin!(h, k)  # remove and return the smallest k elements

popmax!(h)     # remove and return the largest element
popmax!(h, k)  # remove and return the largest k elements

popall!(h)     # remove and return all the elements, sorted smallest to largest
popall!(h, o)  # remove and return all the elements according to ordering o``````

The usual `first(h)` and `pop!(h)` are defined to be `minimum(h)` and `popmin!(h)`, respectively.

This package includes an implementation of a binary min-max heap (`BinaryMinMaxHeap`).

Atkinson, M.D., Sack, J., Santoro, N., & Strothotte, T. (1986). Min-Max > Heaps and Generalized Priority Queues. Commun. ACM, 29, 996-1000. doi: 10.1145/6617.6621

Examples:

``````h = BinaryMinMaxHeap{Int}()        # create an empty min-max heap with integer values

h = BinaryMinMaxHeap([1, 2, 3, 4]) # create a min-max heap from a vector``````

# Functions using heaps

Heaps can be used to extract the largest or smallest elements of an array without sorting the entire array first:

``````data = [0,21,-12,68,-25,14]
nlargest(3, data)  # => [68,21,14]
nsmallest(3, data) # => [-25,-12,0]``````

Both methods also support the `by` and `lt` keywords to customize the sort order, as in `Base.sort`:

``````nlargest(3, data, by=x -> x^2)  # => [68,-25,21]
nsmallest(3, data, by=x -> x^2) # => [0,-12,14]``````

The lower-level `DataStructures.nextreme` function takes a `Base.Ordering` instance as the first argument and returns the first `n` elements according to this ordering:

``DataStructures.nextreme(Base.Forward, n, a) # Equivalent to nsmallest(n, a)``

# Improving performance with Float data

One use case for custom orderings is to achieve faster performance with `Float` elements with the risk of random ordering if any elements are `NaN`. The provided `DataStructures.FasterForward` and `DataStructures.FasterReverse` orderings are optimized for this purpose and may achive a 2x performance boost:

``````h = BinaryHeap{Float64, DataStructures.FasterForward}() # faster min heap
h = BinaryHeap{Float64, DataStructures.FasterReverse}() # faster max heap

h = MutableBinaryHeap{Float64, DataStructures.FasterForward}() # faster mutable min heap
h = MutableBinaryHeap{Float64, DataStructures.FasterReverse}() # faster mutable max heap

DataStructures.nextreme(DataStructures.FasterReverse(), n, a)  # faster nlargest(n, a)
DataStructures.nextreme(DataStructures.FasterForward(), n, a)  # faster nsmallest(n, a)``````