Heaps

The (binary) heap data structure is an array that represents a nearly complete binary tree.
NoteThe heap data structure is completely unrelated to the region of the computer’s memory of the same name.
NoteA binary tree is a tree where every node has at most two children.
A complete binary tree is one where each node has 0 or two children and the total number of elements of a tree of height $h$ is $2^{h + 1}  1$. Notice how every level is completely filled and all leaves are at the same level.
A nearly complete binary tree is one where every level, except possibly the last, is completely filled. On the last level $h$, it must have between $1$ and $2^h$ nodes.
Array Representation

A binary heap is often represented as an array.

Below are the properties of such representation.

The size of the array $n$ is the number of elements in the heap.

The root of the tree is the first element in the array.

The parent of the element at index $i$ is at index $\dfrac {i  1}{2}$.

The left child of the element at index $i$ is at index $2i + 1$.

The right child of the element at index $i$ is at index $2i + 2$.

The height of the tree with $n$ elements is $\lceil \log (n + 1) \rceil$.

Heap Property

There are two kinds of binary heaps: maxheaps and minheaps. In both kinds, the values in the nodes must satisfy a heap property. For the remainder of this section, we will discuss maxheaps. Minheaps are analogous.

In a maxheap, nodes must satisfy the maxheap property: every node must be greater than or equal to its children.

Below are some examples of valid maxheaps: all are nearly complete binary trees where the values in the nodes satisfy the maxheap property.
Heap Operations
maxHeapify

maxHeapify(i) corrects a single violation of the heap property in a subtree with root at
i
. This procedure assumes the left and right children of the element ati
are valid maxheaps. 
To correct a violation at index
i
, we findlargest
, the largest ofleft(i)
andright(i)
. Then we swap the elements at indicesi
andlargest
(A[i]
andA[largest]
) and callmaxHeapify(largest)
until the violation no longer exists. 
Here is an example of correcting a violation at index
1
by callingmaxHeapify(8)
. 
Pseudocode for
maxHeapify
, whereA
is the array representing the heap:maxHeapify(i): l = left(i) r = right(i) if l < A.size and A[l] > A[i]: largest = l else: largest = i if r < A.size and A[r] > A[largest]: largest = r if largest ≠ i swap A[i] and A[largest] maxHeapify(largest)

Time complexity of
maxHeapify
is $O(\log n)$ where $n$ is the number of elements in the heap.A more precise time complexity of calling
maxHeapify(i)
is $O(h)$ where $h$ is the height of the subtree with root at nodei
. If the number of elements in the subtree is $n_i$, the time complexity is $O(\log n_i)$.
buildMaxHeap

buildMaxHeap() produces a maxheap from an unordered array.

We observe that the leaves of the heap are always valid maxheaps. So to build a maxheap, we need to correct violations on nodes that are not leaves. (If the array that we use to represent our heap is 0indexed, these would nodes at indices between $\dfrac{n  2}{2}$ and $0$ inclusive).

Here is an example of building a max heap from an array containing the elements
{4, 1, 3, 2, 16, 9, 10, 14, 8, 7
. 
Pseudocode for
buildMaxHeap
, whereA
is the array representing the heap:buildMaxHeap(): for i = (A.size  2) / 2..≥0: maxHeapify(i)

A simple analysis of time complexity of
buildMaxHeap
on a heap with $n$ elements will lead to a complexity of $O(n \log n)$, since we make $\dfrac{n}{2}$ calls tomaxHeapify
and each call takes $O(\log n)$ time.However, we can get a tighter bound by doing a more careful analysis. Observe that each call to
maxHeapify
requires $O(h)$ time where $h$ is the height of the subtree at node $i$. The number of nodes at height $h$ is $\dfrac{n}{2^h + 1}$. Therefore, the running time is$\begin{aligned} \sum ^{\log n}_{n = 0} \left( O(n) \times \dfrac{n}{2^n + 1} \right) &= O\left( n \sum ^{\log n}_{h=0} \dfrac{h}{2^h} \right) \\ &= O\left( n \sum ^{\infty}_{h=0} \dfrac{h}{2^h} \right) \\ &= O(n) \end{aligned}$
Since $\sum \limits_{h=0}^{\infty} \dfrac{h}{2^h} = 2$.