# Heaps

## Algorithms and Datastructures/Data Structures :: 05-06-2018

In general, let $A$ be an array that represent the heap, and we have for a max-heap

• A heap is a sort of binary tree with the property $$A[x.left] \leq A[x] \geq A[x.right]$$ That is: every child has a smaller or equal to key compared to its parent. This is called the max-heap-property.
• A heap is used, among other things, to sort and implement priority queues.

We have the following simple functions for a max-heap

Parent(i) return floor(i/2) Left(i) return 2 \cdot i Right(i) return 2 \cdot i + 1

### Heapsort

Uses max-heapify and build-max-heap. max-heapify with example

max-heapify assumes that the children of current $i$ are max-heaps, but that $A[i]$ might violate the max-heap-property. The pseudocode is as follows

Max-Heapify(A,i) l = left(i) r = right(i) if l ≤ A.heap-size \land A[l] > A[i] largest = l else largest = i if r ≤ A.heap-size \land A[r] > A[largest] largest = r if largest \neq i exchange A[i] with A[largest] Max-Heapify(A,largest)

This procedure is $O(lg)$

build-max-heap uses max-heapify from $floor(n/2)$ and down. The pseudo code is

Build-Max-Heap(A) A.heap-size = A.length for i = floor(A.length / 2) downto 1 Max-Heapify(A,i)

The correctness can be shown as follows

• Initialization Prior to first iteration $floor(n/2) + 1, .. , n$ is a leaf and therefore the root of a trivial max-heap.
• Maintenance we have that the children of node $i$ are numbered higher than $i$. By the loop invariant they are roots of a max-heap. This is what is required by max-heapify.
• Termination. We have that $i = 0$, and by the loop invariant each node $1 .. n$ is root in a max heap.

Intuitively time complexity: Each call to max-heapify is $O(lg n)$. We do $n$ of such calls, and we get $O(n \cdot lg n)$.

Tighter time complexity is as follows: We rely on the properties that an $n$-element heap has a height of at most $lg n$ and at most $ceil(n / 2^{h + 1})$ nodes of any height $h$ *aside. The time required by max-heapify when called on a node of height $h$ is $O(h)$. Thus we get an upper bound for build-max-heap $$\sum_{h = 0}^{floor(lg n)} ceil \left ( \frac{n}{2^{h + 1}} \right ) O(h) = O \left ( n \sum_{h = 0}^{floor(lg n)} \frac{h}{2^{h}} \right )$$ That is we sum over possible heights. We have that $$\sum_{h = 0}^{\infty} \frac{h}{2^{h}} = \frac{1/2}{(1 - 1/2)^{2}} = 2$$ And all in all we have $$O \left ( n \sum_{h = 0}^{floor(lg n)} \frac{h}{2^{h}} \right ) = O(n \cdot 2) = O(n)$$ That is summing over infinity is an upper bound of summing over a finite set.

Now the code for heapsort is as follows

HeapSort(A) Build-Max-Heap(A) for i = A.length downto 2 exchange A[1] with A[i] A.heap-size = A.heap-size - 1 Max-Heapify(A,1)

Which by the above has a running time of $O(n \cdot lg n)$.

*aside : a heap with $n$ nodes has at most $ceil(n/2^{h + 1})$ nodes of height $h$. Shown very intuitively. A heap with $n$ nodes has height $lg n$. Since the height of a node does not include the node itself, we in converse have that a heap with height $h$ has $2^{h + 1}$ nodes. We concentrate on a fully grown heap since we are looking for an upper bound. So of a total of $n$ nodes, there are $n/m$ trees with $m$ nodes.

### Priority Queues

A priority queue can be implemented if we add the operations insert, extract-min, extract-max and increase-key which all are $O(lg n)$.

Both extract-min and extract-max for their respective heap type catches the first element of the heap, max/min-heapifies and returns the element.

increase-key on the other hand has the following code

Heap-Increase-Key(A,i,key) if key < A[i] error "new key is smaller than current" A[i] = key while i > 1 and A[Parent(i) < A[i]] exchange A[i] with A[Parent(i)] i = Parent(i)

That is, it increases the chosen key, and then it keeps exchanging keys of the chosen key and the parent until the parent key is larger.