Greedy algorithms are
In order to apply a greedy solution the problem must comply to the following two properties
One problem that can't be solved greedy is as follows (the 0/1 knapsack problem): We are out stealing with a knapsack that can hold 50 pounds. We steal with a greedy attitude from the list with the following items (10,60),(20,100),(30,120) where the first item of the tuple is the weight and the second is the prize. Now being greedy we take (10,60),(20,100) which amount to 30 pounds and 160 worth of ill-gotten gains. But the optimal items to put in the knapsack is of course the second and the third.
Example. Huffmann codes are a way to way to compress data. Say we have a string of characters. This string can be compressed with a greedy approach on the frequency of each character. To make this work we see the characters as a binary string.
Now the binary string representation can have fixed length. Or variable length where no two strings have same prefixes. The latter results in a complete binary tree (optimal code). The lower frequency, the lower placement in the tree. For the frequencies a:20,b:13,c:9,d:5 we have
Where $z_3 = 47, z_2 = 27, z_1 = 14$.
For a tree, $T$, we have that the total cost in bits is given as $$ B(T) = \sum_{c \in C} c.freq \cdot d_{T}(c) $$
The code is
We first show the greedy-choice property. We need the following lemma
Let $C$ be alphabet with $c$ as characters and $c.freq$ as frequencies of those. Let $x$ and $y$ be the two characters with lowest frequencies. Then there exist an optimal prefix code where $x$ and $y$ are siblings.
Let $T$ be a tree of an optimal prefix code. Let $a$ and $b$ be siblings in $T$ with maximum depth. And, as stated in the Lemma, let $x$ and $y$ be the two characters with lowest frequencies. Let $a.freq \leq b.freq$ and $x.freq \leq y.freq$ from which follows that $x.freq \leq a.freq$ and $y.freq \leq b.freq$. Note that if $x.freq = b.freq$, then $a.freq = b.freq = x.freq = y.freq$, and the lemma is trivially true.
Now we exchange the position of $a$ and $x$ in $T$ to obtain $T'$. And we exchange the positions of $b$ and $y$ in $T'$ to obtain $T''$ in which $x$ and $y$ now have maximum depth. The difference in cost between $T$ and $T'$ is $$ B(T) - B(T') = \\ \sum_{c \in C} c.freq \cdot d_{T} (c) - \sum_{c \in C} c.freq \cdot d_{T'} (c) = \\ x.freq \cdot d_{T}(x) + a.freq \cdot d_{T}(a) - x.freq \cdot d_{T'}(x) - a.freq \cdot d_{T'}(a) = \\ x.freq \cdot d_{T}(x) + a.freq \cdot d_{T}(a) - x.freq \cdot d_{T}(a) - a.freq \cdot d_{T}(x) = \\ (a.freq - x.freq)(d_{T}(a) - d_{T}(x)) \geq 0 $$ *remember that $a.freq \geq x.freq$ and $d_{T}(a) \geq d_{T}(x)$. In the same way we get that $B(T') - B(T'') \geq 0$ - exchanging $y$ and $b$ does not increase the cost. Now we have that $B(T'') \leq B(T)$. But since $T$ is optimal, we also have that $B(T) \leq B(T'')$, and thus $B(T) = B(T'')$. In conclusion: $T''$ is an optimal tree in which $x$ and $y$ appears as sibling leaves.
By lemma 16.2 we have that an optimal tree can be build starting with merging the two characters with lowest frequencies.
Next let's show optimal substructure.
Let $C$ be alphabet, let $x$ and $y$ have minimum frequencies. Let $C'$ be the alphabet $C$ with $x$ and $y$ removed and $z$ added, that is $C' = C - \{x,y\} \cup {z}$. In $C'$ $z.freq = x.freq + y.freq$. Let $T'$ be a tree representing an optimal prefix code for $C'$. Then $T$ obtained by replacing $z$ with $x$ and $y$ in $T'$ is an optimal prefix code tree for $C$.
For each $c \in C - \{x,y\}$ we have that $d_{T}(c) = d_{T'}(c)$ and hence $c.freq \cdot d_{T}(c) = c.freq \cdot d_{T'}(c)$. We have that $d_{T}(x) = d_{T}(y) = d_{T'}(z) + 1$. And we have that $$ x.freq \cdot d_{T}(x) + y.freq \cdot d_{T}(y) = \\ (x.freq + y.freq)(d_{T'}(z) + 1) = \\ z.freq \cdot d_{T'}(z) + x.freq + y.freq $$ from which follows that $$B(T) = B(T') + x.freq + y.freq$$ or equivalently $$B(T') = B(T) - x.freq - y.freq$$ We finally proof the lemma by contradiction. Let $T$ not represent an optimal prefix code for $C$. Let $T''$ represent an optimal prefix code for $C$, and we have $B(T'') < B(T)$. Let $T'''$ be $T''$ with the common parrent of $x$ and $y$ replaced by $z$ with $z.freq = x.freq + y.freq$. Now $$ B(T''') = B(T'') - x.freq - y.freq < \\ B(T) - x.freq - y.freq = \\ B(T') $$ in contradiction with the assumption that $T'$ represent an optimal prefix code. Thus $T$ represents an optimal prefix code for $C$.
By lemma 16.2 and lemma 16.3 we have that a solution to a problem in the algorithm contains solutions to subproblems.
We have the following factors of concern
...