Consider the following problem:

You are given a finite set of numbers $ (l_k)_{k\in \{ 1, …, n \}}$ such that $ \sum_{k=1}^n2^{-l_k}<1$ . Describe an algorithm to find a set $ (l’_k)_{k\in \{ 1, …, n \}}$ such that $ \forall k \in \{ 1, …, n \}:l’_k\le l_k$ and $ \sum_{k=1}^n2^{-l’_k}=1$ .

(For what it’s worth, this is a problem arising from Information Theory, where the Kraft-McMillan Theorem gives that the result above yields a more efficient binary code than the one with codeword lengths $ (l_k)$ .)

Here are my initial thoughts. We can consider $ \sum_{k=1}^n2^{-l_k}$ as a binary number e.g. $ 0.11010011$ and then we need to reduce the value of some $ l_k$ values whose digit position is preceded by a $ 0$ . So for instance, with the initial $ 0$ in the $ \frac{1}{8}$ position of the example number I gave, we want to decrease the value of some $ l_i=4$ to $ l’_i=3$ to add $ \frac{1}{8}$ and subtract $ \frac{1}{16}$ to the sum. We then have $ 0.11100011$ , so we’ve moved the problematic $ 0$ along a digit. When we get to the end we presumably have something like $ 0.11111110$ , and then need to reduce the value of the longest codeword by 1 to get the overflow to $ 1.00000000$ .

However, I encounter two problems: there may not be such an $ l_i=4$ , for instance, if the $ 1$ in the $ \frac{1}{16}$ digit place arises as the sum of three $ l_i=5$ numbers. Additionally, if we multiple have $ 0$ digits in a row then we presumably need to scan until the next $ 1$ and then decrement a corresponding $ l_i$ multiple times, but it’s conceivable that I would “run out” of large enough $ l_i$ codewords that I can manipulate in this way.

Can anyone describe an algorithm with a simple proof of correctness?

A follow-up problem: how do we generalise this algorithm to bases other than $ 2$ ?