I’m having some difficulty understanding/being convinced the technique used to prove a greedy algorithm is optimal for the fractional knapsack problem. A proof by contradiction is used. I’ve never been great at proofs, and maybe this will help me get on the track to becoming more comfortable with them.

Consider section 3.2, Analysis. https://www2.cs.duke.edu/courses/fall17/compsci330/lecture6note.pdf The algorithm is very intuitive – choose the highest value/weight and only the last item in the knapsack may need be fractional. I don’t have any issues with that. Formally proving that this algorithm is the best thing out there I what I have an issue with.

(note proof is available in more detail: https://www2.cs.duke.edu/courses/fall17/compsci330/lecture6.pdf)

The professor shows that our algorithm chooses items p_1…p_n in various amounts. ALG = (p_1, p_2, … , p_n). So we choose p_1 much of item 1, p_2 much of item2… all the way to p_n. Only the last item in the knapsack could be fractional. All good.

Now we assume that there exists a better algorithm, OPT. We say OPT makes the same choices as ALG up to a certain point OPT(q_1, q_2,…q_n). At some point, q_i, OPT makes a different choice that ALG. Why in the notes **does the professor say q_i cannot be p_1**. Surely OPT could take a different choice immediately. Anyway, that aside, I agree that OPT, like ALG will be putting as much of the items it can into the knapsack upto i. After i, OPT starts doing something else. The professor doens’t talk about this, but maybe OPT changes the ordering of the items it puts into the knapsack – that wouldn’t be a big deal as long as it chose all the same items and put the same amount of the very last item ALG put in the knapsack last.

However changing the order of items chosen wouldn’t make OPT better, just the same. Right? (professor doesn’t talk about this, at least not explicitly). For OPT to be better there must be some item, i it chooses less than ALG did of. And similarly some item j we take more of than ALG did, to keep the capacity the same. Another aside: haven’t we made the implicit assumption that we must use all of the capacity, W, to be optimal. Is this self evident or something? Intuitively I can see why, but formally I cannot.

Anyway we remove a little amount from qj and give it to qi, hence those equations involving epsilon. i.e. if we remove e units of weight of j then we can add (e * wj / wi) units of value. Fine.

Now why on earth is this called OPT’. Isn’t this still OPT? I’m confused here. Why did we bring OPT’ in there? Why do we need yet another modified version of ALG. I thought OPT (not OPT’) is where we modified qi and qj by changing how much of each we picked.

Thanks all!