I’m new to algorithm correctness proof-writing but am keen to improve my skill there.
I have a first attempt here, but I think I’m missing a final step. The algorithm is for the LeetCode: 11. Container With Most Water problem.
I’ve shown to my satisfaction that only an “advance the lesser” move at any given point can possibly result in a greater area (water volume), but it feels like I’m missing the part where I can say “therefore this algorithm will always find the maximum”.
Any pointers on process, notation, or formalisms is also greatly appreciated. This proof strikes me a bit as “workshop-grade” and I wouldn’t mind getting a bit more elegant about it as I do others.
Given an array of non-negative integers hs (heights) where each represents a point at coordinate (i, hs[i]), n vertical lines are drawn such that the two endpoints of line i is at (i, hs[i]) and (i, 0). Find two lines, which together with the x-axis form a container, such that the container contains the most water.
- H – Height
- W – Width
- A – Area
|--W--| | ___ | | | | | | | H | | | | | | +--+--+--+--+ --- 0 1 2 3 4
For example, a maximal cross-section A of H ✕ W = 3 ✕ 2 = 6 is between offsets 1 and 3. Note there’s another one of area 6 for range [1..4], so the maximum is not necessarily unique.
The solution approach which seems to work is the following:
Create index variables left (L) and right (R) initially positioned at the extreme left (0) and right (|hs|-1) of array
Calculate the area as A = H ✕ W where H = min(hs[L], hs[R]) and W = R – L and record it as the maximal area so far.
Move the lesser of L or R toward the other.
Repeat until R == L, then return the maximum recorded.
Code would look something like this in Python
def max_water(hs): L, R = 0, len(hs) - 1 max_A = 0 while L < R: A = min(hs[L], hs[R]) * (R - L) max_A = max(max_A, A) if hs[L] <= hs[R]: L += 1 else: R -= 1 return max_A
My proof approach is to show that only advancing the lesser-height index can possibly increase the current area. The thing I don’t quite get is whether this proves correctness; my sense is I’m missing a last step:
There are four possible cases produced by choosing to advance the lesser or greater-height index (toward the middle) and whether the “advanced-to” height is greater or lesser that the prior. For concise expression, I use L and R to represent the heights of those two positions:
Case 1: Advance greater, new height is greater
|-----W-----| | | | ___ | | | | H | ... | | | +--+--------+ --- L ... R' R L < R, advance R to R', R' > R then: * H' = H -- because L is unchanged and L = H is still the upper bound. * W' < W * => H' ✕ W' < H ✕ W * => A' < A
Case 2: Advance greater, new height is lower (or equal).::
|-----W-----| | | | ___ | | | | H | ... | | | +--+--------+ --- L ... R' R L < R, advance R to R', R' <= R then: * H' <= H -- H' cannot be greater than it was because L = H is still an upper bound. * W' < W * => H' ✕ W' < H ✕ W * => A' < A
Case 3: Advance lesser, new height is lower (or equal).::
|-----W-----| | | ___ | | | H | | ... | | +--+--------+ --- L L' ... R L < R, advance L to L', L' <= L then: * H' <= H * W' < W * => H' ✕ W' < H ✕ W * => A' < A
Case 4: Advance lesser, new height is higher.::
|-----W-----| | | | ___ | | | | H | | ... | | +--+--------+ --- L L' ... R L < R, advance L to L', L' <= L then: * H' > H * W' < W * => H' ✕ W' is either <, =, or > H ✕ W * => A' <, =, or > A
So the only way an area greater than the current area can possibly be found is by following the “advance the lesser” policy. All the other cases lead to a reduction in area.
What I’m not seeing is how that necessarily guarantees this algorithm will find the maximum.