The unweighted maximum coverage problem is defined as follows:
Instance: A set $ E = \{e_1,…,e_n\}$ and $ m$ subsets of $ E$ , $ S = \{S_1,…,S_m\}$ .
Objective: find a subset $ S’ \subseteq S$ such that $ |S’| = k $ and the number of covered elements is maximized.
The problem is NP-hard, but a simple greedy algorithm (at each stage, choose a set which contains the largest number of uncovered elements) achieves an approximation ratio of $ 1-\frac{1}{e}$ .
In the following post, there is an example of when the greedy algorithm fails.
Tight instance for unweighted maximum coverage problem?
I wish to prove that the approximation ration for the greedy algorithm is tight. That is, the greedy algorithm is not an $ \alpha-$ approximation ratio for any $ \alpha > 1-\frac{1}{e}$ .
I think that if I will find, for any $ k$ , (or for an ascending series of $ k’s$ ), an instance where the number of elements covered by greedy algorithm is $ 1-(1- \frac{1}{k})^k$ times the number of elements covered by the optimal solution, the tightness of the ratio will be proved.
Can someone give a clue for such instances?
I thought of an initial idea: let $ E = \{ a_1 ,…a_n,b_1,…,b_n,…,k_1,…,k_n\}$ , a set with $ n\cdot k$ elements. Let $ S$ include $ k$ sets of $ n$ elements each, $ A = \{ a_1 ,…a_n\},…,K= \{k_1,…,k_n\}$ . The optimal solution will select these $ k$ sets and cover all the elements in $ E$ . Now I want to add $ k$ sets to $ S$ , that will be the solution the greedy algorithm will find, and will cover $ 1-(1- \frac{1}{k})^k$ of the elements in $ E$ . The first such set, of size $ n$ : $ S_1 = \{a_1,…a_\frac{n}{k},b_1,…b_\frac{n}{k},…,k_1,…k_\frac{n}{k} \}$ ($ \frac{n}{k}$ elements from each of the first $ k$ sets). The second such set, of size $ n – \frac{n}{k}$ : $ S_2 = \{a_\frac{n}{k},…a_{\frac{n}{k}+ (n – \frac{n}{k})\cdot\frac{1}{k}},b_\frac{n}{k},…,b_{\frac{n}{k}+ (n – \frac{n}{k})\cdot\frac{1}{k}},…,k_\frac{n}{k},…,k_{\frac{n}{k}+ (n – \frac{n}{k})\cdot\frac{1}{k}} \}$ , (that is, $ (n – \frac{n}{k})\cdot\frac{1}{k}$ elements from each of the first $ k$ sets) and so on till we have $ k$ additional such sets.
I don’t think this idea works for every $ k$ and $ n$ , and I’m not sure it’s the right approach.
Thanks.