I came across this question about showing that the language $ L = \{w \epsilon \{a, b, c\}^*: n_a(w) + n_b(w) = n_c(w)\}$ is context-free but not linear in the book by Peter Linz. That is easily doable by the separate pumping lemma for linear languages (as given in Linz’s book), but my question is different.

Evidently this is a CFL, and a pushdown automaton can be constructed for it. But if I apply the pumping lemma for CFLs, I find that I’m able to pump strings that don’t belong to the language, which would mean that the language is not a CFL. Clearly I’m doing something wrong.

Going by the "game-like" format given in Linz, say you pick $ w = a^mb^mc^{2m}$ , $ |w| \ge m$ . The adversary can choose a number of decompositions $ w = uvxyz$ , they can look like -:

- $ v = a^k, y = a^l$ : The case where $ |vxy|$ is contained within the $ a$ ‘s of the string. Pump $ i = 0$ , and then $ w_0 = a^{m – (k + l)}b^mc^{2m}$ cannot be in the language since the equality no longer holds.
- $ v = a^k, y = b^l$ : The case where $ v$ is in the $ a$ section, $ x$ spans across the $ a$ ‘s and $ b$ ‘s, and $ y$ is in the $ b$ section. Again, pump $ i = 0$ . $ w_0 = a^{m – k}b^{m – l}c^{2m}$ cannot be in the language.

There are more cases like these. Where am I going wrong in the application of the CFL PL?