## Is grammar that describes an equation in prefix (Polish) notation always unambiguous?

I recently completed a problem in which I was asked to generate a parse tree for the expression $$+ \, 5 \, * \, 4 \, 3$$ using the following grammar and rightmost derivation:

$$Expr \rightarrow + \, Expr \, Expr \, | \, * \, Expr \, Expr \, | \, 0 \, | \, \dots \, | \, 9 \,$$

While I have no trouble taking the derivation and creating its parse tree, the question also asks whether the grammar is ambiguous. In the scope of what I’ve been taught, my only tool for proving ambiguity has been to find a different parse tree for whatever leftmost or rightmost derivation I have, thus proving multiple valid parses and ambiguity. However, I have not been told how to prove unambiguity. I am fairly confident that the grammar described above is unambiguous based partially on intuition, and partially because it’s designed for prefix notation. I tried to generate new trees for a given string to prove ambiguity, but since the operator is always leftmost, I could not find any string in which multiple parse trees could be created. If I am mistaken, please let me know.

My question is this: Is it possible for grammar that describes strings using prefix (Polish) notation such as the one above to ever be ambiguous? My intuition tells me that it will always be unambiguous, but I was wondering why this might be the case.

Posted on Categories proxies

## Are these special (one production) Context-Free Grammars always unambiguous?

Consider the following (Context-Free) Grammars with only one production rule (not including the epsilon production):

• $$S \rightarrow aSb\;|\;\epsilon$$
• $$S \rightarrow aSbS\;|\;\epsilon$$
• $$S \rightarrow aSaSb\;|\;\epsilon$$
• $$S \rightarrow aaSaaSbb\;|\;\epsilon$$
• $$S \rightarrow aSbScSdSeSf\;|\;\epsilon$$
• etc…

Grammars like these uphold 4 basic rules:

1. The non-terminal symbol $$S$$ can never appear next to itself.
• e.g. $$[\;S \rightarrow aSSb\;|\;\epsilon\;]$$ would not be allowed.
2. The non-terminal symbol $$S$$ can appear at the beginning or end of a production but not on both sides at the same time.
• e.g. $$[\;S \rightarrow SabaS\;|\;\epsilon\;]$$ would not be allowed.
• e.g. $$[\;S \rightarrow Saba\;|\;\epsilon\;]$$ or $$[\;S \rightarrow abaS\;|\;\epsilon\;]$$ would be allowed.
3. The sequence of terminal symbols that exist between each non-terminal $$S$$ cannot all match.
• e.g. $$[\;S \rightarrow aSaSa\;|\;\epsilon\;]$$, $$[\;S \rightarrow abSabSab\;|\;\epsilon\;]$$, etc. would not be allowed.
• e.g. $$[\;S \rightarrow aSaSb\;|\;\epsilon\;]$$, $$[\;S \rightarrow abSabSaf\;|\;\epsilon\;]$$, etc. would be allowed.
4. The sequence of terminal symbols at the beginning and end cannot match.
(i.e. $$[\;S \rightarrow ☐_1SaS☐_2\;|\;\epsilon\;]$$ s.t. $$☐_1 \neq ☐_2$$ where $$☐_1$$ and $$☐_2$$ are a sequence of terminal symbols)
• e.g. $$[\;S \rightarrow aSbSa\;|\;\epsilon\;]$$, $$[\;S \rightarrow aaSbSaaS\;|\;\epsilon\;]$$, etc. would not be allowed.
• e.g. $$[\;S \rightarrow aSbSb\;|\;\epsilon\;]$$, $$[\;S \rightarrow aaSbSaxS\;|\;\epsilon\;]$$, etc. would be allowed.

Are (Context-Free) Grammars, that follow these 4 rules, always unambiguous? It would seem so. I really don’t see any conceivable way that such Grammars could be ambiguous.

(Note: To show why Rule 4 is necessary consider the grammar $$S \rightarrow aSbSa\;|\;\epsilon$$ with the string $$aababaababaa$$. A visual derivation can be found here.)

I’ve spent a lot of time thinking about this question and have had a hard time finding any way of either proving or disproving it. I’ve tried showing that Grammars like these are $$LL(1)$$. However, it seems only Grammars of the form $$S \rightarrow aSb\;|\;\epsilon$$ are $$LL(1)$$. Grammars like $$S \rightarrow aSaSb\;|\;\epsilon$$ are not $$LL(1)$$. Maybe I need to use a higher $$k$$ for $$LL(k)$$?

(This question is a follow-up/reformulation of a previous question.)

I would really appreciate any help I could get here.

Posted on Categories proxies

## Are Context-Free Grammars with only one Production Rule always Unambiguous?

Consider the following (Context-Free) Grammars with only one production rule (not including the epsilon production):

• $$S \rightarrow aSb\space|\space\epsilon$$
• $$S \rightarrow aSSb\space|\space\epsilon$$
• $$S \rightarrow aSbS\space|\space\epsilon$$
• $$S \rightarrow aSaSb\space|\space\epsilon$$
• $$S \rightarrow aaSaaSbb\space|\space\epsilon$$
• $$S \rightarrow aSbScSdSeSf\space|\space\epsilon$$
• $$S \rightarrow aSSbcSd\space|\space\epsilon$$
• etc…

Are all these Grammars unambiguous? Will every Grammar with only one production rule (not including the epsilon production) always be unambiguous? It would seem so, but I’m not totally sure.

Grammars that only contain one unique terminal symbol could be ambiguous. (ex. $$S\rightarrow aSaSa\space|\space\epsilon$$) However, Grammars with at least two distinct terminal symbols seem like they should always be unambiguous.

I’ve tried showing that Grammars like these are $$LL(1)$$. However, it seems only Grammars of the form $$S \rightarrow aSb\space|\space\epsilon$$ are $$LL(1)$$. Grammars like $$S \rightarrow aSaSb\space|\space\epsilon$$ are not $$LL(1)$$. (Illustrated in the parse table below.) Despite the example Grammar above not being $$LL(1)$$, it still seems to be unambiguous. Maybe it’s simply a matter of using a higher $$k$$ for $$LL(k)$$?

In short, are (Context-Free) Grammars with only one production rule (not including the epsilon production) and at least two unique terminal symbols always unambiguous?

I would really love some help, any at all would be greatly appreciated.

Posted on Categories proxies

## How to intuitively come up with an example for an ambiguous grammar and how to make that grammar unambiguous?

I don’t get how to intuitively come up with an example for an ambiguous grammar.

Let’s take as an example this grammar:

Declaration ::= Type D ;          Type ::= "int" | "char"          D ::= "*" D             |  D "[" number "]"             |  D "(" Type ")"             |  "(" D ")"             |  name 

I am told outright that this grammar is ambiguous. What is expected of me is to find one example that proves that it is. What I’m interested is what is the thought process that allows you to find an example. Our teacher just gave us one example that would show that we would obtain two different derivation tree like:

int *foo;  has two derivation tree              Declaration              Declaration              /    |   \               /    |   \            Type   D    ;            Type   D    ;             |    / \                 |    / \____            int  *   D               int  /   \ \ \                    / \____              D    [ 5 ]                   /   \ \ \            / \                  D    [ 5 ]           *   D                  |                        |                 foo                      foo  

However I have no idea how he thought to himself that int*foo would be the example before doing the trees. It all boils down to how they did it without trial and error?

How to make that grammar unambiguous? I was also given the task to make the above grammar unambiguous. However I don’t know yet again what is the intuition behind making it unambiguous.

They gave us this solution:

 Declaration ::= Type D ;        Type ::= "int" | "char"        D ::= "*" D           |  "(" D ")" D'           |  name D'        D' ::= "[" number "]" D'            |  "(" Type ")" D'            |  empty               <== empty string  

There must be a pattern in all of this. What it is? What is the general method to solve this type of problem regardless of which grammar is given?

Posted on Categories proxies

## Language containing all unambiguous grammars

Suppose $$L$$ is the language of the unambiguous grammars. That is, a sentence $$w\in{}L$$ if it is a string that describes an unambiguous context-free grammar.

Considering that deciding whether a context-free grammar is ambiguous is non-decidable, would it be correct to say that $$L$$ exists but is not recursively enumerable?

Posted on Categories proxies

## Does left factoring CFG make it unambiguous?

I came across following problem:

If the CFG is left factored then it must be Unambiguous and Not left Recursive. TRUE/FALSE?

I have many thoughts about this. But I feel they are somewhat contradicting and hence form a bigger doubt. May be I lack some understanding.

Thought 1

Ambiguity has nothing to do with determinism (even if grammar does not have productions with same prefix of their right side, they still can have a string with two parse trees). So CFG is still ambiguous after left refactoring. Hence, the sentence should be false.

Thought 2

Left factoring is used to eliminate common prefixes of right hand side of productions in the grammar. Doesnt that make grammar deterministic? Deterministic grammar derive deterministic context free languages. So isnt turning grammar deterministic makes it unambiguous? If yes, given statement should be true.

Thought 3

I have completely ignored left recursion in thought 1 and 2. Is it ok?

Doubt: I feel thought 1 and thought 2 are contradicting. And I am not able to figure it out which one is correct. Which of the above thoughts are incorrect and why? Am I missing any more subtleties?

## What are the sufficient conditions for a grammar to be unambiguous?

There is no algorithm that, given an arbitrary grammar, decides if it’s ambiguous or not. However, Are there any sufficient conditions that make it easier to tell that a grammar is unambiguous?

For example, if I’m not mistaken, it can be algorithmically checked if a grammar contains no conflicts under LR(1), LALR(1) or LL(1); if so, is there any implication like “If the grammar has no conflicts under LL(1) then it is unambiguous” or something like that?

Posted on Categories proxies

## Finding an unambiguous grammar of a language provided by a CFG

I’m working through ‘Intro to Automata Theory, Language and Computation’ 2nd edition by Hopcroft, Motwani & Ullman.

In section 5.4, exercise 5.4.3 I am tasked with finding an unambiguous grammar for the language of this Context-Free Grammar (where epsilon is the empty string):

$$S \rightarrow aS | aSbS | \epsilon$$

I’m having difficulty describing this grammar formally, but I realize the number of a’s is greater than or equal to the number of b’s. And the string will always begin with an a (except for the empty string).

I have attempted to break the grammar into two parts to remove the ambiguity. After a few brute force attempts, I decided to distinctly have a variable generate only a’s and the other variable generate an equal amount of a’s and b’s.

$$S \rightarrow A | B | \epsilon$$ $$A \rightarrow aA | a$$ $$B \rightarrow BB | aAbA | ab$$

The key part is when a re-write of B to a sentential form when an A occurs, only more a’s can be generated in the string.

My question is does the new grammar I created generate the same language as the previous grammar? If not, what would be the correct grammar? I realize determining if two CFG are equivalent is undecidable; but I have no method of determining if I am correct.

Posted on Categories proxies

## Generate a Unambiguous Grammar

Given the grammar:

$$S \to AS\mid \varepsilon$$
$$B \to A1\mid 0A1 \mid \varepsilon$$

Generate a new unambiguous grammar that generates the same language as the grammar above.

I have no idea how to make it unambiguous, I have proven already that the given language has 2 leftmost derivatios for the string 01101.

Posted on Categories proxies

## Generate a Unambiguous Grammar

Given the grammar:

$$S \to AS\mid \varepsilon$$
$$B \to A1\mid 0A1 \mid \varepsilon$$

Generate a new unambiguous grammar that generates the same language as the grammar above.

I have no idea how to make it unambiguous, I have proven already that the given language has 2 leftmost derivatios for the string 01101.

Posted on Categories proxies