Short Version:

Let all lower latin variables mean strings, except $ n, m, l, r$ , which are indices and index bounds.

Let $ ab$ mean usual string concatenation of $ a$ to $ b$ .

Let $ t \leqslant s \iff \exists$ strings $ u,v$ such that $ utv = s$ .

Let $ \Sigma(s) = \{ a \leqslant s : a = 1\}$ .

Let $ V(n) = \{X_1, \dots, X_n\}$ be disjoint from $ \Sigma(s)$ always.

Let $ G(s, n) = \text{freegroup}(V(n)\cup \Sigma(s))$

Let $ \text{pruned}(s, n) = \{$ grammars $ g$ for $ s$ on $ \leq n$ variables such that $ \forall X \in V(g)$ we have $ X \leqslant g^k(S)$ for some $ k \geq 0$ .

Define a grammar $ g$ on $ n$ variables for $ s$ to be a homomorphism of free groups: $ \alpha : G(s, n) \to G(s, n – 1)$ , such that $ \alpha$ is some mapping $ V(n) \to G(s, n1)$ extended homomorphically, and identity on $ \Sigma(s)$ , and such that $ \alpha^k(S) = s$ for all $ k \geq n_g$ .

Let $ \text{SGP}(s, n) = \{\overline{x} \in G(s, n)^n :$ if $ \overline{x}\cdot \overline{g} = \overline{g’}$ then $ g’$ can be pruned such that an equivalent grammar resides in $ \text{pruned}(s, n)$ whenever $ g$ does $ \}$ .

From now on, identify $ G(s, n1)$ with the first $ n1$ components of $ G(s, n)$ .

Question is at the bottom of post.
Written with StackEdit.
Long Version:
A grammar $ g$ of a string $ s$ over $ \Sigma$ is a string homomorphism from $ (\Sigma \cup V)^* \to (\Sigma \cup V\setminus \{S\})^*$ such that $ g^n(S) = s$ for all sufficiently large $ n$ , where $ V$ is a finite set of variable symbols, and $ S \in V$ is the designated start symbol which is defined to be present for any string $ s$ . A pruned grammar is one such that for all $ X \in V(g), $ the letter $ X$ appears as a substring to some iterate of $ g$ evaluated at $ S$ : $ X \leqslant g^k(S)$ for some $ k \geq 0$ . A pruned grammar just gets rid of unused rules.
Define the size of a grammar $ g$ of $ s$ to be the sum of the lengths of the RHS of rules defininig $ g$ . Recall from contextfree languages that a grammar can also be defined as a list of rules of the form $ X \to x$ . Formally we can express this as $ g := \sum_{X \in V(g)} g(X)$ .
In order to algebratize the problem, I didn’t find it beneficial to study string homomorphisms that are pruned grammars (the semigroup formed is $ (g \circ g’ = g’, \forall g,g’ \in \text{ the semigroup})$ . This is because composing two grammars like functions causes the first in the composition to hide the variables of the second, such that when the result is pruned you get the first operand as the result.
Instead, we’ll study transformations of the rules of grammars with $ n$ rules. Consider the direct product $ G^n$ , where $ G$ is the free group on $ \Sigma(g) = V(g) \cup \Sigma(s)$ , where the union is made disjoint by an unlimited supply of new variable symbols. Clearly if we assign a variable index to factor of the product, then we have an encoding of all posible grammars on $ n$ variables, a lot them being redundant and “very similar” to other grammars.
None the less, the smallest grammar problem should be statable as:
Find a grammar $ g$ of $ s$ such that $ g$ is minimized, or equivalently,
Find a vector $ \overline{g} \in G^n, n\geq 1$ such that $ g$ is minimized and $ g$ is a grammar for $ s$ .
Note the $ \geq n$ part. It implies that any naive algorithm would have to check all $ n = 1…C(s)$ where $ C(s) = O(s^2)$ and $ C(s)$ is the set of substrings of $ s$ . I.e. in the worst case assume you have to variablize approximately all substrings. This is not the case in practice or in theory, but it shouldn’t matter for this analysis.
Now we have to be careful about how we define $ \cdot : G \to \Bbb{Z}$ on a group, because it can literally be always equal to $ 1$ if not done right. Since $ G$ is generated by $ \Sigma(g)$ , we say that $ x = $ the length of a minimial run of letters in $ x_i \in \Sigma(g) \cup \Sigma(g)^{1}$ . I.e. if $ x_1 \cdots, x_r = x$ . And we cannot do the same for $ r’ \lt r$ , then $ x := r$ .
Since the alphabet $ \Sigma(g)$ depends on the grammar, we have to introduce the alphabet that doesn’t change over $ \overline{g} \in G^n$ . That, we’ll just call $ \Sigma_n = \{X_1, \dots, X_n\}$ .
Now to define $ \overline{x}$ for $ \overline{x} \in G$ we need to take out rules that are easily seen to be not required or not helpful in making a grammar the smallest it can be. For instance if $ X_1 \to a \in \Sigma(s)$ , then $ a = 1$ and so any grammar would actually be larger than the equivalent grammar without it. Thus whenever $ t \in \Sigma_n \cup \Sigma$ we define $ t_n = 0$ . To get to the equivalent nonredundantly styled grammar, you just substitute in $ a$ for every occurence of $ X_1$ found on the RHS of other rules.
Define a grammar symmetry from $ \overline{g}$ to be any element $ \sigma \in G^n$ such that when $ \sigma$ is applied componentwise on the left (or right) of $ \overline{g}$ , written $ \overline{h} = \sigma \overline{g}$ we have that $ h^k(S) = s$ for sufficiently large $ k$ is a preserved property.
We have $ G^n$ acting on $ G^n$ itself and a subset $ Y := \text{prunedgrammars}(n, s) = \{ \overline{g} : g$ is a pruned grammar for $ s\}$ . The subset $ H = \text{Stab}_{G^n}(Y)$ i.e. all group elements that stabilize pruned grammars for $ s$ together with their componentwise inverses, is a subgroup of $ G^n$ . Proof. $ iY =Y, hY = Y \implies Y = h^{1} Y \implies ih^{1}Y = Y$ .
Let $ X := \text{smallestgrammars}(n, s)$ . Clearly $ X \subset Y$ since you can’t have extra junk in a smallest grammar of course. Note: elements of $ X$ are precisely the smallest grammars one can achieve in $ n$ or less variables. The reason you can represent the lower variable counts in there as well, is that a component of $ \overline{g}$ can simply map to $ 1 \in \Sigma^*$ and it has length $ 0$ .
Lemma 0. $ H \times Y \to Y$ acts transitively.
Let $ g, h$ be two pruned grammars of $ s$ , each with $ n$ rules, or $ g,h \in Y$ . Consider the component of lowest index $ i$ such that $ g_i \neq h_i$ . Since we’re in a group, $ x g_i = h_i$ has a solution for all $ g_i, h_i \in \text{Stab}_{G^n}(Y)$ . Similarly we have that $ g_i = x^{1} h_i$ . Do the same for each component, and call $ x_j$ the left multiplier to get from $ g_j$ to $ h_j$ . Then $ \overline{x} = (x_1, \dots, x_n)$ is such that $ \overline{x}\overline{g} = \overline{h}$ .
This is flawed, since $ \text{Stab}_{G^n}(Y)$ may not be writeable as a direct product.
Lemma 1. $ \text{Stab}_{G^n}(Y)$ is equal to a direct product of subgroups of $ G^n$ . Let $ K_i = \{ (1,1, \dots, x_i, \dots, 1, 1) : x_i$ , is in the $ i$ th position and $ (c_1, \dots, c_{i1}, x_i, \dots, c_n) \in \text{Stab}_{G^n}(Y)$ for some $ c_j \in G\}$ . Then $ K_i$ forms a group under componentwise multiplicaiton.
Proof. We only need to show this for the $ i$ th component since the rest are all $ 1$ . Without loss of generality we only need to show that $ H_i = \{ x \in G : (c_1, \dots, c_{i1}, x, \dots, c_n) \in \text{Stab}_{G^n}(Y)$ for some $ c_j \in G \}$ is a subgroup of $ G$ . Since $ \text{Stab}_{G^n}(Y)$ is in particular a subgroup of $ G^n$ any multiplication of its elements must be carried out componentwise, by definitino of direct product. Let $ x, y\in H_i$ . Then clearly $ (c_1 d_1, \dots, c_{i1}d_{i1}, xy^{1}, \dots, c_n d_n) \in \text{Stab}_{G^n}(Y)$ , where similarly we can prove, using componentwiseness that $ y^{1} \in H_i$ . $ \blacksquare$
That proves that $ \text{Stab}_{G^n}(Y)$ transitively acts on $ Y$ . Define $ H’ := \text{Stab}_{H}(X) = \{ \overline{x} \in H : \overline{x} X = X \}$ similarly, and we have a proof that $ H’ \times X \to X$ is a transitive group action.
Thus if we are given a grammar $ \overline{y} \in Y$ for $ s$ , say it might be approximately small, but not the smallest. Then there is a group element in $ H = \text{Stab}_{G^n}(Y)$ that will take $ \overline{y}$ to $ \overline{g}$ , a smallest grammar.
Thus the question can be phrased, compute $ \text{SGP}(s,n) = \text{argmin}_{\overline{x} \in H}\{\overline{x} \cdot \overline{y}\} \cdot \overline{y} =$ the set of smallest grammars of $ s$ on up to $ n$ variables, when given a pruned grammar $ y$ for $ s$ . You can go ahead and take $ y = \{ S \to s \}$ , with $ 1$ ‘s in the other components of $ \overline{y}$ .
Questions.
 For any smallest grammar $ g$ does there always exist two other smallest grammars $ h_1, h_2$ such that $ \overline{h_1}\overline{h_2} = \overline{g}$ in the direct product $ G^n$ ? In other words is there a recursive decomposition of the problem for each problem instance.
Written with StackEdit.