It is a known result that, given generically noncommuting operators $ A,B$ , we have $ $ A^n B=\sum_{k=0}^n \binom{n}{k} \operatorname{ad}^k(A)(B) A^{n-k}.\tag A $ $ This can be proved for example via induction with not too much work.

However, while trying to get a better understanding of this formula, I realised that there is a much easier way to derive it, at least on a formal, intuitive level.

### The trick

Let $ \hat{\mathcal S}$ and $ \hat{\mathcal C}$ (standing for “shift” and “commute”, respectively) denote operators that act on expressions of the form $ A^k D^j A^\ell$ (denoting for simplicity $ D^j\equiv\operatorname{ad}^j(A)(B)$ ) as follows:

\begin{align} \hat{\mathcal S} (A^k D^j A^\ell) &= A^{k-1} D^j A^{\ell+1}, \ \hat{\mathcal C} (A^{k-1} D^{j+1} A^\ell) &= A^{k-1} D^j A^{\ell+1}. \end{align} In other words, $ \hat{\mathcal S}$ “moves” the central $ D$ block on the left, while $ \hat{\mathcal C}$ makes it “eat” the neighboring $ A$ factor.

It is not hard to see that $ \hat{\mathcal S}+\hat{\mathcal C}=\mathbb 1$ , which is but another way to state the identity $ $ A[A,B]=[A,B]A+[A,[A,B]].$ $ Moreover, crucially, $ \hat{\mathcal S}$ and $ \hat{\mathcal C}$ commute. Because of this, I can write

$ $ A^n B=(\hat{\mathcal S}+\hat{\mathcal C})^n (A^n B)=\sum_{k=0}^n\binom{n}{k} \hat{\mathcal S}^{n-k} \hat{\mathcal C}^{k}(A^n B),$ $ which immediately gives me **(A)** without any need for recursion or other tricks.

### The question

Now, this is all fine and dandy, but it leaves me wondering as to *why does this kind of thing work*? It looks like I am somehow bypassing the nuisance of having to deal with non-commuting operations by switching to a space of “superoperators”, in which the same operation can be expressed in terms of *commuting* “superoperators”.

I am not even sure how one could go in formalising this “superoperators” $ \hat{\mathcal S},\hat{\mathcal C}$ , as they seem to be objects acting on “strings of operators” more than on the elements of the operator algebra themselves.

Is there a way to formalise this way of handling the expressions? Is this a well-known method in this context (I had never seen it but I am not well-versed in this kinds of manipulations)?