A monad is just a monoid in the category of endofunctors, what’s the enlightenment?

Pardon the word play. I’m a little confused about the implication of the claim and hence the question.

Background: I ventured into Category Theory to understand the theoretical underpinnings of various categorical constructs and their relevance to functional programming (FP). It seems (to me) that one of the "crowning gems" at the intersection of Cat and FP is this statement:

A monad is just a monoid in the category of endofunctors

What is the big deal about this observation and what are its programmatic/design implications? Sources like sigfpe and many texts on FP seem to imply the mindblowingness of this concept but perhaps I’m unable to see the subtlety that’s being alluded to.

Here’s how I understand it:

Knowing something is a monoid allows us to extrapolate the fact that we can work within a map-reduce setting where the associativity of the operations allows us to split/combine the computation in arbitrary order i.e., (a1+a2)+a3 == a1+(a2+a3). It can also allow one to distribute this across machines and achieve high parallelization. (Thus, I could mentally go from a theoretical construct -> computer science understanding -> practical problem solving.)

For me it was obvious (as a result of studying Cat) to see that monads have a monoidal structure in the category of endofunctors. However, what is the implication one can draw from this and what is its programmatic/design/engineering impact when we’re coding with such a mental model?

Here’s my interpretation:

  • Theoretical Implication: All computable problems at their heart are monoidal in a sense.
    • Is this correct? If so, I can understand the enlightenment. It’s a different perspective on understanding the notion/structure of computable problems that wouldn’t be obvious if coming from only a Turing/Lambda model of computation and I can be at peace.
    • Is there more to it?
  • Practical Implication: Is it simply to provide a case for the do-notation style of programming? That is, if things are monoidal we can better appreciate the existence of the do/for constructs in Haskell/Scala. Is that it? Even if we didn’t know about the monoidal underpinnings, we needn’t invoke the monoidalness to make this claim since bind >>= and flatMap constructs are defined to be associative. So what gives? Or is it more to do with the foldability of monadic constructs and that is the indirect enlightenment that is being alluded to?

Question(s): What am I missing here? Is it simply the recognition of the fact that monads are generalized monoids and that they can be combined in any order similar to map-reduce operations like monoids? How does knowing about the monoidal property help improve the code/design in any way? What’s a good example of before/after to show this difference (before knowing about monads/monoidality and after)?