What do we gain with higher order logics?

Gödel’s speed up theorems seem to say that higher order logics offer shorter shortest proofs of various propositions in number theory. Explicitly, he gave the following

Theorem. Let $ n>0$ be a natural number. If $ f$ is a computable function, then there are infinitely many formulas $ A$ , provable in $ S_n$ ($ n$ -th order logic), such that if $ k$ is the length of the shortest proof of $ A$ in $ S_n$ and $ l$ is the length of the shortest proof of $ A$ in $ S_{n+1}$ , then $ k>f(l)$ .

I am not well versed in computable functions but I believe the identity function is computable, so the above inequality would simplify to tell us that the shortest proofs get shorter as we move into higher logics.

How does this reconcile with the following quote from Bell and Machover’s “A course in Mathematical Logic”:

However, most logicians agree that [second and higher order] languages are, at least in principle, dispensable. Indeed, let $ \mathfrak{U}$ be any structure and let $ \mathfrak{B}$ be a structure obtained from $ \mathfrak{U}$ in the following way. The universe of $ \mathfrak{B}$ consists of all undividuals of $ \mathfrak{U}$ plus all sets of individuals of $ \mathfrak{U}$ . The basic operations of $ \mathfrak{B}$ are defined in such a way that when they are restricted to the universe of $ \mathfrak{U}$ they behave exactly as the corresponding basic operations of $ \mathfrak{U}$ . Finally, the basic relations of $ \mathfrak{B}$ are all the basic relations of $ \mathfrak{U}$ plus two additional relations: the property of being an individual of $ \mathfrak{U}$ , and the relation of membership between an individual of $ \mathfrak{U}$ and a set of individuals of $ \mathfrak{U}$ . Then any statement about $ \mathfrak{U}$ expressed in a second-order language with set variables, can easily be “translated” into a statement about $ \mathfrak{B}$ in first order language. A similar argument applies to other higher-order languages. We therefore do not lose much by confining our attention to first order languages only.

From the argument they outline it seems they lose nothing by restricting their attention to first order languages, but this goes against the grain of the speed up theorems unless I’m missing something – does the “translation” from a second order statement over $ \mathfrak{U}$ to a first order statement over $ \mathfrak{B}$ require more symbols? This goes against what seems to be the implied translation, where we interpret the symbols for subsets of the universe of $ \mathfrak{U}$ as the symbols in $ \mathfrak{B}$ added to denote these subsets, which seems like it would yield a translation of equal length.

Giorgio Mossa’s answer here seems to indicate that the difference lies in the standard semantics of the languages in question, not their syntax, but this is somewhat unclear to me as the semantics of a language are fixed as soon as we choose a structure to interpret the language in, which it seems has been done in the above quote. Any assistance is greatly appreciated.