Optimising tensor operations under memory constraints

Let `riem` is a free variable with `riem ∈ Arrays[{4, 4, 4, 4}` as assumption. Let:

`val = TensorConstract[TensorProduct[riem, riem, riem], {{4,5}}]`

Let `riemVals` be an actual `{4, 4, 4, 4}` tensor whose indices have symbolic values.

I’m interested in computing `val /. (riem-> riemVals)`. I’m “guessing” there are two ways Mathematica could do this internally:

1) Compute `v1 =TensorProduct[riemVals, riemVals, riemVals]` then compute the result as `TensorConstract[v1,{{4,5}}]`.

2) Note that `val` is equivalent to:

`TensorProduct[TensorConstract[TensorProduct[riem, riem], {{4,5}}],riem]`.

Compute `v1= TensorProduct[riemVals, riemVals]`. Then `v2= TensorConstract[v1,{{4,5}}]`. Then the result as `TensorProduct[v1, riemVals]`.

Now, what’s the difference between these two? Obviously they give us the same result, but in the first approach we have to store a $$4^{12}$$ tensor in memory as intermediate value, while in the second one we only have to store a $$4^{10}$$ tensor. The idea being that, when your maximum memory is constrained, it pays off to move `TensorConstract` inwards to the expression so you can do it as earlier as possible before you do the `TensorProduct`.

My question is: does Mathematica take the memory-efficient approach when doing these types of operations? If not, is there any way to implement the evaluation/computation in a controlled manner such that the result will be calculated in a memory-efficient way (compute and prioritize forms where the `TensorCotract` is made early)?