I am implementing a parallel merge sort in C++ that is based off P-Merge-Sort algorithm from *Introduction to Algorithms, CLRS* on page 803. Here’s the algorithm:

And the associated algorithms:

And my problem is that I am trying avoid creating a $ T[1 \, ..\, n]$ for each invocation of P-Merge-Sort. I am told that it should be done via global arrays. So I have tried utilizing a single array for working which is the same size as the array being sorted where it would be used as ‘B’ in P-Merge-Sort.

How can I make this work without dynamically allocating memory during Merge Sorting?