Optimize molecule distance analyzing code

I have a very large dataset (31552000 lines) of xyz coordinates in the following format

1 2 3 4 5 6  7 8 9 . . .  

I have to take a distance using the special method below.

Distance[{a_, b_, c_}, {d_, e_, f_}] :=   Sqrt[(If[Abs[a - d] >= (40/2), Abs[a - d] - 40, Abs[a - d]])^2 + (If[       Abs[b - e] >= (40/2), Abs[b - e] - 40, Abs[b - e]])^2 + (If[       Abs[c - f] >= (40/2), Abs[c - f] - 40, Abs[c - f]])^2] 

Then I import the data.

data = Partition[    Partition[ReadList["input.txt", {Real, Real, Real}], 16], 38]; 

The formatting is kind of strange. Every 16 rows is one molecule, and every 38 molecules is one timestep. I take the distance between the 16th atom of each molecule and the 5th atom of each molecule.Then I select the distances that are less than 5.55 and determine the length of the resulting list. This is repeated for each of the 29,000 timesteps.

analysis =   Flatten[    Table[     Table[      Length[       Select[        Table[         Distance[data[[r, y, 16]], data[[r, x, 5]]],         {x, 1, 38}],        # <= 5.55 &]       ],      {y, 1, 38}],     {r, 1, 29000}]    ]; 

This last section is my most computationally intensive part. For 29000 timesteps and 38 molecules, it takes 40 minutes to process fully. It also takes too much memory (16+ gigs per kernel) to parallelize. Is there any other method that will improve the performance? I have tried using compile, but I realized that Table, the biggest bottleneck, is already complied to machine code.

Below is an example of a dataset that takes my computer 2 minutes to complete with the analysis code. It is scalable to larger timesteps by changing 4000 to larger numbers.

data = Partition[   Partition[Partition[Table[RandomReal[{0, 40}], (3*16*38*4000)], 3],     16], 38]