Encoding a arbitrary stack trace into a fixed length value


Background

I would like to store the nodes of a Calling Context Tree using in a key value store. I need to be able to directly access a node by it’s method name and complete stack trace. In addition in need to access all nodes of a method by only it’s name (the key value store supports a loading based on prefix).

Problem

The first idea is to use the method name + an encoded stack trace as key, e.g. the concatenated string representations. Unfortunately this can get quite large and I cannot use keys of arbitrary length. So the second idea was to encode the stack trace in a deterministic and reversible way. So my next idea was to encode the stack trace in a 64 bit integer, by adding the 32 bit hash representations of the methods in the stack. Unfortunately this is not collision free as the traces A -> B -> C and B -> A -> C compute to the same values even though the traces are different. So my current idea is to encode the traces by:

encodeStacktrace(stack_trace) 1. 64bit current = 0 2. For every method m in stack_trace 3.   current = rotateLeft(current) + hash(m) 4. return current 

They key is then method name concatenated with the encoded stack trace value.

Question

Is this implementation collision safe? I think not 100% however I don’t know how to compute the probability under the assumption that the method hash computation is a perfect hashing algorithm.

If it is not safe, are there other implementations/directions I can look into?