I want to store a hash-code for a variable-length text field (max 1000 chars) in a database table. The hash-code will be computed and assigned once on insert, and new rows will be inserted very often.
The hash-code will be used mainly for filtering (
WHERE), grouping (
GROUP BY), and sorting (
ORDER BY) in a couple of queries. The database table will hold a few million rows over time, with the probability of identical hash-codes (for identical text) being around 30% (rest being unique).
I have the choice of making the hash-code data type
NVARCHAR (SHA1 of text) or
BIGINT (converted bytes of SHA1 of text). I think
BIGINT will be better in terms of storage space (less pages).
Generally speaking, which of these two data types will be better in terms of performance, considering the operations mentioned above?