I am creating a database of encrypted value.
Let us say I store “John” which would be encrypted and stored as “Yoky”.
John | Yoky
Now I store “Johnny” which would be encrypted and store as “Koaddy”
John | Yoky Johnny | Koaddy
Now with the above storage I will not get any kind of regex search functionality. If I wanted to search “Jo%” it will not work.
But what if I store the values after breaking them. as
Jo | Yoky , Koaddy Joh | Yoky , Koaddy John | Yoky , Koaddy Johnn | Koaddy Johnny | Koaddy
Here the regex searches will work “Jo%”,”Joh%” both will give Yoky and Koaddy, which is what I want.
I can see the obvious security flaw above that anyone can map out Jo,Joh.
So I have decided to store the encryption of these.
I will AES encrypt my stubs and store them.
qkjklewr!j== | Yoky , Koaddy klkadsopos== | Yoky , Koaddy oensd%21op== | Yoky , Koaddy kaknvp23b02== | Koaddy kashdi2094j== | Koaddy
While performing any type of search say, “Joh”, I will first encrypt “Joh” then perform the search, therefore it will map to the AES encrypted value of “Joh”,i.e,klkadsopos==
Note : Both the column will use different keys and algorithms to protect the data.
Note : This storage will be TDE encrypted. HDFS will be encrypted and I will be using Apache Solr for the rest.
I need to understand if I am missing something fundamental here.