Design Review: Randomly generated primary keys in database


Problem:

In my web application entities have unique IDs, and for some type of entities(e.g. user, order etc.), the IDs are visible to users via URLs. MySQL is used for storing the entities.

Using auto increment integer as primary key is simple and convenient, but if it’s exposed in URL, people are able to estimate the entity amount or increasing rate by observing the IDs(German Tank Problem), and we might want to avoid that.

Having read a lot of posts on the topic, I’m aware of several approaches to mitigate the information disclose, e.g. generate random IDs by the app, using UUIDs, encode/hash last insert ID etc.

But finally I’ve come up with my approach.

Design

Firstly, generate a list of unique integers, and preload them into the database.

Since I’m not going to have billions of records, I could make a large sequential list and shuffle it in memory.

table to store the integers would be:

CREATE TABLE IF NOT EXISTS entity1_id (     id INT UNSIGNED AUTO_INCREMENT PRIMARY KEY,     entity_id INT UNSIGNED NOT NULL UNIQUE ); 

The entity_id column is assigned the integer value.

Secondly, prepare the entity table and a counter table:

CREATE TABLE IF NOT EXISTS entity1 (     id INT UNSIGNED PRIMARY KEY,     create_time DATETIME DEFAULT CURRENT_TIMESTAMP ) ENGINE=InnoDB;  CREATE TABLE IF NOT EXISTS id_counter (     entity_name CHAR(30)  PRIMARY KEY,     counter INT NOT NULL DEFAULT 1 );  INSERT INTO id_counter (entity_name) VALUES ('entity1'); 

Finally, use a “BEFORE INSERT” trigger to fetch a ID from the entity1_id table, and increment the ID counter.

The counter is used to keep track of used IDs in the entity1_id table. Think of entity1_id as a map, entity1_id[counter] gives next entity ID.

Here is the trigger:

DELIMITER |  CREATE TRIGGER entity1_idgen BEFORE INSERT ON entity1 FOR EACH ROW BEGIN     DECLARE cnt, eid INT UNSIGNED;      IF (NEW.id IS NULL) THEN         SET cnt = (SELECT counter FROM id_counter WHERE entity_name='entity1' FOR UPDATE);          SET eid = (SELECT entity_id FROM entity1_id WHERE id=cnt);         IF (eid IS NULL) THEN             SET @errMsg = CONCAT("table `entity1_id` doesn't have row where id=", cnt);             SIGNAL SQLSTATE '45000' SET MESSAGE_TEXT=@errMsg;         END IF;          SET NEW.id = eid;         UPDATE id_counter SET counter=counter+1 WHERE entity_name='entity1';     END IF; END|  DELIMITER ; 

I’ve briefly tested this approach, when inserting into the entity1 table, if id is not specified, an ID is taken from the entity1_id table; if a ID is specified, it would be used, and entity1_id and id_counter is not visited; if id is not specified and there’re no enough IDs in entity1_id, the error is reported.

Analysis

I think this approach is good for me:

  1. I get randomly generated IDs
  2. I can decide what IDs to use
  3. It’s not slow

From my observation, this approach has 3 downsides:

  1. it’s not a straightforward solution
  2. the risk of running out of IDs
  3. extra space for the pre-generated IDs

Question

  • Am I doing something wrong?
  • Are there any other risks I’m not aware of?