UNPKG

@clickup/pg-id

Version:

Generate randomly-looking never repeating primary key ids and more

155 lines (109 loc) 6.13 kB
# pg-id: generates random-looking never repeating primary key ids and more This library contains PostgreSQL functions which generate bigint ids having the following format: ``` EssssRRRRRRRRRRRRRR ^ ^^^^^^^^^^^^^^ 4 14 ``` Also, `id_gen_uuid()` generates UUIDs looking like: ``` EssssRRR-RRRR-4RRR-NRRR-RRRRRRRRRRRR ``` Here, - "E" is an "environment number" (e.g. denoting dev, test, staging, production). The range is 1..7. The current value of the environment should be returned by function `id_env_no()`: you need to create it beforehand in the schema where you install the library (or pass it in `DB_ID_ENV_NO` environment variable before installing). - "ssss" is a 4-digit "microshard number". The range is 0..9999. The current value of the microshard should be returned by function `id_shard_no()`: you need to create it beforehand in the schema where you install the library (or the schema name should end with that number). - "RRRRRRRRRRRRRR" is a value which is unique within the chosen "environment number" and "microshard number". Depending on the function used, it is either a randomly-looking number (for `id_gen()` and `id_gen_uuid()`), a number based on the current timestamp (for `id_gen_timestampic()`) or just an auto-incrementing number (for `id_gen_monotonic()`). You can customize the length of "ssss" and "RRRRRRRRRRRRRR" sequences in your `pg-id.config.sql` file, see below. ## Installation First, copy `pg-id.config.sql.example` to `pg-id.config.sql` in the root of your app and update `CONST_MUL`, `CONST_SUM` and `CONST_MOD` with some random numbers that only you know. Those numbers will play the role of crypto constants which will not allow people to easily guess the pattern of ids generated by `id_gen()`. Then run in psql console: ```sql -- Selects the schema you want to install the library to. SET search_path TO your_schema_1234; -- Optional: if you don't want to pass DB_ID_ENV_NO environment variable, -- create this configuration function manually. --CREATE OR REPLACE FUNCTION id_env_no() RETURNS integer LANGUAGE sql -- SET search_path FROM CURRENT AS 'SELECT 1'; -- Optional: if your schema name doesn't end up with a microshard number, -- create this configuration function manually. --CREATE OR REPLACE FUNCTION id_shard_no() RETURNS integer LANGUAGE sql -- SET search_path FROM CURRENT AS 'SELECT 123'; -- Install the library to the current schema. It will load crypto constants -- from pg-id.config.sql, get the environment number from DB_ID_ENV_NO -- environment variable and infer microshard number from the numeric -- suffix of the current schema. \ir .../pg-id-up.sql ``` The `pg-id-up.sql` installation script will search for `pg-id.config.sql` file in the parent folders (it will fail if the file is not found). By default, environment number is read from `DB_ID_ENV_NO` shell env variable. This way, you can assign it to e.g. 1 for development, 2 for staging and 3 for production environments. If for some reason you don't want this behavior, just define `id_env_no()` function manually. Also, unless you created `id_shard_no()` manually, the installation script will try to infer the microshard number from the current schema name suffix. E.g. if your schemas are named `sh0000`, `sh0042`, `your_schema_1234` etc., then you're good with that default behavior. ## id_gen() Generates next globally-unique randomly-looking id. The main idea is to not let external people infer the rate at which the ids are generated, even when they look at some ids sample. The function implicitly uses `id_seq` sequence to get the information about the next available number, and then uses Feistel cipher to generate a randomly-looking non-repeating id based of it. Examples of ids generated (underscores are just for illustration): - `2_0000_17217633124378`: "environment 2, shard 0, number 17217633124378" - `1_0238_17493700363834`: "environment 1, shard 238, number 17493700363834" ## id_gen_timestampic() Similar to `id_gen()`, but instead of generating randomly looking ids, prepends the "sequence" part of the id with the current timestamp (actually, the number of seconds since 2010-01-01 UTC which is 9 decimal digits, i.e. +17 years from 2023). The function reserves up to 5 decimal digits for the number part of the id, so within each second, up to 100k unique ids can be generated. The function implicitly uses `id_seq_timestampic` sequence to get the information about the next available number. The benefit of this function is performance: increasing ids are more friendly to heavy INSERTs since they maximize the chance for btree index to reuse the newly created leaf pages. At the same time, having timestamp in the prefix doesn't allow to infer the number of objects existing in the database so far. Example of id generated (underscores are just for illustration): - `2_0001_435044939_00029`: "environment 2, shard 1, seconds 435044939, num 29" ## id_gen_monotonic() The simplest and fastest function among the above: generates next globally-unique monotonic id, without using any timestamps as a prefix. Monotonic ids are more friendly to heavy INSERTs since they maximize the chance for btree index to reuse the newly created leaf pages. The function implicitly uses `id_seq_monotonic` sequence to get the information about the next available number. Example of id generated (underscores are just for illustration): - `2_0001_00000000000003`: "environment 2, shard 1, number 3" The downside is that the ids of this format basically expose the number of unique objects which were created in the database so far. ## id_gen_uuid() An example UUID generated by that function: ``` 10246RRR-RRRR-4RRR-NRRR-RRRRRRRRRRRR ``` Here, 1 is environment number (e.g. production) and 0246 is microshard number. Basically, the function uses PostgreSQL built-in [gen_random_uuid()](https://www.postgresql.org/docs/current/functions-uuid.html), but replaces the first digits with environment and microshard numbers. This trick doesn't cut too much of the UUID's entropy, but allows to use UUIDs in microsharded environment.