@clickup/pg-snowflake
Version:
Generate Snowflake ID Standard compatible ids with optional cluster number
113 lines (86 loc) • 4.57 kB
Markdown
# pg-snowflake: generate Snowflake ID Standard compatible ids with optional cluster number
Exposes `snowflake_gen()` function that generates [Snowflake ID
Standard](https://en.wikipedia.org/wiki/Snowflake_ID) compatible ids (with the
number of bits customizable, see below):
```
# Snowflake ID compatible (each letter is 1 bit):
0tttttttttttttttttttttttttttttttttt | CCCCC | iiiiiiiiii | ssssssssssssss
^ ^ ^ ^
35 5 10 14 bits
```
Here,
- "t" denotes the bits of timestamp, which is the current Unix time in
milliseconds minus the value of `CONST_SNOWFLAKE_TS_START` psql variable. Only
`CONST_SNOWFLAKE_TS_BITS` **highest** bits of the timestamp are used.
- "C" denotes "cluster number". If `CONST_SNOWFLAKE_CLUSTER_NO_BITS` is 0, then
this field is not present, and the entire leftmost range is used for the
timestamp.
- "i" is "island no" aka "Snowflake Machine ID", must be unique across all
master nodes within the same cluster.
- "s" is "sequence number bits".
You can customize the length of the bit fields in your `pg-snowflake.config.sql`
file, see below.
Also, you can pass custom arguments to `snowflake_gen()` function itself.
The library is typically installed in schema `public` on every PostgreSQL node.
## Installation
First, copy `pg-snowflake.config.sql.example` to `pg-snowflake.config.sql` in
the root of your app and optionally update `CONST_*` according to your needs.
```sql
\set CONST_SNOWFLAKE_TS_START 1475796139634
\set CONST_SNOWFLAKE_TS_BITS 35
\set CONST_SNOWFLAKE_CLUSTER_NO_BITS 5
\set CONST_SNOWFLAKE_ISLAND_NO_BITS 10
\set CONST_SNOWFLAKE_SEQ_BITS 14
```
Then run in psql console:
```sql
-- The library is typically installed in schema "public".
SET search_path TO public;
-- Optional: if you don't want to pass PG_SNOWFLAKE_CLUSTER_NO environment
-- variable, create this configuration function manually.
--CREATE OR REPLACE FUNCTION snowflake_cluster_no() RETURNS integer LANGUAGE sql
-- SET search_path FROM CURRENT AS 'SELECT 1';
-- Optional: if you don't want to pass PG_SNOWFLAKE_ISLAND_NO environment
-- variable, create this configuration function manually.
--CREATE OR REPLACE FUNCTION snowflake_island_no() RETURNS integer LANGUAGE sql
-- SET search_path FROM CURRENT AS 'SELECT 1';
-- Install the library to the current schema.
\ir .../pg-snowflake-up.sql
```
The `pg-snowflake-up.sql` installation script will search for
`pg-snowflake.config.sql` file in the parent folders (it will fail if the file
is not found).
By default, cluster number is read from `PG_SNOWFLAKE_CLUSTER_NO` shell env
variable. This way, you can assign it to e.g. 1 for development, 2 for staging,
3 for production environments (or 3, 4, 5, ... for different clusters in
different AWS regions). If for some reason you don't want this behavior, just
define `snowflake_cluster_no()` function manually.
The same applies to `PG_SNOWFLAKE_ISLAND_NO` and `snowflake_island_no()`
functions.
## snowflake_gen([seq_name regclass][,ts_start bigint][,ts_bits][,cluster_no_bits][,island_no_bits][,seq_bits])
Generates a [Snowflake ID Standard](https://en.wikipedia.org/wiki/Snowflake_ID)
compatible ids.
- `seq_name`: by default, the function uses `snowflake_seq` sequence to avoid
within-millisecond (or within-second) collisions, but you can also pass your own
sequence name instead.
- `ts_start`: optional epoch start timestamp in milliseconds. Defaults to
`CONST_SNOWFLAKE_TS_START` (see pg-snowflake.config.sql.example).
- `ts_bits`: optional number of bits for the timestamp part. Defaults to
`CONST_SNOWFLAKE_TS_BITS`.
- `cluster_no_bits`: optional number of bits for the cluster number part.
Defaults to `CONST_SNOWFLAKE_CLUSTER_NO_BITS`.
- `island_no_bits`: optional number of bits for the island number part. Defaults
to `CONST_SNOWFLAKE_ISLAND_NO_BITS`.
- `seq_bits`: optional number of bits for the sequence part. Defaults to
`CONST_SNOWFLAKE_SEQ_BITS`.
The default format is `t35-C5-i10-s14`. To customize, use PostgreSQL named
arguments syntax, e.g.:
```sql
SELECT snowflake_gen(ts_bits:=30, cluster_no_bits:=10, island_no_bits:=10);
```
The benefit of this function is that you can order by id to get the approximate
timestamp-based order, even across microshards/nodes. It also uses bits in the
bigint id efficiently. The downside is that the id is not easily parsable by
just the human eye, since it's bits-based (not decimal positions based).
Example of id generated:
- `4705186322862407685`