ent-framework

Version:

A PostgreSQL graph-database-alike library with microsharding and row-level security

80 lines (58 loc) • 3.54 kB

Markdown

# Ent API: selectBy() Unique Key Prefix Similar to how `loadBy()` loads a single Ent by its unique key, `selectBy()` call loads _multiple_ ents by their **unique key prefix**. ## Ent.selectBy(vc, { field: "...", ... }): Ent\[] Loads the Ents matching the predicate, considering the predicate is a list of fields from your unique key prefix. Logically, you can load the same Ents by just `select()` call, but then, while batching, it will produce a `UNION ALL` clause, which is less efficient and may cause performance problems when a large number of calls are batched. In contrast, `selectBy()` never produces a `UNION ALL` clause, but the price we pay for it is the implication that we can only select by the unique key _prefix_, not by an arbitrary predicate. All in all, you’ll rarely need to use `selectBy()` in your code. It is used interally though to fetch [Inverses](../architecture/ent-framework-metas-tao-entgo.md#no-explicit-assocs) efficiently.  Let’s actually use Inverses to illustrate, how `selectBy()` works. Internally, the Inverses Ent schema looks like this: ```typescript const schema = new PgSchema( name, { id: { type: ID }, created_at: { type: Date, autoInsert: "now()" }, type: { type: String }, id1: { type: ID }, shard2: { type: Number }, }, ["type", "id1", "shard2"], ) ``` ## Simple Batching Sometimes, when Ent Framework needs to discover the full list of microshards on the opposite end of some field edge, it internally runs the following calls in parallel: <pre class="language-typescript"><code class="lang-typescript">await Promise.all([ EntInverse.selectBy(vc, { type: "user2topics", id1: "123" }), EntInverse.selectBy(vc, { type: "user2topics", id1: "456" }), <strong>]); </strong></code></pre> Notice that in this example, all parallel calls use the same prefix (`type: "user2topics"`), but the very last selection field varies. For such a case (which is actually pretty common), to produce the most optimal PostgreSQL execution plan, Ent Framework builds the following batched SQL query: ```sql SELECT * FROM inverses WHERE type='user2topics' AND id1 IN('123', '456') ``` ## Complex Batching Unfortunately, the above query stops being optimal when the prefix differs across multiple parallel calls. Consider this example: <pre class="language-typescript"><code class="lang-typescript">await Promise.all([ EntInverse.selectBy(vc, { type: "user2topics", id1: "123" }), EntInverse.selectBy(vc, { type: "user2topics", id1: "456" }), EntInverse.selectBy(vc, { type: "topic2comments", id1: "789" }), <strong>]); </strong></code></pre> Assume we try to build the batched query using the same approach as above: ```sql -- DON'T DO IT! SELECT * FROM inverses WHERE (type='user2topics' AND id1 IN('123', '456')) OR (type='topic2comments' AND id1 IN('789')) ``` In this case, PostgreSQL will often times produce a suboptimal plan with "bitmap index scan" instead of "index scan". This is partially due to the fact that our DB unique index is by `(type, id1, shard2)`, and we only utilize its prefix `(type, id1)`. Luckily, there is another query plan which is used by Ent Framework in such a case: ```sql -- Good plan! SELECT * FROM inverses WHERE (type, id1) IN(VALUES( ('user2topics', '123'), ('user2topics', '456'), ('topic2comments', '789') )) ``` It produces an optimal query plan for the cases when prefixes differ. (BTW, it loses in the situations when the prefix is common, for which `AND id1 IN(...)` clause plays better.)