notion-page-tree
Version:
Recursively fetch nested Notion pages from the root page/database/block node.
497 lines (436 loc) β’ 74.7 kB
Markdown
# Notion Page Tree
> Fetch nested Notion pages from the root page/database/block.



---
- [Notion Page Tree](#notion-page-tree)
- [Why Would I Want This?](#why-would-i-want-this)
- [π Use the official Notion API.](#-use-the-official-notion-api)
- [πΈ Fetch nested children pages.](#-fetch-nested-children-pages)
- [π Handle API Errors gracefully.](#-handle-api-errors-gracefully)
- [Other Features](#other-features)
- [πΎ It saves fetch results to your local disk.](#-it-saves-fetch-results-to-your-local-disk)
- [π₯ It builds basic page server.](#-it-builds-basic-page-server)
- [π It builds basic page search indexes (missing feature of the official Notion API).](#-it-builds-basic-page-search-indexes-missing-feature-of-the-official-notion-api)
- [Advices](#advices)
- [Usage](#usage)
- [`.env` File Configuration](#env-file-configuration)
- [Basic Usage](#basic-usage)
- [Usage with More Options](#usage-with-more-options)
- [Shape of Data](#shape-of-data)
- [Notion's Original Block Tree Structure](#notions-original-block-tree-structure)
- [Fetched Page Structure](#fetched-page-structure)
- [Entity Data Types](#entity-data-types)
- [`Entity`](#entity)
- [`PlainEntity`](#plainentity)
- [`Page | Database | Block`](#page--database--block)
- [Fetch Result Data Types](#fetch-result-data-types)
- [`NotionPageTree.prototype.page_collection`](#notionpagetreeprototypepage_collection)
- [`NotionPageTree.prototype.root`](#notionpagetreeprototyperoot)
- [`NotionPageTree.prototype.search_index`](#notionpagetreeprototypesearch_index)
- [`NotionPageTree.prototype.search_suggestion`](#notionpagetreeprototypesearch_suggestion)
- [How It Creates Fetch Queue](#how-it-creates-fetch-queue)
- [NodeJS Promise Queue Example](#nodejs-promise-queue-example)
- [Flowchart](#flowchart)
---
## Why Would I Want This?
### π Use the official Notion API.
- Popular `/loadpagechunk/` endpoint is not public and may not be stable in future updates.
- Official API can be integrated with private key, so you can keep your database private.
### πΈ Fetch nested children pages.
- Pages inside non-page blocks are also fetched.
- Max request-depth can be set in your preference.
- Main fetch loop uses nodejs timers, so it's safe from maxing out recursion depth.
### π Handle API Errors gracefully.
- Maximum fetch concurrency is set to avoid `rate_limited` error.
- On `rate_limited` error, it stops and waits for some minutes.
- Other errors are automatically retried. Max retry count can be set in your preference.
---
## Other Features
### πΎ It saves fetch results to your local disk.
- Set parameter `private_file_path` to your custom path.
### π₯ It builds basic page server.
- `/page/:id/` endpoint for retrieving page and its childrens' id.
- `/tree/:id/` endpoint for retrieving all nested pages from the page.
### π It builds basic page search indexes (missing feature of the official Notion API).
- Uses lunr.js.
- Page's properties and chilren are converted into plain text for building search index.
- `/search?keyword=` endpoint for searching page properties and retrieving page ids.
- `/suggestion?keyword=` endpoint for looking for search index's tokens.
---
## Advices
β οΈ This library is not for fetching the whole nested page content.
- This library is for listing nested pages to some depth and retrieve their properties.
- Page's block children are fetched to boost up the search index results, not to display them.
- If you want to render the whole page, use amazing libraries like `react-notion-x` (yet you should share your pages publically to the web).
β οΈ I recommend to keep `maxRequestDepth` lower than 5 and `maxBlockDepth` lower than 2.
- Increasing max request depth will increase request count exponentially
- If you want to fetch deeply nested pages, don't put them under plain blocks. Rather put them directly on the page's root level.
---
## Usage
> See `./sample/index.ts` for full example file.
### `.env` File Configuration
Write directly on `<package_root>/.env`
```text
NOTION_ENTRY_ID = <root page/database/block's id>
NOTION_ENTRY_KEY = <root's integration key>
NOTION_ENTRY_TYPE = <page/database/block>
```
### Basic Usage
```js
import NotionPageTree from 'notion-page-tree';
async function simple_use() {
const notionPageTree = new NotionPageTree();
// construct main class instance
const server = notionPageTree.setupServer({ port: 8889 });
// Setup servers for listing and searching pages. (will respond 503 if pages are not fetched yet)
await notionPageTree.parseCachedDocument();
// Look for cached documents in private_file_path.
await notionPageTree.setRequestParameters({ prompt: true });
// Set environment variables that are needed for requesting Notion API.
await notionPageTree.fetchOnce();
// Fetch pages once asynchronously.
notionPageTree.startFetchLoop(1000 * 10);
// Create an asynchronouse fetch loop. Wait for some milliseconds between each fetch.
setTimeout(() => {
notionPageTree.stopFetchLoop();
// Stopping fetch loop immediately.
server.close();
// Stopping servers immediately.
}, 1000 * 30);
}
simple_use();
```
### Usage with More Options
```js
import NotionPageTree from 'notion-page-tree';
import path from 'path';
async function use_more_options() {
const notionPageTree = new NotionPageTree({
private_file_path: path.resolve('./results/'), // path to save serialized page data
searchIndexing: false, // turn off search indexing
createFetchQueueOptions: {
maxConcurrency: 3,
// Current official rate limit is 3 requests per second. Notion api would likely to throw error when you increase this value.
maxRetry: 2,
// How many times errored request are retried ("rate_limited" error will wait some minutes before retrying)
maxRequestDepth: 3,
// Search depth applied to all the entities.
maxBlockDepth: 2,
// Search depth applied only to plain blocks (not page or database, relative depth to the nearest parent page).
databaseQueryFilter: {
// Use filters when querying databases (Find details in official notion API).
property: 'isPublished',
checkbox: {
equals: true
}
}
}
});
await notionPageTree.parseCachedDocument();
const server = notionPageTree.setupServer({ port: 8888 });
await notionPageTree.setRequestParameters({
prompt: true,
// Prompt and rewrite .env if parameters don't exist.
forceRewrite: false
// Prompt and rewrite .env even if parameters exist.
});
await notionPageTree.fetchOnce();
notionPageTree.startFetchLoop(1000 * 10);
setTimeout(() => {
notionPageTree.stopFetchLoop();
server.close();
}, 1000 * 30);
}
```
---
## Shape of Data
### Notion's Original Block Tree Structure
<div style="overflow-x: scroll !important; white-space: pre-wrap !important; width: 100%">
<pre style="display: inline-block;">
<code>
ββββββββββββββ
β database A β
ββββββββββββββ
β
ββββββββββββββ΄βββββββββββββββββββββ
βΌ βΌ
ββββββββββββββ ββββββββββββββ
β page A β β page B β
ββββββββββββββ ββββββββββββββ
β β
β β
βΌ βΌ
ββββββββββββββ βββββββββββββββ
β check list β β bullet-list β
ββββββββββββββ βββββββββββββββ
β β
β βββββββββββββββββΌββββββββββββββββ
βΌ βΌ βΌ βΌ
ββββββββββββββ ββββββββββββββ ββββββββββββββ ββββββββββββββ
βtoggle list β β page C β β page D β βlist elementβ
ββββββββββββββ ββββββββββββββ ββββββββββββββ ββββββββββββββ
β β β
β β β
βΌ βΌ βΌ
ββββββββββββββ ββββββββββββββ ββββββββββββββ
β page E β β database B β βlist Elementβ
ββββββββββββββ ββββββββββββββ ββββββββββββββ
</code>
</pre></div>
### Fetched Page Structure
<div style="overflow-x: scroll !important; white-space: pre-wrap !important;">
<pre style="display: inline-block;">
<code>
ββββββββββββββ
β database A β
ββββββββββββββ
β
ββββββββββββββββββ΄βββββββββββββββββ
βΌ βΌ
βββββββββββββββββ βββββββββββββββββ
βpage A β βpage B β
β β β β
β blockChildren β β blockChildren β
β PlainText β β PlainText β
βββββββββββββββββ βββββββββββββββββ
ββ check list ββ ββ bullet list ββ
ββββββββββββββββ€β ββββββββββββββββ€β
ββ toggle list ββ ββlist element ββ
βββββββββββββββββ ββββββββββββββββ€β
βββββββββ¬ββββββββ ββlist element ββ
βΌ βββββββββββββββββ
ββββββββββββββ βββββββββ¬ββββββββ
β page E β βββββββββ΄ββββββββ
ββββββββββββββ βΌ βΌ
ββββββββββββββ ββββββββββββββ
β page C β β page D β
ββββββββββββββ ββββββββββββββ
β
βΌ
ββββββββββββββ
β database B β
ββββββββββββββ
</code></pre></div>
---
## Entity Data Types
### `Entity`
Entity that has `children` as direct reference.
```typescript
type Entity = Commons & (Page | Database | Block);
interface Commons {
id: string;
depth: number;
blockContentPlainText: string;
parent?: Entity;
children: Entity[];
}
```
### `PlainEntity`
Entity that has `children` as id.
```typescript
type FlatEntity = FlatCommons & (Page | Database | Block);
interface FlatCommons {
id: string;
depth: number;
blockContentPlainText: string;
parent?: string;
children: string[];
}
```
### `Page | Database | Block`
Notion API's fetch request result for each entity types, with typed properties included.
```ts
export interface Page {
type: 'page';
metadata: Extract<GetPageResponse, { last_edited_time: string }>;
}
export interface Database {
type: 'database';
metadata: Extract<GetDatabaseResponse, { last_edited_time: string }>;
}
export interface Block {
type: 'block';
metadata: Extract<GetBlockResponse, { type: string }>;
}
```
---
## Fetch Result Data Types
### `NotionPageTree.prototype.page_collection`
Key, value collection of `id` and `PlainEntity`
```ts
page_collection: Record<string, FlatEntity> | undefined;
```
### `NotionPageTree.prototype.root`
Root `Entity` that has nested children `Entity`s.
```ts
root: Entity | undefined;
```
### `NotionPageTree.prototype.search_index`
`lunr.Index` built with `page_collection` entities' `blockContentPlainText`.
```ts
search_index: lunr.Index | undefined;
```
### `NotionPageTree.prototype.search_suggestion`
Search tokens extracted from `lunr.Index`.
```ts
search_suggestion: string[] | undefined;
```
---
## How It Creates Fetch Queue
### NodeJS Promise Queue Example
Try it on the Stackblitz.
https://stackblitz.com/edit/react-sp2zy3?embed=1&file=src/job.js
<div style="overflow-x: scroll !important; white-space: pre-wrap !important; width: 100%">
<pre style="display: inline-block;">
<code>
ββββββββββββββββββββββββββββββββββββ
β Request Ready Queue β
β β
β ββββββββββββββ¬βββββββββββββ β
β β job β job β ... β
β βdescription βdescription β β
β ββββββββββββββ΄βββββββββββββ β
β β β
ββββββββββββββββββββββββββββββββββββ
β
If promise
β queue has β β
empty slot
β
βββββββββββββββββββββββββββββββββββββββββββββ
β Request Promise Queue β β
β βΌ β
β βββββββββββββββββββββββββββββ β β β β β β β
β β queryable ββ queryable β (Empty Slot)ββ
β β promise ββ promise ββ β
β ββββββββββββββββββββββββββββ β β β β β β ββ
β β β
βββββββββββββββββββββββββββββββββββββββββββββ
β
If promise
is setteled
β ββββββββββββββ
β promise β
β β β βΆβ handler() β
ββββββββββββββ
</code></pre></div>
### Flowchart
<div style="overflow-x: scroll !important; white-space: pre-wrap !important; width: 100%">
<pre style="display: inline-block;">
<code>/******************************************************************************************************************************************************************************************************************************************************************************************************************************\
* *
* *
* *
* *
* ββββββββMain Routineββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ *
* β β *
* β β *
* β βββββββββββββββββββββββββ βββββββββββββββββββββββββ β *
* β βββββΆβ page_collection β βclearTimeout()ββββΆβ Request Promise Timer β β *
* β β βββββββββββββββββββββββββ β βββββββββββββββββββββββββ β *
* β β Ξ β β *
* β β βββββββββββββββββββββββββ β± β² β βββββββββββββββββββββββββ β *
* β βββββΆβ page_tree β β± β² βclearTimeout()ββββΆβ Request Ready Timer β β *
* β β βββββββββββββββββββββββββ β± β² β βββββββββββββββββββββββββ β *
* β β β± β² β β *
* β β βββββββββββββββββββββββββ β± check β² β βββββββββββββββββββ β *
* β βββββΆβ Request Promise Queue ββββββββ β± routine β² β ββββupdate ββββΆβ page_collection β β *
* β βββββββββββββββββββββββββ β βββββββββββββββββββββββββ β β± β» β² β β βββββββββββββββββββ β *
* β β Fetcher Routine ββββββ€ βββββββΆβ promise = 0 ββββ¬βtrueβββΌββββ€ β *
* β βββββββββββββββββββββββββ β βββββββββββββββββββββββββ β β² ready = 0 β± β β β βββββββββββββββββββ β *
* β β² βββββΆβ Request Ready Queue ββββββββ β² β» β± β β ββββupdateβββββΆβ page_tree β β *
* β β β βββββββββββββββββββββββββ β² 0ms β± β β βββββββββββββββββββ β *
* β β β β² β± β β β *
* β β β βββββββββββββββββββββββββ β² β± β β β *
* β β βββββΆβ Request Promise Timer β β² β± β ββββββββββββββββββββΆ wait for some minutes β β β β *
* β β β βββββββββββββββββββββββββ β² β± β β *
* β β β V β β β *
* β β β βββββββββββββββββββββββββ β² β β *
* β β βββββΆβ Request Ready Timer β βββfalseββββ β β *
* β β βββββββββββββββββββββββββ β *
* β β β β *
* β ββ β β β β β β β β β β β β β β β β β β β β β β β β β β β create new fetcher routine β β β β β β β β β β β β β β β β β β β β β β β β β β *
* β β *
* ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ *
* *
* β *
* *
* β *
* *
* ββββFetcher Routineββ»ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ *
* β ββ.plaintext += plaintextβββββββββββββββββββββββββββββββββββββββββββββββββββββ β *
* β β β β *
* β ββββββββββββββββββββββββββββ β ββββββββββββββββββββββββ β β *
* β βββfalseββββ β Promise (resolved) β ββ.children.push()β βConnector β β β *
* β ββββββββββββββββββββββββββββββββββ β β ββββββββββββββββββββββββββ β β β β ββββββββββββββββββββ β β β *
* β β Request Promise Queue β β β ββparentToAssign: Entity βββββ ββββββββββββββββββββ β βtoAssigned: itselfβ β β β *
* β β ββββββββββββββββββββββββββββ β β Ξ ββββββββββββββββββββββββββ β ββββββΆβ is Page/Database βββcreateβββΆβ ββββββββββββββββββββ ββββββββββββββββββββββββββββββββββ β *
* β β β Promise (pending) β β β β± β² ββββββββββββββββββββββββββ β β ββββββββββββββββββββ ββββββββββββββββββββββββ β β β *
* β βββββββββββββββββββββββββββββββββββ β β ββββββββββββββββββββββββββ β β β± β² ββparentToRequest: Entityβ β β β ββtoRequested: itself ββ β β β *
* β β New Requests ββββββββββββββββββββββββββββββββββββββββββββββββ¬βΆβparentToAssign: Entity ββ β β β± β² ββββββββββββββββ ββββββββββββββββββββββββββ β β β ββββββββββββββββββββββββ β β β *
* β β βββββββββββββββββββββββββββ β β ββββββββββββββββββββββββββββββββββββββββ β β ββββββββββββββββββββββββββ...β β β± β² ββββΆβ is Fulfilled ββββΆββββββββββββββββββββββββββ β β β ββββββββββββββββββββββββ β β β *
* β β βββββββββββββββββββββββββββ β β β β β β ββββββββββββββββββββββββββ β β β± set β² β ββββββββββββββββ ββ children: β β β β β β β *
* β β ββparentToAssign: Entity ββ ββββ β β βββββββββββββββ List Block βββββ¬βΆβparentToRequest: Entityββ β β β± interval β² β ββ QueryablePromise βββ¬ββββ€ β ββββββββββββββββββββ β β *
* β β βββββββββββββββββββββββββββ β ββββΆβis Block/PageβββββΆChildren()ββββββ β β ββββββββββββββββββββββββββ β β β± β» β² β ββ Entity[] β β β β extractPlainText β Concatenated β β β *
* β β βββββββββββββββββββββββββββ ... β β βββββββββββββββ β β β ββββββββββββββββββββββββββ βββ΄ββΆβ promise > 0 βββββ€ ββββββββββββββββββββββββββ β β β βββΆ FromBlock βββββΆ.reduce()ββββΆβ Plain Text β β β *
* β β ββparentToRequest: Entityββ ββββββββ .push()β β β children: ββ β β² isSetteled β± β ββββββββββββββββ ββββββββββββββββββββββββββ β β β β ββββββββββββββββββββ β β *
* β β βββββββββββββββββββββββββββ β β βββββββββββββββ Query ββββββ¬βΆβ QueryablePromise ββ β β² β» β± ββββΆβ is Rejected β ββ retry: number β β β β β β β *
* β β βββββββββββββββββββββββββββ β ββββΆβ is Database βββββΆDatabase()ββββββ β β β Entity[] ββ β β² 0ms β± ββββββββββββββββ ββββββββββββββββββββββββββ β β ββββββββββββββββββββ β β β *
* β β ββ retry: number ββ β βββββββββββββββ β β ββββββββββββββββββββββββββ β β² β± β ββββββββββββββββββββββββββββ ββββββΆβ is Block β ββββ€ ββββββββββββββββββββ β β *
* β β βββββββββββββββββββββββββββ β β β ββββββββββββββββββββββββββ β β² β± β ββββββββββββββββββββ β β is in Traverse β β β *
* β β βββββββββββββββββββββββββββ β β β β retry: number ββ β β² β± β set maxConcurrency to 0 β β ββββββΆβ Exclusion List β βββββββββββββ¬βββββββββββ β *
* β ββββββββββββββββ²βββββββββββββββββββ β β ββββββββββββββββββββββββββ β β² β± βΌ empty promise_queue β β β ββββββββββββββββββββ βConnector β β β *
* β β β ββββββββββββββββββββββββββββ β V rate_limitedββtrueβββΆ move promises to ready_queue β βββΆ.filter()βββββ€ ββββββββββββββββββββ β βββββββββββ΄βββββββββ β β *
* β β ββββββββββββββββββββββββββββββββββ β wait for some minutes β β βis NOT in Traverseβ β βtoAssigned: PARENTβ β β *
* β β β set maxConcurrency to 3 β ββββββΆβ Exclusion List βββcreateβββΆβ βββββββββββ¬βββββββββ β β *
* β β βββββββββββββββββββββββββββ false β ββββββββββββββββββββ βββββββββββββ΄βββββββββββ β *
* β β βββββββββββββββββββββββββββ β β ββtoRequested: itself ββ β *
* β .splice(concurrency - ββparentToAssign: Entity ββ β β βββββββββββββ¬βββββββββββ β *
* β promise.length) βββββββββββββββββββββββββββ βΌ β βββββββββββββ¬βββββββββββ β *
* β β βββββββββββββββββββββββββββ retry < β β β *
* β β βββββββββparentToRequest: Entityββββββtrueβββββ retryCount ββfalseβββΆ console.error β β β *
* β β β βββββββββββββββββββββββββββ β β β *
* β β β βββββββββββββββββββββββββββ β β β *
* β β βββfalseββββ β ββ retry: +=1 ββ β β β *
* β β β β β βββββββββββββββββββββββββββ β β β *
* β β β β .unshift() βββββββββββββββββββββββββββ β β β *
* β β Ξ β βΌ β β β *
* β β β± β² β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β β β *
* β β β± β² β β Request Ready Queue β β β β *
* β β β± β² β β βββββββββββββββββββββββββββ βββββββββββββββββββββββββββ β β β β *
* β β β± β² β β βββββββββββββββββββββββββββ βββββββββββββββββββββββββββ β β β β *
* β β β± check β² β β ββparentToAssign: Entity ββ ββparentToAssign: Entity ββ β β β β *
* β β β± routine β² β β βββββββββββββββββββββββββββ βββββββββββββββββββββββββββ β β β β *
* β β β± β» β² β β βββββββββββββββββββββββββββ βββββββββββββββββββββββββββ...β β β β *
* β βββββββtrueββββββββ ready > 0 ββββ΄βββββββββββββ ββparentToRequest: Entityββ ββparentToRequest: Entityββ ββββββββββββββ.push()βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β *
* β β² promise < 3 β± β βββββββββββββββββββββββββββ βββββββββββββββββββββββββββ β β β *
* β β² β± β βββββββββββββββββββββββββββ βββββββββββββββββββββββββββ β β β *
* β β² β» β± β ββ retry: 0 ββ ββ retry: 0 ββ β β β *
* β β² 250ms β± β βββββββββββββββββββββββββββ βββββββββββββββββββββββββββ β β β *
* β β² β± β βββββββββββββββββββββββββββ βββββββββββββββββββββββββββ β β β *
* β β² β± βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β β *
* β β² β± β² β β *
* β V β ββββββββββββββββββββββββββββ β β *
* β (init) β Page Collection β β β *
* β β β ββββββ¦ββββββββββββββββββ β β β *
* β β β β id β page: Entity β β β β *
* β β β ββββββ©ββββββββββββββββββ β β β *
* β ββββββββββββββββββββ β ββββββ¦ββββββββββββββββββ β β β *
* β β ROOT ββββ(init)ββΆβ β id β page: Entity β ββββββββββββββassign with keyβββββββββββββββββββββββββββββββββββββββββ β *
* β ββββββββββββββββββββ β ββββββ©ββββββββββββββββββ β β *
* β β β ββββββ¦ββββββββββββββββββ β β *
* β β β id β page: Entity β β β *
* β β β ββββββ©ββββββββββββββββββ β β *
* β β .... β β *
* β β ββββββββββββββββββββββββββββ β *
* β β β *
* β β β β β β β β β β¬ β β β β β β β β β β *
* β β *
* β β β *
* β ββββββββββββββββββββββββ β *
* β β Entity β β *
* β ββββββββββββββββββββββββ β *
* β ββ id: string ββ β *
* β ββββββββββββββββββββββββ