icebird
Version:
Apache Iceberg client for javascript
117 lines (89 loc) • 3.52 kB
Markdown

[](https://www.npmjs.com/package/icebird)
[](https://www.npmjs.com/package/icebird)
[](https://github.com/hyparam/icebird/actions)
[](https://opensource.org/licenses/MIT)

Icebird is a library for reading [Apache Iceberg](https://iceberg.apache.org/) tables in JavaScript. It is built on top of [hyparquet](https://github.com/hyparam/hyparquet) for reading the underlying parquet files.
To read an Iceberg table:
```javascript
const { icebergRead } = await import('icebird')
const tableUrl = 'https://s3.amazonaws.com/hyperparam-iceberg/spark/bunnies'
const data = await icebergRead({
tableUrl,
rowStart: 0,
rowEnd: 10,
})
```
To read the Iceberg metadata (schema, etc):
```javascript
import { icebergMetadata } from 'icebird'
const metadata = await icebergMetadata({ tableUrl })
// subsequent reads will be faster if you provide the metadata:
const data = await icebergRead({
tableUrl,
metadata,
})
```
Check out a minimal iceberg table viewer demo that shows how to integrate Icebird into a react web application using [HighTable](https://github.com/hyparam/hightable) to render the table data. You can view any publicly accessible Iceberg table:
- **Live Demo**: [https://hyparam.github.io/demos/icebird/](https://hyparam.github.io/demos/icebird/)
- **Demo Source Code**: [https://github.com/hyparam/demos/tree/master/icebird](https://github.com/hyparam/demos/tree/master/icebird)
To fetch a previous version of the table, you can specify `metadataFileName`:
```javascript
import { icebergRead } from 'icebird'
const data = await icebergRead({
tableUrl,
metadataFileName: 'v1.metadata.json',
})
```
You can add authentication to all http requests by passing a `requestInit` argument that will be passed to `fetch`:
```javascript
import { icebergRead } from 'icebird'
const data = await icebergRead({
tableUrl,
requestInit: {
headers: {
Authorization: 'Bearer my_token',
},
}
})
```
Icebird aims to support reading any Iceberg table, but currently only supports a subset of the features. The following features are supported:
| Feature | Supported |
| ------- | --------- |
| Read Iceberg v1 Tables | ✅ |
| Read Iceberg v2 Tables | ✅ |
| Read Iceberg v3 Tables | ❌ |
| Parquet Storage | ✅ |
| Avro Storage | ✅ |
| ORC Storage | ❌ |
| Puffin Storage | ❌ |
| File-based Catalog (version-hint.text) | ✅ |
| REST Catalog | ❌ |
| Hive Catalog | ❌ |
| Glue Catalog | ❌ |
| Service-based Catalog | ❌ |
| Position Deletes | ✅ |
| Equality Deletes | ✅ |
| Binary Deletion Vectors | ❌ |
| Rename Columns | ✅ |
| Efficient Partitioned Read Queries | ❌ |
| All Parquet Compression Codecs | ✅ |
| All Parquet Types | ✅ |
| Variant Types | ❌ |
| Geometry Types | ❌ |
| Geography Types | ❌ |
| Sorting | ❌ |
| Encryption | ❌ |
## References
- https://iceberg.apache.org/spec/
- https://avro.apache.org/docs/1.12.0/specification/
- https://github.com/hyparam/hyparquet
- https://github.com/apache/iceberg
- https://github.com/apache/iceberg-python