firestore-to-bigquery-export
Version:
NPM package for copying and converting Firestore data to BigQuery.
153 lines (124 loc) • 5.35 kB
Markdown
# Firestore to BigQuery export
NPM package for copying and converting [Cloud Firestore](https://firebase.google.com/docs/firestore/) data to [BigQuery](https://cloud.google.com/bigquery/docs/).
<p align="center">
<a href="LICENSE">
<img src="https://img.shields.io/badge/license-MIT-brightgreen.svg?" alt="Software License" />
</a>
<a href="https://npmjs.org/package/firestore-to-bigquery-export">
<img src="https://img.shields.io/npm/v/firestore-to-bigquery-export.svg?" alt="Packagist" />
</a>
<a href="https://npmjs.org/package/firestore-to-bigquery-export">
<img src="https://img.shields.io/npm/dm/firestore-to-bigquery-export.svg?" alt="Packagist" />
</a>
<a href="https://github.com/Johannes-Berggren/firestore-to-bigquery-export/issues">
<img src="https://img.shields.io/github/issues/Johannes-Berggren/firestore-to-bigquery-export.svg?" alt="Issues" />
</a>
</p>
Firestore is awesome. BigQuery is awesome. But transferring data from Firestore to BigQuery sucks.
This package lets you plug and play your way out of config hell.
- Create a BigQuery dataset with tables corresponding to your Firestore collections.
- Table schemas are automatically generated based on your document property data types.
- Convert and copy your Firestore collections to BigQuery.
This package doesn't write anything to Firestore.
## Contents
* [Installation](#installation)
* [How to](#how-to)
+ [API](#api)
+ [Examples](#examples)
* [Limitations](#limitations)
* [Issues](#issues)
* [Issues](#to-do)
## Installation
> npm i firestore-to-bigquery-export
```javascript
import bigExport from 'firestore-to-bigquery-export'
// or
const bigExport = require('firestore-to-bigquery-export')
// then
const GCPSA = require('./Your-Service-Account-File.json')
bigExport.setBigQueryConfig(GCPSA)
bigExport.setFirebaseConfig(GCPSA)
```
## How to
### API
```javascript
bigExport.setBigQueryConfig(
serviceAccountFile // JSON
)
```
```javascript
bigExport.setFirebaseConfig(
serviceAccountFile // JSON
)
```
```javascript
bigExport.createBigQueryTable(
datasetID, // String
collectionName, // String
verbose // boolean
)
// returns Promise<Array>
```
```javascript
bigExport.copyToBigQuery(
datasetID, // String
collectionName, // String
snapshot // firebase.firestore.QuerySnapshot
)
// returns Promise<number>
```
```javascript
bigExport.deleteBigQueryTable(
datasetID, // String
tableName // String
)
// returns Promise<Array>
```
### Examples
```javascript
/* Create table 'account' in BigQuery dataset 'firestore'. You have to create the dataset beforehand.
* The given table name has to match the Firestore collection name.
* Table schema will be autogenerated based on the datatypes found in the collections documents.
*/
await bigExport.createBigQueryTable('firestore', 'accounts')
```
Then, you can transport your data:
```javascript
/* Copying and converting all documents in the given Firestore collection snapshot.
* Inserting each document as a row in tables with the same name as the collection, in the dataset named 'firestore'.
* Cells (document properties) that doesn't match the table schema will be rejected.
*/
const snapshot = await firebase.collection('payments').get()
const result = await bigExport.copyToBigQuery('firestore', 'payments', snapshot)
console.log('Copied ' + result + ' documents to BigQuery.')
/*
* You can do multiple collections async, like this.
* If you get error messages, you should probably copy fewer collections at a time.
*/
const collectionNames = ['payments', 'profiles', 'ratings', 'users']
for (const name of collectionNames) {
const snapshot = await firestore.collection(name).get()
await bigExport.copyToBigQuery('firestore', name, snapshot)
}
```
After that, you may want to refresh your data. For the time being, the quick and dirty way is to delete your tables and make new ones:
```javascript
// Deleting the given BigQuery table.
await bigExport.deleteBigQueryTable('firestore', 'accounts')
```
## Keep in mind
* If there's even one prop value that's a FLOAT in your collection during schema generation, the column will be set to FLOAT.
* If there are ONLY INTs, the column will be set to INTEGER.
* All columns will be NULLABLE.
## Limitations
* Your Firestore data model should be consistent. If a property of documents in the same collection have different data types, you'll get errors.
* Patching existing BigQuery sets isn't supported (yet). To refresh your datasets, you can `deleteBigQueryTables()`, then `createBigQueryTables()` and then `copyCollectionsToBigQuery()`.
* Changed your Firestore data model? Delete the corresponding BigQuery table and run `createBigQueryTables()` to create a table with a new schema.
* When running this package via a Cloud Function, you may experience that your function times out if your Firestore is large, (Deadline Exceeded). You can then:
* Increase the timeout for your Cloud Function in the [Google Cloud Platform Cloud Function Console](https://console.cloud.google.com/functions).
* Run your function locally, using `firebase serve --only functions`.
## Issues
Please use the [issue tracker](https://github.com/Johannes-Berggren/firestore-to-bigquery-export/issues).
## To-do
* Improve the handling of arrays.
* Implement patching of tables.