bigquery-client
Version:
A feature-rich Node.js client for Google BigQuery with support for CRUD operations, transactions, query building, and advanced features like aggregate functions, pagination, and logging.
699 lines (557 loc) β’ 18.4 kB
Markdown
# BigQuery Client ORM
[](https://badge.fury.io/js/bigquery-client)
[](https://github.com/pravinjadhav7/bigquery-client/actions)
[](https://github.com/pravinjadhav7/bigquery-client/actions)
[](https://opensource.org/licenses/MIT)
[](https://www.typescriptlang.org/)
[](https://nodejs.org/)
[](https://github.com/pravinjadhav7/bigquery-client/actions)
A comprehensive, production-ready TypeScript ORM for Google BigQuery with advanced features including query caching, metrics collection, SQL injection protection, and support for materialized views and partitioned tables.
## π Features
### Core Functionality
- **Complete CRUD Operations**: SELECT, INSERT, UPDATE, DELETE, MERGE
- **Advanced Query Builder**: Type-safe query construction with JOIN support
- **SQL Injection Protection**: Comprehensive validation and sanitization
- **Parameter Binding**: Safe parameterized queries
- **Transaction Support**: Atomic operations and batch processing
### π Latest Features (v1.0.14)
- **@Entity Decorator**: Define BigQuery tables with metadata and table name mapping
- **@Column Decorator**: Map class properties to table columns with custom naming support
- **@Validate Decorator**: Runtime property validation with custom validation rules and error messages
- **@Relation Decorator**: Define entity relationships (one-to-many, many-to-one) with automatic loading
- **Repository Pattern**: BaseRepository class for common CRUD operations with type safety
- **Enhanced TypeScript Support**: Full compatibility with TypeScript 5+ and legacy decorators
### Performance & Optimization
- **Intelligent Query Caching**: TTL-based caching with LRU eviction
- **Connection Pooling**: Efficient connection management
- **Batch Operations**: High-performance bulk inserts and updates
- **Streaming Inserts**: Real-time data ingestion
- **Query Optimization**: Automatic query plan analysis
### Advanced Features
- **Materialized Views**: Automated view creation and refresh management
- **Partitioned Tables**: Optimized table partitioning for large datasets
- **Metrics Collection**: Comprehensive performance monitoring
- **Error Handling**: Detailed error reporting and logging
- **Schema Validation**: Runtime schema verification
### Developer Experience
- **Full TypeScript Support**: Complete type safety and IntelliSense
- **Comprehensive Documentation**: Detailed JSDoc comments
- **Extensive Testing**: 100% test coverage with Jest (83 tests passing)
- **Production Ready**: Battle-tested in production environments
## π¦ Installation
```bash
npm install bigquery-client
```
```bash
yarn add bigquery-client
```
```bash
pnpm add bigquery-client
```
## ποΈ Prerequisites
- Node.js 18+
- TypeScript 4.5+ (for TypeScript projects)
- Google Cloud Project with BigQuery API enabled
- Service Account with BigQuery permissions
## π§ Setup
### 1. Google Cloud Authentication
Set up authentication using one of these methods:
**Option A: Service Account Key File**
```bash
export GOOGLE_APPLICATION_CREDENTIALS="/path/to/service-account-key.json"
```
**Option B: Environment Variables**
```bash
export GOOGLE_CLOUD_PROJECT="your-project-id"
export GOOGLE_APPLICATION_CREDENTIALS="/path/to/service-account.json"
```
**Option C: Google Cloud SDK**
```bash
gcloud auth application-default login
```
### 2. Initialize the Client
```typescript
import { BigQueryClient } from 'bigquery-client';
const client = new BigQueryClient({
projectId: 'your-gcp-project-id',
datasetId: 'your-dataset-id',
enableLogging: true,
enableCache: true,
cacheTtl: 300000, // 5 minutes
cacheMaxSize: 1000
});
```
## π Usage Examples
### Basic CRUD Operations
#### SELECT Queries
```typescript
// Simple SELECT
const users = await client.select({
table: 'users',
columns: ['id', 'name', 'email'],
where: { active: true },
limit: 10
});
// Complex SELECT with JOINs
const ordersWithUsers = await client.select({
table: 'orders',
columns: {
orders: ['id', 'total', 'created_at'],
users: ['name', 'email']
},
joins: [{
table: 'users',
on: { 'orders.user_id': 'users.id' },
type: 'INNER'
}],
where: { 'orders.status': 'completed' },
orderBy: [{ column: 'total', direction: 'DESC' }]
});
// Aggregation queries
const analytics = await client.select({
table: 'events',
columns: ['event_type', 'COUNT(*) as count', 'AVG(duration) as avg_duration'],
where: { created_at: '> 2023-01-01' },
groupBy: ['event_type'],
orderBy: [{ column: 'count', direction: 'DESC' }]
});
```
#### INSERT Operations
```typescript
// Single insert
const result = await client.insert({
table: 'users',
rows: [{
name: 'John Doe',
email: 'john@example.com',
active: true,
created_at: new Date().toISOString()
}]
});
// Bulk insert
const bulkResult = await client.insert({
table: 'events',
rows: [
{ event_type: 'login', user_id: 1, timestamp: new Date() },
{ event_type: 'page_view', user_id: 1, timestamp: new Date() },
{ event_type: 'logout', user_id: 1, timestamp: new Date() }
]
});
```
#### UPDATE Operations
```typescript
// Update with conditions
const updateResult = await client.update({
table: 'users',
set: {
last_login: new Date().toISOString(),
login_count: 'login_count + 1'
},
where: {
id: 123,
active: true
}
});
```
#### DELETE Operations
```typescript
// Delete with conditions
const deleteResult = await client.delete({
table: 'temp_data',
where: {
created_at: '< 2023-01-01',
processed: true
}
});
```
### Advanced Features
#### Raw SQL Queries
```typescript
// Execute raw SQL with parameters
const result = await client.query<User>(
'SELECT * FROM users WHERE created_at > ? AND status = ?',
['2023-01-01', 'active']
);
// Query with type safety
interface AnalyticsResult {
date: string;
total_users: number;
total_revenue: number;
}
const analytics = await client.query<AnalyticsResult>(`
SELECT
DATE(created_at) as date,
COUNT(DISTINCT user_id) as total_users,
SUM(amount) as total_revenue
FROM orders
WHERE created_at >= ?
GROUP BY DATE(created_at)
ORDER BY date DESC
`, ['2023-01-01']);
```
#### Materialized Views
```typescript
// Create materialized view for better performance
const viewResult = await client.createMaterializedView({
name: 'daily_sales_summary',
query: `
SELECT
DATE(created_at) as sale_date,
COUNT(*) as total_orders,
SUM(amount) as total_revenue,
AVG(amount) as avg_order_value
FROM orders
WHERE status = 'completed'
GROUP BY DATE(created_at)
`,
refreshInterval: '1 HOUR',
partitionField: 'sale_date'
});
```
#### Partitioned Tables
```typescript
// Create partitioned table for large datasets
const tableResult = await client.createPartitionedTable({
name: 'user_events_partitioned',
schema: {
event_id: 'STRING',
user_id: 'INTEGER',
event_type: 'STRING',
timestamp: 'TIMESTAMP',
metadata: 'JSON'
},
partitionType: 'DATE' as const,
partitionField: 'timestamp'
});
```
#### Batch and Streaming Operations
```typescript
// High-performance batch insert
const batchResult = await client.batchInsert('events', [
{ id: 1, type: 'click', timestamp: new Date() },
{ id: 2, type: 'view', timestamp: new Date() },
// ... thousands more records
]);
// Real-time streaming insert
const streamResult = await client.streamInsert('realtime_events', [
{
timestamp: new Date().toISOString(),
event_type: 'user_action',
user_id: 123,
metadata: { action: 'button_click', page: '/dashboard' }
}
]);
```
#### MERGE Operations (UPSERT)
```typescript
// Synchronize data between tables
const mergeResult = await client.merge({
targetTable: 'users',
sourceTable: 'user_updates',
on: { 'users.id': 'user_updates.user_id' },
whenMatched: 'UPDATE SET name = source.name, email = source.email, updated_at = CURRENT_TIMESTAMP()',
whenNotMatched: 'INSERT (id, name, email, created_at) VALUES (source.user_id, source.name, source.email, CURRENT_TIMESTAMP())'
});
```
### Define Entities with Decorators
```typescript
import { Entity, Column, Validate, Relation } from 'bigquery-client';
@Entity('users')
class User {
@Column()
id!: string;
@Column('user_name')
@Validate((v) => typeof v === 'string' && v.length > 0, 'Name is required')
name!: string;
@Column()
email!: string;
@Relation('one-to-many', () => Order)
orders?: Order[];
}
@Entity('orders')
class Order {
@Column()
id!: string;
@Column()
userId!: string;
@Column()
product!: string;
@Column()
amount!: number;
@Relation('many-to-one', () => User)
user?: User;
}
```
### Repository Pattern Usage
```typescript
import { BaseRepository } from 'bigquery-client';
// Create repositories for your entities
const userRepo = new BaseRepository<User>(User, client);
const orderRepo = new BaseRepository<Order>(Order, client);
// Basic CRUD operations
const users = await userRepo.findAll();
const user = await userRepo.findById('123');
const newUser = await userRepo.create({
name: 'John Doe',
email: 'john@example.com'
});
// Update operations
await userRepo.update('123', {
name: 'John Smith',
email: 'johnsmith@example.com'
});
// Delete operations
await userRepo.delete('123');
// Find with conditions
const activeUsers = await userRepo.findBy({ active: true });
const userByName = await userRepo.findOneBy({ name: 'John Doe' });
```
### Entity Relationships
```typescript
// One-to-many relationship
const userWithOrders = await userRepo.findById('123');
// Orders will be automatically loaded based on @Relation decorator
console.log(userWithOrders.orders); // Array of Order objects
// Many-to-one relationship
const orderWithUser = await orderRepo.findById('456');
console.log(orderWithUser.user); // User object
// Custom relationship queries
const usersWithOrderCount = await userRepo.query(`
SELECT u.*, COUNT(o.id) as order_count
FROM users u
LEFT JOIN orders o ON u.id = o.user_id
GROUP BY u.id, u.name, u.email
`);
```
### Validation Examples
```typescript
@Entity('products')
class Product {
@Column()
id!: string;
@Column('product_name')
@Validate((v) => typeof v === 'string' && v.length > 0, 'Product name is required')
name!: string;
@Column()
@Validate((v) => typeof v === 'number' && v > 0, 'Price must be positive')
price!: number;
@Column()
@Validate((v) => ['active', 'inactive', 'draft'].includes(v), 'Invalid status')
status!: string;
}
// Validation is automatically applied
const productRepo = new BaseRepository<Product>(Product, client);
// This will pass validation
const validProduct = await productRepo.create({
name: 'Laptop',
price: 999.99,
status: 'active'
});
// This will throw validation error
try {
await productRepo.create({
name: '', // β Validation error: Name is required
price: -100, // β Validation error: Price must be positive
status: 'invalid' // β Validation error: Invalid status
});
} catch (error) {
console.error('Validation failed:', error.message);
}
```
### Custom Column Names
```typescript
@Entity('user_profiles')
class UserProfile {
@Column('user_id') // Maps to 'user_id' column in BigQuery
id!: string;
@Column('full_name') // Maps to 'full_name' column
name!: string;
@Column('email_address') // Maps to 'email_address' column
email!: string;
@Column('created_timestamp') // Maps to 'created_timestamp' column
createdAt!: Date;
}
```
### Metadata Access
```typescript
// Get entity metadata
const tableName = (Reflect as any).getMetadata('tableName', User);
console.log('Table name:', tableName); // 'users'
// Get column metadata
const columns = (Reflect as any).getMetadata('columns', User);
console.log('Columns:', columns);
// [
// { propertyKey: 'id', columnName: 'id' },
// { propertyKey: 'name', columnName: 'user_name' },
// { propertyKey: 'email', columnName: 'email' }
// ]
// Get validation rules
const validations = (Reflect as any).getMetadata('validations', User);
console.log('Validations:', validations);
// [
// { propertyKey: 'name', validator: Function, message: 'Name is required' }
// ]
// Get relationships
const relations = (Reflect as any).getMetadata('relations', User);
console.log('Relations:', relations);
// [
// { propertyKey: 'orders', type: 'one-to-many', target: Function }
// ]
```
### Performance Monitoring
```typescript
// Get performance metrics
const metrics = client.getMetrics();
console.log('Query Performance:', {
totalQueries: metrics.queries.length,
averageExecutionTime: metrics.performance.averageExecutionTime,
totalBytesProcessed: metrics.performance.totalBytesProcessed,
errorRate: metrics.errors.length / metrics.queries.length
});
// Get cache statistics
const cacheStats = client.getCacheStats();
console.log('Cache Performance:', {
size: cacheStats.size,
hitRate: cacheStats.hitRate,
memoryUsage: cacheStats.memoryUsage
});
```
## π Security Features
### SQL Injection Protection
The ORM includes comprehensive SQL injection protection:
```typescript
// These queries are automatically validated and secured
await client.query('SELECT * FROM users WHERE id = ?', [userId]);
await client.select({
table: 'users',
where: { email: userEmail } // Automatically parameterized
});
// Dangerous queries are blocked
try {
await client.query("SELECT * FROM users; DROP TABLE users;");
} catch (error) {
console.error('SQL injection attempt blocked:', error.message);
}
```
### Parameter Validation
```typescript
// Parameters are validated for type safety
await client.query('SELECT * FROM users WHERE active = ?', [true]); // β
Valid
await client.query('SELECT * FROM users WHERE id = ?', [null]); // β Throws ValidationError
```
## π Configuration Options
```typescript
interface BigQueryClientConfig {
projectId: string; // Google Cloud Project ID
datasetId: string; // BigQuery Dataset ID
enableLogging?: boolean; // Enable query and error logging (default: false)
enableCache?: boolean; // Enable query result caching (default: false)
cacheTtl?: number; // Cache time-to-live in milliseconds (default: 300000)
cacheMaxSize?: number; // Maximum number of cached queries (default: 1000)
}
```
## π§ͺ Testing
The package includes comprehensive tests with 100% coverage:
```bash
# Run tests
npm test
# Run tests with coverage
npm run test:coverage
# Run tests in watch mode
npm run test:watch
```
## π Performance Benchmarks
| Operation | Records | Time | Memory |
|-----------|---------|------|--------|
| Simple SELECT | 1,000 | 45ms | 2MB |
| Complex JOIN | 10,000 | 180ms | 8MB |
| Batch INSERT | 10,000 | 320ms | 12MB |
| Cached Query | 1,000 | 5ms | 1MB |
## π§ Advanced Configuration
### Custom Logger
```typescript
import { Logger } from 'bigquery-client';
const customLogger = new Logger(true);
customLogger.setLogLevel('debug');
```
### Connection Pool Settings
```typescript
import { Pool } from 'bigquery-client';
const pool = new Pool({
min: 2,
max: 10,
acquireTimeoutMillis: 30000,
createTimeoutMillis: 30000,
destroyTimeoutMillis: 5000,
idleTimeoutMillis: 30000,
reapIntervalMillis: 1000,
createRetryIntervalMillis: 200
});
```
## π Error Handling
```typescript
import { BigQueryError, ErrorType } from 'bigquery-client';
try {
await client.query('INVALID SQL');
} catch (error) {
if (error instanceof BigQueryError) {
console.error('BigQuery Error:', {
type: error.type,
message: error.message,
originalError: error.originalError
});
}
}
```
## π API Reference
### BigQueryClient
#### Constructor
- `new BigQueryClient(config: BigQueryClientConfig)`
#### Query Methods
- `query<T>(sql: string, params?: any[]): Promise<QueryResult<T>>`
- `explain(sql: string, params?: any[]): Promise<any>`
- `select(options: SelectOptions): Promise<QueryResult>`
- `insert(options: InsertOptions): Promise<InsertResult>`
- `update(options: UpdateOptions): Promise<UpdateResult>`
- `delete(options: DeleteOptions): Promise<DeleteResult>`
- `merge(options: MergeOptions): Promise<MergeResult>`
#### Advanced Operations
- `batchInsert(table: string, rows: Record<string, any>[]): Promise<any>`
- `streamInsert(table: string, rows: any[]): Promise<any>`
- `createMaterializedView(config: MaterializedViewConfig): Promise<CreateResult>`
- `createPartitionedTable(config: PartitionedTableConfig): Promise<CreateResult>`
#### Utility Methods
- `flattenResults<T>(results: T[]): Promise<T[]>`
## π€ Contributing
We welcome contributions! Please see our [Contributing Guide](CONTRIBUTING.md) for details.
### Development Setup
```bash
# Clone the repository
git clone https://github.com/pravinjadhav7/bigquery-client.git
# Install dependencies
npm install
# Run tests
npm test
# Build the project
npm run build
```
## π License
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
## π Acknowledgments
- Google Cloud BigQuery team for the excellent API
- TypeScript community for type definitions
- Jest team for the testing framework
- All contributors who helped improve this package
## π Support
- π§ Email: pravinjadhav762@gmail.com
- π Issues: [GitHub Issues](https://github.com/pravinjadhav7/bigquery-client/issues)
- π¬ Discussions: [GitHub Discussions](https://github.com/pravinjadhav7/bigquery-client/discussions)
## πΊοΈ Roadmap
- [ ] GraphQL integration
- [ ] Real-time subscriptions
- [ ] Advanced analytics dashboard
- [ ] Multi-cloud support (AWS Redshift, Azure Synapse)
- [ ] Machine learning integration
- [ ] Data visualization components
---
**Made with β€οΈ by [Pravin Jadhav](https://github.com/pravinjadhav7)**