@flightstream/core-server
Version:
Core Apache Arrow Flight server framework with plugin architecture for Node.js
260 lines (186 loc) • 6.68 kB
Markdown
# Arrow Flight Server Core Package
A generic Apache Arrow Flight server framework with plugin architecture for Node.js applications.
## Features
- Generic gRPC server setup with Arrow Flight protocol
- Plugin architecture for data source adapters
- Standard protocol handler delegation
- Configurable server options
- Lifecycle management (start, stop, graceful shutdown)
- **Configurable logging system**
## Quick Start
```javascript
import { FlightServer } from '@flightstream/core-server';
import { MyFlightService } from './my-flight-service.js';
// Create server instance
const server = new FlightServer({
host: 'localhost',
port: 8080
});
// Set your flight service
server.setFlightService(new MyFlightService());
// Start the server
await server.start();
```
## Installation
```bash
npm install @flightstream/core-server
```
**Requirements:**
- Node.js >= 18.0.0
- Apache Arrow >= 14.0.0 (peer dependency)
## Logging Configuration
The Flight Server package supports configurable logging. You can set a custom logger once and it will be used throughout the entire package.
### Using a Custom Logger
```javascript
import { FlightServer } from '@flightstream/core-server';
import winston from 'winston';
// Create a custom logger
const logger = winston.createLogger({
level: 'info',
format: winston.format.json(),
transports: [
new winston.transports.File({ filename: 'flight-server.log' }),
new winston.transports.Console()
]
});
// Set the logger for the entire package
FlightServer.setLogger(logger);
// Now all Flight Server components will use your logger
const server = new FlightServer({
host: 'localhost',
port: 8080
});
```
### Using the Default Console Logger
If no custom logger is provided, the package defaults to using `console`:
```javascript
import { FlightServer } from '@flightstream/core-server';
// Uses console by default
const server = new FlightServer({
host: 'localhost',
port: 8080
});
```
### Logger Interface
Your custom logger should implement these methods:
- `debug(message, meta?)`
- `info(message, meta?)`
- `warn(message, meta?)`
- `error(message, meta?)`
## Configuration Options
**Note:** Data source specific configurations (like CSV settings) should be configured in the respective adapter packages, not in the core server configuration.
### Server Configuration
```javascript
const server = new FlightServer({
// Connection settings
host: 'localhost', // Server host
port: 8080, // Server port
// Message size limits (100MB default)
maxReceiveMessageLength: 100 * 1024 * 1024,
maxSendMessageLength: 100 * 1024 * 1024,
// Advanced settings
keepAlive: true,
keepAliveTimeout: 20000,
keepAliveInterval: 10000,
// Performance settings
maxConcurrentRequests: 100,
requestTimeout: 30000,
// Logging
logLevel: 'info',
enableDebugLogging: false,
// Protocol
protoPath: '/path/to/flight.proto' // Custom proto file path
});
```
### Environment Variables
The server supports configuration via environment variables:
```bash
# Performance settings
MAX_CONCURRENT_REQUESTS=100
REQUEST_TIMEOUT=30000
# Logging
LOG_LEVEL=info
DEBUG=false
```
## API Reference
### FlightServer
The main server class that manages the gRPC server and Flight protocol.
#### Static Methods
- `FlightServer.setLogger(logger)` - Set the logger for the entire package
#### Constructor
```javascript
new FlightServer(options)
```
**Options:**
- `host` (string) - Server host (default: 'localhost')
- `port` (number) - Server port (default: 8080)
- `logger` (Object) - Custom logger instance
- `protoPath` (string) - Path to Flight protocol definition
- `maxReceiveMessageLength` (number) - Max message size for receiving (default: 100MB)
- `maxSendMessageLength` (number) - Max message size for sending (default: 100MB)
- `keepAlive` (boolean) - Enable keep-alive (default: true)
- `keepAliveTimeout` (number) - Keep-alive timeout in ms (default: 20000)
- `keepAliveInterval` (number) - Keep-alive interval in ms (default: 10000)
- `maxConcurrentRequests` (number) - Max concurrent requests (default: 100)
- `requestTimeout` (number) - Request timeout in ms (default: 30000)
- `logLevel` (string) - Log level (default: 'info')
- `enableDebugLogging` (boolean) - Enable debug logging (default: false)
#### Instance Methods
- `setFlightService(flightService)` - Set the Flight service adapter
- `start()` - Start the server (returns Promise<number> with port)
- `stop()` - Stop the server gracefully (returns Promise<void>)
- `getServerInfo()` - Get server information
- `isRunning()` - Check if server is running
#### Server Information
The `getServerInfo()` method returns an object with:
```javascript
{
host: string, // Server host
port: number, // Server port
maxReceiveMessageLength: number, // Max receive message size
maxSendMessageLength: number, // Max send message size
running: boolean, // Whether server is running
flightService: string, // Flight service class name
datasets: Array<string> // Available dataset IDs
}
```
### FlightServiceBase
Abstract base class for Flight service implementations.
```javascript
import { FlightServiceBase } from '@flightstream/core-server';
class MyFlightService extends FlightServiceBase {
async _initialize() {
// Initialize your service
}
async _initializeDatasets() {
// Discover and register datasets
}
async _inferSchemaForDataset(datasetId) {
// Infer Arrow schema for dataset
// Must return an Arrow Schema object
}
async _streamDataset(call, dataset) {
// Stream dataset data as Arrow record batches
// Use call.write() to send data
}
}
```
#### Abstract Methods
- `_initialize()` - Initialize the service (called automatically)
- `_initializeDatasets()` - Discover and register datasets
- `_inferSchemaForDataset(datasetId)` - Infer Arrow schema for a dataset
- `_streamDataset(call, dataset)` - Stream dataset data
#### Utility Methods
- `getDatasets()` - Get list of available dataset IDs
- `hasDataset(datasetId)` - Check if dataset exists
- `refreshDatasets()` - Refresh dataset discovery
## Error Handling
The server uses standard gRPC error codes:
- `5` (NOT_FOUND) - Dataset not found
- `3` (INVALID_ARGUMENT) - Invalid request
- `14` (UNAVAILABLE) - Service unavailable
- `4` (DEADLINE_EXCEEDED) - Request timeout
## Examples
See the `examples/` directory for complete working examples.
## Contributing
See [CONTRIBUTING.md](../../../CONTRIBUTING.md) for development guidelines.