UNPKG

@cdp-forge/plugin-pipeline-sdk

Version:
350 lines (268 loc) • 10.6 kB
# CDP Forge Plugin Pipeline SDK SDK for easily implementing pipeline plugins for the CDP Forge platform. This project serves as an SDK for building plugins that can be integrated into the data processing pipeline of the CDP Forge platform. It is designed to simplify the development of custom data transformation and processing logic within the platform ecosystem. ## šŸ“¦ Installation as NPM Library You can install this library as a dependency in other projects: ```bash npm install @cdp-forge/plugin-pipeline-sdk ``` ### Usage as Library ```typescript import { PipelinePluginI, PipelineStage, ConfigListener, ConfigReader, Log, start } from '@cdp-forge/plugin-pipeline-sdk'; // Create a custom plugin class MyCustomPlugin implements PipelinePluginI { async elaborate(log: Log): Promise<Log | null> { // Implement your processing logic console.log('Processing log:', log); return log; } async init(): Promise<void> { console.log('Plugin initialization'); } } // Load configuration const config = ConfigReader.getInstance('./config/config.yml', './config/plugin.yml').config; // Create plugin instance and start the server const customPlugin = new MyCustomPlugin(); start(customPlugin, config).then(({ stage, configListener }) => { console.log('Server started successfully'); }).catch(error => { console.error('Error during startup:', error); }); ``` ## šŸš€ Features - **Pipeline Plugin:** Provides a structure for creating plugins that fit into a sequential or parallel processing pipeline - **Kafka Integration:** Uses Kafka for asynchronous communication and data streaming between pipeline stages - **TypeScript:** Written in TypeScript to improve code maintainability, type safety, and developer productivity - **Docker Support:** Includes Docker configuration for deployment - **Testing:** Jest configuration for unit tests - **Configuration Management:** Automatic merging of cluster and plugin configurations ## šŸ“‹ Prerequisites - Node.js 20.11.1 or higher - npm or yarn - Docker (optional, for deployment) - Access to a Kafka cluster ## šŸ› ļø Installation 1. **Clone the repository:** ```bash git clone <repository-url> cd plugin-pipeline-sdk ``` 2. **Install dependencies:** ```bash npm install ``` 3. **Configure the environment:** - Copy and modify configuration files in `config/` - Ensure Kafka brokers are accessible ## āš™ļø Configuration The SDK uses two separate configuration files to manage different aspects of the plugin system: ### Configuration File Structure #### `config/config.yml` - Cluster Configuration This file contains the **cluster-level configuration** that is **shared across all plugins** in the CDP Forge platform. ```yaml kafkaConfig: brokers: - 'localhost:36715' manager: url: 'https://plugin_template_url' config_topic: 'config' mysql: uri: 'mysql://user:password@my-server-ip:3306' ``` **Important**: If you're using the **Helm installer** provided by the CDP Forge platform, this file is **automatically generated** and you should use thia one on your plugin. #### `config/plugin.yml` - Plugin-Specific Configuration This file contains **plugin-specific settings** that define how your individual plugin behaves within the pipeline. ```yaml plugin: name: 'myPlugin' priority: 1 # 1 to 100 (not required if parallel) type: 'blocking' # or 'parallel' ``` ### Field Descriptions #### Cluster Configuration (`config.yml`) - **`kafkaConfig.brokers`** List of Kafka broker addresses to which the plugin will connect. This is configured at the cluster level and shared by all plugins. - **`manager.url`** URL used to register or communicate with the plugin manager service. - **`manager.config_topic`** Kafka topic used for plugin configuration management across the cluster. - **`mysql.uri`** MySQL connection string for database operations. #### Plugin Configuration (`plugin.yml`) - **`plugin.name`** Unique identifier for your plugin instance within the pipeline. - **`plugin.priority`** (Required only for `blocking` plugins) An integer from **1 to 100** that defines the execution order of the plugin within the pipeline. A lower number means higher priority, so the plugin with priority 1 will be executed before plugins with priority 2,3,4... - **`plugin.type`** Defines the plugin execution mode: - `blocking`: The plugin processes data and returns a `Promise<Log>` for the next stage. - `parallel`: The plugin runs independently and returns a `Promise<void>`. ### Configuration Management - **Cluster Config (`config.yml`)**: Managed by the platform, automatically generated by Helm installer - **Plugin Config (`plugin.yml`)**: Managed by you, defines your plugin's behavior - **Environment Variables**: Can override both configurations if needed - **Runtime Updates**: Plugin configuration can be updated without restarting the cluster ### Using ConfigReader for Convenience The SDK provides a `ConfigReader` utility that automatically merges both configuration files into a single `config` object, making it easier to access all settings in your plugin code. ```typescript import { ConfigReader } from 'plugin-pipeline-sdk'; // The ConfigReader automatically loads and merges: // - config/config.yml (cluster configuration) // - config/plugin.yml (plugin configuration) const config = ConfigReader.getInstance('./config/config.yml', './config/plugin.yml').config; // Access cluster configuration console.log(config.kafka.brokers); console.log(config.manager.url); // Access plugin configuration console.log(config.plugin.name); console.log(config.plugin.priority); // Access merged configuration console.log(config.mysql.uri); ``` ### Starting the Server with Configuration The `start()` function requires the merged configuration to initialize the server: ```typescript import { start, PipelinePluginI, Log, ConfigReader } from 'plugin-pipeline-sdk'; const config = ConfigReader.getInstance('./config/config.yml', './config/plugin.yml').config; class MyPlugin implements PipelinePluginI { async elaborate(log: Log): Promise<Log | null> { // Your plugin logic here return log; } async init(): Promise<void> { // Plugin initialization } } // Start the server with the merged configuration start(new MyPlugin(), config).then(({ stage, configListener }) => { console.log('Server started with merged configuration'); }).catch(error => { console.error('Error starting server:', error); }); ``` The server will: 1. **Load** both configuration files using the specified paths 2. **Merge** them into a single config object 3. **Validate** the configuration 4. **Start** the plugin with the merged settings ## šŸ”§ Plugin Development To create a new plugin, follow these steps: 1. **Configure the `config.yml` and `plugin.yml` files correctly** 2. **Implement the `elaborate` function in your plugin class** ### Plugin Implementation The plugin must implement the `PipelinePluginI` interface: ```typescript import { PipelinePluginI, Log } from 'plugin-pipeline-sdk'; export default class MyPlugin implements PipelinePluginI { elaborate(log: Log): Promise<Log | null> { // Implement your processing logic here // For blocking plugins: return Promise<Log> // For parallel plugins: return Promise<void> return Promise.resolve(log); } init(): Promise<void> { // Plugin initialization return Promise.resolve(); } } ``` ### Plugin Types Depending on the plugin type: - **`blocking` plugins**: The `elaborate` function must return a `Promise<Log>`. - **`parallel` plugins**: The `elaborate` function must return a `Promise<void>`. ## šŸ“ Project Structure ``` plugin-pipeline-template/ ā”œā”€ā”€ config/ # Configuration files │ ā”œā”€ā”€ config.yml # Cluster configuration │ └── plugin.yml # Plugin-specific configuration ā”œā”€ā”€ src/ # TypeScript source code │ ā”œā”€ā”€ plugin/ # Plugin implementation │ │ ā”œā”€ā”€ Plugin.ts # Main plugin class │ │ └── PipelinePluginI.ts # Plugin interface │ ā”œā”€ā”€ types.ts # Type definitions │ ā”œā”€ā”€ config.ts # Configuration management │ ā”œā”€ā”€ index.ts # Library entry point │ └── ... # Other utility files ā”œā”€ā”€ __tests__/ # Unit tests ā”œā”€ā”€ Dockerfile # Docker configuration ā”œā”€ā”€ package.json # Dependencies and scripts └── tsconfig.json # TypeScript configuration ``` ## šŸš€ Available Scripts - **`npm run build`**: Compiles TypeScript code - **`npm test`**: Runs unit tests - **`npm run clean`**: Cleans the dist folder - **`npm run prepublishOnly`**: Builds before publishing ## 🐳 Docker Deployment 1. **Build the image:** ```bash docker build -t plugin-pipeline-sdk . ``` 2. **Run the container:** ```bash docker run -p 3000:3000 plugin-pipeline-sdk ``` ## šŸ“Š Data Structure The plugin processes `Log` objects that contain: ```typescript interface Log { client: number; date: string; device: { browser?: string; id: string; ip?: string; os?: string; type?: string; userAgent?: string; }; event: string; geo?: { city?: string; country?: string; point?: { type: string; coordinates: number[]; }; region?: string; }; googleTopics?: GoogleTopic[]; instance: number; page: { description?: string; href?: string; image?: string; title: string; type?: string; }; product?: Product[]; referrer?: string; session: string; target?: string; order?: string; [key: string]: any; // Allows additional properties } ``` ## šŸ“¦ Publishing to NPM To publish this library to npm, see the [Publishing Guide](PUBLISHING.md). ## šŸ¤ Contributing Contributions are welcome! To contribute: 1. Fork the repository 2. Create a feature branch (`git checkout -b feature/AmazingFeature`) 3. Commit your changes (`git commit -m 'Add some AmazingFeature'`) 4. Push to the branch (`git push origin feature/AmazingFeature`) 5. Open a Pull Request ## šŸ“„ License This project is distributed under the GPL-3.0 license. See the `LICENSE` file for more details. ## šŸ“ž Support For support and questions, please open an issue on the GitHub repository.